Forensic Science International 117 (2001) 163±173
Y-chromosome variation in a Norwegian population sample B. Myhre Dupuya,*, Rune Andreassena, Anne Grete Flùnesa, Karianne Tomassena, Thore Egelandb, Maria Brionc, Angel Carracedoc, B. Olaisena a
Institute of Forensic Medicine, University of Oslo, The National Hospital, 0027 Oslo, Norway b Epidemiological Centre, The National Hospital, 0027 Oslo, Norway c Institute of Legal Medicine, University of Santiago de Compostela, E-15705 Santiago de Compostela, Galicia, Spain Received 19 October 1999; received in revised form 1 September 2000; accepted 2 September 2000
Abstract Y-chromosome DNA pro®les are promising tools in population genetics and forensic science. Here we present DNA pro®les of 300 unrelated Y-chromosomes of Norwegian origin. The pro®le is composed of eight short tandem repeats (STRs) and one single nucleotide polymorphism (SNP). In more than 2/3 of the haplotypes the modular structure in the 50 end of the minisatellite locus DYF155S1 was revealed by minisatellite variant repeat PCR (MVR-PCR) These haplotypes were also typed for deletions of fragment 50f2C (DYF155S2). Allele distribution and paternity exclusion parameters are given for each marker. The degree of haplotype diversity and its implication for statistics are evaluated. In the 300 samples 177 different haplotypes were encountered, of which 137 were observed once only. Analysis showed that the main source of variation is within the population. The Fst values were less than 0.015 in general. Haplotype grouping by the SNP demonstrated two haplogroups (Tat/T and Tat/C). Haplogroup Tat/C Ð found in 5.7% of the present material Ð is the same haplogroup as encountered in 60% of Finnish males [Am. J. Hum. Genet. 62 (1998) 1171]. Mutation analysis in 150 father/son pairs (a total of 1200 meiotic events) revealed an average mutation frequency of 0.0042 (95% CI 0.0014±0.0097). # 2001 Elsevier Science Ireland Ltd. All rights reserved. Keywords: Norway; DYS19; DYS385I/II; DYS388; DYS389I; DYS389II; DYS390; DYS391; DYS392, DYS393; DYF155S1; DYF155S2; Haplotypes; STRs
1. Introduction In the last few years great efforts have been made to ®nd polymorphic Y-chromosome markers suitable for PCR-ampli®cation. The haploid, non crossing-over behaviour of the chromosome, gives potentials in many aspects resembling the maternally inherited mtDNA. The advantages for forensic casework are
*
Corresponding author. Tel.: 47-23071313; fax: 47-23071248. E-mail address:
[email protected] (B.M. Dupuy).
obvious. The haploid state makes mixture patterns much simpler; and in woman/man stain mixtures, pure male DNA pro®les may be obtained. Y-chromosome markers will also be of extraordinary value to solve selected paternity cases [2,3]. Our goal was to study the distribution of Y-chromosome polymorphisms in a Norwegian population sample. We also wanted to add empirical data concerning mutation rates on Y-chromosome STRs. We hereby report population databases, haplotype distribution and mutation analysis of eight Y-chromosome markers: DYS19 (Genome Database (GDB): 121409), DYS388 (GDB: 365729), DYS389I and II (GDB:
0379-0738/01/$ ± see front matter # 2001 Elsevier Science Ireland Ltd. All rights reserved. PII: S 0 3 7 9 - 0 7 3 8 ( 0 0 ) 0 0 3 9 7 - 2
164
B.M. Dupuy et al. / Forensic Science International 117 (2001) 163±173
366108), DYS390 (GDB: 366115), DYS391 (GDB: 366118), DYS392 (GDB: 456509) and DYS393 (GDB: 456649). DYS385I/II (GDB: 316257) was also typed but the results were not included in statistical analysis or the combined mutation rate. An ancient T ! C transition, frequent in Asians and Northern Europeans [4], was typed for all samples. To obtain further information on the relationship among these Ychromosomes, 232 haplotypes were also typed for the modular structure in the minisatellite locus MSY1 (DYF155S1) [5,6] and 251 haplotypes for the deletion polymorphism DYF155S2 (MSY1) [7]. 2. Material and methods 2.1. Material The material consists of 300 unrelated Norwegian males from a paternity case material analysed at the Institute of Forensic Medicine. Sequentially received samples were used, but with omission of non-Norwegian sounding surnames, thereby excluding chromosomes of obvious foreign origin. Each of the 20 counties in Norway is represented with minimum seven and maximum 34 samples. For the mutation rate study we utilised paternity veri®ed male offspring from 150 of the 300 adult males. The kinship status was con®rmed by ®ve hypervariable VNTR loci (D2S44, D7S21, D7S22, D12S11, and D14S13). 2.2. DNA extraction and PCR ampli®cation conditions Blood samples (EDTA) were extracted by the salting out method [8]. PCR condition for Multiplex I (DYS19, DYS389I and II and DYS390) and Multiplex II (DYS391, DYS392 and DYS393) are as described by Kloosterman et al. [9]. DYS385I/II and DYS388 were ampli®ed singleplex with the following conditions: 0.01 M Tris pH 8.3, 0.05 M KCl, 1.5 mM MgCl2 (2.5 mM MgCl2 for DYS388), 1% Triton X-100, 200 mM of each dNTP, 0.256 mM of each primer, 1 unit Taq DNA Polymerase, 2 ng DNA. Reaction volume 25 ml, 2 min denaturation 958C, 20 s 948C, 45 s 588C, 60 s 728C, 28 PCR-cycles, 10 min ®nal extension 728C.
2.3. Ampli®cation of locus DYF155S1 and DYF155S2 The DYF155S1 and DYF155S2 loci were ampli®ed using primers Y1A and Y1B as described in Jobling et al. [10]. The deletion polymorphism DYF155S2 was typed by visualising the PCR-products on 1% ethidium bromide stained agarose gels. Samples revealing an absence of the constant small fragment DYF155S2 together with an ampli®cation of locus DYF155S1 were typed as deletions. Locus DYF155S1 was typed by MVR-PCR technique as described in Jobling et al. [10]. MVR-PCR analysis of alleles was performed from the 50 end and into, but not beyond, a central block of 3 type repeats found in all 232 alleles typed. Only forward typing detecting type 1 and 3 variants was performed using the repeat speci®c primers TAG1 and TAG3 labelled with a ¯uorescence dye in the 50 end. The MVR-PCR products were separated and visualised using an ABI PrismTM 377 Sequencer and Genescan analysis software as described in Section 2.5. Length variation of the repeat blocks was not studied. 2.4. Detection of the T ! C transition The DNA samples were ampli®ed by use of the Tat1 and Tat3 primers [4]. The 112-bp PCR fragment was digested by Hsp92II (Promega) and analysed by electrophoresis in a 3% agarose gel containing ethidium bromide in a 1 TBE-buffer. All alleles carrying the ancient T were digested while those alleles carrying the C remained undigested. The latter was con®rmed with MaeII (Boehringer) digestion. 2.5. Electrophoretic methods, software and fragment length analysis An ABI PrismTM 377 Sequencer was used for fragment length analysis of STRs and the analysis of the modular structure of DYF155S1. Standard length gel and recommended electrophoretic condition were applied as described: 4.25% acrylamide, 36 cm well to read, 3000 V, 2400 scan/h, 3.0 h run, 518C. ABI Prism 377 GenescanTM Analysis software was used. Fragment length analysis was performed with the GS500 internal standard (Perkin Elmer).
B.M. Dupuy et al. / Forensic Science International 117 (2001) 163±173
2.6. Nomenclature and de®nitions The repeat number nomenclature follows the ISFH guidelines [11] as published by Kayser et al. [12]. For DYS385I/II the nomenclature of Schneider et al. [13] also used in Kayser et al. [12] was followed. The nomenclature for the different alleles in the other systems is as follows: the two alleles, observed at the ancient T ! C transition, were typed T-alleles and C-alleles, respectively. The two variants observed in DYF155S1 were typed as type 1 alleles or type 3 alleles depending on the succession of blocks formed by the two repeat variants observed in the ®rst 20 repeats from the 50 end and into the repeat array. The succession of repeat blocks is called the modular structure. The two variants that were observed in DYF155S2 (50f2/C) were typed deletion positive (del) or deletion negative (delÿ). Four of the most frequent haplotypes observed were assigned B1/69, C1/22, C3/33 and A/49 as by Kittles et al. [1]. The two haplogroups, sorted by the SNP, were assigned Tat/T and Tat/C by Tyler±Smith (personal communication). Haplogroup Tat/C corresponds to haplogroup A in Kittles et al. [1]. Using STRs for grouping, Tat/T was split into three clusters numerated Tat/T I, Tat/T II and Tat/T III. Here, we de®ne the haploid state of a chromosome inherited from parent to child as an haplotype. It is composed of the set of markers used at any time. In this study the haplotype is composed of eight STRs and one SNP. For more than 2/3 of the samples the haplotype is extended to include the modular structure of a minisatellite (232 samples) and a deletion polymorphism (251 samples). A haplogroup is de®ned as a group of haplotypes sorted by binary markers. Using a large number of STRs, >10 (P. de Knijff, personal communication), or by analysing well-de®ned populations [14], STRs might also been used for grouping, however, binary markers are to be recommended. 2.7. Phylogenetic and statistical analysis We have used the Arlequin package [15] to calculate gene diversity, haplotype diversity, Fst, analysis of molecular variance (AMOVA) and population differentiation. Using STRs for possible clustering of haplotypes into haplogroups we have considered formal methods
165
as implemented in Phylip [16] and S-plus [17]. However, these approaches have made less sense than the informal method used [this study]. In the informal approach we used the number of differences in numbers of repeats to de®ne the distance between the haplotypes. The results reported are based on the informal approach as well as K-means clustering [18]. The exact con®dence interval for the mutation rate was calculated using StatXact 3.0 (Cytel Software Corporation). 3. Results 3.1. Allele frequencies The allele frequencies of the Y-STRs designed by number of repeats are given in Table 1. For DYS385I/ II the frequencies of the allele pairs are given. For DYS385I/II the allele pair 11±14, which is a typical European allele pair, is shown to be the most frequent in the Norwegian population sample. 3.2. Mutations Mutation studies for 9 STRs have been carried out. However, DYS385I/II must be considered with great caution, e.g. DYS385I/II variant 11, 13 could also be 13, 11. Observed mutations may be wrongly interpreted and even lost. For this reason the DYS385I/II is not included in the calculation of combined mutation rate or any other analysis performed. One hundred and ®fty con®rmed father/son pairs have been analysed. Description of mutation steps and calculations of individual mutation rates are presented in Table 2. The average mutation rate based on 1200 meiotic allele transmissions is 0.0042 (95% CI 0.0014± 0.0097). 3.3. Haplotype distribution We observed 177 different haplotypes in 300 unrelated Norwegians. One hundred and thirty seven were experienced in one individual only while the most frequent haplotype, B1/69, represented 8.7% of the population sample studied. The population sample was divided into six regions: East, Central, North, South and West each consisting of two to eight
166
B.M. Dupuy et al. / Forensic Science International 117 (2001) 163±173
Table 1 Allele frequencies of nine Y-linked STR loci in 300 unrelated Norwegians Repeats
n
%
DYS385I/II 10±14 10±15 11±11 11±12 11±13 11±14 11±15 11±16 12±12 12±13 12±14 12±15 12±16 13±13 13±14
4 1 2 3 35 105 16 2 1 1 7 2 1 5 26
0.013 0.003 0.007 0.010 0.117 0.350 0.053 0.007 0.003 0.003 0.023 0.007 0.003 0.017 0.087
DYS388 10 12 13 14 15 16
6 177 19 90 6 2
0.020 0.590 0.063 0.300 0.020 0.007
DYS389I 9 10 11
96 147 57
0.320 0.490 0.190
DYS390 21 22 23 24 25 26
1 48 95 89 66 1
0.003 0.160 0.316 0.296 0.220 0.003
DYS392 17 12 13 14
190 15 74 21
0.633 0.050 0.246 0.070
Repeats DYS385I/II 13±15 13±16 13±17 13±19 13±20 14±14 14±15 14±16 14±17 14±19 15±15 15±16 17±17 18±19
n
%
2 2 4 1 2 33 27 2 1 1 10 2 1 1
0.007 0.007 0.013 0.003 0.007 0.110 0.090 0.007 0.003 0.003 0.033 0.007 0.003 0.003
8 158 94 36 4
0.027 0.527 0.313 0.120 0.013
1 86 93 79 29 12
0.003 0.286 0.310 0.263 0.097 0.040
2 164 131 2 1
0.007 0.546 0.436 0.007 0.003
10
1
0.003
12 13 14 15
13 244 40 2
0.043 0.813 0.133 0.007
DYS19 13 14 15 16 17 DYS389II 24 25 26 27 28 29 DYS391 9 10 11 12 13 DYS393
B.M. Dupuy et al. / Forensic Science International 117 (2001) 163±173 Table 2 Description of mutation steps and mutation rates for individual loci Locus
Father/son types
Mutation rates
DYS390
23/24 24/25
0.013
DYS391
11/10
0.007
DYS389II
27/26 27/28
0.013
DYS385I/II
11±15/11±13
0.007
167
counties (Fig. 1). The capital Oslo was also de®ned as an isolated region because of important immigration from the rest of the counties. The haplotypes and frequencies are listed in the Y-STR Haplotype Reference Database [19]. 3.4. Genetic structure The analysis by AMOVA shows that the main source of variation is within the population. The percentage of variation found between regions is only
Fig. 1. Map of Norway, separated into regions, and neighbouring countries showing the distribution of haplogroup Tat/C in the northern county Finnmark, Norway and Finland.
168
B.M. Dupuy et al. / Forensic Science International 117 (2001) 163±173
Table 3 Distance method: pairwise differencea Source of variation
d.f.
b
Sum of squares
Between groups Between populations within groups Within populations
5 13 278
19.808 45.305 956.301
Total
296
1021.414
Variance components
Percentage of variation
0.00974 Va 0.00311 Vb 3.43993 Vc
0.28 0.09 99.63
3.45278
a
AMOVA design and results. b Groups of counties.
0.28 while it is 99.63 within populations (Table 3). Using this method there is no evidence of population substructuring between regions in the Norwegian population sample. We also analysed the different region to reveal signs of substructuring using the pairwise difference test. The Fst values were less than 0.015 in general (Table 4) thus supporting the AMOVA analysis. The differentiation test between all groups of counties (regions) and between all counties gave P-values of 0:56585 0:1245 and 0:76375 0:1245 (2000 Markov steps). 3.5. Grouping of haplotypes into haplogroups The haplotypes are clustered into two clearly de®ned haplogroups; Tat/C and Tat/T based on the T ! C mutation. Haplogroup Tat/C represents 5.7% of the population sample. This result is in accordance with published data [4]. Haplogroup Tat/C is overrepresented in the northern county Finnmark that boundaries Finland (Fig. 1). More than half (9 out of 17) of the samples in this county belongs to haplogroup Tat/C. In addition all C-alleles analysed showed the DYF155S2 deletion, a ®nding in good accordance with Zerjal et al. [4].
3.6. Frequent haplotypes By comparing our data to the Finnish population sample [1] we were able to identify four haplotypes, de®ned by STRs, (B1/69, C1/22, C3/33 and A/49) that were relatively frequent in both population samples (Tables 5 and 10). In the Finnish population sample these haplotypes belonged to three different haplogroups de®ned by DYZ3 and DYF155S2. By manually sorting the haplotypes based on these four haplotypes we could observe grouping. By comparing these groups to the haplogroups sorted by the SNP both Tat/T and Tat/C could be de®ned, however, Tat/T was further divided into three possible clusters designed Tat/T I, Tat/T II and Tat/T III. The cluster Tat/T I comprises the haplotypes that deviate two (d 2) or fewer repeats from B1/69 and similarly for Tat/T II (C1/22), Tat/T III (C3/33) and Tat/C (A/ 49). Table 6 summarises the results. Rows 1±4 indicate the four possible clusters and the number of haplotypes observed obtained by summing for a distance corresponding to d 1, 2 and 3 repeats. For d 2 only one haplotype occurred in two different haplogroups while for d 3, 12 overlapping haplotypes were observed. We, therefore, decided to use d 2 as our measure of distance.
Table 4 Population pairwise Fstsa Populations
East
Central
North
Oslo
South
Central North Oslo South West
ÿ0.00594 0.01343 0.01026 0.00187 0.01102
ÿ0.00022 ÿ0.00688 ÿ0.00806 ÿ0.00006
0.01156 0.00005 0.00336
0.00049 0.00679
ÿ0.01671
a
Distance method: pairwise difference.
B.M. Dupuy et al. / Forensic Science International 117 (2001) 163±173
169
Table 5 Haplotypes used as centre points for informal clustering (B1/69, C1/22, C3/33 and A/49) Haplotype
Haplogroup
19a
389I
389II
390
391
392
393
388
Numberb
%
B1/69 C1/22 C3/33 A/49
Tat/T I Tat/T II Tat/T III Tat/C
14 14 15 14
9 10 10 11
25 26 27 27
23 24 25 24
10 11 11 11
11 13 11 14
13 13 13 14
14 12 12 12
26 10 6 4
8.7 3.3 2.0 1.3
a b
Abbreviation for DYS19. Total number of observed haplotypes in 300 Norwegian males.
Table 6 Distributions of haplotypes into possible haplogroups using different repeat distances to de®ne the optimal cluster da
d 1a
Haplogroup Haplogroup Haplogroup Haplogroup
Tat/T I Tat/T II Tat/T III Tat/C
Total
d2
d3
41 35 29 9
67 55 44 14
87 72 64 21
114
180
244
a
d: the number of repeat differences that de®nes the cluster around each haplotype. One haplotype for d 2 and 12 haplotypes for d 3 were ascertained in these haplogroups (overlapping).
3.7. Modular structure and deletion polymorphism In this study 232 alleles were typed by MVR-PCR analysis. At least 20 repeats for each allele were analysed and in most cases more than 25 repeats. With a few exceptions all alleles typed could be
divided in two groups based on the modular structure of the ®rst repeat block(s) prior to the central repeat block. One group starts with a block of 1 type repeats 50 to the central 3 type repeat block (shown as type 1 alleles in Fig. 2). The other group starts with a block of 3 type repeats followed by a block of 1 type repeats before the central 3 type repeat block (shown as type 3 alleles in Fig. 2). Type 3 alleles were mainly observed in haplogroup Tat/T I for d 2 while type 1 alleles were mainly observed in the three other haplogroups (Tat/T II, Tat/T III and Tat/C, Table 7). This result, together with the result of DYS155S2, adds additional evidence to the clustering of haplotypes into Tat/T I and Tat/C based on the STR data, however, it does not separate between Tat/T II and Tat/T III. 3.8. Characterisation of four common haplotypes The number of differences separating the four haplotypes, that are basis for the informal clustering
Fig. 2. Examples of the modular structure of type 1 and type 3 of DYF155S1. At least 20 repeats into the repeat array were typed for all alleles.
Table 7 Distribution of type 1 and type 3 alleles together with DYS155S2 into the four possible haplogroups de®ned by STR data (d 2)
Tat/T I Tat/T II Tat/T III Tat/C
Type 1
Type 3
Samples not typed
DYS155S2 del
DYS155S2 delÿ
Samples not types
3 51 33 11
57 4 1 3
7 0 10 0
0 1 0 12
58 49 35 1
9 5 9 1
170
B.M. Dupuy et al. / Forensic Science International 117 (2001) 163±173
Table 8 Differences separating the centre point haplotypes (Table 5) Haplotypes
No of repeat differences in the STRsa
DYF155S1
DYF155S2
T ! C transition
B1/69 B1/69 B1/69 C1/22 C1/22 C3/33
8 9 12 5 4 7
ÿ ÿ ÿ
ÿ ÿ ÿ
ÿ ÿ ÿ
a
! ! ! ! ! !
C1/22 C3/33 A/49 C3/33 A/49 A/49
Assuming single step and no reversible mutations only; (): differ, (ÿ): do not differ.
Table 9 Gene diversity by group and total Locus
North
East
Central
West
South
Oslo
Norway
DYS19 DYS388 DYS389I DYS389II DYS390 DYS391 DYS392 DYS393
0.5845 0.5807 0.6725 0.7710 0.7391 0.5527 0.5961 0.4261
0.5992 0.5718 0.6224 0.7185 0.7644 0.4925 0.4922 0.2848
0.6551 0.5807 0.6303 0.7491 0.7385 0.5310 0.4690 0.2668
0.6190 0.5640 0.5694 0.7183 0.7019 0.5084 0.5873 0.4122
0.6233 0.5100 0.6533 0.7600 0.7367 0.5567 0.5867 0.1567
0.5741 0.5450 0.6217 0.7513 0.7540 0.4762 0.4153 0.2619
0.6112 0.5589 0.6235 0.7439 0.7402 0.5121 0.5324 0.3198
HT diversity
0.9952
0.9858
0.9894
0.9931
0.9900
0.9894
0.9889
by STRs, is shown in Table 8. Micro- and minisatellite mutations together with the SNP are presented. Haplotype B1/69 differs from all other haplotypes as it shows most dissimilarity to the rest, and especially to A/49. 3.9. Gene diversity Table 9 depicts the gene diversity (GD) by groups and total. We can see that individual GD for the total range from 0.3198 for DYS393 (implying that two randomly chosen haplotypes from this population differ with probability 0.3198 in this locus) to 0.7439 for DYS389II. By combining all loci the GD or exclusion probability reaches 98.89%. 4. Discussion 4.1. Mutation analysis Based on 1200 meiosis the average mutation rate for all loci (DYS385I/II excluded) was calculated to
0.0042 (95% CI 0.0014±0.0097). This is close to the estimate of Kayser et al. [12] of 0.0032 based on 626 meioses in DYS19. In a Galician population study [20], 35 father/son combinations were analysed on seven Y-STR polymorphisms (245 meioses). No mutations were found. Lessing and Edelmann [21] found one single repeat mutation in DYS390 when analysing 41 father/son pairs in four systems (164 meiosis). In studies by Heyer et al. [22] and Kayser et al. [23] average mutation rates of 2:1 10ÿ3 (213 meiosis) and 3:17 10ÿ3 (4999 meiosis) were calculated, respectively. 4.2. Sorting criteria for haplotypes Using binary markers as sorting criteria for haplotypes into haplogroups is widely accepted. However, using STRs or minisatelittes for grouping is heavily debated as the haplotypes can be identical by state and not by descent. Only by increasing the number of STRs (>10) one might obtain non-overlapping haplogroups (P. de Knijff, personal communication). When using four of the frequent haplotypes observed
B.M. Dupuy et al. / Forensic Science International 117 (2001) 163±173
in the Finnish and the Norwegian population samples for clustering, the number of haplotypes for d 2 in each four haplogroups is maximised covering 60% of the population sample (81% for d 3). Further, using STRs as sorting criteria for possible haplogroups in this study gives results in good agreement with the two haplogroups sorted by the SNP and the groups sorted by the modular structure of the minisatellite locus DYF155S1 and the deletion polymorphism DYF155S2. However, Tat/T II and Tat/T III are only separated by the STR data (Table 8) and typing of extended binary markers could be used to con®rm that these clusters might represent true haplogroups. 4.3. Comparison of haplotype distribution By comparing the Norwegian population sample to the Finnish population sample [1] we are able to identify all frequent haplotypes shared in the two populations. As Kittles et al. [1] used allele sizes in base pairs instead of typing the alleles by the number of repeat units, their haplotypes were translated into repeat units (types) before comparison. Seven microsatellite loci are shared between the two studies together with DYF155S2, the latter de®ning haplogroup Tat/C, called haplogroup A in the Finnish population sample. For comparison of haplogroup Tat/ C to haplogroup A, we used the same de®nition for grouping as Kittles et al. [1] when taken into account that all C-alleles analysed showed the DYF155S2 deletion. Further comparison between haplogroups seems dif®cult because of non-overlapping use of binary markers de®ning true haplogroups. The results are presented in Table 10. These data show that there is a signi®cant difference in the distribution of the Table 10 Comparison of distribution of frequent haplotypes based on seven STRs and haplogroup Tat/C between the present Norwegian and the published Finnish population samples [1] Haplotype (haplogroup)
Finns % (n 77)
Norwegians % (n 300)
B1/69 C1/22 C3/33 A/49 (Tat/C)
10.7 1.8 1.4 28.9 (60.0)
8.7 3.3 2.0 1.3 (5.7)
171
haplotype A/49 in the two populations. The very high frequency of haplotype A/49 in Finns compared to Norwegians may re¯ect a Finnish and/or Saami origin of this haplotype in Norway. This is supported by the fact that haplogroup Tat/C is over-represented in the northern county Finnmark that boundaries Finland (Fig. 1). More than half of the samples in this county belongs to haplogroup Tat/C, which is encountered in 60% of Finnish males. However, the frequency of haplotype C1/22, C3/33 and B1/69 are relatively similar in the two populations groups. 4.4. Y-chromosome ancestors Unlike the relatively high frequency of mutation in STRs and minisatellites, the frequency of deletion polymorphism DYF155S2 and the frequency of the events generating a new modular structure in DYF155S1 is regarded to be much less frequent events [6,10]. DYF155S1 contains a repeat array of AT-rich tandem repeats that show a high polymorphic length variation revealed by MVR-PCR [10]. However, unlike other minisatellites, different repeat variants are distributed in blocks of similar variant repeats within the repeat array [10]. Based on the distribution of such blocks, the alleles can be divided into groups with different modular structures. Each group contain alleles believed to be evolutionary more closely related to each other than to alleles in other groups [10]. Thus, the deletion polymorphism and the modular structures in MSY1 might in the same manner as binary markers be used to group alleles in populations sharing a common ancestry. In agreement with this, the typing of the modular structure in DYF155S1 alleles, adds additional evidence to the grouping by STRs into haplogroup Tat/T I, while the deletion polymorphism DYF155S2 together with the T ! C transition adds additional evidence to the STR grouping of alleles into haplogroup Tat/C. 4.5. Forensic and paternity evaluation The male speci®city of the human Y-chromosome together with the haploid, non-crossing over behaviour of the chromosome makes it potentially useful in forensic studies and paternity testing. In forensic casework the haploid state makes mixture patterns and male/female stain mixtures much simpler to interpret.
172
B.M. Dupuy et al. / Forensic Science International 117 (2001) 163±173
In paternity testing the use of Y-chromosome markers may be of value in selected cases were the alleged father is not available. However, while Y-chromosome markers can be used con®dently for exclusions, the impact of inclusion may be very dif®cult to assess [24]. Large databases will ordinarily be needed, and due caution in relation to the genetic kinship and subpopulation relationship between alternative fathers should be given. In this study 137 of 300 unrelated men had their ``personal'' Y-haplotype, while 163 shared haplotype with at least one unrelated person. Even if almost half of the haplotypes observed thus appear unique, they are of course con®ned within lineages and most of them are shared between father and son. Calculations of the gene diversity (GD) demonstrate that by using eight microsatellites the cumulative GD reaches 98.98% among unrelated Norwegians. To further increase the GD the addition of highly informative bilocal markers like YCAII and DYS413 is a promising strategy, or replacement of less informative loci as DYS388 and DYS393 with highly informative ``new'' systems in the future.
[5] [6]
[7]
[8] [9]
[10]
[11]
Acknowledgements We would like to thank Margurethe Stenersen who provided the population samples, Dr. Peter de Knijff who provided the allelic ladders and Dr. Tobias Gedde-Dahl Jr. for helpful discussions.
[12]
References [1] R.A. Kittles, M. Perola, L. Peltonen, A.W. Bergen, R.A. Aragon, M. Virkkunen, M. Linnoila, D. Goldman, J.C. Long, Dual origins of Finns revealed by Y-chromosome haplotype variation, Am. J. Hum. Genet. 62 (1998) 1171±1179. [2] S.D.J. Pena, R. Chakraborty, Paternity testing in the DNA era, Trends Genet. 10 (1994) 204±209. [3] F.R. Santos et al., Testing de®ciency paternity cases with a Y-linked tetranucleotide repeat polymorphis, in: S.D.J. Pena, R. Chakraborty, J.T. Epplen, A.J. Jeffreys (Eds.), DNA Fingerprinting: State of Science, BirkhaÈuser Verlag, Basel, 1993, pp. 261±265. [4] T. Zerjal, B. Dashnyam, A. Pandya, M. Kayser, L. Roewer, F.R. Santos, W. SchiefenhoÈvel, N. Fretwell, M.A. Jobling, S. Harihara, K. Shimizu, D. Semjidmaa, A. Sajantila, P. Salo, M.H. Crawford, E.K. Ginter, O.V. Evgrafov, C. Tyler-Smith,
[13]
[14] [15]
[16] [17]
Genetic relationship of Asian and northern Europeans, revealed by Y-chromosomal DNA analysis, Am. J. Hum. Genet. 60 (1997) 1174±1183. M.A. Jobling, N. Fretwell, G.A. Dover, A.J. Jeffreys, Digital coding of human Y-chromosomes Ð MVR-PCR at Y-speci®c minisatellites, Cytogenet. Cell Genet. 67 (1994) 390. M.A. Jobling, E. Heyer, P. Dieltjes, P. de Knijff, Ychromosome-speci®c microsatellite mutation rates re-examined using a minisatellite, MSY1, Hum. Mol. Genet. 8 (11) (1999) 2117±2120. M.A. Jobling, V. Samara, A. Pandya, N. Fretwell, B. Bernasconi, R.J. Mitchell, T. Gerelsaikhan, B. Dashnyam, A. Sajantila, P.J. Salo, Y. Nakahori, C.M. Disteche, K. Thangaraj, L. Singh, M.H. Crawford, C. Tyler-Smith, Recurrent duplication and deletion polymorphisms on the long arm of the Y-chromosome in normal males, Hum. Mol. Genet. 5 (11) (1996) 1767±1775. S.A. Miller, D.D. Dykes, H.F. Polesky, A simple salting-out procedure for extracting DNA from human nucleated cells, Nucleic Acids Res. 16 (1988) 1215. A.D. Kloosterman, M. Pouwels, P. Daselaar, H.J.T. Janssen, Population genetic study of Y-chromosome speci®c STR-loci in Dutch Caucasians, Prog. Forensic Genet. 7 (1998) 491± 493. M.A. Jobling, N. Bouzekri, P.G. Taylor, Hypervariable digital DNA codes for human paternal lineages: MVR-PCR at the Yspeci®c minisatellite, MSY1 (DYF155S1), Hum. Mol. Genet. 7 (4) (1998) 643±653. DNA recommendations-1994 report concerning further recommendations of the DNA Commission of the ISFH regarding PCR-based polymorphisms in STR (short tandem repeat) systems, Forensic Sci. Int. 69, 103±104. M. Kayser, A. CagliaÁ, D. Corach, N. Fretwell, C. Gehrig, G. Graziosi, F. Heidorn, S. Herrmann, B. Herzog, M. Hidding, K. Honda, M. Jobling, M. Krawczak, K. Leim, S. Meuser, E. Meyer, W. Oesterreich, A. Pandya, W. Parson, G. Penacino, A. Perez-Lezaun, A. Piccinini, M. Prinz, C. Schmitt, P.M. Schneider, R. Szibor, J. Teifel-Greding, G. Weichhold, P. de Knijff, L. Roewer, Evaluation of Y-chromosomal STRs: a multicenter study, Int. J. Legal Med. 110 (1997) 125±133, 141±149. P. M Schneider, S. Meuser, W. Waiyawuth, Y. Seo, C. Rittner, Tandem repeat structure of the duplicated Y-chromosomal STR locus DYS385 and frequency studies in the German and three Asian populations, Forensic Sci. Int. 97 (1998) 61±70. N. Vandenberg, R.A.H. Van Oorschot, C. Tyler-Smith, R.J. Mitchell, Y-chromosome-speci®c microsatellite variation in Australian aboriginals, Hum. Biol. 71 (6) (1999) 915±931. S. Scheneider, J.M. Kueffer, D. Roessli, L. Excof®er, Arlequin ver 1.1: a software for population genetic data analysis, Genetics and Biometry Laboratory, University of Geneva, Switzerland, 1997, http://anthropologie.uniqe.ch/ arlequin/bug-report.html. J. Felsenstein, PHYLIP Ð Phylogeny inference package (Version 3.2), Cladistics 5 (1989) 164±166. W.N. Venables, B.D. Ripley, S-plus. Modern Applied Statistics with S-Plus, Springer, NY, 1994.
B.M. Dupuy et al. / Forensic Science International 117 (2001) 163±173 [18] J.A. Hartigan, M.A. Wong, A K-means clustering algorithm, Appl. Stat. 28 (1979) 100±108. [19] S. Willuweit, L. Roewer, M. Krawczak, M. Kayser, P. de Knijff. Y-STR Haplotype Reference Database, based on the continuous submission of Y-STR haplotypes by the International Forensic Y-User Group, http://ystr.charite.de. [20] C. Pestoni, M.L. Cal, M.V. Lareu, M.S. Rodriguez-Calvo, A. Carracedo, Y-chromosome STR haplotypes: genetic and sequencing data of the Galician population (NW Spain), Int. J. Legal Med. 112 (1998) 15±21. [21] R. Lessig, J. Edelmann, Y-chromosome polymorphisms and haplotypes in the West Saxony Germany, Int. J. Legal Med. 111 (1998) 215±218.
173
[22] E. Heyer, J. Puymirat, P. Dieltjes, P. de Knijff, Estimating Y-chromosome speci®c microsatellite mutation frequencies using deep rooting pedigrees, Hum. Mol. Genet. 6 (1997) 799±803. [23] M. Kayser, L. Roewer, M. Hedman, L. Henke, J. Henke, S. Brauer, C. KruÈger, M. Krawczak, M. Nagy, T. Dobosz, R. Szibor, P. de Knijff, M. Stoneking, A. Sajantila, Characteristics and frequency of germline mutations at microsatellite loci from the human Y-chromosome, as revealed be direct observation in father/son pairs, Am J. Hum. Genet. 66 (2000) 1580±1588. [24] M.A. Jobling, A. Pandya, C. Tyler-Smith, The Y-chromosome in forensic analysis and paternity testing, Int. J. legal Med. 110 (1997) 118±124.