Accepted Manuscript Title: A novel multiplex assay amplifying 13 Y-STRs characterized by rapid and moderate mutation rate Author: Urszula Rogalla Marcin Wo´zniak Jacek Swobodzi´nski Miroslava Derenko Boris A. Malyarchuk Irina Dambueva Marek Kozi´nski Jacek Kubica Tomasz Grzybowski PII: DOI: Reference:
S1872-4973(14)00244-0 http://dx.doi.org/doi:10.1016/j.fsigen.2014.11.004 FSIGEN 1267
To appear in:
Forensic Science International: Genetics
Received date: Revised date: Accepted date:
14-8-2014 4-11-2014 6-11-2014
Please cite this article as: U. Rogalla, M. Wo´zniak, J. Swobodzi´nski, M. Derenko, B.A. Malyarchuk, I. Dambueva, M. Kozi´nski, J. Kubica, T. Grzybowski, A novel multiplex assay amplifying 13 Y-STRs characterized by rapid and moderate mutation rate, Forensic Science International: Genetics (2014), http://dx.doi.org/10.1016/j.fsigen.2014.11.004 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Title: A novel multiplex assay amplifying 13 Y-STRs characterized by rapid and moderate mutation rate
3 4 5
Authors: Urszula Rogalla1, Marcin Woźniak1, Jacek Swobodziński1, Miroslava Derenko2, Boris A. Malyarchuk2, Irina Dambueva3, Marek Koziński4, Jacek Kubica4, Tomasz Grzybowski1
6 7
1
Institute of Molecular and Forensic Genetics, Nicolaus Copernicus University, Collegium Medicum in Bydgoszcz, M. Sklodowskiej-Curie 9, 85-094 Bydgoszcz, Poland
8 9
2
10 11
3
12 13
4
cr
Institute of Biological Problems of the North, Far-East Branch of the Russian Academy of Sciences, Portovaya str. 18, Magadan 685000, Russia
us
Institute of General and Experimental Biology, Russian Academy of Sciences, Ulan-Ude, Russia
U. Rogalla:
[email protected] M. Woźniak:
[email protected] J. Swobodziński:
[email protected] M. Derenko:
[email protected] B.A. Malyarchuk:
[email protected] I. Dambueva:
[email protected] M. Koziński:
[email protected] J. Kubica:
[email protected] T. Grzybowski:
[email protected]
d
M
Email addresses:
an
Chair of Cardiology and Internal Disease, Nicolaus Copernicus University, Collegium Medicum in Bydgoszcz, M. Skłodowskiej-Curie 9, 85-094 Bydgoszcz, Poland
te
14 15 16 17 18 19 20 21 22 23 24
ip t
1 2
Corresponding author: Urszula Rogalla at The Nicolaus Copernicus University, Ludwik Rydygier Collegium Medicum, Institute of Forensic Medicine, Department of Molecular and Forensic Genetics, Skłodowskiej-Curie 9, Bydgoszcz, 85-094, Poland
[email protected], +48 525853556
29
Keywords: RM Y-STRs, Y chromosome, individual identification, Buryats
30
Ac ce p
25 26 27 28
Page 1 of 26
30
Abstract
31 As microsatellites located on Y chromosome mutate with different rates, they may be
33
exploited in evolutionary studies, genealogical testing of a variety of populations and even,
34
as proven recently, aid individual identification. Currently available commercial Y-STR kits
35
encompass mostly low to moderately mutating loci, making them a perfect choice for the first
36
two applications. Some attempts have been made so far to utilise Y-STRs to provide a
37
discriminatory tool for forensic purposes. Although all thirteen rapidly mutating Y-STRs were
38
already multiplexed, no single assay based on single-copy markers allowing at least a
39
portion of close male relatives to be differentiated from one another is available. To fill in the
40
blanks, we constructed and validated an assay comprised of single-copy Y-STR markers
41
only with a mutation rate ranging from 8x10-3 to 1x10-2. Performance of the resulting
42
combination of nine RM Y-STRs and four moderately mutating ones was tested on 361
43
father-son pairs and 1326 males from 9 populations revealing an overall mutation rate of
44
1.607x10-1 for the assay as a whole. Application of the proposed 13 Y-STR set to
45
differentiation of haplotypes present among homogenous population of Buryats resulted in a
46
three-fold increase of discrimination as compared with 10 Y-STRs from the PowerPlex® Y.
M
an
us
cr
ip t
32
47 1. Introduction
d
48
Y chromosome analysis is an exceptionally valuable tool both in forensic and human
50
population genetics mainly owing to inheritance along paternal line and susceptibility to
51
genetic drift and patrilocality [1,2]. During criminal investigation, Y-STRs are most
52
commonly used for solving sexual assault cases where identification of the DNA from the
59
Ac ce p
te
49
60
are usually restricted only to identification of male lineage the sample originates from,
61
not acknowledging potentially vital information on certain individual. Meanwhile, it seems
62
it is not a rare occurrence that close relatives are involved in the same crime [6].
63
A broad screening of Y-STRs performed by Ballantyne et al. [7] has revealed significant
64
differences in mutation rates of these loci and identified an exceptional set of thirteen
65
among them, termed “rapidly mutating”, which mutation rates were to exceed 1x10-2. As
66
has been proven later [8], these RM Y-STRs are capable of differentiating nearly 27% of
53 54 55 56 57 58
male perpetrator requires overcoming huge excess of female component [3]. They are also of a great use in examination of the DNA mixtures donated by multiple males, as well as conducting deep studies on human population history [4] and even in deficiency paternity testing [5].
Its major disadvantage, as seen from the forensic genetics community’s point of view,
is the inability to unambiguously differentiate between plausible male donors. Results obtained with most of the commercially available Y-STR kits encompassing up to 17 loci
Page 2 of 26
father and sons and 56.3% of brothers, while for the most broadly used Y-STR kit, being
68
Yfiler® (Yfiler, Life Technologies, Foster City, CA), these values are 4.5% and 10%,
69
respectively. That outstanding ability was quickly noticed by the community what
70
resulted in a recent launch of commercial Promega PowerPlex® Y23 kit (PPY23,
71
Promega Corporation, Madison, WI) [9-10] that does contain two tetranucleotide RM loci
72
(DYS570 & DYS576). Noteworthy, also the YHRD database [11] has been adapted to
73
handle these new markers. In its most up-to-date version (release 47) one may browse
74
21909 PPY23 haplotypes and 873 haplotypes obtained with the only just announced
75
AmpFLSTR® Yfiler® Plus kit [12] encompassing inter alia 7 RM Y-STRs.
76
Rapidly mutating Y-STRs are undoubtedly of a great use when attempting to
77
differentiate male lineages in populations that underwent recent bottleneck or founder
78
effect resulting in general low diversity of haplotypes as seen for example among
79
Buryats or Finns [13-15]. Studies of this kind performed using Y-STRs with midrange
80
mutation rates usually fail to give satisfactory results.
an
us
cr
ip t
67
Although these are multicopy Y-STRs, which are characterized mostly by the highest
82
mutation rates, they suffer from some major drawbacks that need to be mentioned. If the
83
evidence contains DNA from multiple donors, multi-copy STRs simply hinder proper
84
determination of the number of males involved. Moreover, for inferring genealogical
85
relationships of any kind, it would be beneficial to rely only on information from single-
86
copy markers as the multi-copy ones often exhibit values that cannot be unequivocally
87
traced back to the ancestral state [16].
te
d
M
81
To address these issues, we decided to raise a question if a carefully chosen set of
89
single copy Y-STR markers characterized by medium to rapid mutation rate is capable of
90 91 92
Ac ce p
88
aiding male individual identification. We also aimed at constructing single tube multiplex assay amplifying these chosen loci and check, whether it can be of any use also for the field of population genetics.
93
2. Materials and Methods
94
2.1.
Samples
95
Mutation rates of the chosen markers and the ability of the proposed assay to
96
differentiate between male relatives was tested using 361 father and son pairs, all
97
sampled among Poles. Assay’s performance and diversity of the markers was checked
98
using 1329 samples collected in Europe (406 from Poland, 84 from Ukraine, 45 from
99
European part of Russian Federation including Pskov, Veliky Novgorod, Volot, 24 from
100
Nogais, 289 from Austria, 95 from Northern Italy), Middle East (141 Palestinian Arabs)
101
and Asia (210 Buryats – from South Siberia and 35 Kazakhs from Kereit clan). Sampling
Page 3 of 26
102
locations are shown in Figure 1. In order to check assay’s ability to resolve genealogies
103
of closely related males we additionally analyzed 14 samples from potentially related
104
men sharing a common last name.
105
2.2.
Loci selection and primer design
The assay encompasses 13 markers characterized by medium (DYS458, DYS516,
107
DYS534, DYS611) to rapid (DYS449, DYS518, DYS526b, DYS547, DYS570, DYS576,
108
DYS612, DYS626, DYS627) mutation rate. All of the markers were selected from the
109
study by Ballantyne et al. [7] based upon 1) the highest mutation rate (lower threshold
110
set at 6x10-3 to cover sampling bias), 2) the broadest allelic range (arbitrary set as 8 or
111
more), 3) being a single copy marker. The only exception was DYS526 normally present
112
in two copies. For this marker primers were designed in a way providing opportunity for
113
amplification of one allele only. All the primers were designed specifically for this study
114
using Primer3Plus software [17]. Seven among the chosen markers (DYS611, DYS612,
115
DYS526b, DYS516, DYS626, DYS547 and DYS534) are not present in any
116
commercially available Y-STR system. For details on the marker’s repeat structure,
117
mutation rate, genomic location and observed allelic ranges refer to the Table 1.
cr
us
an
Multiplex amplification
M
2.3.
d
118
ip t
106
All thirteen markers were amplified in a single multiplex reaction. The 10 µL final
120
volume PCR reaction contained: 1x GoTaq G2 Reaction Buffer, 3 mM MgCl2, 160 µM of
121
each dNTP, 0.8 U GoTaq G2 HotStart Polymerase (Promega) and 400 ng/µL bpvine
122
serum albumin (BSA, Promega) with forward and reverse primers in concentrations
124 125 126 127 128 129
Ac ce p
123
te
119
listed in Table 2. Full profiles were obtained with 0.5 ng of DNA. Amplification was performed in GeneAmp 9700 thermal cyclers (Applied Biosystems) and required 5 minutes of initial denaturation at 96˚C. Subsequent cycles of denaturation at 96˚C for 30 seconds, annealing at 61˚C for 45 seconds and elongation at 72˚C for 45 seconds were repeated a total of 30 times, followed by 10 minutes of the final elongation at 72˚C to avoid splitting peaks.
2.4.
Detection and genotyping
130
Products were separated using capillary electrophoresis on an ABI PRISM® 3130xl
131
analyzer (Applied Biosystems) equipped with 36cm capillaries and POP-7 polymer. The
132
injection time was 23 s and the applied voltage was set to 1.2kV. Samples for capillary
133
electrophoresis were prepared by mixing 1 µl of PCR product with 9 µl Hi-Di
134
Formamide™ (Applied Biosystems) and 0.3 µl GeneScan™ 600 LIZ® (Applied
135
Biosystems). A peak detection threshold of 50RFU was applied.
Page 4 of 26
All genotyping was performed with GeneMapper ID v.3.2 software (Applied Biosystems)
137
with custom allelic ladder and bin sets for each marker. All alleles present in the allelic
138
ladder were sequenced to confirm length and repeat unit structure, using Big Dye
139
Terminator v.3.1 chemistry (Applied Biosystems). All father-son mutations revealed
140
during this study were confirmed by second amplification or sequencing.
141
2.5.
ip t
136
Sensitivity, specificity and inhibition
Sensitivity testing was performed using a series of diluted samples from male donors.
143
DNA concentration of these samples was previously assessed with QuantiFiler® Duo kit
144
(Applied Biosystems). To test if the assay is specific for male DNA only, we performed 5
145
amplifications with 1ng of female DNA templates. We also checked the ability of the
146
assay to amplify male DNA in the presence of an excess of female material (3 pairs of
147
samples tested), which reflects actual conditions encountered during analysing samples
148
collected from post-coital swabs (ratio 1:1, 1:10, 1:100 and 1:1000). To assess the
149
efficacy of the assay to amplify DNA from multiple donors present in different
150
proportions, we analysed extracts containing mixtures of male DNA (3 pairs tested) in
151
the following ratios: 1:1, 1:5, 1:10. Assessing species specificity encompassed testing
152
performance of the assay in amplifying 10ng of template DNA from animals most likely to
153
appear on the crime scene, including dog, cat, hamster, guinea pig, rabbit, horse and
154
cattle. BLAST cross-search reveals high probability of successful amplification of most
155
markers from the DNA extracted from various primates. We have not tested this
156
possibility experimentally, as primate species other than humans are virtually absent in
157
Europe. Resistance to inhibition was assessed using two most common inhibitors
163
Ac ce p
te
d
M
an
us
cr
142
164
Haplotype diversities and discrimination capacity were calculated according to standard
165
methodology [10]. Fst and Rst were computed using Arlequin v.3.5 [19] software and
166
their graphical presentation in a heatmap form was constructed using gplots package for
167
R [20]. Multidimensional scaling of both linearized Fst and Rst was performed using
168
Statistica package v.9.1 (StatSoft). Haplotypes’ networks were constructed using median
169
joining algorithm [21] embedded in the Network v.4.6.12 software [22].
158 159 160 161 162
encountered in our practice – hematin and humic acid (HA) – with varying concentrations of both (for the details proceed to the results section).
2.6.
Statistics
Mutations were counted directly and the mutation rate was calculated as the number of observed mutations divided by the number of father-son pairs. 95% confidence intervals from binomial probability distribution was estimated using the formula available at [18].
Page 5 of 26
170
3. Results and discussion
171
3.1.
Nomenclature
Nomenclature used for all the loci follows the ISFG recommendations [23] and is
173
equal to one published in [7], including DYS449 locus (see [24]). The only discrepancy in
174
nomenclature may be seen in DYS534 locus, which in case of our assay contains
175
additional ATCT11-13 microsatellite (rs72167351) making it necessary to add a value of
176
11 to 13 to the repeat number of DYS534 marker. 3.2.
cr
177
ip t
172
Overall assay’s performance
Sensitivity: As each assay to be used in forensics needs to be capable of amplification of
179
low amounts of DNA found on the crime scene (LT-DNA samples), we tested our PCR
180
reaction using varying amounts of DNA collected from female (1 ng) and male donors
181
(31.25 pg-10 ng). For female samples the reaction gave no signal of amplification, as
182
expected. For male samples full, well-balanced profiles, evenly distributed across all loci
183
in all channels were obtained in the presence of 500 pg to 1 ng DNA, which is
184
comparable to commercially available kits (see Supplementary Figure 3). With the lower
185
amounts of the DNA template, reaching 125 pg we still observed full profiles, yet signal
186
intensity was much weaker, therefore one need to assume some drop-outs may
187
potentially occur.
188
Mixtures: Most of the instances where Y-STR assays are employed refer to sexual
189
assault cases, therefore it is crucial to test the assay’s ability to amplify samples
196
Ac ce p
te
d
M
an
us
178
197
Species specificity: We tested the assay’s specificity for human DNA template by making
198
attempt to amplify non-human samples collected from dog, cat, guinea pig, hamster,
199
rabbit, horse and cattle – species most probable of being present at the crime scene in
200
our climatic zone. As expected, amplification of DNA from all the aforementioned species
201
did not yield detectable products in any locus.
190 191 192 193 194 195
consisting of female:male and male:male DNA mixtures present in variable ratios. Our
results suggest that the 13 Y-STRs assay is able to amplify male DNA even if there is a huge excess of female material added to the reaction (up to 100:1). However, commercial assays successfully amplify male profiles in female admixed samples at
ratios ranging from 1:1000 (as for Yfiler [25]) to 1:24000 (for PPY23 [26]). The 13 Y-STR assay is also capable of retrieving full profiles of two male donors if their DNA is present even in the 1:10 proportion (upper tested value).
Page 6 of 26
Inhibition: Overcoming inhibition is a daily problem for forensic genetics, as samples
203
from crime scenes can contain inhibitors that affect amplification. Therefore, we
204
analysed resistance of the assay to two most commonly encountered inhibitors –
205
hematin and humic acid. For hematin the range of examined concentrations was 1 to
206
30μM with 20μM being the upper resistance value allowing successful amplification
207
across all 13 loci. This result is much worse than the one reported for PPY23 (500 μM
208
[26] yet comparable with Yfiler data, which exhibits overall inhibition if hematin
209
concentration exceeds 16 μM [25]. In case of humic acid, we applied 1 to 30 ng/μL of
210
inhibitor and observed non-problematic amplification up to the addition of 20ng/μL of HA.
211
It is worth noting, however, that the only PCR enhancer tested was BSA, without which
212
overcoming HA inhibition was not possible at all (data not shown).
cr
3.3.
Efficacy for differentiating individuals
us
213
ip t
202
In order to test for efficacy of the assay for differentiating individuals we tested 361
215
father-son pairs and performed a broad population study based on samples collected
216
from individuals from 9 populations representing Europe, Asia and the Middle East. We
217
found 58 mutations within father-son pairs (see Table 1 for the details) corresponding to
218
the overall mutation rate of 16.07% (as the sum of all the mutation rates for separate
219
loci), which is remarkably higher than the one given for Yfiler [27] and PPY23 and
220
comparable with the data for Yfiler Plus [12] yet still much lower than reported for the set
221
built up with 13 RM Y-STRs only [8]. One should note, however, that the highest
222
mutation rate has been reported so far for multi-allelic markers [7], which we purposely
223
avoided in our assay. According to our results, most mutable single-allelic marker was
230
Ac ce p
te
d
M
an
214
231
analysed. Most of the observed mutations were single-step ones (13.5:1) with slight
232
majority of gains over losses (1.23:1).
224 225 226 227 228 229
DYS518, which is also included in the Yfiler Plus kit. Second most mutable marker in our study was DYS458, a well-established marker described – as seen in previous studies [7,28] – as a moderate mutation rate one. We observed no single meiosis mutations in our dataset for DYS611 marker, yet given its extremely complex layout, one may presume that sequencing could reveal some trinucleotide repeat loss counterbalanced with repeat gain impossible to trace with fragment length electrophoresis. However, these estimates would definitely be different if a few thousand father-son pairs were
233
Haplotype diversity and discrimination capacity values as well as the number of
234
haplotypes including unique ones are summarized in Table 3. There were no shared
235
haplotypes between any samples from the populations under study. For Polish
236
population, which we analysed most extensively, there were 403 haplotypes in 406
237
individuals resulting in haplotype diversity and discrimination capacity values reaching
Page 7 of 26
238
0.99996 and 0.993, respectively. Overall, for all the populations studied (excluding
239
Buryats and Kereits which would severely bias the result due to their high homogeneity)
240
HD and DC show values of 0.999931 and 0.969 respectively, which seem to be
241
promising in terms of individual identification, especially if compared with other
242
previously available Y-STR assays [10].
244
ip t
243 3.3.1. Practical application of the assay
We also took an opportunity to verify our assay’s performance in distinguishing
246
closely related males whilst performing genealogical testing for 14 males inhabiting
247
diverse regions of central Poland but sharing the same last name, who were eager to
248
investigate if they have common male ancestor. The gathered documentation was
249
scarce, yet it contained evidence that there were at least four individuals representing
250
one male lineage and separated from each other by six or nine meioses, depending on
251
the branch of genealogy. Yfiler testing excluded one of these individuals as a close
252
relative of the others but simultaneously revealed that two more men from the tested
253
group bear exactly the same haplotype as the supposedly related men. Altogether 5 out
254
of 14 males tested shared one Yfiler haplotype. On the contrary, results obtained with
255
the assay proposed in this study clearly distinguished all the five men from each other by
256
one to four mutations, what remains concordant with the calculated mutation rate of 16%
257
(equalling to 6.2 meioses).
te
259
3.4.
260
Ac ce p
258
d
M
an
us
cr
245
261 262 263 264 265 266
Use of the assay in population and evolutionary genetics
Y-STR markers characterized by a rapid mutation rate may potentially be of a great
use not only for individual identification purposes but also for resolving phylogenies of homogenous populations, whose diversity is low or has been dramatically reduced by a bottleneck effect, for instance. To answer the question if our assay may aid identification of individuals sampled from homogenous populations we chose Buryats from southern Siberia and Kereit clan from Kazakhstan, both of which have been previously screened for the Y-STR diversity [13, 29]. Among Buryats (n=202) 58 haplotypes were reported
267
based on the PowerPlex Y analyses. Three of these haplotypes were shared by 59% of
268
all males in the group. Applying our set of markers allowed distinguishing 163 unique
269
haplotypes, which is an almost three-fold increase in resolution (see Figure 4. for
270
haplotypes’ network comparison). After combining our 13 markers with 10 provided by
271
the PowerPlex Y, we got only 11 distinctive haplotypes more. In case of Kereits (n=36)
272
the difference was less spectacular, yet still remarkable. Using 10 PowerPlex Y markers
273
only 15 haplotypes could have been identified, whereas our set used separately allowed
Page 8 of 26
distinguishing 25 haplotypes. These values do not point to the ability of unambiguous
275
identification of individuals but definitely prove that inclusion of some of the RM Y-STRs
276
in analyses results in a massive increase in differentiation capabilities of the assay.
277
In order to investigate further the potential relationships between populations under
278
study as revealed with our set of markers, we calculated Fst and Rst distances between
279
groups of samples and visualised them using heatmaps and MDS plots, as presented in
280
Figure 2 and Figure 3. The results obtained with both methods differed significantly from
281
one another. On the MDS plot of Fst distances, all the populations grouped together with
282
the exception of Kereits and Buryats that were evidently separated. In case of Rst, which
283
is a usual choice for STR analysis, no single cluster could be distinguished, although the
284
greatest differences were observed between Arabs, Kereits and the remaining
285
populations. Thus, it seems that the chosen set of markers performs poorly in terms of
286
providing data for inferring relationships between populations and in practice can
287
probably be overrun by a set of slowly or moderately mutating markers that maintains
288
uniqueness of populations or changes in clines across geographic regions. In case of
289
RM Y-STRs the chance of a random convergence of haplotypes is high, thus making
290
them less useful in analysing populations with well-resolved phylogenies.
M
an
us
cr
ip t
274
291 4. Conclusions
d
292
Rapidly mutating markers have recently proven ability to increase the differentiation of
294
related males, yet no single assay has been reported so far and it seems that analyses
295
of multi-allelic markers may be cumbersome in some cases. The proposed multiplex
296
assay comprised of markers exhibiting single copy alleles allows distinguishing fathers
303
Ac ce p
te
293
304
paternity cases as it would create interpretational issues.
305
It has been shown that the assay is human specific, requires only 0.5 ng of the DNA for
306
the successful amplification, offers some, although limited, level of resistance to common
307
inhibitors (humic acid and hematin) and is complete in just 2.5 hours including CE,
308
making it a promising alternative for the commercially available kits.
297 298 299 300 301 302
from their sons in at least 16% of cases. It also appears to be useful in testing of homogenous populations characterised by low genetic diversity, as it is capable of differentiation of many of the samples on individual’s level. On the other hand, it seems that there is no point in applying the tested marker set to population genetics research focused on highly diverse populations, as the results would bring no new information and would rather blur the overall picture. Certainly, use of RM Y-STRs should also be avoided in missing persons (when comparison with potential relatives is performed) and
Page 9 of 26
309
For further comparative purposes all the obtained haplotypes are available in
310
Supplementary Table 1. Supplementary Figures 1 and 2 depict representative
311
electropherogram for the assay and custom allelic ladder, respectively.
312 Acknowledgements
314
This study is funded by the Polish Ministry of Science and Higher Education Preludium Grant
315
no. 2012/05/N/NZ8/00801 and the Faculty of Medicine CM NCU Grant for young scientists
316
no. 07/WL/2013. We are also grateful to Mrs Mariola Mrozek for the excellent technical
317
assistance.
318
Conflict of interest:
319
Authors declare no conflict of interests.
320
Ethical approval:
321
All samples
322
variation from anonymous donors and from individuals participating in forensic paternity
323
testing performed by Institute of Molecular and Forensic Genetics in Bydgoszcz, Poland.
324
References
325
[1] M. Kayser. Uni-parental markers in human identity testing including forensic DNA
326
analysis. Biotechniques 43 (2007) 3042.
327
[2] P.A. Underhill, and T. Kivisild. Use of y chromosome and mitochondrial DNA population
328
structure in tracing human migrations. Annu. Rev. Genet. 41 (2007) 539–564.
329
[3] W. Parson, H. Niederstatter, A. Brandstatter, B. Berger, Improved specificity of YSTR
330
typing in DNA mixture samples, Int. J. Legal Med. 117 (2003) 109–114.
331
[4] W. Shi, Q. Ayub, M. Vermeulen, R.G. Shao, S. Zuniga, K. van der Gaag, P. de Knijff, M.
332
Kayser, Y. Xue and C. Tyler-Smith. A worldwide survey of human male demographic history
333
based on Y-SNP and Y-STR data from the HGDPCEPH populations. Mol. Biol. Evol. 27
334
(2010) 385–393.
335
[5] M.A. Jobling, A. Pandya, C. Tyler-Smith, The Y chromosome in forensic analysis and
336
paternity testing, Int. J. Leg. Med. 110 (1997) 118 – 124.
337
[6] C.J. Gershaw, A.J. Schweighardt, L.C. Rourke, M.M. Wallace, Forensic utilization of
338
familial searches in DNA databases, Forensic Sci. Int. Genet. 5 (2011) 16–20.
with
us an
obtained
informed
consent
for studies of gene frequency
Ac ce p
te
d
M
were
cr
ip t
313
Page 10 of 26
[7] K.N. Ballantyne, M. Goedbloed, R. Fang, O. Schaap, O. Lao, A. Wollstein, Y. Choi, K. van
340
Duijn, M. Vermeulen, S. Brauer, R. Decorte, M. Poetsch, N. von-Wurmb-Schwark, P. de
341
Knijff, D. Labuda, H. Vezina, K. Knoblauch, R. Lessig, L. Roewer, R. Ploski, T. Dobosz, L.
342
Henke, J. Henke, M.R. Furtado, M. Kayser, Mutability of Ychromosomal microsatellites:
343
rates, characteristics, molecular bases, and forensic implications, Am. J. Hum. Genet. 87
344
(2010) 341–353.
345
[8] N. Ballantyne, A. Ralf, R. Aboukhalid, N. M. Achakzai, M.J. Anjos et al. Toward Male
346
Individualization with Rapidly Mutating Y-Chromosomal Short Tandem Repeats. Hum Mut 35
347
(2014) 1021-1032.
348
[9] J.M. Thompson, M.M. Ewing, W.E. Frank, J.J. Pogemiller, C.A. Nolde, D.J. Koehler, A.M.
349
Shaffer, D.R. Rabbach, P.M. Fulmer, C.J. Sprecher, D.R. Storts, Developmental validation of
350
the PowerPlex Y23 System: a single multiplex Y-STR analysis system for casework and
351
database samples, Forensic Sci. Int. Genet. 7 (2013) 240–250.
352
[10] J. Purps et al. A global analysis of Y-chromosomal haplotype diversity for 23 STR loci.
353
Forensic Sci Int Genet. (2014) doi: 10.1016/j.fsigen.2014.04.008
354
M
an
us
cr
ip t
339
[11] YHRD database: http://yhrd.org
356
[12] Life Technologies, 2013 Future Trends in Forensic DNA Technology Seminar Series:
357
Development of a Next Generation Y-STR Multiplex for Forensic Applications : Development
358
of
359
http://www.slideshare.net/Lifetech_HID/2013-hid-universityyfilerplus,
360
14.10.2014.
361
[13] M. Woźniak, M. Derenko, BA Malyarchuk, I. Dambueva, T. Grzybowski, D. Miścicka-
362
Śliwka. (2006) Allelic and haplotypic frequencies at 11 Y-STR loci in Buryats from South-East
363
Siberia. Forensic Sci. Int. 164 (2006) 271–275.
364
[14] M. Hedman, V. Pimenoff, M. Lukka, P. Sistonen, A. Sajantila, Analysis of 16 Y STR loci
365
in the Finnish population reveals a local reduction in the diversity of male lineages, Forensic
366
Sci. Int. 142 (2004) 37–43.
367
[15] M. Hedman, AM. Neuvonen, A. Sajantila, and J.U. Palo. Dissecting the Finnish male
368
uniformity: The value of additional Y-STR loci. Forensic Sci. Int. Genet., 5 (2011) 199-201.
369
[16] M. Vermeulen, A. Wollstein, K. van der Gaag, O. Lao, Y. Xue, Q. Wang, L. Roewer, H.
370
Knoblauch, C. Tyler-Smith, P. de Knijff, M. Kayser. Improving global and regional resolution
Next
Generation
Ac ce p
a
te
d
355
Y-STR
Multiplex
for
Forensic last
Applications. accessed
Page 11 of 26
of male lineage differentiation by simple single-copy Y-chromosomal short tandem repeat
372
polymorphisms. Forensic Sci. Int. Genet. 3 (2009) 205–13.
373
[17] A. Untergasser, I. Cutcutache, T. Koressaar, J Ye, B.C. Faircloth, M. Remm, S.G.
374
Rozen. Primer3--new capabilities and interfaces. Nucleic Acids Res. 40 (2012) e115.
375
[18] Exact Binomial and Poisson Confidence Intervals: http://statpages.org/confint.html
376
[19] L. Excoffier and H.E. L. Lischer. Arlequin suite ver 3.5: A new series of programs to
377
perform population genetics analyses under Linux and Windows. Molecular Ecology
378
Resources. 10 (2010) 564-567.
379
[20] R: http://cran.r-project.org
380
[21 H-J Bandelt, P Forster, A. Röhl. Median-joining networks for inferring intraspecific
381
phylogenies. Mol Biol Evol 16 (1999) 37-48.
382
[22] Network v.4.6.12 software: fluxus-engineering.com
383
[23] L. Gusmao, J.M. Butler, A. Carracedo, P. Gill, M. Kayser, W.R. Mayr, et al., DNA
384
Commission of the International Society of Forensic Genetics (ISFG): an update of the
385
recommendations on the use of Y-STRs in forensic analysis, Forensic Sci. Int. 157 (2006)
386
187–197.
387
[24] J. Mulero, J. Ballantyne, K. Ballantyne, B. Budowle, M. Coble, L. Gusmao, L. Roewer,
388
M. Kayser. Nomenclature update and allele repeat structure for the markers DYS518
389
and
390
http://dx.doi.org/10.1016/j.fsigen.2014.04.009
391
[25] J.J. Mulero, C.W. Chang, L.M. Calandro, R.L. Green, Y. Li, C.L. Johnson, L.K.
392
Hennessy. Development and validation of the AmpFlSTR Yfiler PCR amplification kit: a male
393
specific, single amplification 17 Y-STR multiplex system. J Forensic Sci. 51(2006) 64-75.
394
[26] J.M. Thompson, M.M. Ewing, W.E. Frank, J.J. Pogemiller, C.A. Nolde, D.J. Koehler,
395
A.M. Shaffer, D.R. Rabbach, P.M. Fulmer, C.J. Sprecher, D.R. Storts Developmental
396
validation of the PowerPlex® Y23 System: a single multiplex Y-STR analysis system for
397
casework and database samples. Forensic Sci Int Genet. 7 (2013) 240-50.
398
[27] M. Goedbloed, M. Vermeulen, R.N. Fang, M. Lembring, A. Wollstein, K Ballantyne, O.
399
Lao, S. Brauer, C. Krüger, L. Roewer, R. Lessig, R. Ploski, T. Dobosz, L. Henke, J. Henke,
400
M.R. Furtado, M. Kayser. Comprehensive mutation analysis of 17 Y-chromosomal short
Forensic
Ac ce p
DYS449.
te
d
M
an
us
cr
ip t
371
Sci.
Int.
Genet.
(2014),
Page 12 of 26
tandem repeat polymorphisms included in the AmpFlSTR Yfiler PCR amplification kit. Int J
402
Legal Med. 123 (2009) 471-82.
403
[28]K.N. Ballantyne, V. Keerl, A. Wollstein, Y. Choi, S.B. Zuniga, A. Ralf, M. Vermeulen, P.
404
de Knijff, M. Kayser, A new future of forensic Y-chromosome analysis: rapidly mutating Y-
405
STRs for differentiating male relatives and paternal lineages, Forensic Sci. Int. Genet. 6
406
(2012) 208–218.
407
[29] S. Abilev, B.A. Malyarchuk, M. Derenko, M. Wozniak, T. Grzybowski, I. Zakharov. The Y-
408
chromosome C3* star-cluster attributed to Genghis Khan’s descendants is present at high
409
frequency in the Kerey clan from Kazakhstan. Hum Biol. 84 (2012) 79-89.
410
Captions for figures:
411
Figure 1. Geographic distribution of samples, POL- Poles, UKR – Ukrainians, AUS-Austrians,
412
IT- Italians, ROS – Russians (Pskov, Veliky Novgorod, Volot), NG – Nogais, BR – Buryats,
413
KER – Kereits, AR – Palestinian Arabs.
414
Figure 2. MDS plotted for Fst and Rst for all the populations under study.
415
Figure 3. Heatmaps constructed with R gplots package based on Fst and Rst results.
416
Figure 4. Networks constructed using MJ algorithm for the Buryat population obtained with
417
10 Y-STRs data (from PowerPlex® Y) and 13 Y-STRs from our assay.
cr
us
an
M
d
te
Ac ce p
418
ip t
401
Page 13 of 26
418 Number of mutations
DYS458
DYS449
Motif
95% confidence interval
Location
Obs. allelic range Total gains loses single multip.
(GAAA)11-24
simple
9.6x10-32.22x10-2 4.32x10-2
Yp11.2
10-21
(TTCT)13-19N22(TTCT)3N12(TTCT)13-19
complex
3x10-3 1.11x10-2 2.81x10-2
Yp11.2
26-38
complex
-
simple
4.5x10-3 1.39x10-2 3.2x10-2
complex
4.5x10-3 1.39x10-2 3.2x10-2
4
4
7
4
3
1
1
3
1
(TTC)5N9(TTC)4(CTC)1(TTC)3N9(TTC)5 (CTC)1(TTC)3N15(TTC)4 (CT)1(TTC)3 (CTC)1(TTC)3 N20 (TTC)3T(TTC)3N7(TTC)3N9 (TTC)4(TCC)1 (TTC)7-21N23 (TTC)4N4 [(TTC)1(CTC)1]2[(CTC)1(TTC)1]3
-
-
-
-
Yq11.221 12-24
5
3
2
5
0
Yp11.2
19-31
5
5
0
4
1
7x10-4 5.54x10-3 1.99x10-2
Yp11.2
11-28
2
0
2
2
0
1.5x10-2 3.05x10-2 5.39x10-2
Yq11.21
24-38
11
6
5
11
0
complex
1.7x10-38.31x10-3 2.4x10-2
Yq11.221 36-49
3
2
1
3
0
complex
6.1x10-3 1.66x10-2 3.58x10-2
Yq11.223 9-18
6
3
3
6
0
DYS627 (AGAA)3N16(AGAG)3(AAAG)1224N81(AAGG)3
complex
7x10-4 5.54x10-3 1.99x10-2
Yp11.2
2
1
1
2
0
DYS534 (CTTT)3N8(CTTT)920N9(CTTT)3N169(ACTC)11-13
complex
1.7x10-3 8.31x10-3 2.4x10-2
Yq11.221 22-32
3
2
1
2
1
simple
1.7x10-3 8.31x10-3 2.4x10-2
Yp11.2
12-21
3
2
1
3
0
complex
6.1x10-3 1.66x10-2 3.58x10-2
Yq11.221 29-41
6
2
4
6
0
summed for all loci
12.43x10-2 - 20.27x1016.07x10-2 2
(CCT)5(CTT)1(TCT)4(CCT)1(TCT)19-31
DYS518 (AAAG)3(GAAG)1(AAAG)1422(GGAG)1(AAAG)4N6(AAAG)1119N27(AAGG)4
Ac ce p
DYS516
DYS576
complex
te
DYS547 (CCTT)9-13T(CTTC)4-5N56(TTTC)1022N10(CCTT)4(TCTC)1(TTTC)916N14(TTTC)3
d
DYS626 (GAAA)1423N24(GAAA)3N6(GAAA)5(AAA)1(GAAA)23(GAAG)1(GAAA)3 complex
M
DYS612
(TTTC)14-24
(TTCT)4N30(TTCT)9-18
(AAAG)13-22
DYS526b (CCCT)3N20(CTTT)11-17(CCTT)610N113(CCTT)10-17
Yq11.221 15-20
an
0
DYS570
-
us
cr
DYS611
8
ip t
Locus
Mean mutation Complexity rate
8-22
Table 1. Summary of all the 13 Y-STRs form the assay, including structure of mutations
419 420
Page 14 of 26
420 Locus
Primers (3'-5')
C[μM] Dye
DYS458 F: TGCAGACTGAGCAACAGGAAT
Size range
0.32 6-FAM 166-218
DYS449 F: TGGAGTCTCTCAAGCCTGTTC
0.4
6-FAM 294-342
DYS611 F: CTGAAGCGATCCCCTGAGTAG
cr
R: GGTTGGACAACAAGAGTAAGACAG 0.24 6-FAM 413-459
0.32 VIC
DYS612 F: TTCACACAGGTTCAGAGGTTTG
0.4
VIC
M
R: CTTGACACTTGCCATGGGTAT DYS518 F: CTGGGCAACACAAGTGAAACT
107-147
an
R: GCTGAAATGCAGATATTCCCTA
us
R: ACTTGGCAACATAGCAGATCC DYS570 F: GCTGTGTCCTCCAAGTTCCT
ip t
R: TTTCCTGACCTTGTGATCCAG
0.16 VIC
189-225
310-378
R: GCATCACATGTAGCACTCTGG
d
DYS547 F: GTTCCAATTCTATCCATGTTACTGC 0.8
VIC
412-508
te
R: CCTGAGTGACAGAGCATAAACG
0.32 NED
177-213
0.24 NED
240-288
0.64 NED
389-433
0.24 PET
165-205
0.8
PET
215-274
DYS526b F: CATTATGTATTCTGTTTGTTTTCAGC 1.2
PET
400-496
Ac ce p
DYS516 F: GCCATGGTTTCTTGCTTCTTT
R: ACGAACCTGCAAATTGTTCAC
DYS627 F: AGCGCAGGATTCCATCTAAAA R: GCCTTTCATTCTCTCCTTCGT
DYS534 F: TCATCCCTCATCTACCCAACA R: TCAGTTCTTAACTCAACCAAACAA
DYS576 F: TTGGGCTGAGGAGTTCAATC R: GGCAGTCTCATTTCCTGGAG DYS626 F: CTGGGTGACAGAGTGCAAGAC R: TTTGGGACATGTTTGTTCTTTC
R: GTTTGGGTTACTTCGCCAGA
Page 15 of 26
Table 2. List of all the primers designed for the assay with dye-labels and expected product sizes in bp
421
Ac ce p
te
d
M
an
us
cr
ip t
422
Page 16 of 26
422 Unique (n=1)
n=2
n=3
n=>4
HD
DC
Poles
406
400
3
-
-
0.99996
0.993
Ukrainians
84
80
2
-
-
0.99942
0.976
Russians
45
43
1
-
-
0.99899
0.978
Nogais
24
20
2
-
-
0.99275
0.917
Buryats
210
150
13
3
3
Italians
95
89
3
-
-
Austrians
289
274
6
1
Arabs
141
118
5
3
Kereits
36
19
3
cr
0.805
0.99932
0.968
us
0.99416
-
0.99978
0.972
1
0.99797
0.901
0.96984
0.694
0.99977
0.935
an 2
ip t
Number of haplotypes
1
M
Total
te Ac ce p
423
d
Table 3. Summary of distinct haplotypes as seen in various populations with haplotype diversity and discrimantion capacity values
Page 17 of 26
Highlights
424
‐ Thirteen carefully selected Y‐STRs are capable of differentiating fathers and sons in ca. 16% of cases
425
‐ The newly designed 13 Y‐STR assay successfully amplifies 0.5‐1ng of DNA
426 427
‐ Rapidly mutating Y‐STRs aid lineage differentiation in populations characterized by low genetic diversity
428
ip t
423
Ac ce p
te
d
M
an
us
cr
429
Page 18 of 26
Ac
ce
pt
ed
M
an
us
cr
i
Figure_1
Page 19 of 26
Ac
ce
pt
ed
M
an
us
cr
i
Figure_1_Grayscale
Page 20 of 26
Ac
ce
pt
ed
M
an
us
cr
i
Figure_2
Page 21 of 26
Ac
ce
pt
ed
M
an
us
cr
i
Figure_2_Grayscale
Page 22 of 26
Ac
ce
pt
ed
M
an
us
cr
i
Figure_3
Page 23 of 26
Ac
ce
pt
ed
M
an
us
cr
i
Figure_3_Grayscale
Page 24 of 26
Ac
ce
pt
ed
M
an
us
cr
i
Figure_4
Page 25 of 26
Ac
ce
pt
ed
M
an
us
cr
i
Figure_4_Grayscale
Page 26 of 26