A top-down approach to DNA mixtures

A top-down approach to DNA mixtures

Journal Pre-proof A top-down approach to DNA mixtures Klaas Slooten PII: S1872-4973(20)30021-1 DOI: https://doi.org/10.1016/j.fsigen.2020.102250 ...

3MB Sizes 2 Downloads 112 Views

Journal Pre-proof A top-down approach to DNA mixtures Klaas Slooten

PII:

S1872-4973(20)30021-1

DOI:

https://doi.org/10.1016/j.fsigen.2020.102250

Reference:

FSIGEN 102250

To appear in:

Forensic Science International: Genetics

Received Date:

14 June 2019

Revised Date:

23 December 2019

Accepted Date:

16 January 2020

Please cite this article as: Klaas Slooten, A top-down approach to DNA mixtures, (2020), doi: https://doi.org/10.1016/j.fsigen.2020.102250

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier.

A top-down approach to DNA mixtures

Abstract

lP

re

-p

ro of

Presently, there exist many different models and algorithms for determining, in the form of a likelihood ratio, whether there is evidence that a person of interest contributed to a mixed trace profile. These methods have in common that they model the whole trace, hence all its contributors, which leads to the computation time being mostly determined by the number of contributors that is assumed. At some point, these calculations are no longer feasible. We present another approach, in which we target the contributors of the mixture in the order of their contribution. With this approach the calculation time now depends on how many contributors are queried. This means that any trace can be subjected to calculations of likelihood ratios in favour of being a relatively prominent contributor, and we can choose not to query it for all its contributors, e.g., if that is computationally not feasible, or not relevant for the case. We do so without using a quantitative peak height model, i.e., we do not define a peak height distribution. Instead, we work with subprofiles derived from the full trace profile, carrying out likelihood ratio calculations on these with a discrete method. This lack of modeling makes our method widely applicable. The results with our top-down method are slightly conservative with respect to the one of a continuous model, and more so as we query less and less prominent contributors. We present results on mixtures with known contributors and on research data, analyzing traces with plausibly 6 or more contributors. If a top-k of most prominent contributors is targeted, it is not necessary to know how many other contributors there are for LR calculations, and the more prominent the queried contributor is relatively to all others, the less the evidential value depends on the specifics of a chosen peak height model. For these contributors the qualitative statement that more input DNA leads to larger peaks suffices. The evidential value for a comparison with minor contributors on the other hand, potentially depends much more on the chosen model. We also conclude that a trace’s complexity, as meaning its (in)ability to yield large LR’s that are not too model-dependent, is not measured by its number of contributors; rather, it is the equality of contribution that makes it harder to obtain strong evidence.

1. Introduction

na

Keywords: Likelihood ratios, DNA mixtures, Semi-continuous model, Deconvolution, Weight of evidence

Jo

ur

Currently there exist several models for likelihood ratio calculations on DNA mixtures. Discrete models base the likelihood ratio on the detected alleles, continuous models on the detected alleles and their peak heights. A continuous model therefore needs to be able to model peak heights. Very many models exist and more would be possible. A first important aspect is the distribution from which peak heights are drawn once other parameters are conditioned on (e.g., the genotypes of the contributors, the mixture proportions). Some models use Gamma-distributions (e.g., [1], [2], [3], [4], [5]), others use a Lognormal distribution (e.g., [6]), other choices are made in [7], [8], and no doubt more would be reasonably possible. In addition to modeling peak heights of alleles, decisions need to be made as to whether and how to model stutter (e.g., not at all, -1 stutters, -2, +1, -.2 stutters, etc.) and the peak height distribution of them (e.g., conditional on the length of the parental allele, or on the longest uninterrupted sequence, conditional on the locus); how to model drop-in alleles (e.g., their peak height distributions, their probabilities, whether or not they are taken into the subpopulation correction or not), locus specific sensitivity, degradation, .., to name a few. It is easy to envisage dozens, or even hundreds, of models that are all slightly different from each other. It has been noted as well that, especially for small amounts of input DNA, peak height distributions need not be unimodal and hence are not well described by a unimodal distribution, including all the ones we mentioned (cf. [9] and [10]). And yet, in many cases, when comparison studies are carried out (e.g. [11], [12], [13], [14]) it is observed that there is substantial agreement between the likelihood ratios that they compute, and we Preprint submitted to Elsevier

December 23, 2019

na

lP

re

-p

ro of

note that this agreement is especially strong when the likelihood ratios are larger. One of the purposes of this paper is to show that this is so, simply because what all models have in common is that they reflect in one way or another that sample contributors with more DNA in the sample will tend to give rise to larger peaks in the trace profile than contributors with less DNA. We will show that this principle alone is sufficient to obtain weights of evidence that closely resemble those of continuous models. We will do so by developing a model that uses the peak heights, but without specifying a peak height model. Before we proceed to give details on that model, we give two examples where it is apparent that any peak height model will give the same answer. An at first sight trivial example is to consider a single source trace. Suppose we have a locus with alleles (12, 16) with peak heights (1000, 1100). If we have a person of interest with genotype (12, 16) then the model needs to predict with what probability one would get peak heights exactly equal to (1000, 1100). For an unknown individual, all possible genotypes need to be considered, but the contribution to the profile likelihood coming from genotypes other than (12, 16) will be negligible, and the likelihood is almost exactly equal to the probability of seeing (12, 16) in the person of interest if it is not a contributor, multiplied by the probability to obtain peak heights (1000, 1100) for such a person. In the likelihood ratio this probability cancels out and we get the inverse of the random match probability for genotype (12, 16). Thus, the whole peak height model drops out of the equation and the LR is model-independent. The model becomes important only when the alleles have non-negligible chance to be not from the contributor, but a result of stutter, drop-in, dye bleeding, etc. For an actual mixture this result need not be different. Suppose we consider a locus with alleles (15, 16, 17, 18) with peak heights (2500, 1000, 1100, 2600). Some peaks may have a stutter component but they are too large to be purely stutter. The only possibility really playing a role is that corresponding to genotypes (15, 18) and (16, 17); the probability to see exactly these heights is not relevant because it is needed for both hypotheses for the same genotypes. The only assumption that is crucial, is that contributors tend to have higher peaks if their contribution is larger. Different models will predict the profiles with different probabilities, but the ratio between these probabilities under both hypotheses, i.e., the likelihood ratio, is for such profiles the same. These toy examples illustrate that the importance of the details of the peak height model is dependent on how well the trace profile allows to distinguish between the contributors. For contributors that can be well distinguished, the precise peak height model is less of importance. Therefore the evidential value is quite model independent. Our aim is to work with only the basic assumption that a larger contribution leads to higher peaks in the trace profile and to refrain from making more assumptions as much as possible. We will see that this allows to identify contributors to an extent that is, for the most prominent contributors, comparable to that of continuous models. However, the method does not need the total number of contributors: we target the contributors serially in the order from most to least contributing person. 2. Likelihood ratios and probabilistic genotyping

Jo

ur

Let us denote the trace profile by M , and let S be the person of interest with DNA profile g. There may be other profiles available from persons that play an undisputed role (e.g., of a victim whose contribution is considered to be certain) and we denote by I the collection of all such profiles, excluding g. For example I may be the profile of a person whose contribution to M is undisputed. We wish to calculate the likelihood ratio in favour of S having contributed to M , i.e., calculate LR(M, g | I) =

P (M | Hp , g, I) P (g | Hp , I) P (M, g | Hp , I) = , P (M, g | Hd , I) P (M | Hd , g, I) P (g | Hd , I)

(2.1)

where Hp states that S contributed to M and Hd states that S did not. In addition, if there are known contributors, these are declared to be contributors by both Hp and Hd . We will assume that Hp and Hd differ only with respect to the alleged contribution of S. That is, the relationship (if any) between S and the persons whose profiles are in I (if any) is the same for Hp and Hd . In that case, the last term in (2.1) is unity, and we get P (M | Hp , g, I) LR(M, g | I) = , P (M | Hd , g, I)

i.e., we need to compare the probabilities to see M under each hypothesis. These hypotheses, as stated so far, are not specific enough to be able to compute the required likelihoods. We need a model for the computations. 2

Schematically, we introduce a set of parameters Θ such that Z Z P (M ) = P (M | θ)f (θ)dθ = Pθ (M )f (θ)dθ, representing that if the parameters have known values Θ = θ, we are able to compute Pθ (M ) = P (M | θ). Since we want to make a statement about the profiles of the contributors, we include in θ the set of profiles of all the contributors. Therefore, the number n of contributors emerges as a parameter that we need. To emphasize this we take the profiles of the contributors out of θ for explicit, separate integration. We still call the remaining parameters θ, slightly abusing notation. Doing so we get, if we model the trace with n contributors, Z P (M | H, g, I) =

X

Pθ (M | g1 , . . . , gn , H, g, I)P (g1 , . . . , gn | H, g, I)f (θ | H, g, I)dθ

(2.2)

θ g1 ,...,gn

θ g1 ,...,gn

re

-p

ro of

We now make, for this exposition, some additional assumptions: we only prior consider densities f (θ | H, g, I) = f (θ) not influenced by the hypothesis H and the genotype g of the person of interest, or the additional profiles I. This means that we do not allow knowledge of these, without having seen the trace itself, to influence our probability assignments pertaining to the trace. The term P (g1 , . . . , gn | H, g, I) tells us how likely the gi are as contributor profiles, based on H, g, I, but not based on the trace. None of the events conditioned on can be left out in general since H specifies how g and I come into the trace as contributors. For example, conditioning on Hp , according to which S contributes, there must be (at least) one of the gi equal to g, namely the gi that represents the profile of S. The first term in (2.2), Pθ (M | g1 , . . . , gn ), is the probability that we would see M if the contributors have profiles g1 , . . . , gn , conditional on H, g, I and if the model parameters are θ. Clearly now the conditioning on H, g, I is irrelevant. Then we can write (2.2) as Z X P (M | H, g, I) = Pθ (M | g1 , . . . , gn )P (g1 , . . . , gn | H, g, I)f (θ)dθ (2.3)

Jo

ur

na

lP

We call Pθ (M | g1 , . . . , gn ) the statistical model, since it reflects our assessments of how trace profiles are generated if the parameters θ are known, and we call f (θ) the probabilisitic model. It reflects our assumptions as to which parameters are deemed possible and should be used with which weights by the statistical model. In other words, f describes our prior probability distribution for the parameters. Finally, the probabilities P (g1 , . . . , gn | H, g, I) reflect our population genetic model for the unknown contributors conditioned on the assumptions on contribution as specified by H and I. We can only do this if we specify which index i is the profile of S. Therefore we introduce Hp,i by letting it put S in position i, i.e., it lets gi be g in P (g1 , . . . , gn | Hp,i , g, I). In the sum over the gi , all gi run over all possible profiles. Therefore if f (θ) is a symmetric density satisfying f (σ(θ)) = f (θ) for all permutations σ and all θ, then P (M | Hp,i , g, I) is the same for all i and we obtain P (M | H, g, I) without having to specify in the hypothesis H an index i for the person(s) who are supposed to have contributed according to H. Therefore we get a single likelihood ratio LR(M, g), that is not specifically targeting any of the contributors. In this article we want to distinguish between the contributors on the basis of their relative contribution, so now we consider how that can be done in (2.3). In order to differentiate between the contributors we need contributor specific parameters in the statistical model. For example, we may have a separate dropout probability di per contributor (for a discrete model) or a parameter ri that measures the relative contribution of contributor i (for a continuous model). Suppose now we let f (θ) be non-zero only for increasing probabilities of dropout d1 < d2 < . . . dn or for decreasing relative contributions r1 > r2 > · · · > rn . Then the P (M | Hp,i , g, I) are different so we arrive at contributor specific likelihoods. For example, the mixture likelihood P (M | Hp,1 ) corresponds to letting S be the most contributing person, and P (M | Hp,n , I) corresponds to letting S be the least contributing person to M , which is different provided of course n > 1. Therefore, we now have hypotheses refining Hp into Hp,i that state that S is the contributor with parameter θi , and we define LRi (M | g, I) as the LR for hypotheses (Hp,i , Hd ). For such statistical models, we now do not yet have a likelihood ratio LR(M, g | I), only the contributor specific ones LRi (M, g | I). We can combine these into a likelihood ratio LR(M, g | I) if we 3

choose conditional probabilities πi = P (Hp,i | Hp ) stating which contributor S can be. Taking πi = 1/n, n

1X LR(M, g | I) = LRi (M, g | I) n i=1

(2.4)

which is simply the average of the contributor-specific LRi . We now make (2.3) explicit for some models that we will need in this paper. First we will use the discrete method of [15]. In that case M is reduced to its detected alleles and θ = (d1 , . . . , dn , c) where di is the dropout probability for the contributor with profile gi , and c is a parameter for drop-in. We then assume that any allele a is detected with probability Pd,c ~ (a ∈ M | g1 , . . . , gn ) = 1 − e

−cpa

n Y

n

di i,a

(2.5)

i=1

Z P (M | H, g, I) =

1

Z

1

··· 0

X

ro of

with ni,a ∈ {0, 1, 2} the number of alleles a in gi . The probability that M is the observed set of alleles is obtained from (2.5) by multiplication, i.e., assuming conditional independence of detection of alleles given the gi , the di , and c; see also [15]. We take f (d1 , . . . , dn ) to be uniform on [0, 1]n and for c we use (here) some fixed value. Then (2.3) specializes into

P~x,c (M | g1 , . . . , gn )P (g1 , . . . , gn | H, g, I)dx1 . . . dxn .

0 g1 ,...,gn

(2.6)

re

-p

This is a model with a symmetric density f (θ), so that LR(M, g | I) is the likelihood ratio in favour of being a contributor, not in favour of being any specific one. Still, if the di that best explain the trace M are different from each other, this is taken into account in the likelihood ratio. Indeed, we have Z LR(M, g | I) = LRθ (M, g | I)f (θ | M, Hd , g, I)dθ, (2.7)

Jo

ur

na

lP

where f (θ | M, Hd , g, I) (cf. [15]) is the updated density for the probabilities of dropout given the M, Hd , g, I and LRθ (M, g) is the likelihood ratio evaluated for parameters θ. Therefore the likelihood ratio LR(M, g | I) is determined mostly by the probabilities of dropout that best explain the trace profile, taking the information in I into account, considering the profile g of the person of interest as a that of a non-contributor, i.e., under Hd . This probability density is still symmetric under permutations, but may be supported mostly on vectors (d1 , . . . , dn ) whose entries are different from each other. As a second example we globally describe the continuous model for EuroForMix (for the full details, cf. [16],[2]). Now M is the set {(a, ha ) : ha ≥ T } where a is an allele, ha its peak height measured in rfu, and T the detection threshold. Peak heights are assumed to follow Gamma distributions parametrized such that sums of peak heights also follow a Gamma distribution. The parameters θ are (related to) the expected peak height, their variance, the relative proportion of contribution ri for each contributor and the degradation slope (the fragment lengths of the alleles must therefore be given to the model as well). One needs the detection threshold for each locus because peaks that have a height below that threshold are not recorded. On each locus, the expected peak height for contributor i is considered to be proportional to ri . Further, a separate peak height distribution for drop-in alleles is needed. If stutter is modeled, more assumptions are needed. Then, maximum likelihood estimates (MLE) are obtained for the parameters for the peak height expectation and variance, the mixing proportions ri and the degradation slope. This is done conditional on H, I, g and M such that we obtain θˆp and θˆd . These values are plugged into (2.2), but note that we cannot write this as (2.3) (and also (2.7) does not apply) because of the dependence of the parameters of the hypotheses and on the data M, g, I. This is a model with an asymmetric density f (θ): the ri are estimated and their point estimates used. If we would use the estimates θˆd obtained from Hd for both likelihoods, we would get LRi (M, g | I), a likelihood ratio in favour of being the ith contributor, where i is the contributor that S gives most support for being.

4

2.1. Deconvolution A second way to evaluate the likelihood ratio (2.1) is as LR(M, g | I)

= =

P (g P (g P (g P (g

| M, Hp , I) P (M | Hp , I) | M, Hd , I) P (M | Hd , I) | M, Hp , I) , | M, Hd , I)

using that we consider models to satisfy P (M | Hp , I) = P (M | Hd , I). Suppose the hypotheses postulate no relatedness between S and the profiled persons in I. If we regard profiles of unrelated individuals as independent, then P (g | M, Hd , I) evaluates to the probability p(g) that a random population member has profile g and we get LR(M, g | I) =

P (g | M, Hp , I) . p(g)

(2.8)

-p

ro of

The relation P (g | M, Hd , I) = p(g) is not exactly true when we introduce dependence between the profiles of persons, such as when the standard subpopulation correction (i.e., the θ or FST correction) is applied. In that case, the alleles of M have a non zero probability to be identical by descent with those of S, hence M will makes us revise the probability that S has profile g to some extent, which is small for small θ. In that case, P (g | M, Hd , I) ≈ p(g) only. In any case, (2.8) remains approximately true and we will think of it as if it is. The term ’probabilistic genotyping’ refers to the fact that, as (2.8) shows, one may regard likelihood ratio calculations as determining a probabilistic assessment P (g | M, Hp , I) of the profile of contributor S. The preceding arguments can be easily used to show that on the level of individual contributors we have P (g | M, Hp,i , I) , (2.9) LRi (M, g | I) = p(g)

lP

re

i.e., LRi (M, g) is large if we predict profile g to be that of contributor i with a probability that is much larger than the population probability p(g). Since S can only be one of the mixture contributors, we expect that if S is contributor j, then (cf. 2.4) LR(M, g | I) ≈ LRj (M, g | I)/n, (2.10)

na

since then LRj will be expected to be much larger than the other LRk (M, g | I). In other words, if S is in reality contributor i we expect P (g | Hp,i , M, I) to be much larger than the a priori probability p(g) but not the other P (g | Hp,j , M, I). 3. The top-down approach

Jo

ur

3.1. Introduction Suppose that the model orders the contributors from larger to smaller contribution. It gives a probabilistic description of the possible genotypes of each contributor. The more the probabilistic description for contributor i resembles a point distribution, the more LRi (M, g) approaches 1/p(g) and the better ‘resolved’ we say the contributor is. If we work under the assumption of n contributors for the trace profile, then from point of view (2.3) we see that we need to take all profiles g1 , . . . , gn into account; from point of view (2.8) these computations amount to predicting the genotypes of all contributors. For large n this becomes computationally prohibitive at some point. On the other hand, if we have a mixture with different mixture proportions for all contributors and we order them according to decreasing contribution, then we expect to be, for most mixtures, able to resolve contributor 1 better than contributor 2, etc, because more contribution leads to larger peaks, more detection of alleles, and less uncertainty as to whether the resulting peaks are allelic or not. This also means that we heuristically expect that generally E[LR1 (M, g | I) | Hp,1 ] ≥ · · · ≥ E[LRn (M, g | I) | Hp,n ]. It may very well be the case that the last contributors can be only so poorly resolved that no useful LR is to be expected. In that case, we would need to model the trace with a very precise model, and even then we would perhaps not be able to gather evidence strong enough to be useful. Moreover these LRj for the last contributors then have the potential to depend much more heavily on the specifics 5

of the chosen model than for the well resolved contributors: in the introduction we argued that for well resolved contributors the precise model is of minor importance. But if we are interested only in the most prominent contributor, it would be enough to compute LR1 (M, g). In view of (2.10) we would be done if that yields strong evidence. We would have obtained evidence in favour of being the most prominent contributor, and hence by (2.4) and (2.10), also evidence in favour of being any contributor. The difference is a factor n depending on how we formulate the hypotheses (to get evidence in favour of being this particular contributor, or just any contributor). If we would have computed LR1 (M, g | I) and we would not have found evidence, we could then consider the second contributor and compare it with S. In other words, we can approach our problem of determining whether there is evidence in favour of S having contributed, by computing the series LR(1) (M, g | I) LR

(2)

LR

(3)

=

LR1 (M, g | I),

(M, g | I)

=

(LR1 (M, g | I) + LR2 (M, g | I))/2,

(M, g | I)

=

(LR1 (M, g | I) + LR2 (M, g | I) + LR3 (M, g | I))/3,

...

=

...

ro of

(which ends with LR(n) (M, g | I) = LR(M, g | I) as the n-th computation). This series can be terminated if we find some LR(j) >> 1 sufficiently large for our purposes, or if the computation becomes too complex. These computations amount to answer the series of questions • Is there evidence that the person of interest is the most prominent contributor? If yes, we are done. If not,

-p

• Is there evidence that the person of interest is among the two most prominent contributors? If yes, we are done. If not, • Is there evidence that the person of interest is among the three most prominent contributors? If yes, we are done. If not,

re

• etc.

na

lP

If we could do this, we can avoid the modeling of contributors that we are at that point not interested in. Finding evidence for belonging to a certain top-k is sufficient, regardless of how many contributors we would need beyond the first k. The purpose of this paper is to give a method that is inspired by this way of approaching the contributors in the order of contribution. Once we have explained this (simple) method, we will also be able to explain why we compute a sequence of LR(k) (M, g | I), corresponding to evidence for being in the top-k contributors, instead of the sequence LRk (M, g | I) for being the k-th contributor. 3.2. The top-down likelihood ratio

Jo

ur

We can now introduce our top-down approach to mixture evaluation. We want to only use the principle that contributors with a larger relative contribution are responsible for larger peaks. Therefore we define Mα , locus by locus, and replicate by replicate, as the subprofile of M that contains the smallest set of peaks such that the sum of the peak heights in Mα is at least a fraction α of the total sum of peak heights. We define this subprofile by carrying out this procedure on all loci separately, and if applicable, on every replicate analysis of the trace. In this way on every locus Mα targets alleles of the most prominent contributors that together have contributed at least a fraction α of the total peak height. On every locus, Mα can be iteratively constructed, by taking in peaks starting with the largest one, and stopping when a fraction α or more of the total sum of peak heights has been taken into Mα . We only take alleles into Mα , not their peak heights, since we will use the discrete model (2.6) on the Mα . For example, suppose a trace has alleles (10, 11, 12, 13, 14, 15, 15.3, 16, 17) with peak heights (112, 733, 2265, 301, 527, 729, 486, 534, 1430) rfu. Then the sum of the peak heights is 7117 rfu, and we now describe Mα for α = 0.1, 0.2, . . . , 0.9, 1. First M0.1 = {12} since allele 12 alone has 2265 rfu, which is more than the targeted α = 0.1. For the same reasons, M0.2 = M0.3 = {12}, whereas M0.4 = M0.5 = {12, 17}, M0.6 = {11, 12, 17}, M0.7 = {11, 12, 15, 17}, M0.8 = {11, 12, 14, 15, 17}, M0.9 = {11, 12, 14, 15, 15.3, 16, 17} and finally M1 is the whole trace. Of course the subprofiles Mα need not contain all alleles of all profile contributors to Mα and they may also contain some alleles that do not belong to the contributors we targeted. If several contributors 6

share the same allele, its (in reality stacked) peak height may be so large that it is taken into Mα . Since in Mα we only register absence or presence of aleles, this is only undesirable when none of the targeted contributors have that allele. We do not claim (or aim for) a perfect derivation of a subprofile containing precisely the alleles of the most prominent contributors, but our procedure is such that we do target them since we collect the largest peaks, which reasonably come from the most contributing persons. We note that the same holds for the full trace profile M = M1 : in that profile we need not have all alleles of all contributors (there may be dropout) and we may have alleles detected not belonging to them (drop-ins or artefacts). Therefore, we compute likelihood ratios on all the Mα using dropout probabilities and drop-in, just as we would do for the trace M = M1 . For each subprofile Mα , we compute the likelihood ratio LRα (M, g) := LR(Mα , g) using the discrete model specified in (2.6), where the number of modeled contributors nα used for Mα is taken to be the minimal required number nα = dmα /2e, with mα the maximum number of peaks observed on the loci of Mα . So, we use the peak heights only for the definition of the subprofiles, and carry out calculations using only the alleles in these. It remains to define a final result, denoted LRtop−down (M, g), from the LRα . We define LRtop−down (M, g) = maxα LRα (M, g),

(3.1)

ro of

i.e., we take the largest likelihood ratio obtained on any of the subprofiles Mα . The choice of taking the largest may seem - at first sight - counterintuitive, so we will elaborate on why we make this choice. We first give two examples, and then proceed with heuristics illustrated with these examples. 3.3. Examples and heuristics for the top-down approach

Log10 (LR) 25











20

















10

Jo

5









0.2















ur

15





na



0 0.0

lP

re

-p

We now proceed with two examples where we compute the LRα (M ), one for a mixture with four replicates and another example for one with a single analysis. As will be presented in more detail below in 4.1, we have subjected laboratory generated profiles (and also research data) to this method. These laboratory generated mixtures are those used in [17], consisting of mixtures analyzed on the NGM loci, with either two or three contributors in various mixture proportions, some with replicate analyses. In Figure 1 we plot, for the profiles g of the sample contributors, the Log(LR(Mα , g)) where M is the trace with code name 0.24 in [17] (two contributors in a targeted 5:1 proportion, four replicate analyses), as a function of α, computed on {0.05, 0.1, 0.15, . . . , 0.95, 1}. We have omitted the LRα < 1 in this graph for more convenient scaling of the Log(LRα )-axis, and plotted them as LRα = 1 instead. In addition we plot the LR’s obtained with EuroForMix, using the MLE approach, taken from [17]. These LR’s are computed on the whole trace. For visual comparison purposes we plot them as a horizontal line.



● ●







0.4









0.6







0.8

α 1.0

Figure 1: Likelihood ratios LR(Mα , g) as a function of α (top graph with dots: major contributor; lower graph with dots: minor contributor), compared with the LR obtained by EuroForMix (MLE) (values plotted as dotted lines for better comparison)

From Figure 1 we see that the LR’s obtained with the continuous model are best approximated by the largest LRα for both the contributors. For the major contributor, the LR is maximal for α = 0.7, and for the minor contributor at α = 1, i.e., when the whole trace is modeled. For the major contributor, 7

ro of

the maximal LR once obtained, remains (approximately) stable when α increases. This can be explained by properties of the discrete model. The LR it computes by integration over the probabilities of dropout is dominated by the LR’s for probabilities of dropout that best explain the evaluated profiles (cf. (2.7)). The four replicates are best explained by one contributor without and one with dropout. The different reproducibilities of the alleles then give information about which alleles come from the same contributor. The largest contribution to the likelihood (2.3) is obtained when alleles with similar reproducibility are from the same contributor. Therefore, when alleles from the minor contributor become taken into Mα , they do not influence the LR for the major contributor much. For a continuous model, the largest contribution to the mixture likelihood is similarly obtained: it is largest when the genotypes fit the peak heights. If the profile of a prominent contributor can be well resolved, a continuous model therefore can be thought of as automatically conditioning on the profile of the major contributor. If only a single analysis has been done, then in the absence of replicates the dropout model cannot use reproducibility. Thus, when α grows and Mα contains more alleles, the newly added alleles cannot be recognized by the discrete model as coming from new contributors. Therefore the LRα will decrease after a maximum has been reached on the Mα , since alleles taken into a larger Mα can not be distinguished as coming from another contributor and the number of modeled contributors grows. For the continuous model, the likelihood ratio for a contributor is much less affected by the presence of less prominently present contributors, since it can still use the peak heights also on a single analysis as it does when there are replicates. This dependence on the LR for a mixture with a single analysis is exemplified in Figure 2 where we display the LR(Mα , g) obtained by the three contributors to mixture 12.5. This is a mixture with (500, 50, 50) pg of DNA for the contributors.

Log10 (LR)

20







re



-p

25



15









lP



10

● ●













0









0.2



na

5 ●





0.4









0.6









0.8

● ●



● ●



● ●



α 1.0

ur

Figure 2: Likelihood ratios LR(Mα , g) (for all three contributors) as a function of α (graphs with dots) compared with the LR obtained by EuroForMix (MLE) (values plotted as dotted lines for better comparison)

Jo

In Figure 2, we see that the LR for the major contributor is practically the same with the top-down method as with the continuous model, and that for the minor contributors, the LR with the top-down method is conservative with respect to that of the continuous model. However, the LR for the major contributor is easily retrieved: this LR is found at α = 0.55 and remains constant up to α = 0.7. Until then, the Mα are actually modeled with a single contributor. These examples have motivated us to define our top-down approach as in 3.1. We believe that this will be sufficient to reproduce the likelihood ratios obtained with a continuous model for the most prominent contributor, at least if the difference in contribution with the remaining contributors is large enough. For less prominent contributors, it will be a conservative approach. Indeed, apart from the first one, we do not target the contributors directly. The Mα form an increasing series of subprofiles, each expected to contain most of the trace alleles coming from the first k contributors (for some k ≥ 0), and also some of the alleles of subsequents ones, mostly from contributor k + 1. The sequence of subsequent LRα (M, g) therefore resembles a sequence containing the various LR1 (M, g), (LR1 (M, g)+LR2 (M, g))/2, (LR1 (M, g) + LR2 (M, g) + LR3 (M, g))/3, etc., interpolating between these. A continuous model can directly compute the LRi (M, g | I) in favour of being contributor i, and therefore we expect the top-down 8

Jo

ur

na

lP

re

-p

ro of

LR to be increasingly conservative as we go deeper into the mixture, since it computes likelihood ratios with a discrete model in favour of being among the top-k, rather than with a continous model for being contributor k. For an algorithm implementing the top-down approach, various choices must be made. In principle, since there are only finitely many alleles in a trace M , there are also finitely many different LRα (M, g), hence all of these can be determined. In practice, one may choose to discretize the interval [0,1] in a trace-independent way, and in this article we have always carried out calculations on α in multiples of 0.05. This means that as many as 20 subprofiles are created; however, these are not always all distinct so that at most 20 computations must be done. Further, the number of subprofiles to be analyzed depends on the criterion for stopping: one can decide to stop when a LR-threshold is reached, when a certain α is reached, or when more than a certain number of contributors needs to be modeled. The runtime for the likelihood ratios on subprofiles using n contributors are, by construction, the same as for any trace that is modeled with n contributors. Therefore, when the top-k is searched several computations each involving 1 ≤ n ≤ k contributors must be done. Before we move on, we reflect on how we expect the method to perform. By the top-k contributors, or equivalently, the k most prominent contributors, we mean the k contributors with the largest contribution. This presents problems if some contributors have precisely the same amount of contribution. If we know the contributions (such as in laboratory generated mixtures) we say the top-k is identifiable if there are k contributors who each have contributed more than all the others. For example, if the mixing proportions are (10, 10, 5, 1, 1) then the top-1 is not identifiable, the top-2 is, the top-3 is, the top-4 is not and the top-5 is. On such a mixture, we would expect that the subprofiles that are derived first (as α increases, starting from zero) contain alleles of both of the first contributors. Therefore, aiming for the top-1 is not expected to perform well, but aiming for the top-2 can be expected to yield evidence for both of the identifiable top-2. We expect the LR’s to be similar to those of a continuous model, whose resolution of these contributors will also be affected by them having the same contribution. Aiming for the top-3, we would expect to find also the third contributor, although with a LR that becomes more conservative with respect to a continuous model than for the first two, since we evaluate the subprofiles with a discrete method. The efficacy of the top-down method can in general be expected to be dependent on the extent to which the targeted top-k is identifiable from the others, i.e., whether their relative contributions differ much from those of the remaining contributors. When we have a likelihood ratio in favour of being among the top-k contributors, we can use this result without specifying how many contributors we would have needed for the whole trace. If no evidence in favour of contribution to the top-k we may not be able to exclude a person as contributor, but only fail to find evidence in favour of being among the top-k. This may be because the person of interest is not in the top-k, or because there is no (sufficiently) identifiable top-k in the profile, as in the previous example with proportions (10, 10, 5, 1, 1). In any case, in contrast to existing methods, the total number of contributors is not a parameter of the top-down method. We will further explain why we believe this is actually logical in 3.4 below where we raise the question of what the notion of contributor represents. The previous examples indicated that taking the largest LRα as top-down LR, as we defined it in (3.1), gives us a good approximation to the LR obtainable with a continuous model. As we saw, for a continuous model we would expect the LR not to decline beyond the maximum LRα if we were to work with subprofiles with such a method. Now, we will discuss heuristically why we believe that working with subprofiles can be expected to have the properties that we want. First we emphasize that not only the subprofiles Mα for α < 1 are incomplete. Often the whole trace M = M1 can also be thought of as a subprofile of an (unobserved) larger profile that we would obtain with other sensitivity settings. It would be a concern if likelihood ratios obtained on M would usually decrease if we would have that larger profile. The reason that such a decline in evidence should not happen is broadly speaking the following. At α = 0 we start with the empty profile M0 . Then, as α increaes, we obtain the alleles of the contributors where their order of appearance is broadly according to decreasing contribution. For every contributor there will be a smallest α such that this contributor’s alleles in M are all in Mα . Letting α increase further, we will add more alleles to the subsequent subprofiles. That does not affect the likelihood ratio for the already present contributors, if the newly added peaks are sufficiently smaller so as to be best explained as coming from other contributors. We therefore believe that, if we would use a continuous model that reflects the reality of mixture profile generation adequately, LRα (M, g) will rise to some maximal value and then remain practically constant, if we use the peak heights in the mixture likelihood calculations for the Mα . 9

lP

re

-p

ro of

We do not use a continuous but a discrete model for the computations on the subprofiles Mα . This means that on any Mα , the peak heights are no longer taken into account; they only play a role in the definition of the Mα . Contrary to what we expect to happen for a continuous model, the LRα can therefore be expected to decrease when Mα takes in more and more alleles especially when no replicates are available. This is indeed what we saw in Figure 2. There are advantages and disadvantages stemming from working with a discrete model in the calculation of LRα . The disdavantage is, of course, the loss of information. Advantages are that the discrete model is computationally easy, and that it can be applied to the subprofiles without modification. The dropout probabilities are, as for the full trace, the probabilities for each allele of each contributor not to contribute to the observed profile; and the drop-in model provides the opportunity to also observe alleles not present in any of the genotypes of the contributors. No adaptations have to be made for the model to be able to applied to the subprofiles. This is not to say that we believe the discrete model only has advantages, and we believe that the top-down approach would in principle also be possible with a continuous model on the subprofiles. The advantage of that approach would increase with the order of the contributor we aim to find. The applied continuous model would need, in order for its prediction of the peak heights, to be defined taking the procedure for defining the subprofiles into account. The approach we have taken does not correspond to a rfu threshold for inclusion, which will complicate the analysis for a continuous model. An alternative approach for defining the subprofiles, that is easy to combine with an evaluation by a continuous model, would be to simply take in alleles whose heights exceed a certain threshold, that one gradually lowers. This would not correspond to a certain relative fraction of the total signal, but rather to a sequential unmasking of the profile by lowering the detection threshold. We have not tried such variants of the top-down approach, and we certainly do not exclude that these could outperform the method discussed in this article. We see the proposed approach more as the first of a new class of approaches than as the establishment of one. A concern with (3.1) may be that for a non-contributor we seem to maximize the opportunity to obtain a likelihood ratio in favour of contribution. We will argue here that, essentially because the subprofiles are derived on the basis of M alone, false positives are not to be expected to be more of a concern than for other models. Since we argued that the LRα , viewed as a function of α, interpolate between the various LR(k) (M, g), the false positive rates will be, we expect, similar to those of a serial query for being either contributor 1, and if not then for being contributor 1 or 2, etc. This again should roughly correspond to the combined false positive rates of several queries for the contributors directly with the LRk (M, g). It is possible to give crude bounds. We can apply the general property of likelihood ratios that 1 , (3.2) t see [18] for a proof. This bound 1/t in (3.2) is usually of course not sharp. In particular, (3.2) applies to all the LRα on the subprofiles. Suppose we approximate (3.1) by discretizing [0, 1] using a number k of subprofiles Mα , say for α ∈ {1/k, 2/k, . . . , (k − 1)/k, 1}. If all the Mα would be independent the bound 1/t in (3.2) will become approximately k/t for large t, if t is the maximal LR on all subprofiles. Therefore it is impossible to have a large probability for strong false positives, with k/t being a very crude upper bound for the probability that a non-contributor has a top-down likelihood ratio exceeding t >> 1, if we take k independent subprofiles Mα . However, the Mα , being an increasing series of subprofiles, are far from independent. Any person fitting well into a particular Mα will also fit in the larger ones, but not conversely. This will reduce the false positive probabilities below this crude bound.

Jo

ur

na

P (LR ≥ t | Hd ) = P (LR ≥ t | Hp )E[LR−1 | LR ≥ t, Hp ] ≤

3.4. The concept of a contributor In the history of likelihood ratios for DNA mixtures the number of contributors has always played a very central role, but in our method the total number of contributors is not very important. It is relevant for its performance in that a large number of contributors will more easily give rise to such a complex profile that it does not give useful evidence any more, but not more than that. To shed some more light on why we believe this only to be reasonable, we need to make sense of the concept of a mixture contributor first. The number of contributors is typically not part of the original question that was asked, namely to determine whether there is evidence that the person of interest contributed DNA to a sample. But since in any model, according to (2.3), a trace profile of the sample is viewed as a superposition of (possibly 10

Jo

ur

na

lP

re

-p

ro of

partial) profiles, with in addition the possibility of non-allelic peaks having another origin, we must assume some number n of contributors (or a probability distribution for it) to be able to calculate (2.3). It is therefore inevitable to introduce some number n of contributors. But what do we really mean when we talk about the (number of) contributors to a DNA mixture? We can think of three possible, but different, interpretations: sample contributors, profile contributors, and modeled contributors. The sample contributors are the persons whose DNA in reality is present in the sample with some minimum amount to be defined (say, one cell). This number may be known in laboratory experiments, but usually otherwise it is not. However, since we do not subject the sample to calculations, but the DNA profile obtained from it, we believe this number is not of ultimate interest. Indeed it may be that sample contributors have not been detected in the profile. It then would not make much sense to still include these persons as contributors in calculations on a profile from which they are absent. Looking at the profile itself, by a profile contributor, we mean those sample contributors whose DNA has actually been detected in the resulting DNA profile. Or, a bit more precisely, we could define them as the sample contributors the removal of which would have led to a different profile. By definition there are at most as many profile contributors as sample contributors, possibly strictly fewer. The difference in numbers will depend on the profiling technology (the sensitivity, the profile analysis procedure, the detection threshold, etc.), and of course on the amounts of DNA of the sample contributors to the sample. For any given trace profile, imagine that we increase the detection thresholds. As we do so, the profile will contain fewer and fewer alleles and hence we lose profile contributors, until there aren’t any left. The number of sample contributors is obviously unchanged, and hence we see that inferences about all sample contributors, based on the profile, are generally not possible. The number of profile contributors has a subtlety. If we believe it exists as a single number valid for all loci, we must define it on the basis of the full profile. But if some loci are more sensitive than other loci, it may be that on these loci alleles are obtained from sample contributors who remain undetected on other loci. Similarly, degradation may cause some loci not to register any alleles. These loci, when considered on their own, show zero profile contributors. Yet, when other loci show alleles and the profile is then classified as (say) having three contributors, then one has to classify the whole profile as a threeperson profile. This makes it impossible to definitively establish the number of profile contributors, since that number could always be subject to upward change depending on the results of additional DNA typing. At best, one can therefore only establish a lower bound for the number of profile contributors. Even this is difficult, because a profile contributor may have contributed very little. In our opinion this is a drawback. Finally, there is the number of modeled contributors. This is the number n in (2.3). A modeled contributor is foremost a concept within the probabilistic model for the likelihood calculations, but what this concept represents (i.e., what it is really modeling) varies over the different approaches. The earliest approaches made no attempt to distinguish between contributors. The first, most basic models (sometimes called the binary or combinatorial method) only dealt with traces where all contributors were supposed to be fully detected in the profile, and no additional alleles were supposed to be present in the trace. The detected alleles, not the peak heights, were subjected to calculations. A contributor in this model is a strong concept: since all of the contributors are modeled as fully detected, the mixture likelihood strongly depends on their number, hence so does the LR. On the bright side, with this assumption the mixture likelihood depends so strongly on the number of modeled contributors that maximum likelihood estimation of that number works fairly well (cf. [19]). A first generalization was the family of discrete models as implemented in LRmix(Studio) (cf. [20]), where dropout is allowed for but all contributors are modeled as having the same probability of dropout (except possibly persons with known DNA profile that are conditioned on under both hypotheses). The set of all detected alleles is therefore regarded as a superposition of partial DNA profiles of persons that are expected to be equally partially detected. As for the previous model without dropout, this induces a large dependence of mixture likelihood on the number of contributors: if the detected alleles are a superposition of equally partial profiles, it is of importance how many there are. It is then only logical that the LR can depend strongly on the chosen n (e.g. [21] and references therein). A contributor is a weaker concept than it was for the binary models, but still quite a strong one since the mixture likelihood (2.3) is still determined by terms where all contributors have detected alleles in the trace. This is different for the latest models that distinguish between the contributors based on their possibly different contribution. A modeled contributor is now a much weaker concept: it is someone that we may, but need not, see alleles of in the profile when computing (2.3). The dependence of the likelihood ratio on n will then be different. Increasing the number of model contributors no longer means that the same 11

Jo

ur

na

lP

re

-p

ro of

alleles must be fairly divided over a larger number of persons; the superfluous model contributors may be estimated to have made no (meaningful) contribution to the trace profile. This means that we can expect that if the number n of modeled contributors is less than the number of profile contributors, not all profile contributors can be recovered. It may however still be perfectly possible that an underestimate does not affect the LR for the most prominent contributors, as was recently also observed in [22]. Contrary to the model contributor notions of the earlier models, it is now not conceptually problematic (but it is computationally unwise) to work with more modeled contributors than profile contributors, or even with a number exceeding the number of sample contributors. This is so since if the model makes a likelihood-driven estimate of parameters including the individual contributions or probabilities of dropout, then superfluous contributors can be expected to have their contribution estimated to be close to zero. In terms of deconvolution nothing is learned about a contributor if we see no alleles of that person. This means that for the superfluous contributors we will have LRi (M, g) = 1 for all g, cf (2.9). Since the LR is, as in (2.4), the average of the LR’s targeting the separate modeled contributors, we can predict what happens if we augment the number of modeled contributors. For a person of interest who is not a profile contributor, we will usually have LRi (M, g | I) << 1 if modeled contributor i is well resolved. This means that LR(M, g | I) << 1 if n is at most equal to what is minimally needed as number of profile contributors. Adding superfluous contributors k who have LRk (M, g | I) ≡ 1 will bring the LR(M, g | I) towards m/n where m is the number of such superfluous contributors and n the total number of modeled contributors. A non-profile contributor’s LR will therefore be determined by the number of superfluous modeled contributors, whereas for a profile contributor, the change in LR due to adding modeled contributors will be small. These effects are described in more detail in [15], and in [22]. Finally we note that since (especially closely) related individuals tend to share (sometimes many) more alleles than unrelated individuals, one must make a choice for the (lack) of relatedness between the contributors whenever their number is to be estimated. In practice contributors are always supposed to be unrelated. This may lead to an underestimate of their number. For example, consider three brothers producing a trace together. The number of profile contributors may then be three. But three brothers together have at most four distinct alleles on every locus, which will lead to an allele count based estimate of there being two contributors. In fact, three brothers together amount to 1.75 unrelated contributors, since for both of their parents the probability that there is one of their alleles that has not been passed on to any of the three siblings is 0.25. Thus, each parent (if heterozygous) will give rise to in expectation 1 · 41 + 2 · 34 = 1.75 distinct alleles in the three siblings together. In general, if we assume related contributors, we will obtain a probability distribution for how many alleles occur in the profiles, together with a probability distribution of their multiplicity due to identical by descent sharing of alleles. The number of distinct alleles that we expect on a locus will generally not be integer. In view of all this we view contributors primarily as model concepts, used as vehicles to draw pairs of alleles that can (but need not be) detected. This is why we prefer to speak of a trace modeled with n contributors, rather than of a trace having n contributors. Even if we know the laboratory used n contributor samples to generate the profile, this does not mean that we should use that number in likelihood calculations. Using less than n modeled contributors often implies some sample contributors can not be found. Using more is, for a good model, harmless. It makes strong exclusions impossible, but this is reasonable: if we take into account the possibility of non-detected contributors, any one could be one of these. Finally, we lay the connection between the top-down approach of this paper and the approach in [23], where the number of contributors is regarded as a nuisance parameter that can be integrated away, as for the other parameters. That approach is for the computation of LR(M, g), i.e., modeling the whole trace. If we restrict to a top-k of contributors, we remove the dependence of the likelihood ratio on the total number. The contributor concept has recently been discussed in [24]. What they call the target, resp. correct, resp. assigned, number of contributors corresponds closely to our notions of sample, profile, and modeled contributors. The authors of [24] state that, as the first two numbers will not be known in casework, one can only make a reasonable choice for the assigned number in the calculations. This is in agreement with the discussion we have just presented.

12

4. Applications In this section we describe the application of the top-down method (3.1) to various sets of mixtures and compare the results to those obtained with the continuous model of EuroForMix. 4.1. Comparison on laboratory generated mixtures on the NGM loci

ro of

To test the performance of the top-down method, we calculated the likelihood ratio on all combinations of the set of 59 mixtures and 33 reference profiles typed on the NGM loci that was described in [25]. This set consists of four two-person mixtures (each with four replicate analyses) and 55 three-person mixtures (six with four replicate analyses, and 49 with a single analysis) in varying mixing proportions, with DNA input ranging from 30 to 500 pg. In [17], a comparison was done between EuroForMix and LRmixStudio, we extended the comparison in [26] to the discrete model with integration over the probabilities of dropout (modeling the whole trace). In this section we compare the results of our top-down method with those obtained with EuroForMix using the MLE method where a separate MLE is obtained for both hypotheses. This MLE method has also been used in other publications such as [27],[28]. The top-down method was carried out by computing LRα for α ∈ {5%, 10%, . . . , 95%, 100%} and by setting LRtop−down = maxLRα .

Log10 (LREFM ) 25 ■■ ■ ■ ■ ■

15 ▼▼

■ ■ ■ ■

▼▼■











5

10

15

20

na



lP



10

re



20

5

-p

4.1.1. Likelihood ratios for contributors We start with comparing the LR’s obtained for (sample) contributors. On the mixtures where replicates are available, we get, similarly to what we obtained in [26], that the LR’s with or without a peak height model are very highly correlated, with the best linear fit indicating that the LR, on logarithmic scale, is 5% larger when the peak heights are modeled (by EuroForMix) compared to when they are not (with the top-down method). We display the results in Figure 3.

25

Log10 (LRtop-down )

Figure 3: Likelihood ratios on the mixtures with four replicates from [17] obtained with the top-down method and with EuroForMix (MLE)

Jo

ur

For the mixtures with a single analysis, we distinguish between contributors according to their ordering according to decreasing contribution. The likelihood ratios for the most prominent contributors are displayed in Figure 4; these correspond to the likelihood ratios for the first contributors of the mixtures, which had either (100, 50, 50), (250, 50, 50), (250, 250, 50), (500, 50, 50), or (500, 250, 50) pg of DNA for the contributors. We see in Figure 4 that the likelihood ratios obtained with the continuous model and the top-down method show good agreement, with the top-down method obtaining a slightly conservative LR with respect to EuroForMix. The results for the second contributor are displayed in Figure 5a and those for the third, least contributing contributor in Figure 5b. From Figures 4-5 we see that indeed, as we search for contributors further down in the ordering according to their contribution, the top-down LR with the discrete method yields LR’s that are increasingly conservative with respect to those of the continuous model. For the most contributing persons, the LR with the top-down method is close to that of the continuous model, and for the least contributing persons, their top-down LR requires an evaluation of the whole trace and is therefore the same as for the classical discrete model (cf. [15]); the top-down LR for the middle contributors is between these.

13

Log10 (LREFM ) 25 ●

20

●● ● ●

● ●

●●

● ● ●

●●

● ●







● ● ●

15



● ● ● ● ● ●●

10

● ●

● ●●

●●

● ●

● ●● ●









5

5

-5

10

15

Log (LRtop-down ) 25 10

20

-5

Log10 (LREFM ) 25

Log10 (LREFM ) 25 ●

20

20





● ●

15

15



● ● ●



● ●





● ●

10



10

● ●







● ●●

● ●











5









● ●

●● ● ●

● ●



10

15

20

Log (LRtop-down ) -5 25 10



● ●

● ● ● ● ●

● ● ●





5

-5



● ●●







● ●

● ● ●● ●





-p

5 ● ● ● ● ● ● ● ● ● ●● ●







● ●



● ●

● ●





ro of

Figure 4: Likelihood ratios on the mixtures with a single analysis from [17] obtained with the top-down method and with EuroForMix (MLE) for the first contributor









5

10

15

20

Log (LRtop-down ) 25 10

-5

(a) Second contributors

re

-5

(b) Third contributors

lP

Figure 5: Likelihood ratios on the mixtures with a single analysis from [17] obtained with the top-down method and with EuroForMix (MLE) for the second and third contributors

ur

na

4.1.2. Likelihood ratios for unrelated non-contributors Next we consider the likelihood ratios for the unrelated non-contributors. The results that we have are a confirmation of the argumentation in section 3.3. In total 1,667 LR’s corresponding to unrelated non-contributors were computed; with EuroForMix assuming three contributors per mixture and with the top-down method evaluating α = 0.05, 0.1, . . . , 0.95, 1, as we did for the contributors. Of the 1,667 LR’s, there were 158 LR’s larger than one with the top-down method compared to 105 with EuroForMix. However, almost of all of these were small. There were 13 LR’s larger than 10 with the top-down method and 15 with EuroForMix; the largest one being 715 with the top-down method and 162 with EuroForMix. Thus, the results for non-contributors are highly similar for the top-down method and EuroForMix. 4.2. Comparison on PPF6C profiles

Jo

A second experiment was to compute LR’s in a similar experiment as in 4.1, but now on trace profiles analyzed on the 23 autosomal loci of the PPF6C multiplex. These traces and computations on them with EuroForMix are reported upon in [29]. In total, 120 traces were generated, each in principle with three replicates. These 120 traces were divided into 30 traces with two, three, four or five persons, in various mixing proportions. We treated each replicate separately, thus having 354 traces (six profiles did not pass a laboratory quality control step), and we computed the top-down LR with each of the 30 individuals that together were the contributors of all traces. These traces contained, for the five person mixtures the following amounts of input DNA: either (300, 150, 150, 150, 150), (300, 30, 30, 30, 30), (150, 150, 60, 60, 60), (150, 30, 60, 30, 30) or (600, 30, 60, 30, 30) pg of DNA. These series are denoted as series A, B, C, D, E, respectively. For the traces with k < 5 contributors the first k of each of these combinations were used. For example the four-person traces of the D-series had (150, 30, 60, 30) pg of input DNA for the four contributors. There is thus variation in the relative proportions, in the absolute amounts, and in whether or not a single prominent contributor stands out. 14

We restricted the top-down LR to use only subprofiles with at most three modeled contributors, thus stopping if more than three were needed. As in 4.1 we used α ∈ {0.05, 0.1, . . . , 0.95, 1}. Thus each top-down LR requires at most 20 computations. If a subprofile Mα was not subjected to computations because more than three contributors would be modeled, all likelihood ratios with this subprofile were set equal to 1. Therefore, according to (3.1), if g is such that all LRα (M, g) with at most three contributors turned out to be less than one, we get LRtop−down (M, g) = 1.

re

-p

ro of

4.2.1. Comparison with discrete method First of all we compare the results of the top-down method with those of the classical discrete method. The latter results corresponds to the evaluation of the full profile. By construction, the topdown method gives an LR that is at least equal to the LR of the discrete method, and is larger if there is a strict subprofile giving a larger LR than the full profile. We have separated the LR’s realized by actual contributors and by non-contributors. In Figure 6 we display a comparison of the LR’s for the contributors. From this Figure we see that there are several effects at play. First, there is the category of traces that have not been evaluated by the discrete method since the number of contributors is too large, meaning that more than three modeled contributors are needed. These corresponds to the dots on the vertical axes corresponding to Log(LR) = 0 for the discrete method. It is clear that many of these traces harbour evidence that already can be extracted at the expense of modeling at most three contributors. A second category is formed by the points for which the resulting LR is the same: these are comparisons where the person of interest’s LR is maximal when the full trace is modeled. We see that almost all of the smallest LR’s are of this type, which is logical since these will be the comparisons with the most minor contributors. The third category consists of those points where the top-down LR is larger than the discrete LR. These points correspond to contributors that give a large likelihood ratio on the subprofiles. We see that the difference in LR can be very substantial, and that we can extract these large likelihood ratios using only the profile data to define the subprofiles. The computational effort to retrieve these larger LR’s is for most points considerably less than the one for the smaller LR obtained with the classical discrete method on the whole profile.

30 25

na

20

lP

Log10 (LRtop-down )

ur

15

Jo

10

-5

5

5

10

15

20

Log10 (LRdiscrete )

Figure 6: Likelihood ratios for contributors (1235 comparisons) with the classical discrete method and with the top-down method; dashed line corresponding to equality

Next, we consider the likelihood ratios obtained by the non-contributors. As for the contributors, by construction the LR obtained by the top-down method must be at least as large as for the discrete method, and one might wonder whether we obtain many false positives. As we see from Figure 7, this 15

is not the case. Of course, the likelihood ratios are very different, and the LR might be much more uninformative with the top-down method than with the classical discrete method. This happens when a subprofile contains only incomplete information on its contributors; in that case the LR’s become more uninformative and this result can then define the final result. However, we should keep in mind that also with the classical discrete method, LR’s for non-contributors are (contrary to those for actual contributors) variable in that they depend crucially on the number of modeled contributors. If we would add an additional contributor in the classical discrete model to those needed for the computations in Figure 6, this would eliminate all the strong exclusions obtained with the discrete method for the same reason as we just outlined. More concrete results that confirms this can be found for example in [15].

Log10 (LRtop-down ) 2

-25

-20

-15

Log10 (LRdiscrete )

ro of

-30

-10

-5

re

-p

-2

-4

-6

lP

Figure 7: Likelihood ratios for non-contributors (9,385 comparisons) with the classical discrete method and with the top-down method; dashed line corresponding to equality

ur

0.5

na

Finally we take a look at the LR distribution for the non-contributors as obtained with the top-down method. As expected, a large proportion of these were precisely equal to 1. A histogram of the obtained LR’s on logarithmic scale is presented in Figure 8. The largest LR observed for a non-contributor was equal to 82, again in line with the arguments in 3.3.

0.4

Jo

0.3 0.2 0.1 0.0

-3

-2

-1

0

1

2

Log10 (LRtop-down )

Figure 8: Likelihood ratios for the non-contributors of the PPF6C profiles

16

Log10 (LREFM )

ro of

4.2.2. Comparison with continuous method From the preceding comparisons between the discrete method and the top-down method, we conclude that the results of the top-down method are quite advantageous compared with those of the classical discrete method. A more important question is of course how they compare to the results of continuous methods, and we now proceed to a comparison with the results obtained by EuroForMix on these data. As calculations using EuroForMix were time consuming, LR calculations were performed on a subset of possible comparisons, as described in [29]. No computations were done with EuroForMix on five-person traces. The calculated LR’s for sample contributors were usually not with the most prominent ones. We now present a comparison of the available results for the actual sample contributors, per number of sample contributors. For the two-person and three-person mixtures, we present the results in Figure 9. For the two-person mixtures, the results are effectively those of the classical discrete method, since the comparisons only involve the minor contributor. We see that, as expected, the evidential value obtained with the continuous method exceeds that of the discrete method. There is one trace for which the minor contributor gave LR = 1 with the top-down method, this was the minor contributor in one of the traces with a (600,30) pg DNA input. For the three person traces, comparisons with all types of contributor are possible. The top-down method is then mostly advantegeous for the most contributing individual, since the likelihood ratios obtained by each method are, for practical purposes, of equal use. But for the top-down method a comparison on a single source trace is sufficient to obtain these, where the continuous method models the whole trace. Log10 (LREFM )

30

35 30

25

✶ ✶✶✶ ✶

25

20











✶ ✶

✶ ✶ ✶

✶✶ ✶



✶ ✶

●●

-p

20

15

● ● ●

15







10

10





5

5

10

15

20

25

30

(a) Two person mixtures (minors only)

● ● ●

● ●



Log10 (LRtop-down )













● ●







re ●

5







● ●● ● ●



● ●

●●

5

10

15

20

25

30

35

Log10 (LRtop-down )

(b) Three person mixtures

lP

Figure 9: Comparison of all Log10 (LR)’s obtained by EuroForMix, with those obtained by top-down method, on the twoand three-person mixtures. Stars correspond to the identifiable top-1 contributors, gray squares to those in the identifiable top-2 (but not top-1) and circles to the remaining contributors.

Jo

ur

na

Next, we consider the four person mixtures. Now, a drawback of the mixture proportions of the mixtures we consider here becomes apparent: there are many contributors with precisely the same input DNA amount. If we sort the contributors according to contribution, we recall from 3.3 that we call the first k contributors an identifiable top-k if all contributors beyond the k th have contributed strictly less than the k th . For example, in a mixture with proportions (300, 300, 150, 150) the top-1 is not identifiable, the top-2 is, the top-3 is not and the top-4 is. None of the mixtures in this study with more than three contributors have an identifiable top-3. Therefore, in the four person mixtures, the top-down method as we applied it here, cannot be expected to find evidential weight against the contributors beyond the identifiable top-1 and top-2. In Figure 10 we present the comparison of all likelihood ratios for the fourperson mixtures that were carried out with EuroForMix. The likelihood ratios for contributors beyond the top-2 are also included, even if they were not a target of the top-down method as we applied it, for completeness. Within the contributors that are in the identifiable top-2, there is one comparison that yields LR = 1 for the top-down method (the 60pg-contributor in a four-person E-series mixture, with input (600, 30, 60, 30) pg). In order to better see the convergence between the results of the top-down method and the continuous method, we give another presentation of the results of Figures 9 and 10 in Figure 11 for all contributors in the two-, three and four person mixtures that are targets of the way we applied the top-down method here, i.e., who are in an identifiable top-k with k ≤ 3. This time, we compare Log10 (LR) obtained by the top-down method with the same quantity obtained by the continuous method, by considering the ratio Log10 (LREFM )of the weight of evidence of the continuous model with that of the top-down method. The reason to make this comparison is that we interpret Log10 (LR) (often called the weight of evidence) as a measure of the information in the profile that discriminates between the hypotheses. Then the ratio 17

Log10 (LREFM ) ✶

30

25

✶ ✶ ✶ ✶



✶✶ ✶

✶ ✶

20 ✶

✶ 15

✶ □



□ □ □ □ □

5□□







□ □

□□









□ □ □







□ □









□ □ □





□ □

□ □ □ □

5

10

15

Log10 (LRtop-down )

ro of

10□□

20

25

30

Figure 10: Comparison of all Log10 (LR)’s obtained by EuroForMix, with those obtained by top-down method, on the four-person mixtures. Stars correspond to the contributors in the identifiable top-1, gray to those in the identifiable top-2 (but not in the top-1) and empty squares to contributors beyond the identifiable top-2.

re

-p

of the Log10 (LR)’s tells us how much more information is obtained by the first method relative to how much is obtained by the second one. For the four person mixtures, there being no identifiable top-3 in any of them, only contributors in the identifiable top-1 and top-2 could be included in Figure 11. These mixtures do not have an identifiable top-3, so for contributors beyond the identifiable top-1 and top-2 we cannot expect the top-down LR to have reached its maximum when targeting the top-3. Log10 (LREFM )/Log10 (LRtop-down ) 5

lP



4

2

na

3



● ●



● ● ● ●●

ur

● ● ● ●

1







Jo



0









5

● ● ● ●







● ● ● ● ● ●

10

●●





✶ ✶ ● ●

15

✶ ✶ ✶✶✶✶✶✶ ✶✶

20

✶✶ ✶ ✶✶ ✶ ✶ ✶ ✶ ✶ ✶✶ ✶ ✶

25

30

Log (LRtop-down ) 35 10

Figure 11: Comparison of all Log10 (LR)’s obtained by EuroForMix, with those obtained by top-down method, on the two-, three- and four-person mixtures. As in Figures 9 and 10, stars correspond to the contributors in the identifiable top-1, gray squares to those in the identifiable top-2 (but not top-1), and black circles to the contributors in the identifiable top-3 (and not in the top-2 or top-1; there are such contributors for the three person mixtures only)

In Figure 11 we clearly see the trend for the LR’s of both methods to differ relatively less from each other, as the weight of evidence becomes larger. For small weight of evidence, the spread between the results is relatively much larger. We note however, that we do not intend the results of EuroForMix to serve as gold standard. Also for different continuous models we expect that it will be the case that the ratio between their obtained weights of evidence is larger for smaller weight of evidence, since these will 18

Mean Log10 (LRtop-down ) D E C A B 30 25 20

E D B

B E

A

D

C

A

E B

D

D

A

C

15

E B

A C

10

C

5 1

2

3

4

5

6

NoC

-p

0 0

ro of

depend more heavily on the more subtle choices in the model, such as the stutter model. These series of traces also allow us to investigate to what extent the LR for a prominent contributor is influenced by the presence of the other, more minor contributors. To do so we computed the mean of the Log(LR)’s obtained for the first contributor of all series, as well as for the second contributor of the B-series, because in that series the first two contributors have the same amount of input DNA. For comparison to single source traces we also calculated the LR for a single source trace as the inverse of the random match probability. This is an upper bound for all LR’s that we can obtain, and the extent to which the average LR decreases when there are several contributors reflects the complexity of the mixtures. The results are summarized in Figure 12.

Figure 12: Likelihood ratios for the most prominent contributor(s), as a function of the total number of contributors

Proportion major contributor

0.8 0.6

E B D A

C

ur

0.4

C D A B E

E B

na

1.0

lP

re

From Figure 12 we see that for the B-series and the E-series, where the most prominent contributor has contributed most relative to all others, the LR is hardly affected by the presence of these minor contributors, whatever their number. Generally the LR for the most prominent contributor is less affected by the presence of other contributors if the relative contribution of the major contributor is larger. In Figure 13 we plot the contribution of the investigated major contributor, relative to the total contribution of all contributors, and note that the resulting graphs show a good qualitative similarity to those in 12.

E B

D A C

D

3

4

A C

E B D A C

0.2

Jo

0.0 0

1

2

5

6

NoC

Figure 13: Relative contribution of the most prominent contributor(s)

4.3. Application to research data We have also applied the top-down method as described in this paper on traces collected in the course of a research experiment. The details of those experiments will be reported elsewhere. The set-up was to sample locations in cars to assess adequacy of sampling locations as well as prevalence and persistence of DNA traces in vehicles. In addition to locations within the cars, the clothing of the drivers was also sampled. Each car had at least one regular driver, and then other persons served as incidental drivers. 19

re

-p

ro of

The cars were sampled at various time intervals after such an event: either immediately, a day later or one week later. We refer to [30] for further details of the experiment with regard to the samples taken from clothing (other articles being in preparation). Separate experiments of this type were carried out at the Netherlands Forensic Institite and in Australia at the VPFCS. We report briefly on the first set, the second set having entirely similar results. In total, 471 traces were analyzed for this set, and we compared all of them with all 23 available reference profiles of the participating drivers and sampling volunteers, yielding a matrix with 10,833 LR’s. We used the top-down method going up to three contributors. For the purposes of this paper, it is insightful to consider how the obtained weights of evidence depend only weakly on the complexity of the trace as measured by total allele count (TAC) over all loci, or by the maximum allele count (MAC) observed on at least one locus. The MAC’s of these traces were (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 12) occurring respectively (5, 12, 53, 102, 141, 78, 37, 26, 10, 6, 1) times; the TAC varied from 0 to 152. In Figure 14 we plot the distribution of LR’s that exceeded 1,000 as a function of the MAC and the TAC. From this Figure we see that traces with large number of contributors can nonetheless provide very strong evidence in favour of having found a prominent contributor, and also that traces with very few alleles can still be quite informative. For example, the trace with a MAC of 12 yielded two LR’s larger than 1, namely 1011.8 for the regular driver and 108.3 for the incidental driver. These were, apparently, the most prominent contributors of that trace. The fact that their DNA was mixed with that of an unknown number of other persons, possibly four or more others, does not render the trace profile uninformative with regard to the more prominent contributors. We set a threshold on the likelihood ratio equal to 1,000 to classify them as a possible link between person and trace. Almost all such LR’s occured between a trace and a person where the trace was taken in a car that that person had actually been driving. There were also four cases of LR > 1,000 for a person who was not a driver of the experiment in which the trace was obtained. In three cases plausible explanations exist for the presence of their DNA: e.g., one case was a person that had indeed been in the sampled car. In one case (with Log10 (LR) = 3.1) the match was with a sampling volunteer, but no direct link was found. However, three of these four LR’s could also be computed with the traditional method modeling the whole trace, in which case they were comparable to the top-down LR’s. Thus these LR’s were for for minor contributors, which makes their presence more plausible, and also shows that there is no difference in the number of false positives (if there are any at all) between the traditional and top-down approach on these traces. 35

lP

35

30

30

25

25

20

na

20 15 10

0

0

1

ur

5

2

3

4

5

6

7

8

9

15  

10

           

5 0

1-

11- 21- 31- 41- 51- 61- 71- 81- 91- 101- 111- 121- 131- 141- 151-

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160

12

(a) MAC

(b) TAC

Jo

Figure 14: Box-whisker plotted distribution of Log(LRtop−down ) likelihood ratios exceeding 1,000 (vertical axis), grouped by maximum allele count (MAC) or total allele count (TAC) of the traces (horizontal axis)

5. Discussion

With this approach we have shown that for the purpose of obtaining evidence in favour of contribution of a person of interest to a trace profile, a likelihood ratio can be assigned without it being necessary to have an estimate of the number of profile contributors before starting the computations. Our top-down approach is a serial one, targeting the contributors in the order of appearance, until one of the following stopping criteria is met: either (i) a sufficiently large LR is found, or (ii) the computations are aborted because they become too complex, i.e., if too many unknown contributors must be modeled, or (iii) if 20

Jo

ur

na

lP

re

-p

ro of

a too large proportion of the total peak height needs to be taken into account (i.e., when α exceeds a certain value). One can also not apply a stopping criterion and analyze whole trace, if computationally feasible. But even if it is not feasible to analyze the whole trace, one can always query for the first contributors. We have also shown that for the more prominent contributors the likelihood ratios are quite model-independent, by conservatively but - for small top-k, say k ≤ 2 - sufficiently closely and in a computationally advantageous way, reproducing the likelihood ratios from a continuous model. We note that we do not make the assumption that doubling the contribution doubles the peak heights. We would need such an assumption if we want to estimate the relative contributions from the α that maximizes LR(Mα , g), but we refrained from doing that. If larger contributions lead to larger peaks, linearly or not, we expect the top-down approach to have the same performance. A disadvantage is that not all available information is maximally exploited, since the subprofiles themselves are evaluated with a discrete model, and peak heights no longer play a role then. The subprofiles are sets of alleles, modeled as a superposition of partly present profiles of some number of contributors and drop-in alleles. The drop-in facility of the model only needs to make it possible to detect alleles that aren’t in the genotypes of the contributors, without having to specify a distribution on their peak heights. The notion of drop-in here has nothing to do with drop-in alleles as in blank control samples. Rather, we use the drop-in provision of the discrete model, since it may be the case that not all of the alleles that are in a subprofile that is evaluated as having k contributors, actually come from precisely k contributors; there may be also alleles of more minor contributors. Since we work with a discrete model, we need not specify peak heights of drop-in alleles, but only allow for their detection. In this model, drop-in alleles are simply alleles that arise via a mechanism that is independent of the one that leads to the modeled contributors have their alleles detected. They do not necessarily represent laboratory drop-ins in negative controls. As for the choice of drop-in rate, we have used the value c = 0.05 throughout. Other choices would have been possible, or even better, e.g., it might be preferable to integrate over some interval for the drop-in parameter c, just as we carried out integration for the dropout probabilities. However, this would have increased computation time and we do not think it would make a qualitative difference on the results that were obtained, especially not for the largest likelihood ratios. Continuous models need to have probability distributions for the heights of all types of peaks, including stutter peaks, drop-in peaks, and other artefactual peaks. Also factors like degradation and inhibition that may cause peak heights to differ across the loci need to be taken into account. Different sensitivity across the loci, leading some loci to show alleles of more sample contributors than other loci, requires careful locus specific peak height modeling. It is less relevant for the top-down method, because rescaling all peak heights by the same factor does not affect the Mα . Indeed, this is why we have not defined the subprofiles in terms of absolute peak heights by moving the detection threshold, but in terms of the heights relative to the total on each locus separately. This was chosen to better deal with different sensitivities and degradation across loci. Of course, different approaches to defining the subsets analogous to our Mα could still be envisaged, for example, one might take a locus specific approach and take in peaks above a locus-dependent peak height threshold. Alternatively, one could first normalize the peak heights to sum up to one, and then lower the detection threshold. If a continuous interpretation method were applied to the subprofiles, such choices might be more suitable for those models. Perhaps such an approach would even outperform the one presented here. We have not trialled other methods and therefore can not carry out a comparison. The choice for Mα described here turned out to give results that were in our opinion satisfactory enough, and the rationale behind it is easy to understand. Analysis settings and detection threshold are also not relevant for all contributors employing a topdown approach: for the contributors whose alleles are not affected by these settings, neither are the results of the top-down method. In particular, it is irrelevant whether a stutter filter has been applied when we target the most prominent contributors. Indeed, if stutter peaks are not larger than 5% of their parent peaks, then taking in peaks with α < 95% cannot include peaks that are purely stutter. These peaks may have a stutter component, but they would have been detected in the trace profile also without that component. We believe that using a top-down approach is quite a natural one, and that it is reminiscent to human judgment. If there is a clear major component to the mixture, its contributor(s) are easily recognizable, without having to know much about the underlying minor contributors, not even how many there are exactly. In DNA casework, sometimes profiles are encountered with many contributors, but only a few major contributors. It is then not uncommon to derive (manually) a profile thought to correspond to the alleles from those most prominent contributors, and subject only that profile to calculations. 21

Jo

ur

na

lP

re

-p

ro of

The top-down approach in this paper is an objectified and unbiased automated version of this manual approach. In this article we have shown that sometimes not much work has to be done in order to find evidence that is for practical purposes as useful as the LR computed by a full evaluation. A full evaluation may be more time consuming, or even impossible. Apart from practical computational aspects, we believe these results are useful (especially for less statistically trained practitioners) to make a more natural distinction between easy and hard cases. Within forensic laboratories there may be a tendency to disregard profiles that are deemed too complex, since they contain a large number of contributors. This article shows that, even if there are many contributors, it may be possible to find strong evidence in favour of being some of those, but not always for all of them. We believe that it is not the number of profile contributors that is the most important factor for determining the complexity of a mixture, rather, we think it is the equality of the mixture proportions and the total amount of DNA in the sample. The less there is, and the more even the mixture proportions are, the less it is possible to target the affected contributors individually, and hence sequentially. Those contributors who have a contribution similar to that of others, are more difficult to find evidence for. This hinders the top-down approach, but lack of information in peak heights also hinders a continuous model. Therefore we would describe equality of mixture proportions as a complexity measure that is inherent to the problem. The most minor contributors are generally harder to distinguish and finding evidence in favour of their contribution requires much more subtle modeling than for the major components. Our method does not aim to find these, it aims to give a method to find the prominent contributors requiring the least effort. We have shown in Section 4 that this is in practice very well possible, even if there are many profile contributors. We conclude by reflecting on the applicability of the top-down method for reporting. Deciding when a method may be regarded as validated is a difficult problem, for which many angles exist. One approach is to establish types of profiles that the method can be applied to, e.g., a maximum number of contributors or a minimal contribution. Since these are all unknown in casework, we would reason along other lines. We have seen in this article that if the top-down methods finds evidence in favour of being for a contributor for a person of interest who actually is a contributor, then these LR’s are generally slightly conservative with respect to those of the compared continuous model. The larger the LR, the closer it is to that of the continuous model. For smaller LR’s, the top-down method may yield a LR that is larger than the one of the continuous model. But these LR’s come from more minor contributors, where we have less reason to treat this particular continuous model as ground truth, if we would accept the existence of one at all. For non-contributors, the distributions of LR’s larger than one is comparable. We also keep in mind that in general, modeling choices will affect the ability to find evidence for actual contributors much more than the probabilities to find evidence for a non-contributor. Indeed, whereas the latter are bounded by (3.2), the ability to find evidence for persons of interest who are actual contributors is determined by the model choices. Therefore, we would rather say that we consider a range of LR’s to be validated: looking at Figure 11, it is reasonable to say that if the LR with the top-down method exceeds, say, 1010 on a trace determined on the loci as those in Figure 11, then this LR may be used without recomputation by a more refined model. Based on the results in this study, the Netherlands Forensic Institute currently uses the top-down method as one of its tools for likelihood ratio calculations, alongside DNAStatistX (cf. [31]). A standard application of the top-down method is to apply it for trace profiles that (may) contain more than four profile contributors, especially when searching the top-2 contributors. The resulting likelihood ratio is then used for reporting. Its precise value is usually not needed since the NFI reports all LR’s that are greater than 109 simply as “at least one billion”. Another application is to quickly screen all combinations of traces and references where both lists contain many (a few hundred) items and the traces may be complex, as we did in Section 4.3. References

[1] R. Cowell, T. Graversen, S. Lauritzen, J. Mortera, Analysis of forensic DNA mixtures with artefacts, Journal of the Royal Statistical Society. Series C: Applied Statistics 64 (1) (2015) 1 – 48. [2] Ø. Bleka, G. Storvik, P. Gill, Euroformix: An open source software based on a continuous model to evaluate STR DNA profiles from a mixture of contributors with artefacts, Forensic Science International: Genetics 21 (2016) 35 – 44. [3] R. Puch-Solis, L. Rodgers, A. Mazumder, S. Pope, I. Evett, J. Curran, D. Balding, Evaluating forensic DNA profiles using peak heights, allowing for multiple donors, allelic dropout and stutters, Forensic Science International: Genetics 7 (2013) 555 – 563.

22

Jo

ur

na

lP

re

-p

ro of

[4] C. Steele, M. Greenhalgh, D. Balding, Evaluation of low-template DNA profiles using peak heights, Statistical applications of Genetics and Molecular Biology 15 (5) (2016) 431 – 445. [5] S. Manabe, C. Morimoto, Y. Hamano, S. Fujimoto, K. Tamaki, Development and validation of open-source software for DNA mixture interpretation based on a quantitative continuous model, PLOS ONE 12 (2017) 1 – 18. [6] D. Taylor, J. Bright, J. Buckleton, The interpretation of single source and mixed DNA profiles, Forensic Science International: Genetics 7 (5) (2013) 516 – 528. [7] M. W. Perlin, M. M. Legler, C. E. Spencer, J. L. Smith, W. P. Allan, J. L. Belrose, B. W. Duceman, Validating true allele DNA mixture interpretation, Journal of Forensic Sciences 56 (6) (2011) 1430 – 1447. [8] H. Swaminathan, A. Garg, C.M.Grgicak, M.Medard, D.S.Lun, Ceesit:a computational tool for the interpretation of STR mixtures, Forensic Science International: Genetics 22 (2016) 149 – 160. [9] R. Cowell, Computation of marginal distributions of peak-heights in electropherogramsfor analysing single source and mixture STR DNA samples, Forensic Science International: Genetics 35 (2018) 164 – 168. [10] R. Cowell, A unifying framework for the modelling and analysis of STR DNA samples arising in forensic casework, Arxiv.org/abs/1802.09863 (2018). [11] Y. You, D. Balding, A comparison of software for the evaluation of complex DNA profiles, Forensic Science International: Genetics 40 (2019) 114 – 119. [12] J. Buckleton, J.-A. Bright, K. Cheng, B. Budowle, M. Coble, NIST interlaboratory studies involving DNA mixtures (MIX13): A modern analysis, Forensic Science International: Genetics 37 (2018) 172 – 179. [13] E. Alladio, M. Omedei, S. Cisana, G. D. Amico, D. Caneparo, M. Vincentia, P. Garofano, DNA mixtures interpretation – a proof-of-concept multi-software comparison highlighting different probabilistic methods’ performances on challenging samples, Forensic Science International: Genetics 37 (2018) 143 – 150. [14] C. Steele, D. Balding, Statistical Evaluation of Forensic DNA Profile Evidence, Annual Review of Statistics and its Applications 1 (2014) 361 – 384. [15] K. Slooten, Accurate assessment of the weight of evidence for dna mixtures by integrating the likelihood ratio, Forensic Science International: Genetics 27 (2017) 1 – 16. [16] R. Cowell, T. Graversen, S. Lauritzen, J. Mortera, Analysis of forensic DNA mixtures with artefacts, Journal of the Royal Statistical Society Series C (with discussion) 64 (1) (2014) 1 – 48. [17] Ø. Bleka, C. Benschop, G. Storvik, P. Gill, A comparative study of qualitative and quantitative models used to interpret complex STR DNA profiles, Forensic Science International: Genetics 25 (2016) 85 – 96. [18] K. Slooten, T. Egeland, Exclusion probabilities and likelihood ratios with applications to mixtures, International Journal of Legal Medicine 130 (1) (2016) 39 – 57. [19] H. Haned, L. P` ene, J. Lobry, A. Dufour, D. Pontier, Estimating the number of contributors to forensic DNA mixtures: does maximum likelihood perform better than maximum allele count?, Journal of Forensic Sciences 56 (1) (2011) 23 – 28. [20] H. Haned, K. Slooten, P. Gill, Exploratory data analysis for the interpretation of low template DNA mixtures, Forensic Science International: Genetics 6 (6) (2012) 762 – 774. [21] C. Benschop, H. Haned, L. Jeurissen, P. Gill, T. Sijen, The effect of varying the number of contributors on likelihood ratios for complex DNA mixtures, Forensic Science International: Genetics 19 (2015) 92 – 99. [22] T. Bille, S. Weitz, J. Buckleton, J.-A. Bright, Interpreting a major component from a mixed DNA profile with an unknown number of minor contributors, Forensic Science International: Genetics 40 (2019) 150 – 159. [23] K. Slooten, A. Caliebe, Contributors are a nuisance (parameter) for DNA mixture evidence evaluation, Forensic Science International: Genetics 37 (2018) 116 – 125. [24] J. Buckleton, J.-A. Bright, K. Cheng, H. Kelly, D. Taylor, The effect of varying the number of contributors in the prosecution and alternate propositions, Forensic Science International: Genetics 38 (2019) 225 – 231. [25] C. Benschop, T. Sijen, LoCIM-tool: An expert’s assistant for inferring the major contributor’s alleles in mixed consensus DNA profiles, Forensic Science International: Genetics 11 (2014) 154 – 165. [26] K. Slooten, The information gain from peak height data in DNA mixtures, Forensic Science International: Genetics 36 (2018) 119 – 123. [27] P. J. Green, J. Mortera, Paternity testing and other inference about relationships from dna mixtures, Forensic Science International: Genetics 28 (2017) 128 – 137. [28] T. Graversen, J. Mortera, G. Lago, The Yara Gambirasio case: Combining evidence in a complex DNA mixture case, Forensic Science International: Genetics 40 (2019) 52 – 63. [29] C. Benschop, A. Nijveld, F. Duijs, T. Sijen, An assessment of the performance of the probabilistic genotyping software EuroForMix: Trends in likelihood ratios and analysis of Type I & II errors, Forensic Science International: Genetics 42 (2019) 31 – 38. [30] T. D. Wolff, L. Aarts, M. van den Berge, T. Boyko, R. van Oorschot, M. Zuidberg, B. Kokshoorn, Prevalence of DNA in vehicles: linking clothing of a suspect to car occupancy, Australian Journal of Forensic Sciences (2019) 1 – 4. [31] C. Benschop, et. al., DNAxs/DNAStatistX: Development and validation of a software suite for the data management and probabilistic interpretation of DNA profiles, Forensic Science International: Genetics 42 (2019) 81–89.

23