Evaluation of DNA match probability in criminal case

Evaluation of DNA match probability in criminal case

Forensic Science International 116 (2001) 139±148 Evaluation of DNA match probability in criminal case$ Jae Won Leea,*, Hye-Seung Leeb, Mira Parkc, J...

160KB Sizes 0 Downloads 36 Views

Forensic Science International 116 (2001) 139±148

Evaluation of DNA match probability in criminal case$ Jae Won Leea,*, Hye-Seung Leeb, Mira Parkc, Juck-Joon Hwangb a

Department of Statistics, Korea University, 5-1 Anamdong, Sungbuk-gu, Seoul, South Korea Department of Legal Medicine, Korea University, 126-1 Anamdong, Sungbuk-gu, Seoul, South Korea c Department of Pre-Medicine, Eulji Medical College, 143-5 Yongdodong, Chung-gu, Taejeon, South Korea b

Received 20 March 2000; received in revised form 30 May 2000; accepted 26 July 2000

Abstract The new emphasis on quanti®cation of evidence has led to perplexing courtroom decisions and it has been dif®cult for forensic scientists to pursue logical arguments. Especially, for evaluating DNA evidence, though both the genetic relationship for two compared persons and the examined locus system should be considered, the understanding for this has not yet drawn much attention. In this paper, we suggest to calculate the match probability by using coancestry coef®cient when the family relationship is considered, and thus the performances of the identi®cation values depending on the calculation of match probability are compared under various situations. # 2001 Elsevier Science Ireland Ltd. All rights reserved. Keywords: Identi®cation using DNA evidence; Match probability; Coancestry coef®cient; Genetic locus system

1. Introduction Forensic science is experiencing a period of rapid change because of the dramatic evolution of DNA pro®ling. One of the consequences of this new technology for the forensic scientist is to express numerically the power of DNA evidence presented in court. It means that the identi®cation using DNA began to consider the evaluation of the DNA evidence rather than the simple exclusion of non-match person. The reason why the evaluation is necessary for the analysis of the DNA evidence is that there is a chance that two persons have DNA pro®le patterns (i.e. genetic types) that match at the examined loci. $

This research was supported by the Korea Science and Engineering Foundation grant, 1998. * Corresponding author. Tel.: ‡82-2-3290-2237; fax: ‡82-2-924-9895. E-mail address: [email protected] (J.W. Lee).

The evaluation of the DNA evidence is conducted by the identi®cation using the calculation of the weight of the evidence. Given two pro®les that could plausibly be observations from the same individual, one from the crime scene and one from the suspect, the forensic scientist is charged with providing the weight of the DNA evidence. The weight is described as the likelihood ratio, which is the ratio of the probability for the DNA evidence assuming that the evidence sample came from the suspect to the corresponding probability assuming that the evidence sample came from another person. In the calculation of the likelihood ratio, the probability assuming that the evidence sample came from another person is called the match probability and it depends on the genetic relation of the suspect and the criminal. However, in most cases, the match probability has been calculated assuming that the suspect is not related to the man who left the crime sample, and thus the calculation of the likelihood ratio tends to overestimate the DNA

0379-0738/01/$ ± see front matter # 2001 Elsevier Science Ireland Ltd. All rights reserved. PII: S 0 3 7 9 - 0 7 3 8 ( 0 0 ) 0 0 3 5 6 - X

140

J.W. Lee et al. / Forensic Science International 116 (2001) 139±148

evidence. For this reason, an alternative method considering the genetic relation in the same subpopulation was proposed [1], and an approach using distance metric for the case that the suspect might be the relative of the criminal was proposed [2]. In addition, the case that the suspect might be a brother of the criminal was analysed [3,4]. However, since it is not clear whether two persons belong to the same subpopulation or have the family relationship, the analyst actually calculates the match probability assuming that the genetic relationship for the two compared persons is given. In this paper, using the coancestry coef®cient which is the usual measure of relatedness for two individuals, we describe a method for calculating match probability in a case where the criminal may be one of the relatives of the suspect. Since an individual who has relationships more distant than the third degree could be usually considered to be an unrelated person, we will only consider the relationships which are not more distant than the third degree. The relatedness for compared persons is more frequent in paternity cases, and we have described the method for paternity case where the alleged father is a relative of true father [9]. We also compare the likelihood ratios derived by the method we describe with those by the conventional methods in various types of genetic locus system. To do this, we generated the simulated data and examined the observed frequency of false match in each of the genetic relationships. We describe the conventional methods for calculating match probability in Section 2, and propose a method in Section 3. In Section 4, through simulation, we compare the performance of the above methods for various types of the genetic locus system and the genetic relationships. Finally, further considerations for the identi®cation are discussed in Section 5. 2. Conventional methods Generally, to express the likelihood ratio (LR) for identifying numerically, we set the two competing hypotheses and calculate the probabilities of the dataset under the hypotheses, Hd and Hp. In this hypothesis, Hd means the defense proposition and Hp means the prosecution proposition. Some other person left the crime sample. Hd Hp The suspect left the crime sample.

When E and I denote the DNA evidence and nonDNA evidence, the LR is described as, P…EjHp ; I†= P…EjHd ; I†, where E ˆ …GS ; GC † if GS and GC denote the DNA typing result (genotype) for a suspect and a criminal, respectively. Since the probability of the DNA evidence assuming that the evidence sample came from the suspect, P…EjHp ; I† is equal to 1 when it is assumed that the laboratory is error-free, LR ˆ

P…EjHp ; I† 1 ˆ ; P…EjHd ; I† P…EjHd ; I†

(1)

and thus it is determined by the probability assuming that the evidence sample came from another person, P…EjHd ; I†, which is called the match probability. When we intend to identify the suspect by using the DNA evidence, we must consider all possible types of criminals to calculate the match probability. Such situations can be divided depending on the genetic relationship of the suspect and the criminal. That is, when the genetic relation is considered, ``Hd: some other person left the crime sample'' can be divided into ``the suspect is a relative of the criminal'' and ``the suspect is not a relative of the criminal'', and thus P…EjHd ; I† ˆ P…EjR†P…R† ‡ P…EjRC †P…RC †;

(2)

where R means that the suspect is a relative of the criminal and RC means that the suspect is not a relative of the criminal. Also, the calculation of P…EjHd ; I† depends on whether the matched genotype is homozygote or heterozygote. The following two methods are conventionally used for the calculation of P…EjHd ; I† in Europe and other places. 2.1. Method 1: the suspect is not related to the criminal In Eq. (2), if we assume that the suspect is not related to the criminal, P…R† ˆ 0 and then the match probability P…EjHd ; I† ˆ P…EjRC †P…RC †. This means that only genotype of a criminal is used for DNA evidence, E. By product rule,  2 GC ˆ GS ˆ Ai Ai ; pi ; (3) P…EjHd ; I† ˆ 2pi pj ; GC ˆ GS ˆ Ai Aj ; where pi is the reference proportion for allele Ai. As shown in this equation, it depends only on the allele frequencies in the reference population.

J.W. Lee et al. / Forensic Science International 116 (2001) 139±148

2.2. Method 2: the criminal belongs to the same subpopulation as the suspect The method considering the genetic relation in the population have proposed [1]. In Eq. (2), if it is assumed that the suspect is related to the criminal, P…RC † ˆ 0 and then P…EjHd ; I† ˆ P…EjR†P…R†. For R, we should consider both the relationship between alleles of different individuals in one subpopulation and that in different subpopulations to estimate the genetic relation in a population. Thus, when the criminal belongs to the same subpopulation as the suspect, 8 ‰2y ‡ …1 ÿ y†pi Š‰3y ‡ …1 ÿ y†pi Š > > ; > < …1 ‡ y†…1 ‡ 2y† P…EjHd ; I† ˆ > > ‰2y ‡ …1 ÿ y†pi Š‰y ‡ …1 ÿ y†pj Š > ; : …1 ‡ y†…1 ‡ 2y†

141

2 in Section 2, if we assume that the suspect is related to the criminal, P…EjHd ; I† ˆ P…EjR†P…R†. For R, we consider the relative relationships which are not more distant than the third degree because an individual who has relationships more distant than the third degree can be usually considered to be an unrelated person. The following relationships are considered: The suspect is sib of the criminal. R1 The suspect is uncle of the criminal. R2 R3 The suspect is first cousin of the criminal. The suspect is first cousin once removed of R4 the criminal. The suspect is second cousin of the criminal. R5

GC ˆ GS ˆ Ai Ai ; (4) GC ˆ GS ˆ Ai Aj ;

where the quantity y describes the relationship of alleles within the subpopulation relative to that among subpopulations, and pi is the reference proportion for allele Ai. In this method, y should be estimated proportionally to the variance of allele frequency among the subpopulations, but the estimation for y is not yet practical since the scientist de®nes the subpopulation of its own. For this reason, the study on the practical estimation of y has been continued [5± 8], and they report the estimates of y are generally less than 0.05. However, the 1996 National Research Council (NRC) report suggested the values in the range 0.01±0.03, and thus it has been tentatively used. In this paper, we consider 0.02 for y. 3. A new method considering the family relationship of suspect and criminal We intend to describe a method for the case that the suspect might be the relative of the criminal. Though the method using distance metric when the measurement error is considered was proposed [2], we will propose a method using the coancestry coef®cient without considering the measurement error. The method for paternity case where the alleged father is a relative of true father was described [9]. Like case

R6 R7

The suspect is second cousin once removed of the criminal. The suspect is third cousin of the criminal.

When we consider the above seven possible relationships, if E ˆ …GS ; GC † is reminded, where GS and GC denote DNA typing result (genotype) for a suspect and a criminal, respectively, Eq. (2) becomes P…EjHd ; I† ˆ P…EjR†P…R† ˆ

7 X

P…EjRl †P…Rl †

lˆ1

ˆ

7 X

P…GC jGS ; Rl †P…Rl †:

(5)

lˆ1

To evaluate Eq. (5), we use the coancestry coef®cient between the suspect and the assumed criminal in each relationship, Rl. Also, P(Rl) can be determined in two different ways. First, P(Rl)0 was determined by the expected number of each relationship assuming that the number of offspring per family is Poisson distributed [4]. Second, P(Rl)00 was determined by considering the probability of the genotype given that a person in the relationship Rl with the criminal is the suspect, P…GS jRl †. Let yCS be the probability that two alleles, one taken at random from each of C and S, are identity by descent (IBD, the event that these two alleles are copies of the same allele from his or her

142

J.W. Lee et al. / Forensic Science International 116 (2001) 139±148

parents) and DC‡S be the average of two probabilities that C and S have two pairs of IBD alleles. In this notation, yCS implies the usual measure of relatedness for individuals C and S, and it is called the coancestry coef®cient. When pi and pj denote the frequency for the allele Ai and Aj, if the matched genotype is  P…EjHd ; I† ˆ

p2i ‡ pi …1 ÿ pi † ‡ 0:25…1 ÿ pi †2 ; 0:5pi pj ‡ 0:25…pi ‡ pj † ‡ 0:25;

evaluation of DNA evidence. In addition, the calculation of the match probability for this case gives the most conservative result, which means that the probability that one who is not the criminal is falsely discriminated as the criminal is lowest. The match probability in this case is calculated by

GC ˆ GS ˆ Ai Ai ; GC ˆ GS ˆ Ai Aj :

(10)

4. Simulation studies

homozygote (GC ˆ GS ˆ Ai Ai ), P…GC jGS ; R† ˆ p2i ‡ 4yCS pi …1ÿ pi † ‡ 2DC‡S …1ÿ pi †2 ; (6) and if the matched genotype is heterozygote (GC ˆ GS ˆ Ai Aj ; i 6ˆ j), P…GC jGS ; R† ˆ 2…1 ÿ 4yCS ‡ 2DC‡S †pi pj ‡ 2…yCS ÿ DC‡S †…pi ‡ pi † ‡ 2DC‡S : (7) For various types of relationships (Rl, l ˆ 1; . . . ; 7), P…GC jGS ; Rl † and P…Rl † depending on the values of yCS and DC‡S are shown in Table 1. Therefore, the match probability P…EjHd ; I† can be obtained as follows: 8 9 1 > …1 ÿ pi †2 ; < p2i ‡ pi …1 ÿ pi † ‡ 86 172 P…EjHd ; I† ˆ > : 155 p p ‡ 2 …p ‡ p † ‡ 1 ; i j i j 86 43 172 where P(Rl)0 is used and 8 127 1 > pi …1 ÿ pi † ‡ …1 ÿ pi †2 ; < p2i ‡ 448 28 P…EjHd ; I† ˆ > : 337 p p ‡ 95 …p ‡ p † ‡ 1 ; i j i j 224 896 28 where P(Rl)00 is used. In addition, the case that the suspect might be the brother of the criminal, which was analysed [3,4], is a special case which considers only Rl among the above relationships. The evidence that suspect is the brother of the criminal could have an important effect on the

For the identi®cation using DNA evidence, the genetic relationship of the compared persons and the number of the used loci should be examined. However, since neither how to consider the information on the genetic relationship of two persons nor how many loci should be examined is clear, use of the conventional methods assuming a given relationship of two persons has had some validity. In this section, by using the data simulated from the genetic relationship assumed for calculating each match probability, we analyse the DNA evidence both when the same person is compared (Hp) and when the different persons are compared (Hd). To do this, for various types of the genetic locus system and the

GC ˆ GS ˆ Ai Ai ;

(8)

GC ˆ GS ˆ Ai Aj ;

GC ˆ GS ˆ Ai Ai ;

(9)

GC ˆ GS ˆ Ai Aj ; genetic relationship assumed for calculating each match probability, we compare the observed probabilities of false match and the performance of the methods described above. The former is to examine the ef®ciency of the genetic locus system for the relationship and the latter is to examine the performance of the

J.W. Lee et al. / Forensic Science International 116 (2001) 139±148

143

Table 1 Probabilities for each relationship P…GC jGS ; R†; P…R†0 and P(R)00 Relationship

R1 R2 R3 R4 R5 R6 R7

Probabilities yCS

DC‡S

P…GC jGS ; Rl †; GC ˆ GS ˆ Ai Ai

P…GC jGS ; Rl †; GC ˆ GS ˆ Ai Aj ; i 6ˆ j

P(Rl)0

P(Rl)00

1=4 1/8 1/16 1/32 1/64 1/128 1/256

1/8 0 0 0 0 0 0

p2i ‡ pi …1 ÿ pi † ‡ …1=4†…1 ÿ pi †2 p2i ‡ …1=2† ‡ pi …1 ÿ pi † p2i ‡ …1=4† ‡ pi …1 ÿ pi † p2i ‡ …1=8† ‡ pi …1 ÿ pi † p2i ‡ …1=16† ‡ pi …1 ÿ pi † p2i ‡ …1=32† ‡ pi …1 ÿ pi † p2i ‡ …1=64† ‡ pi …1 ÿ pi †

…1=2†pi pj ‡ …1=4†…pi ‡ pj † ‡ …1=4† pi pj ‡ …1=4†…pi ‡ pj † …1=2†pi pj ‡ …1=8†…pi ‡ pj † …1=4†pi pj ‡ …1=16†…pi ‡ pj † …1=8†pi pj ‡ …1=32†…pi ‡ pj † …1=16†pi pj ‡ …1=64†…pi ‡ pj † …1=32†pi pj ‡ …1=128†…pi ‡ pj †

0.023 0.047 0.093 0.093 0.186 0.186 0.372

1/7 1/7 1/7 1/7 1/7 1/7 1/7

®ve LR values depending on the calculation of the match probability. For the simulations, we used the allele frequencies of nine STR loci (D3S1358, vWA, FGA, D8S1179, D21S11, D18S51, D5S818, D13S317, D7S820) estimated from Korean population in Institute of Legal Medicine at Korea University [10], and generated 100 reference populations composed of the genotype data for 500 persons. When STR locus is used, discrimination power of the examined locus might have an effect on the identi®cation result. Discrimination power in the criminal cases is estimated by the probability of selecting two persons with the same genotype, which implies the probability that two persons are falsely identi®ed. In the analysis of DNA evidence, the discrimination power of the genetic locus system gets higher as the number of the examined loci increases by the assumption of the independent alleles within and between loci. Thus, for various types of the genetic locus system, we composed the three-locus system (D18S51, FGA, D8S1179), the ®ve-locus system (the threelocus ‡ D13S317, D21S11), the seven-locus system (the ®ve-locus ‡ vWA, D5S818), and the nine-locus system (the seven-locus ‡ D3S1358, D7S820) in the descending order of the discrimination power of each locus. The ®ve LR values are cumulatively calculated by the independence assumption among the loci in each genetic locus system and are described as M1, M2, M3, M4 and M5 here. M1 is calculated by Eq. (3) and assumes that the suspect is not related to the criminal, and M2 is calculated by Eq. (4) and assumes that the criminal belongs to the same subpopulation as the suspect. M3 and M4 are based on P(Rl)0 and P(Rl)00 , respectively, assuming that the relationships are not more distant than the third degree. That is, M3 is

calculated by Eq. (8) and M4 is calculated by Eq. (9). M5 is calculated by Eq. (10) and assumes that the suspect is a brother of the criminal. For various types of genetic relationship, we consider the relationship assumed for calculating each match probability. That is, `not related' for M1, `same subpopulation' for M2, `family relationship_1' for M3, `family relationship_2' for M4 and `brother' for M5 are considered. 4.1. Observed probability of false match The observed probability of false match was calculated by the probability that the genotypes of two compared persons are matched. Table 2 shows the average of 100 reference populations considering each genetic relationship, and the population is composed of 500 persons. As shown in Table 2, the larger number of loci gives the smaller observed probability of false match. The difference among the genetic relationships gets larger when the small number of loci are examined. Especially, the observed probability of false match in the relationship of `brother' gets much smaller than that in other relationships. In addition, in the `family relationship_1' when the relationships not more distant than the third degree are considered by the expected number of each relationship, the probability is almost the same as that in the relation of `not related' and `same subpopulation'. 4.2. Performance of the ®ve log10 LR values We examined the performance of the ®ve LR values for the same person and for the different persons. Each LR value was cumulatively calculated by the method for the genetic relationship in 100 reference

144

J.W. Lee et al. / Forensic Science International 116 (2001) 139±148

Table 2 The observed false match probability depending on the locus system when each genetic relationship is considered Locus system

Three Five Seven Nine

Suspect and criminal Not related

Same subpopulation

Family relationship_1

Family relationship_2

Brother

2.017Eÿ04 1.199Eÿ04 9.523Eÿ05 8.577Eÿ05

2.086Eÿ04 1.259Eÿ04 9.884Eÿ05 8.882Eÿ05

1.777Eÿ04 1.129Eÿ04 9.419Eÿ04 8.657Eÿ04

3.165Eÿ04 2.211Eÿ04 1.826Eÿ04 1.665Eÿ04

1.571Eÿ02 1.619Eÿ03 8.887Eÿ04 7.954Eÿ04

populations composed of 500 persons when the genetic relationships assumed for calculating match probability is considered. From now on, in all tables and ®gures, we consider the log10 LR values instead of LR values. In evaluating DNA evidence, the LR value is calculated only when two compared samples are matched, and thus we considered the cases in which two samples are matched. Figs. 1 and 2 show the performance of the ®ve methods in the nine-locus system when the same person is compared and when the different persons are compared, respectively. As shown in Figs. 1 and 2, the performance of the log10 LR values for the same person does not seem to be much different from that for the different persons when each method is compared. However, distribution of log10 LR values for the different persons lies left of the distribution for the same person. It seems that M3 gives the largest value and M5 gives the smallest value (M3 > M1 ˆ M2 > M4 > M5 ).

Now, we compare the performance of the ®ve log10 LR values in each of the locus systems and the genetic relationships. 4.2.1. Performance of the ®ve log10 LR values in each locus system Table 3 shows the performance of the ®ve log10 LR values for the same person and for the different persons when each genetic locus system is used. To compare the difference depending on each genetic locus system, each LR value was calculated by the method assumed in each relationship. As shown in Table 3, the larger number of loci gives the larger log10 LR values, and thus shows the larger variation. Moreover, in each method, the pattern seems to have signi®cant effect on log10 LR values. However, the increasing rate of log10 LR values does not show any particular difference among the ®ve methods, and gets slower when the number of used loci gets larger.

Fig. 1. Performance of ®ve methods for the same person (Hp) using nine-locus system (LLR ˆ log10 LR, Freq: ˆ P(log10 LR)).

J.W. Lee et al. / Forensic Science International 116 (2001) 139±148

145

Fig. 2. Performance of ®ve methods for the different person (Hd) using nine-locus system (LLR ˆ log10 LR, Freq: ˆ P(log10 LR)).

Though the performance of log10 LR values using M1, M2 and M3 shows little difference, among the three methods, the variation for log10 LR values is larger for the same person than for the different persons and is smallest when M3 is used. Of the ®ve methods, M1 and M5 give the largest and the smallest variation, respectively (M1 > M2 > M3 > M4 > M5 ).

4.2.2. Performance of the ®ve log10 LR values in each relationship Table 4 shows the performance of the ®ve log10 LR values for the same person and for the different persons in each genetic relationship when the ninelocus system is used. In the identi®cation using DNA evidence, the real relationship between two compared

Table 3 The log10 LR values for the same person (Hp) and for the different persons (Hd) depending on the locus system Method in population

Locus system

H

Mean

Median

S.D.

Range

H

Mean

Median

S.D.

Range

M1 in not related

Three Five Seven Nine

Hp Hp Hp Hp

4.628 7.274 9.681 11.911

4.489 7.126 9.554 11.762

0.771 1.025 1.150 1.307

7.539 12.00 14.045 18.459

Hd Hd Hd Hd

4.189 6.727 9.134 11.337

4.102 6.642 9.044 11.212

0.555 0.770 0.916 1.069

3.510 4.750 5.917 6.197

M2 in same subpopulation

Three Five Seven Nine

Hp Hp Hp Hp

4.636 7.219 9.555 11.837

4.563 7.123 9.463 11.730

0.766 1.009 1.150 1.330

5.537 8.355 10.137 13.454

Hd Hd Hd Hd

4.155 6.639 8.943 11.207

4.062 6.549 8.871 11.052

0.590 0.873 0.978 1.151

3.715 5.594 5.712 7.002

M3 in family relationship_1

Three Five Seven Nine

Hp Hp Hp Hp

4.782 7.477 9.884 12.238

4.787 7.469 9.854 12.222

0.628 0.859 0.978 1.166

3.435 6.194 7.367 10.059

Hd Hd Hd Hd

4.413 7.099 9.491 11.842

4.329 7.056 9.441 11.858

0.639 0.903 0.979 1.164

3.058 5.227 6.162 8.093

M4 in family relationship_2

Three Five Seven Nine

Hp Hp Hp Hp

4.007 6.340 8.533 10.874

4.011 6.317 8.525 10.870

0.147 0.331 0.400 0.481

0.804 1.831 2.474 3.270

Hd Hd Hd Hd

3.991 6.310 8.508 10.854

3.991 6.291 8.503 10.864

0.146 0.334 0.395 0.471

0.755 1.831 2.347 3.163

M5 in brother

Three Five Seven Nine

Hp Hp Hp Hp

1.761 2.881 3.906 5.029

1.762 2.881 3.909 5.038

0.025 0.057 0.071 0.096

0.093 0.280 0.421 0.557

Hd Hd Hd Hd

1.760 2.879 3.906 5.027

1.762 2.879 3.906 5.036

0.025 0.059 0.072 0.101

0.093 0.279 0.421 0.557

146

J.W. Lee et al. / Forensic Science International 116 (2001) 139±148

Table 4 The log10 LR values for the same person (Hp) and for the different persons (Hd) depending on the calculation of match probability in each relationship) (nine loci) Relationship

Method

H

Mean

Median

S.D.

Range

H

Mean

Median

S.D.

Not related

M1 M2 M3 M4 M5

Hp Hp Hp Hp Hp

11.911 10.886 10.333 8.288 3.952

11.762 10.806 10.274 8.258 3.946

1.307 0.938 0.795 0.468 0.145

18.459 10.773 8.366 4.500 1.345

Hd Hd Hd Hd Hd

11.337 10.498 10.020 8.118 3.902

11.212 10.452 9.968 8.094 3.900

1.069 0.830 0.722 0.438 0.139

6.197 4.925 4.105 2.747 0.801

Same subpopulation

M1 M2 M3 M4 M5

Hp Hp Hp Hp Hp

13.832 11.837 11.029 8.618 4.037

13.516 11.730 10.964 8.590 4.032

2.367 1.330 1.045 0.573 0.174

33.643 13.454 10.627 5.481 1.716

Hd Hd Hd Hd Hd

12.708 11.207 10.559 8.375 3.965

12.402 11.052 10.452 8.320 3.952

1.916 1.151 0.931 0.520 0.159

11.568 7.002 5.647 3.163 0.994

Family relationship_1

M1 M2 M3 M4 M5

Hp Hp Hp Hp Hp

16.741 13.510 12.238 9.252 4.236

16.545 13.454 12.222 9.252 4.237

2.998 1.579 1.166 0.616 0.187

32.976 13.167 10.059 5.066 1.517

Hd Hd Hd Hd Hd

15.735 12.957 11.842 9.052 4.178

15.622 12.983 11.858 9.069 4.187

2.844 1.560 1.164 0.623 0.187

19.874 11.178 8.093 4.136 1.209

Family relationship_2

M1 M2 M3 M4 M5

Hp Hp Hp Hp Hp

26.728 18.197 15.527 10.874 4.701

26.622 18.169 15.513 10.870 4.699

3.275 1.494 0.970 0.481 0.154

26.043 10.381 6.735 3.270 1.058

Hd Hd Hd Hd Hd

26.620 18.125 15.485 10.854 4.694

26.523 18.115 15.504 10.864 4.697

3.223 1.456 0.950 0.471 0.150

25.937 9.559 6.666 3.163 1.019

Brother

M1 M2 M3 M4 M5

Hp Hp Hp Hp Hp

33.827 21.138 17.628 11.899 5.029

33.822 21.143 17.622 11.920 5.038

2.550 0.857 0.516 0.257 0.096

16.148 5.580 3.241 1.561 0.557

Hd Hd Hd Hd Hd

33.893 21.093 17.627 11.897 5.027

33.845 21.141 17.667 11.921 5.036

2.839 0.864 0.552 0.272 0.101

16.148 5.580 3.241 1.561 0.537

persons is usually unknown, and thus it might be valid for the analyst to assume that a relationship is given. As shown in Table 4, the closer relationship commonly gives the larger log10 LR values for each method, and thus the variation of log10 LR values gets larger. Fig. 3 shows the performance of the ®ve methods for the same person in the population assuming each genetic relationship. Commonly in the ®ve genetic relationship, distribution of log10 LR values lies right in the order of M1, M2, M3, M4 and M5. For each method, log10 LR values in `not related' population are smaller than that in other related populations. The variation of the ®ve log10 LR values is larger when M1 is used for all populations. In addition, the log10 LR values get larger in the closer related population for each method and the increasing rate is highest when M1 is used (M1 > M2 > M3 > M4 > M5 ). Thus, as shown in Fig. 3, in `brother' population which is the closest population, the distribution of

Range

the ®ve log10 LR values gets separated from one another. As shown above, when M1 is used, the changes in the genetic relationship or the number of loci give the largest differences. On the other hand, M5 assuming the suspect might be a brother of the criminal is most insensitive to the change of the genetic relationship or the number of loci. That is, the sensitivity to these changes is in the order of M1, M2, M3, M4 and M5. For number of the loci, though the analyst composes the genetic locus system using the ®xed number of the loci and can control the variation of the LR values, it is not still clear how to consider the genetic relationship for two compared persons. For this reason, when the analyst intends to select a method for calculating match probability, the sensitivity to the genetic relationship might play a role as a standard for the selection. As shown above, the performance in the `family relationship_1' when the relationship is not

J.W. Lee et al. / Forensic Science International 116 (2001) 139±148

Fig. 3. Performance of ®ve methods for same person in each genetic relationship (LLR ˆ log10 LR, Freq: ˆ P(log10 LR)).

147

148

J.W. Lee et al. / Forensic Science International 116 (2001) 139±148

more distant than the third degree is almost the same as that in the relation of `not related' and `same subpopulation'. But when M3 which assumes the `family relationship_1' is used, the sensitivity to the genetic relationship and the variation of the LR value is smaller. Commonly in all situations, use of M1, M2 or M3 gives extremely large LR values, and the use of M5 gives small LR values. 5. Further considerations In the process of evaluating DNA evidence, the fact that two persons are not genetically related is usually assumed, and there is a constant decision rule which is being used regardless of the genetic locus system [11,12]. However, as shown above, the larger number of loci gives the larger LR values, and when the related two persons are compared, use of M1 assuming the two persons are not related overestimates LR value the most. In addition, use of the hypervariable loci such as STR brought down the expected value of false match, and thus there is a growing tendency the suspect is regarded as the criminal if two samples are only matched in the comparison of genotypes. However, considering the seriousness of the error that the innocent suspect is falsely incriminated as the criminal, if only the two genotypes are matched, ignoring the evidence even when the genetic relationship of two persons is asserted could be rather unfavourable to the suspect. In this paper, we suggested to calculate the match probability by using coancestry coef®cient when the family relationship was considered, and the comparison with some conventional approaches was conducted in the simulated population under the various types of locus systems and genetic relationships. When the DNA evidence is interpreted in court, one of the most pervasive distortions is that the genetic relation of the two compared persons is ignored, and it has been studied [13]. However, as explained in the above sections, the genetic relation of the two

compared persons such as the population and the family relation of two persons should not be ignored in the process of evaluating DNA evidence. References [1] D.J. Balding, R.A. Nichols, DNA pro®le match probability calculation: how to allow for population strati®cation, relatedness, database selection and single bands, Forensic Sci. Int. 64 (1994) 125±140. [2] T.R. Belin, D.W. Gjertson, M.-Y. Hu, Summarizing DNA evidence when relatives are possible suspects, J. Am. Statist. Assoc. 92 (1997) 706±716. [3] I.W. Evett, Evaluating DNA pro®les in a case where the defence is ``It was my brother'', J. Forensic Sci. Soc. 32 (1992) 5±14. [4] J.F.Y. Brook®eld, The effect of relatives on the likelihood ratio associated with DNA pro®le evidence, J. Forensic Sci. Soc. 34 (1994) 193±197. [5] D.J. Balding, R.A. Nichols, Signi®cant genetic correlations among Caucasians at forensic DNA loci, Heredity 78 (1997) 583±589. [6] L.A. Foreman, A.F.M. Smith, I.W. Evett, Bayesian analysis of deoxyribonucleic acid pro®ling data in forensic identi®cation applications, J. R. Statist. Soc., Ser. A 160 (1997) 429± 469. [7] K. Roeder, M. Escobar, B. Joseph, Kadane, I. Balazs, Measuring heterogeneity in forensic databases using hierarchical Bayes models, Biometrika 85 (2) (1998) 269±587. [8] L.A. Foreman, J.A. Lambert, I.W. Evett, Regional genetic variation in Caucasians, Forensic Sci. Int. 95 (1998) 27±37. [9] J.W. Lee, H.S. Lee, M. Park, J.J. Hwang, Paternity probability when a relative of father is an alleged father, Sci. Justice 39 (4) (1999) 223±230. [10] G.R. Han, Y.W. Lee, H.L. Lee, S.M. Kim, T.W. Koo, I.H. Kang, H.S. Lee, J.J. Hwang, A Korean Population Study of the 9 STR Loci FGA, VWA, D3S1358, D18S51, D21S11, D8S1179, D7S820, D13S317 and D5S818, Int. J. Legal Med., 1999, in press. [11] B. Devlin, N. Risch, K. Roeder, Forensic inference from DNA ®ngerprints, J. Am. Statist. Assoc. 87 (1992) 337±349. [12] I.W. Evett, B.S. Weir, Interpreting DNA Evidence, Sinauer Associates, Inc., 1992. [13] R. Lempert, Some caveats concerning DNA as criminal identi®cation evidence, Reverend Bayes Cardozo Law Rev. 13 (1991) 303, 308±309.