Available online at www.sciencedirect.com
Forensic Science International: Genetics 2 (2008) 159–165 www.elsevier.com/locate/fsig
FaSTR DNA: A new expert system for forensic DNA analysis Timothy Power, Brendan McCabe, Sally Ann Harbison * Institute of Environmental Science and Research Ltd., Mt Albert Science Centre, Private Bag 92 021, Auckland, New Zealand Received 2 August 2007; received in revised form 17 October 2007; accepted 15 November 2007
Abstract The automation of DNA profile analysis of reference and crime samples continues to gain pace driven in part by a realisation by the criminal justice system of the positive impact DNA technology can have in aiding in the solution of crime and the apprehension of suspects. Expert systems to automate the profile analysis component of the process are beginning to be developed. In this paper, we report the validation of a new expert system FaSTR DNA, an expert system suitable for the analysis of DNA profiles from single source reference samples and from crime samples. We compare the performance of FaSTR DNA with that of other equivalent systems, GeneMapperTM ID v3.2 (Applied Biosystems, Foster City, CA) and FSS-i3 v4 (The Forensic Science Service1 DNA expert System Suite FSS-i3, Forensic Science Service, Birmingham, UK) with GeneScan1 Analysis v3.7/Genotyper1 v3.7 software (Applied Biosystems, Foster City, CA, USA) with manual review. We have shown that FaSTR DNA provides an alternative solution to automating DNA profile analysis and is appropriate for implementation into forensic laboratories. The FaSTR DNA system was demonstrated to be comparable in performance to that of GeneMapperTM ID v3.2 and superior to that of FSS-i3 v4 for the analysis of DNA profiles from crime samples. # 2007 Elsevier Ireland Ltd. All rights reserved. Keywords: Automation; Expert system; DNA profile
1. Introduction The processing of forensic DNA samples and the interpretation of DNA profile data require significant resource both in terms of equipment and in highly trained personnel. The development of robotic equipment to automate the extraction of DNA from forensic samples [1–3], quantitation of the samples [4] and amplification [5], together with multi-capillary electrophoresis instrumentation has shifted this emphasis to the data analysis stage. Typically DNA profile analysis is undertaken by at least two scientists using GeneScan1/Genotyper1 software (Applied Biosystems, Foster City, CA, USA) and includes a manual review step. More recently, this software has been supplemented by GeneMapperTM ID v3.2 (Applied Biosystems, Foster City, CA, USA), an expert system that reduces scientist intervention. Any discrepancies can then be reviewed. The process of determining a DNA profile from a set of electrophoretic data includes direct measurements such as the number of relative fluorescence units (rfu) associated with a
* Corresponding author. Tel.: +64 9 8153969; fax: +64 9 8496046. E-mail address:
[email protected] (S.A. Harbison). 1872-4973/$ – see front matter # 2007 Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.fsigen.2007.11.007
given data point as well as the application of empirically determined rules or guidelines such as stutter proportions [6,7] and homozygote peak determinations for each laboratory system [8]. This complex process is therefore amenable to software driven automation, provided that the rule sets have been determined. An expert system is defined in the National DNA Index System (NDIS) DNA data acceptance standards appendix B(3) (2004) [9] as a software program or set of programs that are designed to replace one or both of the manual review processes. Certain criteria need to be met. An expert system should identify peaks/bands and assign alleles without human intervention, ensure data meets laboratory defined quality checks, describe the rationale behind decisions and not make incorrect allele calls. Expert systems have been developed in order to automate this part of the DNA process as much as possible [10–12] thus reducing the amount of time taken to analyse a large number of DNA profiles. We have developed an alternative system, FaSTR DNA that meets the criteria for an expert system. In this paper we describe the configuration and validation of FaSTR DNA in relation to the criteria described. We compare the performance of FaSTR DNA v2.1 with that of FSS-i3 v4 (The Forensic Science Service1 DNA Expert System Suite FSS-i3,
160
T. Power et al. / Forensic Science International: Genetics 2 (2008) 159–165
Forensic Science Service, Birmingham, UK) and GeneMapperTM ID v3.2 (Applied Biosystems, Foster City, CA, USA). All profiles were also analysed with GeneScan1 Analysis v3.7/ Genotyper1 v3.7 with accompanying manual review. 2. Materials and methods Samples for this study were selected to encompass a range of profiles (from ‘‘good’’ to ‘‘poor’’) that might be encountered in forensic work. Seventy-one single source reference samples and 73 crime sample DNA profiles were used to determine the optimum analysis settings for FaSTR DNA and FSS-i3. A total of 1013 DNA profiles (810 single source reference samples and 203 crime samples) representing a range of profiles and allele types were selected for subsequent evaluation. The single source reference samples were buccal samples transferred to FTA card on collection. The crime samples comprised 63 blood stains, 79 trace samples (cellular material on clothing or touched items), 25 semen stains, 16 cigarette butts, 16 single hairs and 4 bone and tissue samples. The DNA profiles included full single source DNA profiles, full single source DNA profiles showing low levels of a second contributor, DNA profiles showing degradation and/or inhibition, low level DNA profiles, DNA profiles of two, three and four person mixtures at various ratios, DNA profiles showing pronounced peak imbalance, DNA profiles exhibiting enhanced stutter, DNA profiles that were over-amplified and DNA profiles with rare, or off ladder alleles. The rare and off ladder alleles included two instances of primer binding site mutations at the D8S1179 locus, two somatic mutations at the D18S51 locus, one somatic mutation at the vWA locus, three examples of off ladder D3S1358 20/20.1 alleles, one TH01 8.3 allele, two D19S433 12.1 alleles and one example of a peak that fell between the D2S1338 and D16S539 loci and could not be unambiguously assigned. Samples were extracted using standard forensic extraction methods [13–15], quantitated using the QuantiBlot1 human DNA quantification kit (Applied Biosystems, Foster City, CA, USA) and amplified using AMPFlSTR1 SGM PlusTM PCR amplification kits [16] (Applied Biosystems, Foster City, CA, USA). Samples of amplified DNA were separated using an ABI Prism1 3130 Genetic Analyser with 16 capillaries in a 96-well plate format (Applied Biosystems, Foster City, CA, USA). 3. DNA profile analysis Samples were analysed using GeneScan1 Analysis v3.7/ Genotyper1 v3.7 and GeneMapperTM ID v3.2 according to the Manufacturer’s recommendations. A 50-rfu minimum peak height and a 20% peak filter were used in GeneScan1 Analysis v3.7/Genotyper1 v3.7. Appropriate allelic peaks not labelled because of the 20% filter, such as those present as a result of a low level mixture were manually added. Other labels such as those applied to peaks determined to be a result of ‘‘pull-up’’ were removed. Other analysis parameters, such as stutter ratios, were those as described in the FaSTR DNA analysis description shown below.
In each of the expert systems described below, rules are activated or fired when a DNA profile is not from a single source or when the quality of the profile is substandard. For example, mixed DNA profiles, profiles exhibiting preferential amplification or DNA profiles containing rare alleles. When such rules are fired, the expert system alerts the analyst to manually review the data and accept or reject the designation made. Samples were also analysed using FSS-i3 v4. In this study only the first application, i-STRess, was used and is referred to as FSSi3. GeneMapperTM ID v3.2 is required to pre-process the raw data from the ABI Prism1 3130 genetic analysers in order to assign quantitative information (base pair size, peak height and peak area) to all detected peaks before export to FSS-i3. FSS-i3 was used as directed by the supplier and was optimised for reference samples using a set of 71 single source reference samples. In order to minimise the number of stutter and pull-up peaks being recognised as allelic peaks, the stutter ratio and stutter distance settings were applied to all data exported from GeneMapperTM ID v3.2. Seventy-three crime stain DNA profiles were analysed using the reference sample parameters as a starting point including the application of stutter ratio and stutter distance settings to data exported from GeneMapperTM ID v3.2 (15% height, 3.25– 4.75 bp tolerance) and a 10% main peak filter. Amongst the samples were a number of samples from two or more person mixtures. The FaSTR DNA expert system differs significantly from FSS-i3 in that it takes raw data directly from the ABI Prism1 genetic analysers. The FaSTR DNA software peak detection rule recognises peaks according to a minimal number of directional changes within the data points of each dye set, based on changes in slope relative to the baseline. FaSTR DNA assigns peak labels (allele designations) based on a comparison to the allelic ladder within the batch and internal size standard and then applies a user determined rule set. The rule sets can be generic to the multiplex or specific to each locus. FaSTR DNA determines which samples require manual intervention by a scientist and which can be automatically accepted as full single source profiles requiring no further review. This adjustable rule set can be divided into peak detection and profile quality rules. In all cases, the software allows the analyst to intervene and assign alleles if required. When a sample is run on an ABI Prism1 genetic analyser, fluorescence data present below the minimum peak height threshold (e.g. in the range 1–49 rfu) that is superfluous to profile determination can be removed by the peak detection filter. Before analysis begins, the amplification kit relating to the samples is selected. The software identifies the first major peak within the identified ladder at each locus and labels the rest of the ladder. The approximate parameters for that first peak are part of the dataset relating to the ladder. This comprises the ladder peak labelling rule. The minimum peak height allowed can be defined, typically 50 rfu, split peaks are identified based upon slope changes within a single peak. Where the slope moves back down but does not reach the baseline and then moves up again a split peak rule activates.
T. Power et al. / Forensic Science International: Genetics 2 (2008) 159–165
Rare alleles are determined according to a programmed list of allowable alleles per locus. The rare alleles are given the putative designation of ‘R’, but the calculated actual size is visible in the data. Peaks 0.5–1.5 base pairs less than another peak are called n-peaks. The ratio of this peak to the following linked peak can be specified and was set to 20% for this validation. Peaks in a 3.25–4.75 base pair position less than a larger peak and of a user-defined peak height ratio are viewed as stutter. The stutter ratio used for this validation was 15%. The range can be varied per locus to include 3 bp and 5 bp stutter sometimes observed with Y STR data [7]. A stutter ratio can be defined for each locus separately or more generally for the multiplex used. The range in which a peak may be flagged as a possible pull-up peak can be set by scan number (the default is 3 scans) and by peak height. The default setting is 50% of the height of the over lapping larger peak. The height ratio of the larger to the possible pull-up peak can be adjusted for the whole data set or by locus. Rules and settings that relate to profile quality include a peak imbalance ratio rule that allows the user to set the allowable peak height ratio between heterozygous peaks, a minimum homozygote peak height rule that determines at what point allelic drop out may occur and a modified minimum homozygote peak height rule, which is activated when peak imbalance is seen at another locus in the same sample. For example, the minimum homozygous peak height rule may be set at 200 rfu, but if preferential amplification is observed at any other locus in the same sample then the minimum homozygous peak height rule for that particular sample could be raised. When more than two peaks are detected at a locus a minimum
161
number of peaks detected rule is activated at each locus indicating that manual interpretation is required. This will of course apply to any mixed DNA profile. Analysed profiles are visualised as electropherograms (EPG) that can be printed with or without an associated table of peak information (Fig. 1) in a number of different formats. All changes made by the operator are auditable and the data can be exported in several formats including .xls and .xml. Once the rule set has been applied and the data analysed, an option can be selected which allows only those profiles requiring intervention to be displayed. This will separate those samples requiring manual review. The scientist is automatically taken to the site of the rule firing in need of manual review and may agree with the allelic designation for a peak, ‘‘zero’’ the peak or remove the label (by zeroing the peak the analyst is querying the quality of the peak but flagging its presence). Once the dataset has been reviewed a table of results can be generated. Each DNA profile analysis system was tested to determine whether it met the requirements of the NDIS DNA data acceptance standards [9]. Our main criterion for acceptance was that each expert system correctly identified all peaks and that the expert system should automate DNA analysis as much as possible whilst bringing to the attention of the operator those profiles which need interpretation. This means that all mixtures and partial profiles should be identified for review, whilst the number of full profiles flagged for review will depend on the number of rules these profiles activate, which will in turn depend on the quality of the DNA profiles. In order to aid our evaluation of the performance of each system, we defined a number of measurable variables based on
Fig. 1. An example of a FaSTR DNA analysis window.
162
T. Power et al. / Forensic Science International: Genetics 2 (2008) 159–165
the above criteria. These included the number of samples flagged for manual intervention, the number of rules activated per locus and the number of loci flagged for intervention. A 4-way comparison macrotool was developed which allowed the final output from all of the previously analysed single source and crime sample sets of profiles to be compared and a list of any discrepancies arising between them to be generated. Every discrepancy found was investigated and the number of loci where discrepancies occurred was recorded. 4. Results Initial optimisation of FSS-i3 and FaSTR DNA using 71 single source reference samples and 73 crime sample DNA profiles ensured that an optimum ratio between data correctly passed without review and user intervention was achieved and no substandard profiles were missed. In all cases the results were compared with those obtained using GeneMapperTM ID v3.2 and GeneScan1 Analysis v3.7/Genotyper1 v3.7. No incorrect allelic designations were observed and a similar number of samples required review in both systems. A number of issues came to light regarding the functionality of FSS-i3 which precluded further optimisation with respect to crime sample DNA profile analysis. The main issue identified was the filtering out of many low level components of mixtures typically encountered in crime sample profiles after application of the main peak filter. It was difficult to obtain an appropriate balance between automation and the need to flag all profiles requiring review. Furthermore if the main peak filter was turned off or reduced, standard non-allelic profile characteristics required checking even with the relevant rule activated. In addition, unlike FaSTR DNA, FSS-i3 does not provide morphology-based images of the DNA profile. The spikograms displayed are less informative and a major change from the current system. Artefacts cannot be visually discriminated from peaks and mixture interpretation is more difficult. All mixtures required review using GeneMapperTM ID v3.2. As the FSS-i3 system could not output more than four alleles at a locus, the results from mixtures of more than two individuals could not be reported electronically. Three display options are available to the operator; major, minor and all. In order to visualise all of the designated peaks in a particular crime sample mixture ‘‘all’’ is the only viable option, however, if this option is used all of the peaks are designated, even those which activate rules that identify them as artefacts such as stutter and pull-up. These peak designations cannot be removed from the spikogram by the operator. For these reasons, it was decided that FSS-i3 in its current form was unsuitable for crime sample applications and was not investigated further for this sample type. 5. Concordance phase Following optimisation, a concordance study consisted of the analysis of 810 single source reference samples which were processed using FSS-i3, FaSTR DNA, GeneScan1 Analysis v3.7/Genotyper1 v3.7 and GeneMapperTM ID v3.2. The number of profiles which required analyst intervention following
GeneMapperTM ID v3.2, FSS-i3 or FaSTR DNA analysis were recorded and the DNA profiling results obtained after intervention were compared using the 4-way comparison tool. Every discrepancy between the analysis systems was investigated in order to determine whether the differences were caused by a wrong or missed call by an expert system or by operator error during the manual review step. The total number of discrepancies found was comparable in this sample set. For GeneMapperTM ID v3.2, 125 discrepancies were recorded, for FaSTR DNA, 127 discrepancies and for FSS-i3, 123 discrepancies were recorded. All three of the expert systems missed some peaks which had been called using GeneScan1 Analysis v3.7/Genotyper1 v3.7 by a scientist. On close examination of the data, it was concluded that this was due to the calculation of the relative baseline, which was higher in each of the three expert systems used causing peaks marginally above the threshold in GeneScan1 Analysis v3.7/ Genotyper1 v3.7 to be missed. In this sample set, FSS-i3 missed 30 peaks, GeneMapperTM ID v3.2 missed 46 and FaSTR DNA missed 37 such peaks. Not only did this affect samples at or close to minimum peak threshold (50 rfu) mainly in the 50–60 rfu range, but those containing stutter (15%) and low homozygotes (200 rfu) close to the threshold values. These missed peaks accounted for less than 0.5% of the total number of loci examined in each system, and fewer than 0.1% of these loci were not flagged for review. Where the missed peaks were not flagged for review (3 for GeneMapperTM ID v3.2, 11 for FaSTR DNA and 10 for FSS-i3), they occurred at loci with no alleles detected at all or where one of two alleles at a locus had fallen below threshold. Such partial or low level DNA profiles containing small amounts of DNA would not normally be acceptable for single source reference samples and would be reworked as part of a typical quality procedure. Differences due to operator error were also considered and can be divided into two types; those caused by morphological differences (more a subjective judgement call than a ‘‘true’’ error) and operator error when editing data. Differences in allele call of this type were observed 79 times for GeneMapperTM ID v3.2, 90 times for FaSTR DNA and 93 times for FSS-i3 for single source reference samples. These discrepancies are typically resolved using a comparison tool. In conclusion, in this sample set, no incorrect allelic designations were observed by the expert systems, all actions were auditable and all rule firings were explained as required by the NDIS DNA data acceptance standards [9]. The ability of each expert system to correctly identify those profiles in the sample set that required review was then assessed, for full profiles, partial and mixed profiles, samples giving no DNA profile and samples failing to analyse due to failed size standards. Table 1 shows the number of each type of profile (first figure) and those that required intervention (second figure). The number of full profiles obtained is significantly lower in GeneMapperTM ID v3.2. This is because GeneMapperTM ID v3.2 calculates the baseline at a slightly higher threshold and some peaks approaching minimum detection threshold were not designated.
T. Power et al. / Forensic Science International: Genetics 2 (2008) 159–165
163
Table 1 The number of samples in each profile category followed by the number of those samples requiring intervention in each analysis system Analysis system TM
GeneMapper FSS-i 3 FaSTR DNA
Full profiles
Partial profiles
Mixed profiles
No result
Failed to analyse
Need to view samples
693/145 713/188 709/220
94/94 79/78 80/73
1/1 1/1 2/2
13/13 13/3 11/2
9/0 4/0 8/1
253 (31%) 270 (33%) 298 (37%)
A total of 810 single source reference samples were used for the comparison.
The number of full profiles identified for intervention ranged from 21% using GeneMapperTM ID v3.2 to 31% using FaSTR DNA. More profiles were identified for review by FaSTR DNA. This is because FaSTR DNA identifies overlapping, but otherwise genuine alleles as requiring review. All of the profiles designated as mixtures by each system were correctly flagged for review. The difference between FaSTR DNA with two mixtures reported, and GeneMapperTM ID v3.2 and FSS-i3 with only one mixture reported was examined. FaSTR DNA correctly identified and labelled low level peaks from a second individual, while the GeneMapperTM ID v3.2 and FSS-i3 systems had identified extra peaks and these labels had been subsequently removed by the operator. Therefore this difference is due to user intervention. The large difference in the number of interventions where no result was obtained can be attributed to a rule which is activated in GeneMapperTM ID v3.2 when no result is obtained at a locus. This rule is present but was not applied to FaSTR DNA and FSS-i3 data. This also accounts for the less than 100% figure obtained for the interpretation of partial profiles. In FaSTR DNA, partial profiles that meet all other criteria are not identified as requiring attention. The number of samples that failed to analyse due to poor quality size standards differed for each system, with FSS-i3 having the least number of samples failing. As FSS-i3 uses data exported from GeneMapper TM ID v3.2 it might be expected that the number of samples which fall into this category might be the same for both of these systems. However as mentioned previously, in order to ensure the maximum amount of data was imported from GeneMapperTM ID v3.2 to FSS-i3, analysis methods were designed for both single source and crime sample optimisation where thresholds were set deliberately low compared to default crime sample/reference settings. This included GeneMapperTM ID v3.2 quality values which were decreased for data imported into FSS-i3, in effect meaning that size standards that failed for samples analysed using GeneMapperTM ID v3.2 were passed using FSS-i3. The data was then examined at the locus level. A significant difference in the number of loci identified for intervention (Table 2) was found. Expert systems identify loci for review if
one or more rules are activated at that locus. The expert system with the most loci identified for review was GeneMapperTM ID v3.2 (927 loci out of a total of 8910) and the least was FaSTR DNA (464 loci out of a total of 8910). From Table 2, it can be seen that an elevated proportion of D8S1179 alleles were flagged for interpretation using GeneMapperTM ID v3.2. For FSS-i3, an elevated proportion of FGA alleles were flagged and a high proportion of TH01 alleles, accounting for 37% of loci were identified for FaSTR DNA. This was found to relate to pull-up and is discussed further below. The average number of rules activated for each profile identified as requiring review was 5.4 rules per profile for GeneMapperTM ID v3.2, 3.9 rules for FSS-i3 and 1.8 rules for FaSTR DNA. Why were rules fired? The single source reference samples profiled in this study were amplified without prior quantitation and this led to a significant proportion of the DNA profiles from these samples exhibiting pull-up and spurious background artefacts, characteristic of over-amplified samples. This resulted in a higher number of rule activations in all three systems tested than was expected, 104 out of 810 samples used in this study accounted for 36%, 31% and 28% of the rules activated in GeneMapperTM ID v3.2, FSS-i3 and FaSTR DNA. For GeneMapperTM ID v3.2, 23% of the rules activated occurred because no DNA profile result was present for the sample. For example, this rule would be activated 11 times for no result in SGM PlusTM, once for each locus. This same rule is also activated a significant number of times in partial DNA profiles. Artefacts resulting from over-amplified samples accounted for a significant percentage of the remaining rules activated (42%). When using FSS-i3, artefacts present as a result of overamplified samples accounted for half of the rules (48%) activated in this system. Furthermore peaks present as a result of pull-up not only activated this rule but two further rules, ‘‘extra peak’’ and ‘‘peak morphology’’ further exacerbating the number of rules activated. For FaSTR DNA, the pull-up rule accounted for a significant proportion (53%) of the rules being activated, mainly in the
Table 2 The number of times individual loci were flagged for interpretation in the sample set of 810 single source reference samples
GeneMapperTM FSS-i 3 FaSTR DNA
D3
vWA
D16
D2
Am
D8
D21
D18
D19
TH01
FGA
97 66 64
99 44 11
97 25 19
64 25 18
56 42 4
114 89 38
71 40 19
75 52 30
72 50 64
87 59 172
95 83 25
Multiple rules may have activated at some loci. The total number of loci analysed was 8910.
164
T. Power et al. / Forensic Science International: Genetics 2 (2008) 159–165
Table 3 Number of crime samples analysed in each category and the number and percentage of those samples requiring intervention (N = 203)
TM
GeneMapper FaSTR DNA
Full profiles
Partial profiles
Mixed profiles
No result
Failed to analyse
Need to view profiles
70/11 59/29
35/35 45/38
55/55 57/57
43/43 41/5
0 1
144 (71%) 129 (64%)
yellow dye set. Fifty percent of the pull-up rule activations were at the TH01 and D19S433 loci and accounted for 97% and 88% of the total rules activated for TH01 and D19S433, respectively, in these samples. This is particularly evident if the TH01 or D19S433 loci were heterozygous and is a reflection of the lower peak height of the yellow dye set in this multiplex. In conclusion, comparable results were obtained for each expert system. All three systems encountered the same problem with over-amplified samples resulting in a significant proportion of the rules being activated. When using GeneMapperTM ID v3.2, 69% of profiles were passed without requiring review (equal to 89% of loci). For FSS-i3, 67% of profiles (93.5% of loci) were passed without review and for FaSTR DNA, 63% of profiles (94.8% of loci) were passed. Having successfully demonstrated the performance of FaSTR DNA in comparison to both GeneMapperTM and FSS-i3 our attention turned to the analysis of DNA profiles from 203 crime samples. For the reasons discussed previously FSS-i3 was not used for the analysis of these samples. First, discrepancies between the expert systems and GeneScan1 Analysis v3.7/Genotyper1 v3.7 analysis were investigated in order to ensure that no incorrect allele calls had been made. GeneMapperTM ID v3.2 missed 101 peaks that fell below the baseline or applied thresholds and FaSTR DNA missed 100 such peaks, approximately 4.5% of the loci scored. In all cases, the overall quality of the DNA profile ensured that rules were triggered at other loci in the sample and the profiles were reviewed. Lowering the analysis threshold was considered but not implemented as this would cause the same problem but in the opposite direction. Whilst this presents more of a problem in the analysis of crime samples, where low level and close to threshold peaks are often seen, it does not represent truly missed calls and could only be solved if two expert systems had identical baseline properties. A dual analysis system should detect these occurrences. Allele differences due to operator error were also investigated and were observed 68 times for GeneMapperTM ID v3.2 and 66 times for FaSTR DNA. These discrepancies were a result of subjective differences or simple operator error and would typically be resolved using a comparison tool. Use of an appropriate expert system should reduce such discrepancies.
In conclusion, no incorrect calls were made by either GeneMapperTM ID v3.2 or FaSTR DNA. The DNA profiles identified for user intervention were then investigated. Table 3 shows the number of each type of profile analysed (first figure) and the number of those profiles that required intervention (second figure). As expected, a considerable proportion of the samples required manual intervention. This was not surprising given that the samples were chosen to contain a high proportion of mixed samples (approximately 25% of the total) and partial profiles, where imbalance and low homozygous peaks were prevalent, all of which should require manual intervention. Thirty-two samples contained significant pull-up detected by FaSTR DNA and this caused the significant difference in the number of full profiles flagged for intervention from 16% using GeneMapperTM ID v3.2 to 49% using FaSTR DNA. There was minimal difference between the systems for samples that needed to be reviewed. The difference in the number of interventions for partial profiles and no profiles was again caused by the rule activated in GeneMapperTM ID v3.2 when no results are present. The number of samples and loci requiring review was compared to the number of rules fired. For GeneMapperTM ID v3.2, 144 profiles required review, equating to 1116 loci and corresponding to 1387 rules fired. For FaSTR DNA, 129 profiles required review, equating to 614 loci and corresponded to 923 rules fired. The results for each locus are shown separately in Table 4. Because the crime sample derived DNA was quantitated prior to amplification, fewer over-amplified samples were included and the over-representation of TH01 rule activations seen for the single source reference samples in Table 2 was not evident. When the samples for which no results were obtained were removed from consideration, the proportion of profiles which required no intervention was equivalent between the two systems and was 71% for GeneMapperTM ID v3.2 and 72.5% for FaSTR DNA. Examination of the data showed that the number of rules activated was independent of sample source. Mixed DNA profiles caused the most rules to activate as was expected. In conclusion, the function of an expert system is to automate DNA profile analysis as much as possible restricting operator input to only those samples which require review by a
Table 4 Number of times each locus was flagged for interpretation
TM
GeneMapper FaSTR DNA
D3
vWA
D16
D2
Am
D8
D21
D18
D19
TH0
FGA
91 59
108 65
95 59
111 51
80 50
98 59
107 46
109 44
104 76
102 62
111 43
Multiple rules may have been triggered at a locus. The total number of loci examined was 2233.
T. Power et al. / Forensic Science International: Genetics 2 (2008) 159–165
trained scientist. Each system must identify all data that requires review whilst allowing the remaining DNA profiles to be called automatically and without intervention. The performance of FaSTR DNA was compared to the performance of GeneMapperTM and FSS-i3 using a range of both single source reference samples and crime samples typically encountered in casework. The samples were chosen to include problematic data so as to challenge each system. We found that each system correctly identified allelic peaks and there were no instances of incorrect allele calls or rules that were incorrectly activated for either reference or crime samples. All three systems missed low level peaks previously scored with GeneScan1 Analysis v3.7/Genotyper1 v3.7 attributed to the calculation of the relative baseline and affected peaks in the 50–60 rfu range and peaks designated as stutters or homozygotes close to the threshold values. The number of samples passed without need for review for both single source reference samples and crime samples was comparable between systems. Whilst GeneMapperTM ID v3.2 and FSS-i3 flag the smallest number of profiles for review, they activate the largest number of rules per sample. Many of these rules, especially in GeneMapperTM ID v3.2, are very quick to resolve. When analysing crime samples, the overall analysis time for a batch of samples was comparable for GeneMapperTM ID v3.2 and FaSTR DNA with FaSTR DNA being slightly quicker based on less rules being activated and fewer user interventions required. For all expert systems, there was a significant saving of scientists’ time. The purpose of our investigation was to determine whether FaSTR DNA was an effective expert system for the analysis of single source reference and crime samples in comparison to GeneMapperTM ID v3.2 and FSS-i3. For single source reference samples, we found that FaSTR DNA was equal in performance to GeneMapperTM ID v3.2 and FSS-i3. For crime sample profiles, FaSTR DNA was comparable to GeneMapperTM ID v3.2 and had a number of operational advantages. Use of these two systems in parallel enables two independent ‘‘expert’’ analyses which will increase consistency and save analysis time. FaSTR DNA is used operationally at ESR and is available for purchase on a case by case basis. A demonstration version can be obtained from ESR on request. Acknowledgements We acknowledge the combined resources of the Forensic Biology Team at ESR for their patience whilst we carried out this study and for providing the original data. We are
165
particularly grateful to Jo-Anne Bright, Anna Seccombe, Judi Cullen and Delia Moss. This work would not have been possible without the Science Information Management Services Team at ESR who undertook the software development. Finally, we thank Johanna Veth and Jo-Anne Bright for their review of this work and their helpful suggestions. References [1] S.A. Greenspoon, J.D. Ban, K. Sykes, E.J. Ballard, S.S. Edler, M. Baisden, B.L. Covington, Application of the Biomek Laboratory Automation Workstation and the DNA IQTM System to the extraction of forensic crime samples, J. Forensic Sci. 49 (2004) 29–39. [2] S.A. Montpetit, I.T. Fitch, T.P. O’Donnell, A simple automated instrument for DNA extraction in forensic crime sample, J. Forensic Sci. 50 (2005) 555. [3] M. Rechsteiner, Applying revolutionary technologies to DNA extraction for forensic studies, Forensic Magazine, April/May 2006, http://www. forensicmag.com. [4] R.L. Green, I.C. Roinstad, C. Boland, L.K. Hennessy, Developmental validation of the QuantifilerTM real time PCR kits for the quantification of human nuclear DNA samples, J. Forensic Sci. 50 (2005) 1–17. [5] J.M. Butler, Forensic DNA Typing: Biology, Technology and Genetics of STR Markers, 2nd ed., Elsevier Academic Press, New York, 2005. [6] P. Gill, R. Sparkes, C. Kimpton, Development of guidelines to designate alleles using an STR multiplex marker system, Forensic Sci. Int. 89 (1997) 185–197. [7] B.E. Krenke, L. Viculis, M.L. Richard, M. Prinz, S.C. Milne, C. Ladd, A.M. Gross, T. Gornall, J.R.H. Frappier, A.J. Eisenberg, C. Barna, X.G. Aranda, M.S. Adamowicz, B. Budowle, Validation of a male-specific, 12locus fluorescent short tandem repeat (STR) multiplex, Forensic Sci. Int. 148 (2005) 1–14. [8] J.S. Buckleton, C.M. Triggs, S.J. Walsh, Forensic Evidence Interpretation, CRC Press, 2005 (Chapter 1), pp. 1–25. [9] National DNA Index System (NDIS), DNA Data Acceptance Standards Operational Procedures, Appendix B, Guidelines for Submitting Requests for Approval of an Expert System for Review of Offender Samples, revised May 19, 2004. [10] M. Bill, C. Knox, FSS-i3 expert systems, Profiles DNA 8 (2005) 8–10. [11] GeneMapper1 ID Software Version 3.1 Human Identification Analysis: User Guide, Rev. A, Applied Biosystems, Foster City, CA, US, 2004. [12] K. Kadash, B.E. Kozlowski, L.A. Biega, B.W. Duceman, Validation study of the true allele automated data review system, J. Forensic Sci. 49 (2004) 660–667. [13] J. Bright, S.F. Petricevic, Recovery of trace DNA and its application to DNA profiling of shoe insoles, Forensic Sci. Int. 145 (1) (2004) 7–12. [14] P.S. Walsh, D. Metzge, R. Higuchi, Chelex1 100 as a medium for simple extraction of DNA for PCR-based typing from forensic material, Biotechniques 10 (1991) 506–513. [15] D.M. Moss, S.A. Harbison, D.J. Saul, An easily automatable, closed tube forensic DNA extraction using a thermostable proteinase, Int. J. Leg. Med. 117 (2003) 340–349. [16] E.A. Cotton, R.F. Allsop, J.L. Guest, R.R.E. Farzier, P. Koumi, I.P. Callow, A. Seagar, R.L. Sparkes, Validation of the AmpFlSTR1 SGM PlusTM System for use in forensic casework, Forensic Sci. Int. 112 (2000) 151–161.