Forensic validation of a SNP and INDEL panel for individualisation of timber from bigleaf maple (Acer macrophyllum Pursch)

Forensic validation of a SNP and INDEL panel for individualisation of timber from bigleaf maple (Acer macrophyllum Pursch)

Journal Pre-proof Forensic validation of a SNP and INDEL panel for individualisation of timber from bigleaf maple (Acer macrophyllum Pursch) E.E. Dorm...

2MB Sizes 0 Downloads 19 Views

Journal Pre-proof Forensic validation of a SNP and INDEL panel for individualisation of timber from bigleaf maple (Acer macrophyllum Pursch) E.E. Dormontt (Conceptualization) (Methodology) (Validation) (Formal analysis) (Investigation) (Writing - original draft) (Writing review and editing) (Visualization) (Project administration), D.I. Jardine (Conceptualization) (Methodology) (Validation) (Investigation) (Writing - review and editing), K.-J. van Dijk (Conceptualization) (Methodology) (Investigation) (Writing - review and editing), B.F. Dunker (Methodology) (Validation) (Investigation) (Writing - review and editing), R.R.M. Dixon (Methodology) (Validation) (Investigation) (Writing - review and editing), V.D. Hipkins (Conceptualization) (Methodology) (Investigation) (Resources) (Writing - review and editing), S. Tobe (Methodology) (Formal analysis) (Writing - review and editing), A. Linacre (Methodology) (Writing - review and editing), A.J. Lowe (Conceptualization) (Methodology) (Resources) (Writing - review and editing) (Supervision) (Project administration) (Funding acquisition)

PII:

S1872-4973(20)30023-5

DOI:

https://doi.org/10.1016/j.fsigen.2020.102252

Reference:

FSIGEN 102252

To appear in:

Forensic Science International: Genetics

Received Date:

22 October 2019

Revised Date:

22 December 2019

Accepted Date:

19 January 2020

Please cite this article as: Dormontt EE, Jardine DI, van Dijk K-J, Dunker BF, Dixon RRM, Hipkins VD, Tobe S, Linacre A, Lowe AJ, Forensic validation of a SNP and INDEL panel for

individualisation of timber from bigleaf maple (Acer macrophyllum Pursch), Forensic Science International: Genetics (2020), doi: https://doi.org/10.1016/j.fsigen.2020.102252

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier.

Title: Forensic validation of a SNP and INDEL panel for individualisation of timber from bigleaf maple (Acer macrophyllum Pursch)

Authors: Dormontt, E. E.1, Jardine, D. I.1, van Dijk, K.-J.1, Dunker, B. F.1, Dixon, R. R. M.1, Hipkins, V. D.2, Tobe, S. 3, Linacre, A.4, Lowe, A. J.1 1. School of Biological Science, University of Adelaide, Adelaide, SA 5005, Australia 2. USDA Forest Service, National Forest Genetics Laboratory, Placerville, CA 95667, USA

ro of

3. Medical, Molecular and Forensic Sciences, Murdoch University, Murdoch, WA 6150, Australia 4. College of Science & Engineering, Flinders University, Adelaide, SA 5042, Australia

-p

*Corresponding author: Dr. E. E Dormontt, University of Adelaide, School of Biological Science, Adelaide, SA 5005, Australia, E-mail: [email protected] Highlights



Abstract:

re

lP



na



Illegal logging is one of the largest illicit trades in the world, with low risks of detection and prosecution, in part due to the lack of forensic identification tests for timber We report on the development of a DNA assay for bigleaf maple (Acer macrophyllum Pursch) that can be used to assign timber materials to their tree of origin (e.g. the stumps remaining in the forest after felling) SNP and INDEL typing used 131 genetic markers to screen an extensive database (n = 394) allowing for high levels of confidence of association (FST-corrected PID = 1.785 x 10-25) A full forensic validation process was conducted by adapting for timber the SWGDAM guidelines developed for human identification assays

ur



Illegal logging is one of the largest illicit trades in the world, with high profits and generally low risks

Jo

of detection and prosecution. Timber identification presents problems for law enforcement as traditionally used forensic methods such as wood anatomy and dendrochronology are often unable to confidently match wood evidence to the remains of illegally felled trees. Here we have developed and validated a set of genetic markers for individualisation in bigleaf maple (Acer macrophyllum), a high value timber species often felled illegally in the USA. Using 128 single nucleotide polymorphisms and three insertion/deletion markers developed through massively parallel sequencing, 394 individuals were genotyped on the MassARRAY® iPLEX™ platform (Agena Bioscience™, San Diego, USA) to produce a population reference database for the species. We

demonstrate that the resulting DNA assay is reliable, species specific, effective at low DNA concentrations (<1 ng/μL) and suitable for application to timber samples. The PID for the most common profile, calculated using an overall dataset level FST-correction factor, was 1.785 x 10-25 and PID-SIB across all individuals (treated as a single population) was 2.496 x 10-22. The further development of forensic identification assays for timber species has the potential to deliver robust tools for improved detection and prosecution of illegal logging crimes as well as for the verification of legality in reputable supply chains.

Keywords:

ro of

Wildlife forensic science; Forensic botany; Non-human DNA identification

Introduction:

-p

Illegal logging is a pernicious threat to biodiversity, to the protection and sustainable use of forests, and to the communities who rely upon legal utilisation of forest products for their livelihoods. Globally, illegal logging has an estimated worth beyond US$30 billion annually and in some countries

re

can constitute up to 90% of timber traded [1] although estimates of the extent of the problem are complex and confounded by differences in definitions, data sources and methodologies [2]. Whilst

lP

often considered to be predominantly a tropical forest issue, Illegal logging is also a significant problem in the US, with forestry economists estimating an annual value of US$1 billion in 2003 [3].

na

Until relatively recently, there has been little incentive for timber traders to actively seek to ensure the legality of their products, indeed trade in timber illegally sourced from overseas was legal in most countries (with the notable exception of Canada). However in 2008 the United States amended

ur

the Lacey Act to outlaw both international and domestic trade in illegally harvested wood, and the European Union followed quickly with their Timber Regulations (EUTR) in 2010 and Australia with

Jo

their Illegal Logging Prohibition Act, which was passed in 2012 [4]. Most recently, Japan’s Clean Wood Act came into force in 2017. Enforcement of illegal logging laws are challenging due to the lack of appropriate methods for identification of timber [5, 6], traditionally used forensic methods such as wood anatomy and dendrochronology are often unable to confidently match wood evidence to the remains of illegally felled trees. Despite recent interest in the development of appropriate methods [7-12] the forensic validation studies required to demonstrate suitability for legal casework [13, 14] are generally

lacking in the published literature for timber, hindering broader uptake for law enforcement purposes. DNA profiling provides a promising prospect for timber identification as genetic analysis has the potential to identify species, geographic origin as well as individual trees, each of which can be important in determining whether a crime has been committed [4, 5]. However, the application of genetic markers to timber presents several challenges. Wood is a very dense tissue with relatively little DNA, and that which can be extracted is generally of low quality and degradation increases with age [15]. Traditional forensic markers such as STRs are often not reliably amplified from timber extracted DNA, as the sequence length required is too long [16]. Further to these technical

ro of

challenges, there is also ambiguity around how appropriate forensic validation may be conducted for genetic identification of timber. Existing guidelines [17] were developed predominantly for human identification and are not straightforwardly adapted to non-humans. The discipline of wildlife

forensics has considered this problem in some depth for the validation of identification assays for

-p

animals [18, 19], but has so far neglected the same consideration for the identification of plants [20]. Across the Pacific Northwest, bigleaf maple (Acer macrophyllum Pursch) is regularly stolen from national forests (Fig. 1a-d) and finds its way into supply chains for the music wood industry. We

re

aimed to develop a DNA assay that would allow timber evidence, such as sawmill off-cuts, to be compared to the remains of felled bigleaf maple trees in the forest to determine a probability of

lP

identity. This paper presents the first publication of a developmentally validated individualisation assay for a timber species. In the course of this work we have interpreted and adapted the SWGDAM validation guidelines for DNA analysis methods [17] specifically for timber, and provide a framework

Methods:

Jo

Sampling

ur

Document S1).

na

for future robust developmental validation studies on other timber species (Supplementary

Field sampling of Acer macrophyllum was conducted in 2014, with a focus on locations in the state of Washington around the Gifford Pinchot National Park, and on the Olympic Peninsula where United States Department of Agriculture (USDA) Forest Service agents have observed evidence of extensive illegal logging (Anne Minden, personal communication, 2013). More distant locations were sampled from northern Washington, southern Canada, Oregon and California (Fig. 1, Supplementary Table S1). Field collections were taxonomically verified with two voucher specimens submitted to the University of Washington Herbarium in Seattle (accession numbers WTU403124 and WTU403125).

Each tree was sampled for cambium using a hollow leather punch and a mallet. Several additional samples were used to facilitate the forensic validation (summarised in Supplementary Table S2). These were leaves, sawn timber and mature seeds collected from the ground around the base of mother trees. Initially it had been planned that seeds would be collected directly from the mother trees to ensure correct maternity assignments, however logistical constraints limited access to the field collection sites until after the trees had dropped most of their seeds for the year. The cambium samples and seeds were delayed in customs between the USA and Australia for some before germination/extraction. All cambium, leaf and wood/timber samples were stored in silica gel prior to DNA extraction. Seeds were stored in paper bags prior to germination.

ro of

DNA extractions DNA was extracted from cambium, leaves and sawn timber in 2014, and from germinated seedlings in 2016, using either the Nucleospin Plant II Kit (Machery-Nagel, Düren, Germany) with the PL2/PL3 buffer system (for cambium, seedlings and leaf samples) undertaken at the Australian Genome

-p

Research Facility (AGRF) in Adelaide, or a patented timber extraction method [21] (for sawn timber) undertaken at the University of Adelaide. Leaf and cambium samples for extraction were arranged into 96 well plates, each containing one well with tissue from an identical sample (sample ID:

re

AM_043_0439), to ensure correct orientation of the plate through the subsequent genotyping procedures, and to act as a positive control for between-plate differences in genotyping results.

lP

Separate reagent blanks (negative controls) for the DNA extraction and genotyping procedures were also analysed. Ninety-five pairs of technical replicates were included (two DNA extractions with one PCR per extraction). In 2019, twenty-one cambium samples (plus two negative controls) were re-

na

extracted using the BR24 (Omni International, Kennesaw, USA) for homogenisation, and the DNeasy® Plant Mini kit run on the QIAcube® (Qiagen Inc., Valencia, USA) for extraction, undertaken

ur

at the University of Adelaide. Genotyping

Jo

Of the 199 single nucleotide polymorphism (SNP) markers and five insertion/deletion (INDEL) markers from previously published work on A. macrophyllum [22], 183 were successfully remultiplexed into five groups for the MassARRAY® iPLEX™ platform (Agena Bio-science™, San Diego, USA) using Assay Designer 4.0 (Agena Bio-science™, San Diego, USA). Amplification primers and extension primers were supplied by IDT (Integrated DNA Technologies, Coralville, USA), and contained the 5’ 10mer tag (ACGTTGGATG) to remove them from the observed mass window and provide stability in a multiplex PCR. Genotyping was performed on the MassARRAY® iPLEX™ platform using iPLEX™ GOLD chemistry (Agena Bio-science™, San Diego, USA) according to manufacturer’s

specifications in 384 well format at AGRF in Brisbane. Samples were amplified in a 5 µL PCR multiplex reaction composed of 1 × PCR buffer, 2 mM MgCl2, 500 µM deoxynucleotide triphosphates (dNTPs), 0.1 µM each PCR primer, 1U of PCR enzyme, and 2 µL DNA (DNA concentration was 10 ng/µL where available, lower concentrations were used when DNA extractions failed to yield ≥10 ng/µL or where dilutions were used to test assay sensitivity). The thermal cycling conditions consisted of a first denaturation step at 94 °C for 2 minutes, followed by 45 cycles of denaturation at 94 °C for 30 seconds, annealing at 56 °C for 30 seconds, and extension at 72 °C for 1 minute, with a final extension step at 72 °C for 5 minutes. To neutralize unincorporated dNTPs, PCR products were treated with 0.5 U shrimp alkaline phosphatase by incubation at 37 °C for 40 minutes, followed by enzyme inactivation by heating at 85 °C for 5 minutes. The amplification products were used without

ro of

further purification in a 9 µL primer extension reaction composed of 0.222 × iPLEX buffer, 0.222 x iPLEX termination mix, 1.35 U iPLEX enzyme, and the extension primer mix (5-15 µM). The iPLEX

extension cycling conditions were 94 °C for 30 seconds, followed by 40 cycles of a denaturation step at 94 °C for 5 seconds, 5 cycles of annealing at 52 °C for 5 seconds and extension at 80 °C for 5

-p

seconds and a final extension step at 72 °C for 3 minutes. After desalting of the products by using

SpectroCLEAN resins (Agena Bio-science™, San Diego, USA) following the manufacturer’s protocol,

re

cleaned extension products were dispensed onto a 384 SpectroCHIP array using an RS1000 Nanodispenser. SpectroCHIPs were fired in a Compact mass spectrometer. Spectra were acquired using SpectroAcquire software (Agena Bio-science™, San Diego, USA), and data analysis, including

lP

automated allele calling, was completed using MassARRAY Typer software, version 4.0.5 (Agena Bioscience™, San Diego, USA). After automated allele calling, all data were manually reviewed to ensure appropriate calling decisions were made (e.g. observation of degree of data skew, i.e. ‘allele balance’

na

sensu [23]) (see Methods – Quality Controls). Genetic marker selection

ur

Where multiple samples from the same individuals were genotyped, if both profiles passed QC thresholds, a consensus profile was produced filling in any available missing data from the replicate

Jo

profiles for inclusion in the reference data set. Any loci which consistently failed to amplify were excluded. Conformity to Hardy-Weinberg equilibrium (HWE) expectations and linkage disequilibrium between loci were assessed with Genepop [24] using Fisher’s exact tests on the reference data set. Observed heterozygosity was calculated using GenoDive across all sub-populations (collection sites) [25]. Markers were validated according to the guidelines provided by the Scientific Working Group on DNA Analysis Methods [17] which we interpreted specifically for individualisation in timber (Supplementary Document S1). The probability of identity (PID) was calculated incorporating FSTcorrection factors as per [26]. FST was calculated per locus for the most common profile using

Genepop [24] and was also calculated as an standard value for the dataset by taking the mean of the per locus values plus three standard deviations. The more conservative probability of identity for siblings (PID-SIB)[27] was calculated using GenAlEx [28, 29] both within each subpopulation and across all samples (treated as a single population). Quality controls Quality control (QC) thresholds were implemented at two different stages to mitigate the risk of false positive DNA profile inclusion. The first stage was during the allele calling from the MassARRAY® iPLEX™ platform, the second stage was during subsequent data analysis. The following thresholds were applied during manual review of the automated allele calling completed using

ro of

MassARRAY Typer software, version 4.0.5 (Agena Bio-science™, San Diego, USA): average signal

intensity (peak height) greater than 2; allele signal intensity (peak height) greater than 0.5; genotype yield greater than 0.5 (genotype yield is a ratio of unextension primer (UEP) peak signal intensity to allele signal intensity, which measures the percentage assay conversion from UEP starting material

-p

to the allele products); loci-fail rate less than 40% (except in the case of negative controls where no loci-fail rate threshold was applied, any samples with more than 40% of failed loci were deemed to have failed to produce a profile at all). The following thresholds were determined during subsequent

re

data analysis and applied to the reference data set: the loci-pass rate, which is the minimum number of loci that produced an allele call in a profile (this threshold was applied after the loci-fail rate

lP

threshold applied during allele calling from the MassARRAY® iPLEX™ platform); minimum DNA concentration used for genotyping. Thresholds were chosen by examination of the negative controls,

specificity).

na

technical replicates and results of the sensitivity analyses (see Results: Sensitivity and species

Inheritance assessment

ur

The inheritance mode of the genetic markers used was assessed by observing the profiles of mothertrees and their offspring. DNA was extracted from the cambium of the mother trees as described

Jo

previously. Offspring profiles were assessed through germination of the seeds collected from around the mother trees and subsequent DNA extraction and genotyping of the resulting seedlings. Prior to germination, the seeds were dried and stored in paper bags at room temperature. Germination was achieved using a combination of stratification and gibberellic acid as described below. The ‘wing’ of each seed was removed. Seeds were soaked in 4% bleach solution for five minutes before rinsing thoroughly with distilled water to protect against fungal infection. Seeds from each mother tree were germinated separately by placing eight seeds into a 90 mm diameter petri

dish on top of two filter papers. Filter papers were soaked with a 200 mg/L of gibberellic acid potassium salt solution [30] and dishes were incubated in a closed box for one month within a 4 °C fridge. For three mother-trees, and additional 16 seeds were germinated in the same way. Seeds were checked every 4–6 days, maintaining moist filter papers with the gibberellic acid solution and changing filter papers if a large amount of brown exudate was present. Seedling embryos were dissected from the pericarp and seed coat using a scalpel and tweezers (Supplementary Fig. S1) and DNA extracted as described previously. Non-linked loci conforming to Hardy-Weinberg expectations and with an observed heterozygosity >0.1 were assessed in the profiles of mother-trees and seedlings which passed QC thresholds. Each locus was scored as a

ro of

concordant when the seedling and mother-tree profiles shared at least one allele, according to Mendelian inheritance expectations. If the seedling and mother-tree profiles did not share any

alleles, the locus was scored as discordant. The winged samaras of A. macrophyllum are adapted to disperse seeds away from the mother-tree, so collections on the ground around the base of trees

may include seeds from different trees. To identify seedlings incorrectly assigned to a mother-tree, a

-p

simulation approach was employed using Resampling Stats for Excel v4.0 (Statistics.com). This approach was based on the premise that discordance associated with incorrect parentage

re

assumptions will be non-randomly distributed throughout the individual seedling profiles, i.e. if inheritance patterns of a particular locus are inconsistent, we would expect to see discordance in

lP

comparisons between offspring and mother trees at that locus, across many individuals; alternatively if a seed has been erroneously assumed to be from a particular mother tree, we would expect to see discordance between the profiles of that particular seedling and its putative mother

na

tree across many loci. To test this, ten thousand simulated seedling profiles were generated containing same the number of loci as genotyped in the real samples. Loci were assigned as concordant or discordant between mother tree and seedling at each locus, using the data observed

ur

in the real mother-tree and seedling comparisons and selected without replacement. Numbers of discordant loci in the simulated seedlings were calculated and their distributions observed

Jo

(Supplementary Fig. S2). In the real seedling and mother-tree data set, any seedling containing more discordant loci than observed in 99% of the simulated seedlings was removed. Loci where discordance was observed in the remaining seedling and mother-tree comparisons were removed from the final data set to maximise conservativeness in loci selection. Sensitivity The MassARRAY® iPLEX™ platform protocol recommends DNA concentrations be approximately 10 ng/μL where possible. The sensitivity of the assay (the ability to obtain reliable results from a range

of DNA quantities) was assessed using a range of DNA concentrations in two individuals. The concentrations were 5, 2.5, 1.25 and 0.625 ng/μL (along with the recommended 10 ng/µL). Species specificity To assess species specificity, loci were amplified from DNA from the following non-target species at a concentration above the lower limit detected in the sensitivity analysis. Species were chosen to represent a range of increasingly distantly related species from A. macrophyllum. The species were Acer circinatum (vine maple), same genus (Acer); Pericopsis elata (African teak), same superorder (Rosanae); Austrostipa sp. (grass), same class (Magnoliopsida); Schizymenia dubyi (seaweed), same kingdom (Plantae); Pteropus conspicillatus (fruit bat), different kingdom (Animalia).

ro of

Precision, accuracy and case-type samples

To test repeatability, profiles from the technical replicate pairs which passed QC requirements were checked for concordance. The minimum number of loci that were compared was 119 out of 131

-p

(which would occur if both profiles only just met the minimum loci-pass rate threshold and had failures in completely different loci from one another). To test reproducibility, concordance of results between different extraction methods and operators were compared using five paired

re

cambium and sawn timber samples. Samples of cambium and timber were extracted using different protocols and by different operators in different laboratories (see Methods: DNA extractions). The

lP

ability to obtain results from DNA recovered from different tissue types was evaluated. Specifically, profiles generated from DNA extracted from leaves and cambium of the same trees were compared (seven trees in total), as were profiles from DNA extracted from the cambium and sawn timber of

na

the same tree (five trees in total). Fourteen additional (unpaired) sawn timber samples were genotyped to further assess the reliability of the assay when applied to DNA from case-type samples. See Supplementary Table S2 for a summary of samples). Three years after the initial extractions,

ur

DNA was re-extracted from the cambium of twenty-one A. macrophyllum individuals randomly selected from across the DNA plates originally genotyped. DNA from these samples were re-

Jo

genotyped to verify the reliability of the original profile results, due to the detection of potential contamination in a negative control sample (see Results: Quality control thresholds). These samples were extracted using a third protocol (see Methods: DNA extractions) further contributing to the assessment of reproducibility as well as effects of tissue storage time.

Results:

A set of 131 genetic markers (Supplementary Table S3) were forensically validated for genetic individualisation in the bigleaf maple (Acer macrophyllum). A reference database of 394 individual A. macrophyllum trees from 43 separate collection sites (Fig. 1, Supplementary Table S1) were successfully genotyped on the MassARRAY® iPLEX™ platform (Agena Bio-science™, San Diego, USA). Quality control thresholds A minimum loci-pass rate of 95% was applied to all profiles (Fig. 2). This threshold was determined through examination of all A. macrophyllum samples and negative control (reagent blank) profiles, and selected to maximise the inclusion of informative profiles whilst mitigating the risk of false positive profile inclusion (Fig. 2). A minimum DNA concentration threshold of 0.625 ng/μl was also

ro of

applied to all samples (see Results: Sensitivity and species specificity). One negative control sample

showed amplification in 74% of loci. The generated profile was below the pass-rate threshold of 95% and DNA concentration was below the minimum of 0.625 ng/μL. However, as this result is very

different from the other negative controls (Fig. 2), we conclude it represents a contamination event.

-p

The generated profile was not a match to any other samples in the data set. No other potential

sources of Acer contamination could be identified. To mitigate the risk that contamination could have impacted the resulting profiles of other samples in the reference data set, a subset of cambium

re

from 22 individuals across the extraction plates were re-extracted (see Methods: DNA extractions) along with new negative control reagent blanks. All sample profiles generated from these re-

lP

extractions pass QC thresholds. Both negative controls failed to meet QC thresholds for samples (as expected) with 2% and 4% loci-pass rates and undetectable DNA concentrations. Nineteen of the 21 comparisons to the original profiles in the reference data base showed 100% concordance. Two

na

comparisons showed discordance in one allele (at different loci in each comparison). Overall the error rate between the re-extracted profiles and the original profiles from the reference data base

ur

was 0.1% per locus.

Final genetic marker selection

Jo

Of the 183 genotyped loci, 135 were selected that conformed to within-subpopulation HardyWeinberg equilibrium expectations, were not significantly linked to other loci and had an observed heterozygosity of more than 0.1. A further four loci were excluded after inheritance analysis (see Results: Mendelian inheritance) giving a final set of 131 loci (Supplementary Table S3). Mendelian inheritance Twelve of the 20 mother trees genotyped passed QC thresholds so were suitable for further analysis. Profiles of 98 seedlings germinated from seeds collected from around these mother trees also

passed QC so were suitable for comparison. In total 13,172 loci comparisons were made between mother trees and seedlings. Of these, 221 were discordant (profiles of mother trees and seedlings did not share at least one allele). Simulations to identify seeds incorrectly assigned to a mother-tree (see Methods: Inheritance assessment) led to the exclusion of 23 seedlings from comparisons (Fig. S2). For the final assessment of the inheritance of the markers, 10,074 comparisons were made between 75 seedlings and 10 mother-trees. Discordance was observed in five comparisons across four loci. Overall the error rate between the seedlings and the mother trees was 0.05% per locus; however, to maintain conservativeness, the four loci where discordance was observed were subsequently excluded from further analysis.

ro of

Sensitivity and species specificity All DNA concentrations assessed returned the expected profile and were above the QC threshold for minimum number of loci passed (95%). The validated limits of DNA concentration are therefore 0.625-10 ng/µL. In the specificity analysis, all samples from different species failed to produce

loci, all other species tested did not amplify at any loci.

re

Precision, accuracy and case-type samples

-p

profiles that passed QC thresholds. The congener Acer circinatum did return allele calls in 76% of

Of the seven paired leaf and cambium samples, DNA from one cambium and one leaf sample failed

lP

QC. The five remaining pairs showed 100% concordance between the leaf and cambium tissue types at all loci. Of the five paired cambium and sawn timber samples extracted using different methods in different laboratories by different operators, DNA from one cambium sample and its paired sawn

na

timber sample failed QC. The four remaining pairs showed 100% concordance between the cambium and sawn timber tissue types at all loci. Of the 19 total sawn timber samples (five paired with cambium samples, 14 unpaired) the overall number of samples passing QC was nine, representing an

ur

approximate success for profile generation from DNA from sawn timber of 47%. The error rate was assessed by comparing the allele calls of 87 pairs of technical replicates, giving a total of 11360 loci

Jo

comparisons. No discordant loci were observed, giving an error rate per locus of <0.009%. Probability of identity The probability of identity (PID) of the most common profile was 6.182 x 10-30, and the PID of the rarest was 1.227 x 10-120, calculated using locus specific FST-correction factors. When an overall dataset level FST-correction factor was used, the PID of the most common profile was 1.785 x 10-25, and PID of the rarest was 2.616 x 10-81. The probability of identity for siblings (PID-SIB) across all individuals (treated as a single population) was 2.496 x 10-22, and the most conservative (i.e. the

highest) PID-SIB within individual subpopulations was 7.513 x 10-17 found in subpopulation 38 located in northern Washington (Fig. 1). Discussion: This paper represents the first publication of the development and forensic validation of individualisation markers for a timber species, in this case the bigleaf maple (Acer macrophyllum). A set of 128 SNP and 3 INDEL markers (Supplementary Table S3) were forensically validated for genetic individualisation on the MassARRAY® iPLEX™ platform using a reference data base of 394 individual trees from 43 collection sites (Fig. 1, Table S1). The discriminatory power afforded by these markers is extremely high; the most conservative estimate found that the chance of two siblings from within

ro of

the same subpopulation sharing the same DNA profile (PID-SIB in subpopulation 38 in northern

Washington, Fig. 1) was 7.513 x 10-17. The most conservative estimate of the chance of unrelated individuals sharing the same DNA profile (PID for the most common profile, calculated using an overall dataset level FST-correction factor), was 1.785 x 10-25.

-p

Progress on the use of DNA methods for forensic identification of timber has lagged behind that of humans and wildlife [20]. However, both wildlife and timber crimes are now receiving increasing

re

international attention with a growing awareness of the fragility of natural ecosystems in the face of rampant poaching [6, 31]. Legislation such as the United States amended Lacey Act 2008, the

lP

European Union’s EUTR 2010, Australia’s Illegal Logging Prohibition Act 2012 [4] and most recently Japan’s Clean Wood Act 2017, along with more timber species listed on the appendices to CITES, provide the opportunity to stop illegal trade but inevitably require parallel investment in forensic

na

tools that facilitate identification and prosecution of these crimes [5]. Identification of timber using DNA also presents some technical challenges. The low quantity and quality of extractable DNA requires the use of genetic markers and a genotyping platform robust to

ur

these limitations. The use of single nucleotide polymorphisms (SNPs) and small insertion/deletions (INDELs) is superior in this respect to short tandem repeats (STRs), the traditional genetic marker of

Jo

forensic DNA identification, because they do not require such large intact fragments of DNA to provide an accurate result. Use of the MassARRAY® iPLEX™ platform for genotyping also allowed accurate genotyping from very low concentrations of DNA. Despite the use of these approaches, overall DNA extraction and genotyping success from timber material was relatively low (47%). This success rate is likely the result of a combination of low DNA quantity and quality in the wood to begin with, along with co-extraction of inhibitory compounds that negatively affect PCR success. Opportunities to increase the success rate of the assay for timber samples will likely centre around

improvements in the extraction method, such as modified lysis and wash buffers to better remove co-extracting inhibitors. Very sensitive assays present problems with potential for contamination, as we experienced in this project. Despite undetectable DNA concentrations, substantial amplification was observed in a negative control reagent blank. Whilst below the pass-rate threshold imposed as part of QC (Fig. 2), and producing a partial-profile not seen in any other individual in the reference data base, this result appeared to indicate that contamination had occurred. To assess the reliability of the original profiles generated from samples processed alongside this negative control, a subset of original samples were re-extracted using a different protocol with additional negative control reagent

ro of

blanks, and subsequently genotyped. All profiles matched those generated from the original DNA extractions, verifying the reliability of the original profiles generated.

The contamination in the negative control is thought to have occurred during the DNA extraction

procedure at AGRF Adelaide, where samples in 96 well plates were homogenized in a bead mill, and

-p

fine particles of powdered material could have become airborne when tubes were opened, eventually landing in other wells. In cases where another sample is present in the well, this

contamination has had no discernible effect as the contaminant particles represent a tiny fraction of

re

the material present in the tube, and hence cannot affectively compete in the PCR and is not detected in the amplified products. However, when the well is empty, as in a negative control

lP

reagent blank, the contaminant material is the only signal present and hence is amplified. The novel partial-profile observed is likely a result of contamination particles from multiple sample wells entering the negative control well during the bead mill homogenization step of the extractions

na

undertaken at AGRF Adelaide. This contamination risk presents potential problems for the application of the assay in casework, and hence we recommend that all DNA extractions for case work be undertaken in single tubes as opposed to plates. For example, our re-extractions were

ur

completed using single tubes as opposed to plates, homogenised in the BR24 (Omni International, Kennesaw, USA) and extracted using the DNeasy® Plant Mini kit run on the QIAcube® (Qiagen Inc.,

Jo

Valencia, USA) and the reagent blank negative controls for these extractions were clear of contamination.

A final hurdle for the development of timber identification genetic markers appropriate for use in legal cases, has been the lack of a clear interpretation of validation guidelines. The current standard guidelines [17] were developed with human DNA identification in mind. Here we have worked through the guidelines, interpreted them specifically for timber identification and have made recommendations to provide a framework for future studies (Supplementary Document S1). Our

paper also represents the first publication (of which we are aware) that demonstrates the appropriate use of a FST-correction factor in the calculation of PID when used with SNP data. This correction is particularly important to ensure that the estimation is conservative and not inflated due to strong subpopulation differentiation. Broader application of the developed DNA assay could provide a means for legitimate traders in bigleaf maple to demonstrate their compliance with the law and sustainability practices, through the linking of products to their original felling sites providing supply chain verification [4, 16]. Further work on the species should examine the potential to use these markers for geographic assignment [32], to link products back to a region of origin without the requirement for an additional sample

ro of

from the original tree [33-35], similar methods have been successfully developed for identification of the origin of ivory [36] and for verification in the seafood sector [37]. Conclusion:

The present study reports on the development and forensic validation of a DNA assay for the

-p

identification of bigleaf maple, Acer macrophyllum, a high value timber species often illegally felled in the USA. The resulting DNA assay was forensically validated, including characterisation of the

re

inheritance mode, sensitivity, specificity, precision and accuracy to ensure suitability for legal applications. The DNA assay developed could also be used to support law enforcement efforts to

lP

detect and deter illegal logging activity, as well as to support industry compliance efforts through

na

supply chain verification of legality and sustainability.

CRediT Author Statement

ur

Eleanor Dormontt: Conceptualization, Methodology, Validation, Formal Analysis, Investigation, Writing – Original Draft, Writing – Review & Editing, Visualization, Project Administration

Jo

Duncan Jardine: Conceptualization, Methodology, Validation, Investigation, Writing – Review & Editing Korjent van Dijk: Conceptualization, Methodology, Investigation, Writing – Review & Editing Bianca Dunker: Methodology, Validation, Investigation, Writing – Review & Editing Rainbo Dixon: Methodology, Validation, Investigation, Writing – Review & Editing Valerie Hipkins: Conceptualization, Methodology, Investigation, Resources, Writing – Review & Editing Shane Tobe: Methodology, Formal Analysis, Writing – Review & Editing Adrian Linacre: Methodology, Writing – Review & Editing

Andrew Lowe: Conceptualization, Methodology, Resources, Writing – Review & Editing, Supervision, Project Administration, Funding Acquisition

Conflict of Interests None

Acknowledgements: We thank the World Resources Institute (WRI) and Double Helix Tracking Technologies for funding

ro of

this work, the USDA Forest Service Agents for bringing big leaf maple theft to our attention, Anne Minden and Jason O'Brien for their sampling efforts, Richard Cronn and Edgard Espinoza for their

advice on project design, Marlee Crawford and Dona Kireta for their assistance in the laboratory, and

Jo

ur

na

lP

re

-p

finally David Hawkes and Trent Peters for their expertise and advice on MassArray analyses.

References:

Jo

ur

na

lP

re

-p

ro of

[1] C. Nellemann, INTERPOL Environmental Crime Programme, Green carbon, black trade: illegal logging, tax fraud and laundering in the worlds tropical forests. A rapid response assessment, United Nations Environment Programme, GRID-Arendal, Birkeland Trykkeri AS, Norway, 2012. [2] G. Jianbang, P. Cerutti, M. Masiero, D. Pettenella, N. Andrighetto, T. Dawson, Quantifying illegal logging and related timber trade, International Union of Forest Research Organizations (IUFRO), Vienna, Austria, 2016. [3] M. Mendoza, Losing ground to timber thieves: illegal logging chips away at forests, but one court puts foot down, Seattle Post-Intelligencer, Associated Press, Seattle, 2003. [4] A.J. Lowe, E.E. Dormontt, M.J. Bowie, B. Degen, S. Gardner, D. Thomas, C. Clarke, A. Rimbawanto, A. Wiedenhoeft, Y. Yin, Opportunities for improved transparency in the timber trade through scientific verification, BioScience 66(11) (2016) 990-998. https://doi.org/10.1093/biosci/biw129 [5] E.E. Dormontt, M. Boner, B. Braun, G. Breulmann, B. Degen, E. Espinoza, S. Gardner, P. Guillery, J.C. Hermanson, G. Koch, S.L. Lee, M. Kanashiro, A. Rimbawanto, D. Thomas, A.C. Wiedenhoeft, Y. Yin, J. Zahnen, A.J. Lowe, Forensic timber identification: it’s time to integrate disciplines to combat illegal logging, Biol. Conserv. 191 (2015) 790-798. https://doi.org/10.1016/j.biocon.2015.06.038 [6] United Nations Office on Drugs and Crime, Best practice guide for forensic timber identification, Vienna, Austria, 2016. [7] K.K.S. Ng, S.L. Lee, L.H. Tnah, Z. Nurul-Farhanah, C.H. Ng, C.T. Lee, N. Tani, B. Diway, P.S. Lai, E. Khoo, Forensic timber identification: a case study of a CITES listed species, Gonystylus bancanus (Thymelaeaceae), Forensic Science International: Genetics 23 (2016) 197-209. https://doi.org/10.1016/j.fsigen.2016.05.002 [8] S. Hassold, P.P. Lowry, II, M.R. Bauert, A. Razafintsalama, L. Ramamonjisoa, A. Widmer, DNA barcoding of Malagasy rosewoods: towards a molecular identification of CITES-listed Dalbergia species, PLoS One 11(6) (2016) e0157881. https://doi.org/10.1371/journal.pone.0157881 [9] G. Koch, V. Haag, I. Heinz, H. Richter, U. Schmitt, Control of internationally traded timber - the role of macroscopic and microscopic wood identification against illegal logging, Journal of Forensic Research 6(6) (2015) 317. http://dx.doi.org/10.4172/2157-7145.1000317 [10] R.A. Musah, E.O. Espinoza, R.B. Cody, A.D. Lesiak, E.D. Christensen, H.E. Moore, S. Maleknia, F.P. Drijfhout, A high throughput ambient mass spectrometric approach to species identification and classification from chemical fingerprint signatures, Scientific Reports 5 (2015) 11520. https://doi.org/10.1038/srep11520 [11] M.C. Bergo, T.C. Pastore, V.T. Coradin, A.C. Wiedenhoeft, J.W. Braga, NIRS identification of Swietenia Macrophylla is robust across specimens from 27 countries, IAWA Journal 37(3) (2016) 420430. https://doi.org/10.1163/22941932-20160144 [12] M. Vlam, G.A. de Groot, A. Boom, P. Copini, I. Laros, K. Veldhuijzen, D. Zakamdi, P.A. Zuidema, Developing forensic tools for an African timber: Regional origin is revealed by genetic characteristics, but not by isotopic signature, Biol. Conserv. 220 (2018) 262-271. https://doi.org/10.1016/j.biocon.2018.01.031 [13] F.T. Peters, O.H. Drummer, F. Musshoff, Validation of new methods, Forensic Science International 165(2-3) (2007) 216-224. https://doi.org/10.1016/j.forsciint.2006.05.021 [14] R. Ogden, N. Dawnay, R. McEwing, Wildlife DNA forensics—bridging the gap between conservation genetics and law enforcement, Endangered Species Research 9(3) (2009) 179-195. https://doi.org/10.3354/esr00144 [15] L. Jiao, X. Liu, X. Jiang, Y. Yin, Extraction and amplification of DNA from aged and archaeological Populus euphratica wood for species identification, Holzforschung 69(8) (2015) 925-931. https://doi.org/10.1515/hf-2014-0224 [16] A. Lowe, K. Wong, Y. Tiong, S. Iyerh, F. Chew, A DNA method to verify the integrity of timber supply chains; confirming the legal sourcing of merbau timber from logging concession to sawmill, Silvae Genetica 59(6) (2010) 263. https://doi.org/10.1515/sg-2010-0037

Jo

ur

na

lP

re

-p

ro of

[17] Scientific Working Group on DNA Analysis Methods, SWGDAM validation guidelines for DNA analysis methods, (2016). [18] B. Budowle, P. Garofano, A. Hellman, M. Ketchum, S. Kanthaswamy, W. Parson, W. van Haeringen, S. Fain, T. Broad, Recommendations for animal DNA forensic and identity testing, International Journal of Legal Medicine 119(5) (2005) 295-302. https://doi.org/10.1007/s00414-0050545-9 [19] A. Linacre, L. Gusmao, W. Hecht, A.P. Hellmann, W.R. Mayr, W. Parson, M. Prinz, P.M. Schneider, N. Morling, ISFG: Recommendations regarding the use of non-human (animal) DNA in forensic genetic investigations, Forensic Science International: Genetics 5(5) (2011) 501-5. https://doi.org/10.1016/j.fsigen.2010.10.017 [20] A. Iyengar, Forensic DNA analysis for animal protection and biodiversity conservation: A review, Journal for Nature Conservation 22(3) (2014) 195-205. https://doi.org/10.1016/j.jnc.2013.12.001 [21] A.J. Lowe, D.I. Jardine, H.B. Cross, B. Degen, L. Schindler, A.M. Holtken, A method of extracting plant nucleic acids from lignified plant tissue, President and fellows of Harvard College; The Minister for sustainability, environment and conservation, for and on behalf of the State of South Australia; Adelaide Research & Innovation PTY LTD, 2015. [22] D.I. Jardine, E.E. Dormontt, K.J. van Dijk, R.R.M. Dixon, B. Dunker, A.J. Lowe, A set of 204 SNP and INDEL markers for Bigleaf maple (Acer macrophyllum Pursch), Conservation Genet Resour 7(4) (2015) 797-801. https://doi.org/10.1007/s12686-015-0486-7 [23] P. Johansen, J. Andersen, C. Børsting, N. Morling, Evaluation of the iPLEX® Sample ID Plus Panel designed for the Sequenom MassARRAY® system. A SNP typing assay developed for human identification and sample tracking based on the SNPforID panel, Forensic Science International: Genetics 7(5) (2013) 482-487. http://dx.doi.org/10.1016/j.fsigen.2013.04.009 [24] F. Rousset, Genepop’007: a complete re‐implementation of the genepop software for Windows and Linux, Molecular Ecology Resources 8(1) (2008) 103-106. https://doi.org/10.1111/j.14718286.2007.01931.x [25] P.G. Meirmans, P.H. Van Tienderen, GENOTYPE and GENODIVE: two programs for the analysis of genetic diversity of asexual organisms, Molecular Ecology Notes 4(4) (2004) 792-794. https://doi.org/10.1111/j.1471-8286.2004.00770.x [26] D.J. Balding, R.A. Nichols, DNA profile match probability calculation: how to allow for population stratification, relatedness, database selection and single bands, Forensic Science International 64(23) (1994) 125-140. https://doi.org/10.1016/0379-0738(94)90222-4 [27] L.P. Waits, G. Luikart, P. Taberlet, Estimating the probability of identity among genotypes in natural populations: cautions and guidelines, Molecular Ecology 10(1) (2001) 249-256. https://doi.org/10.1046/j.1365-294X.2001.01185.x [28] R. Peakall, P.E. Smouse, GENALEX 6: genetic analysis in Excel. Population genetic software for teaching and research, Molecular ecology notes 6(1) (2006) 288-295. https://doi.org/10.1111/j.1471-8286.2005.01155.x [29] R. Peakall, P.E. Smouse, GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research—an update, Bioinformatics 28(19) (2012) 2537-2539. https://doi.org/10.1111/j.1471-8286.2005.01155.x [30] J. Stejskalová, I. Kupka, S. Miltner, Effect of gibberellic acid on germination capacity and emergence rate of Sycamore maple (Acer pseudoplatanus L.) seeds, J. FOR. SCI 61(8) (2015) 325-331. https://doi.org/10.17221/22/2015-JFS [31] United Nations Office on Drugs and Crime, Guidelines on methods and procedures for ivory sampling and laboratory analysis, Vienna, Austria, 2014. [32] R. Ogden, A. Linacre, Wildlife forensic science: a review of genetic geographic origin assignment, Forensic Science International: Genetics 18 (2015) 152-159. https://doi.org/10.1016/j.fsigen.2015.02.008

Jo

ur

na

lP

re

-p

ro of

[33] C. Jolivet, B. Degen, Use of DNA fingerprints to control the origin of sapelli timber (Entandrophragma cylindricum) at the forest concession level in Cameroon, Forensic Science International: Genetics 6(4) (2012) 487-493. [34] L.H. Tnah, S.L. Lee, K.K.S. Ng, N. Tani, S. Bhassu, R.Y. Othman, Geographical traceability of an important tropical timber (Neobalanocarpus heimii) inferred from chloroplast DNA, Forest Ecology and Management 258(9) (2009) 1918-1923. https://doi.org/10.1016/j.foreco.2009.07.029 [35] B. Degen, S.E. Ward, M.R. Lemes, C. Navarro, S. Cavers, A.M. Sebbenn, Verifying the geographic origin of mahogany (Swietenia macrophylla King) with DNA-fingerprints, Forensic Science International: Genetics 7(1) (2013) 55-62. https://doi.org/10.1016/j.fsigen.2012.06.003 [36] S. Wasser, L. Brown, C. Mailand, S. Mondol, W. Clark, C. Laurie, B. Weir, Genetic assignment of large seizures of elephant ivory reveals Africa’s major poaching hotspots, Science 349(6243) (2015) 84-87. https://doi.org/10.1126/science.aaa2457 [37] E.E. Nielsen, A. Cariani, E.M. Aoidh, G.E. Maes, I. Milano, R. Ogden, M. Taylor, J. HemmerHansen, M. Babbucci, L. Bargelloni, D. Bekkevold, E. Diopere, L. Grenfell, S. Helyar, M.T. Limborg, J.T. Martinsohn, R. McEwing, F. Panitz, T. Patarnello, F. Tinti, J.K.J. Van Houdt, F.A.M. Volckaert, R.S. Waples, c. FishPopTrace, G.R. Carvalho, Gene-associated markers provide tools for tackling illegal fishing and false eco-certification, Nat Commun 3 (2012) 851. https://doi.org/10.1038/ncomms1845

Figure and Table Legends:

ur

na

lP

re

-p

ro of

Fig. 1. Bigleaf maple illegal logging modus operandi and sampling locations. a, Trees are selected based on ‘figured’ patterning, observed through removal of the bark in a small area, ‘quilted’ figuring shown here. b, Once felled, the stump is often covered in moss to evade detection. c, Bark is removed from the felled log revealing the extent of figured patterning. The log is often then cut into blocks. d, In some cases little attempt is made to conceal the crime scene, only the most valuable (and portable) blocks are removed with the remaining timber left to rot in the forest. e, Western North America showing National Parks and Forests (green). f, Total study area with sampling locations (red circles). g, Area of most extensive sampling. Photo credits: Anne Minden.

Jo

Fig. 2. Loci pass-rate threshold. The percentage of loci which amplified in each profile (excluding those where 0% of loci amplified) with profiles ordered by decreasing loci amplification success. Profiles from genotyped samples are presented (black circles), along with reagent blank negative controls (red squares). The dashed line represents the 95% pass-rate threshold applied to samples to mitigate the risk of false positive DNA profile inclusion in subsequent analyses.

ro of -p

re

Supplementary Table S1. Location and number of individuals of Acer macrophyllum successfully genotyped at each collection site. Supplementary Table S2. Summary of all samples collected in the study.

lP

Supplementary Table S3. List of 131 loci used for individualisation in bigleaf maple, a subset of those developed by Jardine et al. [22]. HO: Observed heterozygosity per locus; FST, Wright’s fixation index per locus.

na

Supplementary Fig. S1. Dissection of germinating Acer macrophyllum seeds. (a) After cold stratification, germinating seeds were dissected to reveal the seed coat and separate it from the pericarp. (b) The seed coat was then removed and the growing cotyledons used for DNA extraction. Supplementary Fig. S2. Distribution of discordant allele calls in 10,000 simulated seedlings.

Jo

ur

Supplementary Document S1. Framework for the interpretation of SWGDAM validation guidelines for DNA analysis methods specifically for the developmental validation of individualisation assays for timber species