Improving PDQ database search strategies to enhance investigative lead information for automotive paints

Improving PDQ database search strategies to enhance investigative lead information for automotive paints

Microchemical Journal 117 (2014) 133–137 Contents lists available at ScienceDirect Microchemical Journal journal homepage: www.elsevier.com/locate/m...

762KB Sizes 1 Downloads 30 Views

Microchemical Journal 117 (2014) 133–137

Contents lists available at ScienceDirect

Microchemical Journal journal homepage: www.elsevier.com/locate/microc

Improving PDQ database search strategies to enhance investigative lead information for automotive paints B.K. Lavine a,⁎, A. Fasasi a, M. Sandercock b a b

Department of Chemistry, Oklahoma State University, Stillwater, OK 74078, United States National Center for Forensic Sciences, RCMP, 15707-118th Ave., Edmonton, Alberta T5V 1B7, Canada

a r t i c l e

i n f o

Article history: Received 24 May 2014 Accepted 6 June 2014 Available online 13 June 2014 Keywords: Search prefilters IR spectral library searching Forensic automotive paint analysis Data fusion Pattern recognition

a b s t r a c t Modern automotive paints have a thin clear coat which on a microscopic fragment is often too thin to obtain accurate chemical information. The small size of the fragment also makes it difficult to accurately compare it with the manufacturer's paint color standards. Because adhesion between paint layers is usually very strong, both primer layers are often transferred during a collision if the clear coat and color coat layers are also transferred. Applying low level data fusion techniques where spectra from multiple sources (e.g., IR spectra of clear coat and primer paint layers) are combined and class membership information is extracted, search prefilters have been developed to determine the assembly plant of the vehicle from which an unknown paint sample originated. Even in challenging trials where the clear coat and undercoat layers evaluated were all from the same make (General Motors) within a limited production year range, the respective assembly plants of the vehicles could be identified using only chemical information. The development of search prefilters for the PDQ database to exploit multiple sources of IR data is needed to extract investigative lead information from clear coat and primer paint layer smears. © 2014 Elsevier B.V. All rights reserved.

1. Introduction Automotive vehicles can be identified from paint fragments transferred onto the clothing of a pedestrian involved in a hit-and-run accident by comparing the color, layer sequence, and chemical composition of each individual layer of the paint [1,2]. To make these comparisons possible, the Royal Canadian Mounted Police have developed a comprehensive automotive paint database known as the paint data query (PDQ) database as well as the means of searching and retrieving information from it [3,4]. Currently, PDQ contains over 21,000 samples (street samples and factory panels) that correspond to over 84,000 individual paint layers, representing the paint systems used in most domestic and foreign vehicles marketed in North America. PDQ is a database of the physical attributes, the chemical composition and the infrared (IR) spectrum of each layer of the original manufacturer's automotive paint system. If the original automotive paint layers are present in the recovered (i.e. unknown) paint fragment, PDQ can assist in identifying the make and model of the automotive vehicle. The uniqueness of the PDQ database, and the support it has received from other forensic science laboratories around the world, have made PDQ a world-wide standard. Automotive paint of the original factory applied system has a typical layer sequence of primer, surfacer, color coat and clear coat [5,6].

⁎ Corresponding author at: Department of Chemistry, 107 Physical Sciences I, Oklahoma State University, Stillwater, OK 74078, United States.

http://dx.doi.org/10.1016/j.microc.2014.06.007 0026-265X/© 2014 Elsevier B.V. All rights reserved.

Modern automotive paint systems possess a thin color coat which on a microscopic fragment is often too thin to obtain accurate chemical information. The small size of the fragment also makes it difficult to accurately compare it with manufacturer's paint color standards. Because adhesion between paint layers is usually very strong, both primer layers are often transferred during a collision if both the clear coat and color coat layers are also transferred. As the primer and clear coat layer are often unique to the assembly plant where these paint layers were applied, combining chemical information obtained from the infrared (IR) spectra of the two primer layers and from the clear coat paint layer should make it possible to rapidly and accurately identify the make and model of the vehicle and specify certain years of manufacture of the automobile from its paint system alone. Applying lower level data fusion techniques [7], search prefilters for IR spectral library matching have been developed to differentiate between similar but nonidentical FTIR paint spectra, and to determine the assembly plant (and hence the make and model) of an automotive vehicle from which an unknown paint sample originated. Even in challenging trials where the clear coat and undercoat paint layers evaluated were all the same make (General Motors) within a limited production year range (2000–2006), the respective assembly plants were correctly identified using only chemical information from IR spectra. The development of search prefilters for the PDQ database from IR spectra of the clear coat and undercoat automotive paint layers using pattern recognition techniques, which is the focus of this study, is necessary to extract investigative lead information from automotive paint smears that are often left at the crime scene in

B.K. Lavine et al. / Microchemical Journal 117 (2014) 133–137

a hit-and-run accident where death or injury to a pedestrian has occurred. Search prefilters, which avoid a complete spectral comparison between the full IR spectrum of an unknown and each member of the spectral library, address many of the problems encountered with commercial library search algorithms when applied to forensic paint analysis which include ignoring bands of low intensity and peak shoulders, which may be highly informative. Search prefilters developed in this study also address a major problem encountered with the PDQ database which is the use of text to code the chemistry of each automotive paint layer. Searches of the PDQ database require the user to code their IR spectrum of the recovered paint sample according to the guidelines set out in the database, and to search these codes against the codes in the database. The coding used in PDQ is generic, and can lead to nonspecific search criteria which results in a large number of hits that a scientist must then work through and eliminate. The accuracy of a PDQ search is impaired due to the conversion of spectral information to generic coded text. Using search prefilters a preliminary identification of the paint fragment can be obtained which obviates all of the aforementioned problems. 2. Experimental 240 IR spectra of clear coat and under coat paint smears (see Table 1) applied to the metal surfaces of automobiles assembled at 13 General Motors (GM) industrial plants were collected using two ThermoNicolette 6700s FTIR spectrometers. Both spectrometers were equipped with DTGS detectors and all samples were collected in transmission mode using high pressure diamond anvil cells. A Harrick 6x beam condenser was used in each Thermo-Nicolet 6700s instrument. Of the 240 IR spectra, 80 IR spectra were clear coats, 80 were e-coats and 80 were primer layers of automobile paint samples. Representative spectra of the clear coat, e-coat, and primer paint layers are shown in Fig. 1. Data preprocessing was crucial to ensure a successful analysis of the data. IR spectra of the clear coats and undercoat paint layers were not properly aligned along their x or y-axes because these spectra were collected on two different spectrometers whose laser frequencies were set to 15798.0 and 15798.3 respectively. Differences in the He–Ne laser frequencies also led to slight changes in the number of data points in each spectrum. Discrimination among assembly plants required finding small differences in similar IR spectra, so spectral alignment was crucial in this study to develop search prefilters. The first step in this study was to align all spectra (along the wavelength axis) using OMNIC, ensuring that the He–Ne laser frequency was set to 15798.0 cm−1 for all IR spectra. For the measured absorbance to be very nearly equal to the true absorbance, all interferograms were multiplied by the Norton–Beer medium apodization function before application of the Fourier transform [8]. Although all IR spectra were

Table 1 Training set and validation set samples. Assembly plant

Number of training set samples

Number of validation set samples

1 (ARL) 8 (FOR) 9 (FRE) 12 (JAN) 14 (LAN) 16 (LIN) 17 (LRD) 18 (MOR) 21 (ORI) 23 (PON) 24 (RAM) 26 (SIL) 31 (ELI) Total

10 5 6 6 11 5 5 6 3 4 4 4 11 80

2 0 0 1 2 0 0 2 0 0 0 0 2 9

Representative Spectra of Clear Coat(A), Ecoat(B) and Primer(C) Layers Spectral range used for study: 600 cm-1 to 1500cm-1

20 % Transmittance

% Transmittance

134

4000

3500

3000

2500

2000

1500

1000

500

Wavenumber (cm-1) Fig. 1. Representative IR spectra of the clear coat, e-coat, and primer paint layers.

measured at a nominal 4 cm−1 resolution, the number of points collected per spectrum using the two Thermo-Nicolet instruments varied from 1878 points to 1958 points. To remedy this problem, each IR spectrum was normalized to the helium–neon laser frequency of 15798.0 cm−1. The laser frequency value was set to that measured at the aperture setting to make the peak positions independent of aperture setting. This ensured wavelength alignment along the entire x-axis for all paint samples. After this preprocessing, each IR spectrum consisted of 1869 points for the entire mid-IR range of 400 cm−1 to 4000 cm−1. Although differences between the two instruments were large enough to require alignment, these differences could not be detected visually. To authenticate wavelength alignment along the x-axis for all paint spectra used in this study, IR spectra of the same paint samples collected on the two Thermo Nicolet instruments were subtracted before and after performing the alignment procedure. The subtraction yielded a nonzero response at each wavelength before alignment but zero at each point after alignment. For alignment along the y-axis, we ensured that each spectrum started at the same transmittance value for the spectral range investigated. 2.1. Pattern recognition analysis The training set (see Table 1) consisted of 80 paint samples (i.e., 240 IR spectra) and the validation set consisted of 9 paint samples (see Table 1). Each IR spectrum was normalized to unit length. All spectral features were autoscaled to ensure that each measurement had a mean of zero and a standard deviation of one. Autoscaling removed any inadvertent weighting of the data that otherwise would occur due to differences in the magnitude among the measurement variables comprising the data set. To develop the search prefilters, chemical information from the FTIR spectra of the two primer paint layers and the clear coat paint layer was analyzed individually and combined with specific wavelengths in each FTIR spectrum characteristic of the assembly plant of the vehicle identified using a genetic algorithm (GA) for pattern recognition [9–16]. Both supervised learning and unsupervised learning were used to identify wavelengths that optimized the separation of the IR spectra by assembly plant in a plot of the two or three largest principal components of the data. Because principal components maximize variance, the bulk of the information encoded by the wavelengths selected by the pattern recognition GA was about differences between the classes (assembly plants) in the database. A principal component plot that shows separation of the data by class can only be generated using spectral features whose variance or information is primarily about the differences between the assembly plants. This fitness criterion dramatically reduces

B.K. Lavine et al. / Microchemical Journal 117 (2014) 133–137

Clear Coat

E-Coat

Primer

70

70

80

40

40

40

600

1000

1500 600

135

1000

1500

600

1100

1500

Data Fusion

80

40 0 Fig. 2. Fusion of the fingerprint regions of the clear coat and the two undercoat paint layers (e-coat and primer) is shown. The first 487 elements of the fused data vector represent the fingerprint region of the clear coat layer, the next 487 elements represent the first undercoat layer (e-coat) and the last 487 elements represent the second undercoat layer.

the size of the search space since it limits the search to these types of spectral feature subsets. In addition, the pattern recognition GA focuses on those classes and/or samples that are difficult to classify as it trains by boosting the relative importance of assembly plants and specific samples that consistently score poorly. Over time, the algorithm learns its optimal parameters in a manner similar to a neural network. The pattern recognition GA integrates aspects of artificial intelligence and evolutionary computations to yield a “smart” one-pass procedure for wavelength selection and plant classification.

undercoat paint layers (e-coat and primer), the first 487 elements of the data vector represent the fingerprint region of the clear coat layer, the next 487 elements represent the first undercoat layer (e-coat) and the last 487 elements represent the second undercoat layer. This process is shown in Fig. 2. The pattern recognition GA will then identify the components of this data vector (i.e., specific wavelengths for each paint layer) that are correlated to the make and model of the automobile from which the paint sample was obtained. 3. Results and discussion

2.2. Data fusion As the finger print region of each layer is the most informative, only the spectral range of 1500 cm−1 to 600 cm−1 was used in this study for each layer (see Fig. 1). Although the fingerprint region goes below 600 cm−1, we chose to limit the range to 600 cm−1 as the region below 600 cm− 1 is noisy for most of the IR spectra. 1500 cm− 1 to 600 cm−1 corresponded to 487 points in each IR spectrum. To combine the chemical information obtained from the clear coat and the two

Successful mining of multivariate chemical data requires the user to combine empirical data with careful analysis and prior knowledge and reasoning. Supervised learning represents a systematic approach to this problem, which may be defined as the search for significant structure in data. The pattern recognition GA used in this study is designed to search for significant structure in multivariate data. Applying the pattern recognition GA to the FTIR spectra of the clear coat, e-coat, and primer paint layers, wavelengths characteristic of the assembly plant that produced the vehicles (and hence the model of

T-21 4

T-1 T-21 T-21

6

3

T-1

T-31

PC 2

T-26 T-26 T-26

1 0

T-17 -1

T-17 T-17 T-17 T-17

-2

T-16 T-16 T-16 T-16 T-16

4 T-1T-1

T-31 T-31 T-31 T-31 T-31 T-31 T-31 T-31 T-31

T-12 T-12 T-12 T-12 T-12 T-14 T-12 T-14 T-14 T-8 T-8 T-8 T-1T-14 T-8T-14 T-1 T-14 T-14 T-23 T-14 T-23 T-23 T-1 T-14 T-1 T-1 T-1 T-14 T-1 T-14 T-23 T-1 T-1T-1 T-18 T-18 T-18 T-18 T-18 T-18

PC 2

T-24 T-24 T-24 T-24 T-26

2

T-1 T-1

T-9 T-9 T-9 T-9

T-9 T-9

2

0

T-16 T-16 T-16 T-16

-2

-3

T-1 T-1

T-18 T-12 T-8T-31 T-8 T-12 T-31 T-8 T-18 T-31 T-23 T-26 T-23 T-31 T-8 T-26 T-24 T-31 T-14 T-31 T-31 T-18 T-31 T-31 T-9 T-14 T-26 T-9 T-12 T-17 T-17 T-9 T-31 T-14 T-9 T-14 T-8 T-31 T-9T-12 T-23 T-21 T-9 T-18 T-17 T-23 T-12 T-17 T-18 T-12 T-26 T-24 T-24 T-14 T-14 T-14 T-14 T-17 T-14 T-14 T-14

T-1 T-1

T-16

-4 -3

-2

-1

0

1 PC 1

2

3

4

Fig. 3. Principal component plot of the fingerprint region of the clear coats after wavelength selection by the pattern recognition GA. T-1 = Arlington, T-8 = Fort Wayne, T-9 = Fremont, T-12 = Janesville, T-14 = Lansing, T-16 = Linden, T-17 = Lordstown, T-18 = Moraine, T-21 = Orion Township, T-23 = Pontiac, T-24 = Ramos Arizpe, T-26 = Silao, Mexico and T-31 = Elizabeth.

-12

-10

-8

-6

-4

-2

0

2

4

PC 1 Fig. 4. Principal component plot of the fingerprint region of the e-coats after wavelength selection by the pattern recognition GA. T-1 = Arlington, T-8 = Fort Wayne, T-9 = Fremont, T-12 = Janesville, T-14 = Lansing, T-16 = Linden, T-17 = Lordstown, T-18 = Moraine, T-21 = Orion Township, T-23 = Pontiac, T-24 = Ramos Arizpe, T-26 = Silao, Mexico and T-31 = Elizabeth.

B.K. Lavine et al. / Microchemical Journal 117 (2014) 133–137

the vehicle) were identified for 80 automobiles and trucks from 13 assembly plants. The paint samples used in this study were obtained from metallic automotive substrates — the hood, roof, door, or trunk of the vehicle. Figs. 3 through 5 show a plot of the two largest principal components of the wavelengths identified by the pattern recognition GA as being informative for the clear coats, e-coat, and primer paint layers. Fig. 3 shows a principal component plot of the fingerprint region of the clear coats after wavelength selection by the pattern recognition GA, and Fig. 4 shows a principal component plot of the fingerprint region of the e-coats after wavelength selection. Fig. 5 shows a principal component plot of the fingerprint region of the primer after wavelength selection. From an examination of these plots, it is evident that clear coats and the primer paint layer contain more information about the assembly plant than the e-coat. Furthermore, the primer paint layer was able to better separate the FTIR spectra by assembly plant than the clear coat paint layer. For the clear coats (see Fig. 3), five assembly plants could be recognized by the search prefilter: Fremont, California (T-9), Janesville, Wisconsin (T-12), Linden, New Jersey (T-16), Lordstown (T-17), and Elizabeth, NJ (T-31). As for the e-coats (see Fig. 4), only two assembly plants were recognized: Arlington, Texas (T-1) and Linden, New Jersey (T-16). The primer paint layer (see Fig. 5) identified 9 of the 13 assembly plants: Arlington (T-1), Fremont (T-9), Janesville (T-12), Lansing (T-14), Lordstown (T-17), Orion Township (T-21), Ramos Arizpe (T-24), Silao, Mexico (T-26) and Elizabeth (T-31). The next step was to combine the chemical information contained in the clear coat and the two primer paint layers by fusing the IR spectra of the individual paint layers. Fig. 6 shows a principal component plot of the fused IR spectral data before feature selection. The genetic algorithm for pattern recognition was used to identify spectral features characteristic of the profile of each assembly plant. The pattern recognition GA identifies features by sampling key feature subsets, scoring their PC plots, and tracking those clear coat paint samples or plants that were difficult to classify. The boosting routine used this information to steer the population to an optimal solution. After 200 generations, the GA identified 16 spectral features (i.e., transmittance values at 16 specified wavelengths) whose principal component plot showed clustering of the data on the basis of assembly plant (see Fig. 7). From this plot, it is evident that paint samples from all 13 assembly General Motor assembly plants could be successfully discriminated using the fused data. As paint samples from the Fort Wayne (T-8) and Pontiac assembly plants 2 T-26 T-26 T-26 T-26

1

T-12 T-12 T-12 T-12 T-12 T-12

0

-1

T-26 T-12 T-18 T-1 T-26 T-12 T-14 T-12 T-18 T-31 T-31 T-1 T-24 T-21 T-17T-26T-1 T-31 T-8 T-14 T-31 T-17 T-1 T-23 T-14 T-17 T-14 T-23 T-14 T-14 T-18 T-12 T-31 T-14 T-31 T-24 T-1 T-14 T-31 T-31 T-26 T-23 T-24 T-23 T-14 T-18 T-1 T-1 T-8 T-9 T-1 T-31 T-12 T-17 T-16 T-8 T-9 T-24 T-31 T-14 T-8 T-16 T-9 T-18 T-1 T-21 T-14 T-9 T-31 T-8

20 10 0 -10 -20 -30

T-16 T-16

T-21 T-9

T-1

T-16

T-9 -30

-20

-10

0

10

20

30

40

50

PC 1

Fig. 6. Principal component plot of the fused IR spectral data before wavelength selection. T-1 = Arlington, T-8 = Fort Wayne, T-9 = Fremont, T-12 = Janesville, T-14 = Lansing, T-16 = Linden, T-17 = Lordstown, T-18 = Moraine, T-21 = Orion Township, T-23 = Pontiac, T-24 = Ramos Arizpe, T-26 = Silao, Mexico and T-31 = Elizabeth.

(T-23) could not be differentiated by the clear coat, e-coat, or primer paint layers, one must conclude that a synergistic effect has occurred when FTIR spectra from these three paint layers were fused. To validate these results, an external prediction set of 9 paint samples was employed. IR spectra from the clear coat, e-coat and primer paint layers of these samples were fused and this set of 9 fused spectra was mapped directly onto the principal component score plot defined by the 80 IR spectra and 16 wavelengths identified by the pattern recognition GA. Fig. 8 shows the validation set samples mapped onto the principal component plot developed from the 16 wavelengths and 80 fused spectra from the training set. Each projected clear coat paint sample lies in a region of the map with IR spectra that have the same class label. Evidently, the pattern recognition GA can identify wavelengths from the fused spectra that are correlated to the assembly plant and hence the assembly plant responsible for assembling the vehicle. Thus, search prefilters with high selectivity and high recognition can be developed when chemical information from the 3 paint layers is combined into a single classifier.

4 T-17 T-17

3

T-17 T-17

T-26 T-26 T-26 T-26

2

T-24 T-12 T-24 T-12 T-24 T-12 T-24 T-12 T-12

1

T-8 T-23 T-8 T-8 T-18 T-23 T-23 T-8 T-23 T-8

T-17

-40

T-14

T-17 T-17 T-17 T-17 T-21T-17 T-21

-0.5

T-18 T-12

30

T-31 T-31 T-31 T-31 T-31 T-31 T-31 T-31 T-31 T-31

T-14 T-14 T-14 T-14 T-14 T-14 T-14 T-14 T-14

T-24 T-24 T-24 T-24

0.5

PC 2

T-1 T-1 T-1 T-1 T-1 T-1 T-1 T-1 T-1 T-1

40

PC 2

1.5

Fused Data : PC Plot of IR Spectra of Clear Coat, E-coat and Primer All Features

PC 2

136

T-21 T-21

T-1 T-1 T-1 T-1 T-1 T-1 T-1 T-1 T-1 T-1

0

T-14 T-14 T-14 T-14 T-14 T-14 T-14 T-14 T-14 T-14

T-21

T-23 T-8 T-8 T-23 T-9 T-9 T-23 T-8 T-9 T-23 T-9 T-8 T-9 T-9 T-8

T-31

-1

T-9 T-9 T-9 T-9 T-9 T-9

-1.5 T-16 T-16 T-18T-16 T-16 T-18 T-16 T-18 T-18 T-18

-2

-3

-2

-1

-2 -3

0

1

2

3

4

T-18 T-18 T-18 T-18 T-18

T-16 T-16 T-16 T-16

Fig. 5. Principal component plot of the fingerprint region of the primer paint layer after wavelength selection by the pattern recognition GA. T-1 = Arlington, T-8 = Fort Wayne, T-9 = Fremont, T-12 = Janesville, T-14 = Lansing, T-16 = Linden, T-17 = Lordstown, T-18 = Moraine, T-21 = Orion Township, T-23 = Pontiac, T-24 = Ramos Arizpe, T-26 = Silao, Mexico, and T-31 = Elizabeth.

T-31 T-31 T-31

T-18

5

PC 1

T-31 T-31 T-31 T-31 T-31 T-31 T-31

-4

-4

-3

-2

-1

0 PC 1

1

2

3

4

5

Fig. 7. Principal component plot of the fused IR spectral data after wavelength selection. T-1 = Arlington, T-8 = Fort Wayne, T-9 = Fremont, T-12 = Janesville, T-14 = Lansing, T-16 = Linden, T-17 = Lordstown, T-18 = Moraine, T-21 = Orion Township, T-23 = Pontiac, T-24 = Ramos Arizpe, T-26 = Silao, Mexico and T-31 = Elizabeth.

B.K. Lavine et al. / Microchemical Journal 117 (2014) 133–137 2-D PC Scores plot: Tset IR F used.dat Dim(16)

National Institute of Justice (2012-DN-BX-K059) is appreciated. The opinions, findings, and conclusions or recommendations expressed in this publication/program/exhibition are those of the author(s) and do not necessarily reflect those of the Department of Justice.

4 T-17 T-17

3

T-24 T-12 T-24 T-12 T-24 T-12 P-12 T-24 T-12 T-12

1 PC 2

T-17 T-17

T-26 T-26 T-26 T-26

2

T-21 T-21

T-1 T-1 T-1 T-1 T-1 T-1 P-1T-1 T-1 T-1 P-1 T-1

0

T-14 T-14 P-14 P-14 T-14 T-14 T-14 T-14 T-14 T-14 T-14 T-14

T-21

References

T-23 T-8 T-8 T-23 T-9 T-9 T-23 T-8 T-9 T-9 T-8 T-8 T-9T-9T-23

T-31

-1 T-31 T-31 T-31 -2 -3

T-31 T-31 P-31 T-31 P-31 T-31 T-31 T-31 T-31

T-18 T-18 T-18 P-18 T-18 T-18 P-18

T-16 T-16 T-16 T-16

T-18 -4

-3

-2

-1

0

1

2

137

3

4

5

PC 1

Fig. 8. Validation set samples projected onto the principal component plot of the fused IR spectra comprising the training set after wavelength selection. For the training set, T-1 = Arlington, T-8 = Fort Wayne, T-9 = Fremont, T-12 = Janesville, T-14 = Lansing, T-16 = Linden, T-17 = Lordstown, T-18 = Moraine, T-21 = Orion Township, T-23 = Pontiac, T-24 = Ramos Arizpe, T-26 = Silao, Mexico and T-31 = Elizabeth. As for the validation set P-1 = Arlington, P-12 = Janesville, P-14 = Lansing, P-18 = Moraine, and P-31 = Elizabeth.

4. Conclusions Pattern recognition analysis of the fused IR spectra of three automotive paint layers can provide information about the automotive vehicle from a paint fragment recovered at a crime scene. The pattern recognition GA was able to identify fingerprint patterns in the IR spectra of paints' characteristic of the manufacturing plant. Search prefilters developed from IR spectra of the PDQ database will simplify library searching. When combined with search algorithms that are not only more powerful but also more computationally intensive than the Euclidean distance, similarity searching could become feasible. Acknowledgments The authors wish to express appreciation to Tamara Hodgins, Collin White and Nuwan Perera for their assistance. Financial support from the

[1] A. Beveridge, T. Fung, D. MacDougall, Use of infrared spectroscopy for the characterization of paint fragments, in: B. Caddy (Ed.), Forensic Examination of Glass and Paint Analysis and Interpretation, Taylor and Francis, NY, 2001, pp. 220–233. [2] A. Hobbs, Sifting through the layers: the application of forensic databases to tape and paint analyses, Trace Evidence Symposium, August 13–16, Clearwater Beach, FL, 2007. [3] J.L. Buckle, D.A. MacDougal, R.R. Grant, A computerized system for the identification of suspect vehicles involved in hit and run accidents, Can. Soc. Forens. Sci. J. 30 (1997) 199–212. [4] P.G. Rodgers, R. Cameron, N.S. Cartwright, W.H. Clark, J.S. Deak, E.W. Norman, The classification of automotive paint by diamond window infrared spectrophotometry, part I: binders and pigments, Can. Soc. Forensic Sci. J. 9 (1976) 1–14. [5] G.A. Bishea, J.L. Buckle, S.G. Ryland, International Forensic Automotive Paint Database, Proceedings — Investigation and Forensic Science Technologies, International Society of Optical Engineering (SPIE), 3576, February 1999, pp. 73–80. [6] G. Fettis, Automotive Paints and Coatings, VCH Publications, New York, 1995. [7] J. Llinas, D.L. Hall, An Introduction to Multi-Sensor Data Fusion, Proceedings of the 1998 IEEE International Symposium on Circuits and Systems 1998, ISCAS '98, vol. 536, 1998, pp. 537–540. [8] C. Zhu, P.R. Griffiths, Extending the range of Beer's law in FT-IR spectrometry. Part I: theoretical studies of Norton–Beer apodization functions, Appl. Spectrosc. 52 (1998) 1403–1408. [9] B.K. Lavine, A. Fasasi, N. Mirjankar, M. Sandercock, S.D. Brown, Search prefilters for mid-IR spectra of clear coat automotive paint smears using stacked and linear classifiers, J. Chemom. 28 (2014) 385–394. [10] B.K. Lavine, A. Fasasi, N. Mirjankar, C. White, Search prefilters for library matching of infrared spectra in the PDQ database using the autocorrelation transformation, Microchem. J. 113 (2014) 30–35. [11] B.K. Lavine, A. Fasasi, N. Mirjankar, M. Sandercock, Development of search prefilters for infrared library searching of clear coat paint smears, Talanta 119 (2014) 331–340. [12] B.K. Lavine, K. Nuguru, N. Mirjankar, J. Workman, Pattern recognition assisted infrared library searching, Appl. Spectrosc. 66 (2012) 917–925. [13] B.K. Lavine, C. White, C. Matthew Sundling, C. Breneman, Odor–structure relationship studies of tetralin and indan musks, Chem. Senses 37 (2012) 723–736. [14] B.K. Lavine, K. Nuguru, N. Mirjankar, J. Workman, Development of carboxylic acid search prefilters for spectral library matching, Microchem. J. 103 (2012) 21–36. [15] B.K. Lavine, N. Mirjankar, S. Ryland, M. Sandercock, Wavelets and genetic algorithms applied to search prefilters for spectral library matching in forensics, Talanta 87 (2011) 46–52. [16] B.K. Lavine, K. Nuguru, N. Mirjankar, One stop shopping — feature selection, classification, and prediction in a single step, J. Chemom. 25 (2011) 116–129.