a n a l y t i c a c h i m i c a a c t a 6 1 8 ( 2 0 0 8 ) 196–203
available at www.sciencedirect.com
journal homepage: www.elsevier.com/locate/aca
Non-targeted detection of chemical contamination in carbonated soft drinks using NMR spectroscopy, variable selection and chemometrics Adrian J. Charlton ∗ , Paul Robb, James A. Donarski, John Godward Department for Environment, Food & Rural Affairs, Central Science Laboratory, Sand Hutton, York YO41 1LZ, United Kingdom
a r t i c l e
i n f o
a b s t r a c t
Article history:
An efficient method for detecting malicious and accidental contamination of foods has
Received 13 December 2007
been developed using a combined 1 H nuclear magnetic resonance (NMR) and chemometrics
Received in revised form
approach. The method has been demonstrated using a commercially available carbonated
18 April 2008
soft drink, as being capable of identifying atypical products and to identify contaminant
Accepted 23 April 2008
resonances. Soft-independent modelling of class analogy (SIMCA) was used to compare 1 H
Published on line 2 May 2008
NMR profiles of genuine products (obtained from the manufacturer) against retail products spiked in the laboratory with impurities. The benefits of using feature selection for extract-
Keywords:
ing contaminant NMR frequencies were also assessed. Using example impurities (paraquat,
Nuclear magnetic resonance
p-cresol and glyphosate) NMR spectra were analysed using multivariate methods result-
Contaminant
ing in detection limits of approximately 0.075, 0.2, and 0.06 mM for p-cresol, paraquat and
Chemometrics
glyphosate, respectively. These detection limits are shown to be approximately 100-fold
Variable selection
lower than the minimum lethal dose for paraquat. The methodology presented here is used to assess the composition of complex matrices for the presence of contaminating molecules without a priori knowledge of the nature of potential contaminants. The ability to detect if a sample does not fit into the expected profile without recourse to multiple targeted analyses is a valuable tool for incident detection and forensic applications. Crown Copyright © 2008 Published by Elsevier B.V. All rights reserved.
1.
Introduction
Identifying malicious or accidental contamination of food and drink is an issue of interest to regulators, the food industry and consumers alike. A wide range of targeted methods are already used to demonstrate that the food we consume meets regulatory limits for a number of predefined substances, but there is also a need to be able to identify when an unknown or unexpected contaminant is present. High-resolution proton nuclear magnetic resonance (1 H NMR) spectroscopy is particularly well suited to the analysis of complex mixtures, providing quantitative signals from every
∗
proton containing compound present in solution. This unbiased measurement allows investigators to examine samples without prior assumptions about the nature of any contamination that may be present. A compound’s NMR spectra will usually possess a unique combination of chemical properties such as J-couplings, chemical shifts, NOEs, and diffusion rates potentially facilitating automated molecular characterisation. In recent years, rapid developments in instrument design have resulted in significant improvements to both resolution and sensitivity. The correlation between sensitivity and measurement time leads to a compromise between the desired detection limit and the
Corresponding author. Tel.: +44 1904 462513; fax: +44 1904 462133. E-mail address:
[email protected] (A.J. Charlton). 0003-2670/$ – see front matter. Crown Copyright © 2008 Published by Elsevier B.V. All rights reserved. doi:10.1016/j.aca.2008.04.050
a n a l y t i c a c h i m i c a a c t a 6 1 8 ( 2 0 0 8 ) 196–203
time taken to acquire the NMR spectrum. Recent improvements in NMR sensitivity can therefore be used to obtain more rapid measurements and thus improve sample throughput. In order to extract key information from the NMR spectra of complex mixtures, a range of chemometric techniques have been used in combination with NMR spectroscopic data to, for example, detect differences in the metabolites produced in response to outside stresses [1], and identify differences between the chemical composition of biological extracts from different genotypes [2,3]. Characteristic differences in the chemical composition of foodstuffs taken from different geographical origins or subjected to different production methods have also been identified using chemometrics techniques [4]. A wide range of products has been studied in this way including beverages such as tea [5], coffee [6], and fruit juices [7]. Detection of small quantities of unknown compounds in complex mixtures has also been demonstrated in the search for biomarkers of disease [8]. Chemometric methods to extract information from large datasets are becoming increasingly widespread [9]. One of the most common chemometric methods is principal components analysis (PCA) [10,11]. This method performs a coordinate transformation on multivariate data so that they are represented as a number of principal component scores on new coordinate axis (principal components). The initial principal components capture the most significant sources of variation and this decreases for each subsequent component. PCA is an unsupervised method in which no prior information on experimental groupings is used in the transformation. It therefore avoids the need for the extensive cross-validation required in supervised methods, but can be less efficient in finding differences between experimental groupings when there is a large degree of natural variation. Soft-independent modelling of class analogy (SIMCA) [12] is a supervised multivariate statistical method. SIMCA is used to compute scores and residuals from or within a component, plane or hyperplane of a principal components analysis. Critical distances from (residuals) and within (scores) the plane of the model are then used to determine thresholds for class membership. SIMCA models allow predictions to be made, using test data, about the membership (or otherwise) of the modelled class. In judging class membership two types of outlier are defined: moderate and strong outliers [13]. Strong outliers have a great leverage on a PCA and are thus detected by the PC scores [14]. Threshold values for strong outliers can be set by the determination of a probability ellipsoid and this is often calculated using the Hotelling’s T2 statistic [15], which is a multivariate generalisation of Student’s t-test. Moderate outliers are detected by consideration of the model residuals (e.g. DModX in SIMCA-P) [16]. DModX is the distance from an observation to the plane of the model and is also known as the residual error or the residual standard deviation (RSD) [14]. This paper describes the investigation of a combined NMR and chemometric data analysis approach to detect contaminants in a carbonated soft drink matrix. A database of the spectra of uncontaminated samples was constructed, which describes the normal range of product variation. PCA of this data was then used to construct a SIMCA model of the uncontaminated matrix, against which samples containing different
197
types and levels of contamination were tested. A variable selection algorithm was used to detect regions of the spectrum containing peaks that are characteristic of contamination and maximised the efficiency of the approach resulting in lowered detection limits.
2.
Materials and methods
2.1.
Materials
Thirty-eight samples of the same branded carbonated soft drink were obtained from a commercial supplier and 17 of these samples were used to construct a database of the variability of the matrix (DB). Five of these samples came from the same production batch and 12 were obtained at roughly bi-weekly intervals from the same production line. Twentyone samples were supplied for blind testing of the database (BD) and this sample subset contained samples produced at the same factory that were collected and stored for a range of unknown time intervals up to the self-life of the product. These samples were known to be uncontaminated and representative of the retail soft drink product whilst being sourced directly from production origin. To minimise any packaging related issues, all samples were obtained in 500 mL PET bottles. Three compounds were used as contaminants: glyphosate (120 gL−1 ), as the commercial product Round-Up (Monsanto, Cambridge, UK), paraquat (200 gL−1 ) as the commercial product Gramoxone (Syngenta Crop Protection UK Ltd., Cambridge, UK), and p-cresol (Aldrich, Dorset, UK). All reagents were of analytical grade (≥98% purity). Deuterated water (D2 O) was obtained from Goss Scientific Instruments Ltd., (Cambridge, UK). Trimethylsilyl-propionic-2,2,3,3 acid (d4 ) sodium salt (TSP) was obtained from Avocado (Morecambe, UK). Potassium dihydrogen phosphate (KH2 PO4 ) and dipotassium hydrogen phosphate (K2 HPO4 ) were obtained from BDH (Poole, UK).
2.2.
Preparation of samples for NMR measurement
A K2 HPO4 /KH2 PO4 concentrated stock buffer (1.2 M, pH 7) was used to avoid pH related shifts in the NMR resonances that would complicate subsequent mathematical analyses. TSP was added to this solution at 10 mM concentration for use as an NMR chemical shift reference. The solution was made up in D2 O to provide an NMR lock signal. Stock solutions of the three contaminants were prepared at 10 times the concentration of contaminant required in the final sample, which was between 0.02 and 50 mM. The paraquat formulation employed contained a green dye that precipitated upon dilution; so all paraquat stock solutions were centrifuged to remove insoluble matter before use. Portions of each soft drink were degassed by sparging with N2 for 2 h. In further development work (not described here) degassing is performed using sonication for 5 min. An aliquot of the degassed soft drink sample (480 L) was added to the concentrated buffer solution (60 L) and either spiked with contaminant solution or deionised water (60 L). The resultant solution was placed in a 5 mm NMR tube and homogenised using a vortex mixer prior to NMR analysis.
198
2.3.
a n a l y t i c a c h i m i c a a c t a 6 1 8 ( 2 0 0 8 ) 196–203
1
H NMR spectroscopy
A one-dimensional 1 H NMR spectrum was acquired for each soft drink sample and those spiked with contaminants. Spectra were acquired on a Bruker ARX-500 spectrometer using a 5 mm broadband probe tuned to detect 1 H resonances at 500.13 MHz. Data were collected at 300 K, without sample rotation, as 32,768 complex points using a 30◦ -pulse length, and with pre-saturation to remove the residual water signal. A 3.5 s relaxation delay was found to be sufficient for the acquisition of quantitative data for all resonances for both the soft drink matrix and the three different contaminants. Four hundred scans were acquired with a spectral width of 20 ppm giving a total experiment time per sample of 35 min 45 s. The data were processed using FELIX software (Accelrys, San Diego, CA, USA). A sine-bell-shaped window function phase shifted by 90◦ was applied over the first 16,000 data points prior to Fourier transformation, phase and baseline correction.
2.4. Multivariate mathematical methods and their application The 1 H NMR spectra were analysed by PCA, and modelling of the authentic soft drink matrix undertaken using SIMCA ˚ Sweand the SIMPCA-P software package (Umetrics, Umea, den). The data were set to unit variance prior to multivariate analysis. PCA summarised the variance in the dataset into 30 principal components. The number of components used for modelling was determined from the predictive ability of the model upon internal cross-validation. This approach ensures that overfitting of the model is largely avoided. SIMCA models were constructed using the first eight principal components and these represented 99.8% of the variance in the dataset. PCA scores from the 1 H NMR spectra of the 17 DB soft drinks were used to construct a SIMCA model. The BD and the contaminant spiked sample data were then compared to the model to test for membership of the DB group using the Hotelling’s T2 statistic for strong outliers and DModX to test for moderate outliers. This approach was applied both to the full spectral data set and to a reduced set of frequencies identified by the application of a variable selection approach.
2.5.
Variable selection
Multivariate approaches such as PCA and partial least squares (PLS) regression describe the variance within a dataset by weighting each variable according to its absolute variance (PCA) or the variance that is correlated with the categorical variables (PLS). In both cases all of the data in a multivariate dataset is used and this often limits the sensitivity of the techniques to detect trends in the data. In this study a univariate variable selection procedure has been used prior to multivariate analysis to improve the limit of detection for each of the contaminants. The procedure was used to select those regions of the NMR spectrum that were indicative of contamination and the selected variables were subjected to further multivariate analysis. Statistics were calculated at each NMR frequency (data point) using the DB spectra. For each spectral frequency the mean and standard deviation at each frequency were calculated. Eq. (1) was then used to describe the similar-
ity of any test spectra to the spectral database derived from the DB samples. Xdiff =
|Xi − Xi |
(1)
where Xi is the NMR intensity at each frequency, Xi is the mean intensity at each frequency and is the standard deviation at each frequency. This equation results in mean and variance standardised data and a similar application of this equation is often used to detect outliers (e.g. Grubbs test), where the resultant value is referred to as Z [17]. The magnitude of Xdiff is indicative of the certainty that a particular point does not fall within the range described by the database samples. However, as the data will not follow a Gaussian distribution in the presence of a contaminated sample the absolute value of Xdiff is not interpreted here and is used only to rank the NMR frequencies for feature selection. Variable selection was implemented using Matlab (The Mathworks, Natick MA, USA). The results obtained were used to rank the frequencies within each spiked NMR spectrum illustrating those resonances that exhibited variance outside of that defined by the DB data. These regions were then compared to the NMR spectra of the contaminants used to spike the soft drinks to assess the success of Eq. (1) in identifying the chemical shifts of the contaminating compounds.
3.
Results and discussion
3.1.
Appearance of soft drink spectra
The NMR spectra were reproducible with good resolution and line-shape characteristics. Major constituents of the soft drink matrix included sucrose, glucose, fructose and citric acid and these were readily assigned in the NMR spectra by reference to previously acquired spectra (Fig. 1). Other constituents, such as benzoic acid, were also identified from their characteristic multiplicity and the chemical shift values of their resonances. Some minor constituents were not specifically assigned and were attributed to flavourings.
3.2.
Construction of SIMCA model of the soft drink
Following acquisition of all experimental data, principal components scores were calculated from 17 DB samples and a SIMCA model was constructed. The loadings for this analysis were used to calculate PC scores for the BD data, which were used as a prediction (test) set to validate the SIMCA model. The scores from the first two principal components of the DB samples are plotted in Fig. 2. The outer ellipse indicates the 95% confidence limits defined from the SIMCA model of the PCA data using the Hotelling’s T2 statistic calculated from the DB data. PCA scores from the BD samples were superimposed on the SIMCA model from the DB samples. Fig. 2 shows that 95% (20 out of 21) of the BD soft drink samples are retained within the 95% confidence limits for the DB SIMCA model.
a n a l y t i c a c h i m i c a a c t a 6 1 8 ( 2 0 0 8 ) 196–203
199
of the well-resolved glucose and fructose peaks at 5.241 and 4.119 ppm.
3.3.
Fig. 1 – The one-dimensional 1 H NMR spectrum of (A) a soft beverage, (B) sucrose, (C) glucose and (D) citric acid. The region between 2.3 and 5.8 ppm of the spectra is shown.
The variation in the composition of the genuine soft drink matrix is reflected in the scatter of the PC scores in the SIMCA model and thus the position of the 95% confidence limit. The date of production was a major source of variance in the authentic drinks as the five samples from the same production batch were found to cluster together (Fig. 2, inner ellipse), and all samples that were collected in one month occupy a position in the upper or lower quartile of the right hand side of Fig. 2. The variation in sample composition, demonstrated by PCA, was also investigated by inspection of the NMR spectra. For example, the inversion process resulting in the hydrolysis of sucrose into glucose and fructose is reflected in the intensity of the sucrose doublet at 5.413 ppm. The intensity of this peak is much higher in the fresh samples than in those that have been stored for several months. The reverse relationship exists between the peak intensity and storage time
Fig. 2 – Principal component analysis scores plots (PC 1 vs. PC 2) calculated from the DB samples () and the BD samples (䊉). The solid line represent the 95% confidence limit determined from the Hotelling’s T2 statistic and the broken line encloses the data points from the freshly produced samples. The variance for each PC score is shown in parenthesises on the respective axes.
Appearance of contaminated soft drink spectra
Three compounds were selected as contaminants for study: p-cresol, paraquat and glyphosate. Paraquat and glyphosate were chosen because they are toxic and readily available as commercial formulations (such as those used here) whilst pcresol was chosen because it gives soft drinks an off taste. The resonances from each active ingredient was assigned using analytical grade standards prior to commencement of this study. The assignment of the glyphosate and paraquat in the commercial formulations was also undertaken, however coformulants were not assigned. The PC scores derived from all deliberately contaminated soft drink samples revealed that both paraquat and glyphosate spiked samples could be separated from the unadulterated samples using the first two principal components within the plane of the SIMCA model (i.e. as strong outliers). p-cresol spiked beverages did not fall outside of the 95% confidence limits until the model was clearly overfitted (above PC 26). In all three cases a high concentration (>10 mM) was needed before the data from the contaminated samples were consistently rejected from the SIMCA model as strong outliers. For the purposes of detecting contamination, the identification of moderate outliers using the DModX parameter from the SIMCA model is particularly suitable. Fig. 3 shows DModX for data acquired from each contaminated sample and the BD samples. The values from an eight principal component SIMCA model are presented. The number of scores used was determined from the predictive ability of the model upon cross-validation using the autofit function in SIMCAP. The mean value of DModX for the BD samples was 1.709 with a maximum value for the 21 samples of 2.104 and a standard deviation of 0.159. The maximum value of DModX for these samples was considered to be the threshold above which a test sample could be considered as suspect. Using a range of concentrations of p-cresol (Fig. 3B), DModX consistently exceeds that for all BD samples when its concentration is 0.75 mM (DModX = 2.132) or higher. DModX for paraquat (Fig. 3C) and glyphosate (Fig. 3D) consistently exceeds that for the BD samples at spike concentrations of greater or equal to 0.7 mM (DModX = 3.220) and 1 mM (DModX = 2.992) respectively. Therefore, it is clearly possible to detect these contaminants without knowledge of their identity at concentrations exceeding 1 mM. Inspection of the NMR data indicated that it should be possible to detect the contaminants at concentrations below this level and therefore the variable selection procedure described previously was applied. Reference NMR spectra for each of the contaminants considered showed characteristic chemical shifts and multiplicity patterns with resonances at spectral positions not occupied in the uncontaminated soft drink spectra. This is a function of the chemical structure of the contaminants and in the case of glyphosate and paraquat, of coformulants that are present in the commercial products. Fig. 4 shows selected regions of the contaminated soft drink NMR spectra with different concentrations of the three compounds under investigation. Both
200
a n a l y t i c a c h i m i c a a c t a 6 1 8 ( 2 0 0 8 ) 196–203
Fig. 3 – Histogram of DModX for the NMR spectra of the (A) DB samples, (B) p-cresol, (C) paraquat and (D) glyphosate spiked samples, respectively.
paraquat and p-cresol are visible in the NMR spectrum at concentrations of approximately 0.2 mM. Glyphosate can be seen at a concentration of 0.1 mM although it is unlikely that these resonances would be easily identified by eye without prior knowledge of their location in the NMR spectrum. Chemical shift assignment was completed on the standards of paraquat, p-cresol and glyphosate. The 1 H NMR spectrum of paraquat contained two aromatic resonances (centred at 9.060 (d) and 8.531 (d) ppm) and one methyl resonance (centred at 4.510 (s) ppm). The 1 H NMR spectrum of p-cresol contained two aromatic resonances (centred at 7.148 (d) and 6.833 (d) ppm) and one methyl resonance (centred at 2.262 (s) ppm). The 1 H NMR spectrum of glyphosate contained two alkyl resonances (centred at 3.018 (d) and 3.731 (s) ppm).
3.4.
SIMCA modelling after feature selection
Application of Eq. (1) was used to identify the resonances from glyphosate, paraquat (and their coformulants) and p-cresol when present in the soft drink matrix. The Xdiff parameter (Eq. (1)) was used to place the variables from the NMR spectrum in rank order and those with the highest 50 values of Xdiff were chosen for further multivariate modelling. The number of points chosen was determined by inspecting the ranked data to determine whether the Xdiff value changed significantly. A steep increase in the Xdiff value was found immediately before the highest 50 values. This is related to the number of resonance points that are occupied by the contaminant signals. The digital resolution of the acquired 1 H NMR spectra was
Fig. 4 – Selected regions of the NMR spectra of a soft drink showing resonances relating to (A) paraquat, (B) p-cresol and (C) glyphosate at increasing concentrations close to the limit of detection.
a n a l y t i c a c h i m i c a a c t a 6 1 8 ( 2 0 0 8 ) 196–203
Fig. 5 – A ranked plot of the highest 50 values of Xdiff calculated from the NMR spectrum of a soft beverage contaminated with 0.5 mM p-cresol (- - -), paraquat(—) and glyphosate (· · ·). The value of Xdiff is presented as a percentage of the maximum value.
201
0.305 Hz and the line width of a typical NMR resonance at halfheight was approximately 1.5 Hz. Therefore, each resonance within the NMR spectrum was described by at least four data points. Therefore, a minimum of 20 points were required to characterise paraquat (two doublets and a singlet) and p-cresol (two doublets and a singlet). Fifty data points were therefore sufficient to describe the small molecules used in this study and would also include the most significant resonance peaks from larger or more complex molecules. A ranked plot of the 50 highest values of Xdiff calculated from the NMR spectra of a soft beverage contaminated at a concentration of 0.5 mM is shown for the three model contaminants in Fig. 5. A sample containing 0.5 mM of each contaminant was used for feature selection. Application of this method in an emergency scenario would require the uncharacterised test sample to be used for variable selection. The feature selection method successfully identified 39, 37 and 12 data points directly corresponding to contaminant resonances in the p-cresol, paraquat and glyphosate adulterated samples, respectively, for which the value of Xdiff was within the highest 50 values for the 16,384 data points of the NMR spectrum. For the glyphosate adulterated sample, 28 data points were selected that cor-
Fig. 6 – Histogram of DModX after variable selection for (A) p-cresol, (B) paraquat and (C) glyphosate. The left hand panel shows data from the BD samples and the right hand panel shows data from the spiked samples.
202
a n a l y t i c a c h i m i c a a c t a 6 1 8 ( 2 0 0 8 ) 196–203
responded to non-glyphosate adulterants that were present in the Round-Up formulation and therefore provided further evidence of a contamination event. Further reduction in the limits of detection were obtained when the variables that were identified as correlating with contaminant status were used to construct a SIMCA model. SIMCA models were constructed using the principal components calculated from the selected variables for each contaminant using the DB samples. The SIMCA models used seven or eight principal components capturing between 90 and 94% of the total variance. DModX for the test data (BD and spiked samples) which were not used to construct the model are shown in Fig. 6. The mean, standard deviation and the maximum DModX for each contaminant model obtained from the BD samples were; p-cresol: 2.197, 0.306, 2.785; paraquat: 1.483, 0.347, 2.696; glyphosate: 1.980, 0.380, 2.858. This relates to robust detection limits of 0.075 mM (DModX = 3.246), 0.2 mM (DModX = 3.389) and 0.06 mM (DModX = 3.417) for p-cresol, paraquat and glyphosate respectively. DModX values were found to increase proportionately with the concentration of the contaminants and would therefore facilitate an estimation of the concentration of each contaminant to be made from an uncharacterised sample using multivariate calibration methods. This automated procedure has therefore identified the presence of irregularities within the spectral profiles of three sets of uncharacterised, but contaminated samples at concentrations that are consistent with the NMR data shown in Fig. 4. The DModX parameter was a very sensitive measure of whether the soft drink was contaminated. Using the top 50 results from Eq. (1) to create a variable selected SIMCA model, and measuring the DModX values for the test spectra, reduced the limits of detection significantly when compared to omitting variable selection. The variable selection stage also allows the chemical shifts of the contaminant resonances to be accurately determined which facilitates automated detection and characterisation of contaminants in foods and beverages from NMR spectra.
4.
Conclusions
This paper has demonstrated that the combination of 1 H NMR spectroscopy and chemometrics provide an effective method for the detection of contamination in foods. This approach does not require a priori knowledge of the nature of the contaminant, making it ideal for rapid detection of unknown contamination. The method has been successfully demonstrated on three chemically distinct potential contaminants, obtaining limits of detection for p-cresol, paraquat and glyphosate that are consistent with those that would be expected if the NMR data were to be painstakingly explored by an experienced NMR operator. The development of the methodology presented here was primarily focused on the need for unbiased methods for the detection of chemical contamination incidents. The achievable limits of detection were such that contamination events were detected at levels below that at which a serious threat to public health may ensue. For example, the minimum lethal dose (LDLO ) by oral ingestion in humans of
the most toxic compound studied here, paraquat, is 35 mg kg−1 [18]. Using a typical beverage volume of 500 mL, this relates to a required paraquat concentration of 19.05 mM in a soft drink to produce a lethal effect. Clearly the detection limit of 0.2 mM for paraquat is relevant in this context, being almost 100-fold lower than that which is necessary to prevent mortality. The methods described here can potentially detect any contaminant containing non-exchangeable protons, thus covering a wide range of organic contaminants. This can be contrasted with the widespread use of targeted approaches for contaminant detection such as liquid chromatography and gas chromatography–mass spectrometry where the methodologies and the sample preparation procedures used are inherently selective for subsets of compounds. Maximum efficiency of the method described is obtained when NMR signals are resolved from background signals. There is a probability that some of the signals from potential contaminants will fall in regions of the NMR spectra that are populated by native signals from the food matrix. However, the nature of toxic contaminants is such that they are unlikely to occur as soft beverage ingredients and their NMR resonances are unlikely to completely overlap with existing signals. In particular, those compounds that show extreme toxicity often possess chemical structures that are significantly different to those compounds that are found in foods (e.g. organophosphates). This provides the basis for the detection of contaminants within the soft beverage using the chemical shift of functional groups that are resolved from the residual signals. Contaminant signals and matrix signals are additive thus overlapping signals can also potentially be detected using these methods. The methodology that is described herein has been implemented within our laboratory for the unbiased detection of contaminants in soft beverages. The multivariate models described in this manuscript have been challenged with additional NMR spectra (data not shown) from soft beverages containing contaminants not included in this study. For example; phenol, methanol, and sodium fluoroacetate have been clearly detected and characterised using this approach, at concentrations not exceeding 1 mM and without further optimisation of the system. The limits of detection for NMR measurements are largely determined by the signal/noise level of the spectrum, which can be improved by longer acquisition times or improved instrumentation such as the cryoprobe. Cryoprobe NMR spectroscopy has previously been shown to result in detection limits for contaminants in water of approximately 1 M [19]. Simple sample preconcentration techniques such as rotary evaporation or solid phase extraction (SPE) could also be employed, but with the associated risk of analyte losses that are usual for targeted analyses. Significant improvements in throughput have recently been achieved using state-of-theart NMR instrumentation and equivalent sensitivity to that shown here has been achieved with acquisition times of less than 5 min (data not shown). The statistical approach that has been taken here is also applicable to the detection of contaminants using other techniques such as liquid and gas chromatography mass spectrometry, however, the reproducibility and data comparability of these techniques may be limiting at present. A key element of the method is the requirement for an accurate
a n a l y t i c a c h i m i c a a c t a 6 1 8 ( 2 0 0 8 ) 196–203
database to describe the genuine soft drink matrix. A larger database that covered a full range of production sites and sample ages would ensure that false positive results did not occur due to fluctuations in product formulation. This would ideally be supported by a substantial database of spectra from potential contaminants and product formulants to ensure correct interpretation of the mathematical models. This approach would also permit the rapid identification of unknown pollutants and thus facilitate quantitative determination of the concentration of the contaminant. The current need for methods to detect when a sample is atypical of its kind is clear for the detection of both deliberate and accidental contamination events.
references
[1] N.J.C. Bailey, M. Oven, E. Holmes, M.H. Zenk, J.K. Nicholson, Spectroscopy 18 (2004) 279. [2] A. Charlton, T. Allnutt, S. Holmes, J. Chisholm, S. Bean, N. Ellis, P. Mullineaux, S. Oehlschlager, Plant Biotech. J. 2 (2004) 27. [3] H.K. Choi, Y.H. Choi, M. Verberne, A.W.M. Lefeber, C. Erkelens, R. Verpoorte, Phytochemistry 65 (2004) 857. [4] N. Ogrinc, I.J. Kosir, J.E. Spangenberg, J. Kidric, Anal. Bioanal. Chem. 376 (2003) 424. [5] G. Le Gall, I.J. Colquhoun, M. Defernez, J. Agric. Food Chem. 52 (2004) 692.
203
[6] A.J. Charlton, W.H. Farrington, P. Brereton, J. Agric. Food Chem. 50 (2002) 3098. [7] G. Le Gall, M. Puaud, I.J. Colquhoun, J. Agric. Food Chem. 49 (2001) 580. [8] J.C. Lindon, E. Holmes, J.K. Nicholson, Expert Rev. Mol. Diagn. 4 (2004) 189. [9] E. Holmes, H. Antti, Analyst 127 (2002) 1549. [10] H. Wold, in: P.R. Krishnaiah (Ed.), Multivariate Analysis, Academic Press, New York, 1966, p. 391. [11] H. Wold, in: H.M. Blalock Jr., F.M. Borodkin, R. Boudon, V. Capecchi (Eds.), Quantitative Sociology: International Perspectives on Mathematical and Statistical Model Building, Academic Press, New York, 1975, p. 307. [12] S. Wold, Lect. Notes Comput. Sci. 8 (1976) 127. [13] C. Wikstrom, C. Albano, L. Eriksson, H. Friden, E. Johansson, A. Nordahl, S. Rannar, M. Sandberg, N. Kettaneh-Wold, S. Wold, Chemometr. Intell. Lab. 42 (1998) 233. [14] L. Eriksson, E. Johansson, N. Kettaneh-Wold, S. Wold, Multiand Megavariate Data Analysis: Principles and Applications, Umetrics AB, Umea, Sweden, 2001. [15] H. Hotelling, Ann. Math. Stat. 2 (1931) 360. [16] L. Eriksson, H. Antti, J. Gottfries, E. Holmes, E. Johansson, F. Lindgren, I. Long, T. Lundstedt, J. Trygg, S. Wold, Anal. Bioanal. Chem. 380 (2004) p419. [17] F. Grubbs, Technometrics 11 (1969) 1. [18] S.L. Wagner, Clinical Toxicology of Agricultural Chemicals, Noyes Data Corporation, 1983. [19] A.J. Charlton, J.A. Donarski, S.A. Jones, B.D. May, K.C. Thompson, J. Environ. Monitor. 8 (2006) 1106.