Chemometrics and intelligent laboratory systems ELSEVIER
Chemometrics and Intelligent Laboratory Systems 35 (1996) 199-211
Correction for drift in multivariate systems using the Kalman filter Sarah C. Rutan a,1 Eric Bouveresse a Kevin N. Andrew b, Paul J. Worsfold b D.L. Massart a,* a ChemoAC, Pharmaceutical Institute, Vrije Universiteit Brussel, Laarbeeklaan 103, B-1090 Brussel, Belgium b Department of Environmental Sciences, University of Plymouth, Drake Circus, Plymouth PL4 8AA, UK Received 22 March 1996; revised 11 July 1996; accepted 15 July 1996
Abstract Many analytical instruments can now be used effectively to monitor industrial and environmental processes on-line. In the large majority of cases, however, the instrument must be first calibrated, and then recalibrated if the instrument characteristics change with time. Here an approach for detection and correction of time-dependent drift in multivariate calibration parameters is described. Kalman filters can be devised that correct for drift in the baseline and drift in sensitivities, either through purely random processes, or a combination of linear and random drift. Some examples of implementations of this filter are demonstrated using synthetic, near-infrared, and UV-visible spectrophotometric data. The application of the Kalman filter to the inverse calibration problem is also demonstrated. Keywords: Kalman filter; Multivariate systems; Environmental analysis
1. Introduction Calibration, with subsequent prediction of the concentrations in unknown samples, using instruments placed on-line has become increasingly popular over the past few years. Near-infrared (NIR) spectrometers are one type of instrument where complex mixtures are often used to calibrate the instrumental response, for subsequent determination of un-
* Corresponding author. 1 Permanent address: Box 842006, Department of Chemistry, Virginia Commonwealth University, Richmond, VA 23284-2006, USA.
known samples [1,2]. Other instruments that are used for calibration and for the determinations of complex samples include UV-visible diode array spectrometers, often used as detectors in flow injection analysis [3-6], and for liquid chromatographs [7]. A potential problem is that the instrumental response may not be stable in time, and thus may require periodic recalibration. Instrumental characteristics that may be affected by time-dependent drift include the baseline, the sensitivity, the signal-to-noise ratio, and the wavelength alignment [8]. Many recent articles have addressed the problem of calibration updating, but frequently this is to bring two different states of the instruments into agreement with one another [9-12]. Often a change is made to a new instrument, and a
0169-7439/96/$15.00 Copyright © 1996 Elsevier Science B.V. All rights reserved. Pll S 0 1 6 9 - 7 4 3 9 ( 9 6 ) 0 0 0 5 2 - 4
200
S.C. Rutan et al. / Chemometrics and Intelligent Laboratory Systems 35 (1996) 199-211
calibration transfer algorithm is required [9-12]. Another type of error that can affect the reliability of the calibration parameters, and the subsequent concentration determinations, is a low frequency random drift, often called flicker noise or 1/f noise [13]. In this case, frequent recalibration may be required to ensure that reliable concentration predictions for the unknown samples are obtained. This is the problem, for baseline and sensitivity drift, that is addressed in the present work. The Kalman filter is a recursive, digital filter that is useful for measuring parameters in drifting measurement systems. Thijssen and co-workers, in a series of four papers [14-17], described an application of the Kalman filter for monitoring drift in the baseline and the sensitivity for a flow injection analysis system. The precision of the calibration parameters was monitored on-line, so that a decision as to when recalibration was necessary could be made in realtime. An application of this approach to atomic absorption measurements has been reported [18]. A limitation of this method was that this system was developed for univariate data, i.e., measurements made at a single wavelength, whereas in near-infrared spectroscopic determinations, diode arraybased measurements, and in liquid chromatography, calibration and determination are often based on multivariate methods. Brown has described an approach for correcting for drift in the prediction stage
for multivariate calibration problems, and treated both proportional and linear drift in the predicted unknown concentration, but did not estimate the drift during the calibration phase [19]. The goal of the work described here is the extension of the approach described by Thijssen et al. [14-17] and Wienke et al. [18] to a variety of different multivariate systems.
2.
Theory
The Kalman filter is based on two models, rather than a single model, and it is this characteristic that provides the method with its flexibility [20,21]. The system model is expressed as
x ( k l k - 1) = F ( k , k - 1 ) ' x ( k -
(1) and the measurement model is expressed as
z(k) = h T ( k ) "x(klk- 1) + v(k)
[si] SBi
F(k, k -
1) = I ((n + 1) × (n + 1))
i
zi(k) = A(A i, k) ri(k) = s2(A(hl, k))
P = I . 104 ((n + 1) × (n + 1)) Q ( k ) = 0 ( ( n + 1) × (n + 1))
Q(k) =
qa
0
0
...
or
0
q8
0
...
0
0
0
qc
...
0
0
0
0
"'"
qb
h~k)=[ca(k)
cB(k)
cc(X) ...
(2)
The vector x contains the parameters of interest, and is called the state or parameter vector. For the classical (direct) calibration problem, this vector is comprised of estimates for the sensitivities for each component in the mixture, as well as an estimate for the baseline contribution. The scalar z can then be defined as a measurement of the spectral intensity, in which case hT(k) contains the concentrations of each
Table 1 Classical model information - kth standard
x(k)= Sci
l l k - 1) +w(k)
S.C. Rutan et al. / Chemometrics and Intelligent Laboratory Systems 35 (1996) 199-211
component in the kth calibration standard, and v(k) is a noise contribution. The index k is incremented for each calibration standard. The system model describes how the sensitivities and baseline evolve in time, where F(k, k - 1) is called the system transition matrix, and w ( k ) is a noise contribution. Here, x ( k l k - 1) is interpreted as 'the estimate for the sensitivities at the time that the kth standard is measured, based on information obtained from all measurements up to and including the most recent ((k 1)th) standard.' In the ideal case, the sensitivities are time-invariant, and not affected by additional noise processes beyond those represented by the measurement noise, v ( k ) , so the system model can be reduced to the following form: x(klk-
1) = I . x ( k - l l k - 1 )
Initialize x, P
(3)
where I represents the identity matrix. The model represented by Eq. (2) only accounts for the sensitivities at one wavelength, where clearly the sensitivities at all wavelengths must be determined in order to make use of the multivariate data. This can be accomplished by setting up m different parameter vectors, each with the same form as is shown in Table 1. In this way, by cycling through the m wavelengths and p standards, m final estimates for x are obtained, containing the sensitivities for every component, and a baseline contribution, if necessary, for each of the m wavelengths. A flow chart for this approach is shown in Fig. 1. Note that this is n o t the same as setting up m independent univariate models. The Kalman filter algorithm equations are shown in Table 2, and are implemented recursively, cycling
I Set x, P, Equations (4)- (9)
I Save x, P for ith wavelength, increment i
1 I Incrementk Fig. 1. Flow chart for multivariate, on-line Kalman filter.
Table 2 Kaiman filter algorithm State estimate extrapolation 1)=F(k, k - 1 ) . x ( k - I l k - 1) Covariance estimate extrapolation P ( k l k - 1) = F(k, k - 1 ) . x ( k - I l k - 1). FT(k, k - 1) + Q(k) Weighting factor, Kalman gain g ( k ) = P(klk - 1). h ( k ) . ( 1 / ( h T ( k ) • P(klk - 1). h ( k ) + r(k)) Innovations sequence x(klk-
v(k) = z(k) - (hT(k) . x ( k l k -
1))
(4) (5) (6) (7)
Parameter update x(klk) = x ( k l k - 1 ) + g ( k ) . u(k)
(8)
Covariance matrix P(klk) = (I - g ( k ) . hT(k)) • P(klk - 1). (I - g ( k ) . hT(k))v + g ( k ) . r(k) •gT(k)
(9)
201
202
S. C. Rutan et al. / Chemometrics and Intelligent Laboratory Systems 35 (1996) 199-211
through the data for each standard sequentially. The following information is required to implement the filter: (1) The values comprising the h vector, here the concentrations of each component present in each of the calibration standards; (2) an estimate for r(k), the variance of the noise contribution; (3) an estimate for Q(k), the covariance of the noise contribution w(k) (assumed to be zero for the ideal case mentioned above); (4) an initial estimate for x(0[0) (in this case the sensitivities) to start the first cycle of the filter (often taken as zeros), and (5) an initial estimate for the covariance, P(010), describing the covariance of these initial estimates (often taken as I . 10 4, or a similarly large number). In order for the filter to be able to process the data in a way amenable to on-line analysis, all m wavelengths for the first standard must be processed before incrementing to the next standard, as shown in Fig. 1. An example of how the matrices should be structured for any given wavelength, Ag, is shown in Table 1. The index k indicates the standard number, and the index i refers to the wavelength number. The letters A, B, and C represent the different chemical constituents of the mixture that are to be determined, and n is the number of chemical constituents in the samples (all subsequent models shown here are for n --- 3). The h vector stays the same, as long as the data for the first standard sample are being evalu-
ated, and the parameter vectors, x;, and covariance matrices, Pi, are initialized to zero vectors and identity matrices scaled by 104 , respectively. After cycling through the algorithm equations shown in Table 2, once for each wavelength, the values for xg(lll) and Pg(l[1) are stored, and recalled when the sample is incremented to standard number two (k = 2), and data for the ith wavelength are to be processed. Once the p standards have been evaluated, the resulting m x vectors contain the least-squares optimal results for the sensitivities and the baseline contribution for each wavelength, and the corresponding m P matrices describe the covariance of these estimates. In this form, the Kalman filter provides estimates that are identical to those obtained using classical multivariate least-squares calibration, however, the data are processed standard by standard, so that estimates for the sensitivities may be obtained in real-time. There are several modifications to the model that can be made that allow the filter to compensate for the presence of drift in the analytical system. Some examples of the possible variations are described here. First, a non-zero Q matrix can be used to account for random processes affecting the baseline and or/sensitivities, beyond those reflected in the noise structure of a single spectrum (which provides the estimate for ri(k) = s2(A(Ag))). This modification is represented in Table 1 by the option of having a
Table 3 Modified model information
1[ooooo SBi
x(k)= Sci
F(k,k-1)=
bi
0l O1 00 0 0 0 0 0 0
P(OIO)= I . 104 ((n + 3) × (n + 3))
zi(k) = A(Ai)
0 0 0 0 Q(k)= 0 0 0 0 0 0 0
0 0 0 0 0
ri(k) = $2( A()ti))
hr(k)=[ca(k)
cB(k)
0 0 0 0
0 0 0 q,~ 0 qt3J cc(k)
1 0
0]
0 0 1 0 1 0 0 0 0
11 i~ ] 0 1 0 1 0
203
S.C. Rutan et aL / Chemometrics and Intelligent Laboratory Systems 35 (1996) 199-211
Table 4 Model variations (A) Linear, random drift in baseline only
sAi1
sBi x(k) = Sci
[ ooo ] 0
F(k, k - 1)=
1
0
0
1 0
0
o
o,
0
0
o
q~ ~ o
(B) Linear, random drift with independent drift affectingeach component's sensitivity and baseline drift SA i
x(k)= biA
10000 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0
- 1 0 0 0 0 1 0 0 0 0 1 0
SBi Sci
F(k,k_l)=
a~ I ac
0 0 0 0
0 0 0 0
0 0 0 0
0
0
0
0 0 1 0 0 0 1
qaA' qaB, qac' q# 4=0
ISAil
(C) Linear, random drift with independent drift affectingeach component's sensitivity, no baseline or baseline drift
x(k)=
SBi Sci
aa aB
F(k,k-1)=
ac
li°° 1°il1 0
1
0
0
0 0 0 0
1 0 0 0 1 0 0 0 1 0 0 0
0
1
0
qaA, qaB, qac -4=0
non-zero Q matrix. A second change that can be implemented is to allow for linear drift with random noise superimposed, as proposed by Thijssen [14]. The addition of the random noise component to the model means that the drift is considered to be linear for short time periods (a reasonable approximation for 1 / f , or flicker noise), but allows the magnitude and direction of the drift to change gradually over time. The model information for this situation is shown in Table 3 (for a three-component system). The new parameter a represents the linear drift contribution to the sensitivities (i.e., multiplicative drift), which is assumed to be the same for each component in this case, and the parameter fl represents the linear drift component in the baseline (i.e., additive drift). The drift processes are presumed to affect the parameters,
rather than the measurements themselves; this reduces the dimensionality of the problem. To more clearly illustrate this point, the system model is represented by the product of F(k, k - 1). x ( k - I l k 1) as follows:
-SAi(k- 1) + x(kLk-
1) =
a(k-
1)
s B i ( k - - 1) + a ( k -
1)
Sci(k-
1) + a ( k -
1)
b,(k-
1) + f l ( k -
1)
a(k-
(lO)
1)
/ 3 ( k - 1) The noise contribution, w, is assumed to con-
204
S.C. Rutan et al. / Chemometrics and Intelligent Laboratory Systems 35 (1996) 199-211
tribute only to the drift parameters a and fl so that the full system model is
=SAi(k- l)
-~- o ~ ( k - 1)
sBi(k- 1) + a(k-- 1)
sci(k- 1) + a(k- 1) x(klk-1)= bi(k_l)+fl(k_l)
(11)
t (k- 1) This is the same approach that was used by Thijssen [14-17] for the univariate case. The term a models multiplicative drift, as it is scaled by the concentration of the components when implemented in Eq. (2). This results in greater corrections at higher intensities, which is a characteristic of the multiplicative variations observed in NIR data. Since, however, the contribution is scaled by the concentration rather than the sensitivities directly, this may not result in the best model, if the background component has wavelengths with large intensities. If this is the case, the drift contribution should be more appropriately scaled by the actual intensity. In addition, this correction, with the assumption of constant a , will be most effective if the molar absorptivities (sensitivities) for the different components are similar. Correction via the standard normal variate (SNV) detrending approach [22], prior to Kalman filter analysis, may compensate for this difficulty to some extent, although this results in spectra which do not follow the ideal Beer's Law relationship, and these data are
usually analyzed using principal components regression (PCR) or partial-least-squares (PLS) methods using more than one latent variable. However, these approaches should be able to be coupled to the Kalman filter drift correction, where the parameters are not the sensitivities, but the component loadings. The model shown above can be implemented with both drift parameters incorporated, as shown in Table 3, or with only the baseline drift. If no sensitivity drift is present, this is a more appropriate way of treating the data, since it reduces the number of degrees of freedom in the model. This model is shown in Table 4A. Alternatively, a separate drift parameter (t~A, c~B, ... ) can be assigned to the sensitivity for each component, which may be more appropriate for components with significantly different sensitivities. Several variations of the system model for various types of drift are also shown in Table 4. Note that a random drift is likely to affect the baseline (additive drift) and the sensitivities (multiplicative drift) in NIR measurements, due to particle size differences and packing variability which cause variations in the scattering efficiency. In some cases, it is desirable to process the data according to an inverse model. Especially for the case of NIR spectra, where the concentrations of all components present in the mixtures are not available, an inverse approach must be used. Instead of solving A = C- • (often represented as Y = X. b) for • and then having to find the pseudo inverse of • to determine Cunk for a n Aunk, the equation C = A. 4) is solved for ¢, and Cunk c a n be solved for directly by
Table 5 Inverse model information - kth standard
F(k,k-1)=I(m+l×m+l)
zi(k) = CA(k) ri(k) = S 2 (CA (k))
P = I . 104 ( m + 1 × m + 1) Q(k)=0(m+l×m+l)
hW(k)=[al(k)
az(k)
A3(k)
A~(k)
iI
S.C. Rutan et al. / Chemometrics and Intelligent Laboratory Systems 35 (1996) 199-211
205
0.45
multiplying ~b by Aun k. The Kalman filter setup for the inverse calibration approach is shown in Table 5 for component A. Similar equations can be set up if additional components are to be determined.
0.4 0.35 0,3 0.25 0.2
co 0.15
3. Experimental
0.1 0.05
All simulations and data fitting were done using programs written in the MATLAB environment (Mathworks, Inc., S. Natick, MA), and run on 486 type personal computers.
Eighteen mixtures with the relative concentrations shown in Table 6 were simulated based on two 100 point Gaussian peaks positioned at point 40 and point 60, with a peak width corresponding to a standard deviation of 20. One set of synthetic data was ideal (no drift), and was affected only by normally distributed noise with a variance of 2.5 × 10 -s. These data are shown in Fig. 2. The second data set was modified by adding a random drift (random walk)
Table 6 Concentration design for simulated data
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
,'o
8'0
=0
100
Wavelength Number
Fig. 2. Two component system with no random drift; r = 2.5 X 10-5; 18 samples.
3.1. Computer simulations
Sample #
0 -0.0.=
Component 1 concentration
Component 2 concentration
0.10 0.20 0.30 0.40 0.20 0.20 0.20 0.40 0.40 0.10 0.20 0.30 0.40 0.20 0.20 0.20 0.40 O.40
0.40 0.30 0.20 0.10 0.20 0.40 0.30 0.20 0.30 0.40 0.30 0.20 0.10 0.20 0.40 0.30 0.20 0.3O
component to the data using the following series of MATLAB commands. d(1) = 0 ; for i = 2:18
d(i) = d ( i -
1) + 0.01 * randn;
end
(12)
The actual drifting baseline values generated by these MATLAB commands are shown in Table 7, and the complete, drifting data set is shown in Fig. 3.
Table 7 Random drift vector ( d ) Standard No.
Drift contribution
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
0 0.0116 0.0179 0.0187 0.0222 0.0152 0.0322 0.0328 0.0507 0.0534 0.0621 0.0476 0.0406 0.0531 0.0467 0.0525 0.0489 0.0475
S.C. Rutan et al. / Chemometrics and Intelligent Laboratory Systems 35 (1996) 199-211
206
3.2. Near-infrared measurements Near-infrared data were obtained on a Bran + Luebbe (Elmsford, NY) Infraalyzer 500 instrument. Three different standards, polystyrene, a high absorbance halon standard and a low absorbance halon standard, supplied with the manufacturer's reference cup kit accessory [23] were used in this study. Two different data sets were evaluated here. The first data set (NIR1) consisted of 16 spectra of the strongly absorbing reference. In this study, the sample cup position was deliberately varied systematically. The data were therefore affected by fluctuations relating to the positioning of the sample cup. This effect gives rise to errors greater than those inherent in the noise in a single spectrum. The second data set (NIR2) consisted of measurements of all three reference cup standards, that were measured periodically, in replicates, over a 5 week period, to characterize the stability of the instrument. In total, 147 spectra of the three standards were obtained at 701 wavelengths.
3.3. UV-visible measurements UV-visible data were obtained from a Hewlett Packard 8451A diode array spectrophotometer, as described previously [6,24].
Table 8 Results for Kalman analysis of 18 sample system - No drift correction, model shown in Table 1, q = 0 Calibration Calibration Prediction (18 samples) (9 samples) (9 samples) SEC(1) SEC(2) SEC(1) SEC(2) SEP(1) SEP(2) Data w/odrift 0.65% 0.56% 0.71% 0.58% 0.75% 0.63% Data w/drift 3.24% 3.19% 2.09% 1.53% 5.40% 4.77%
filter approach with no drift compensation are summarized in Table 8. The data were treated in two ways. First, all 18 samples were treated as known standards and used to determine the SEC (standard error of calibration), expressed as a percentage of the average concentration of each component in all the standards. In the second approach, the first nine samples were treated as a calibration set, and the last nine were treated as a prediction set, so both the SEC and the standard error of prediction (SEP) are reported in Table 8. These results demonstrate that the drift affected data result in much higher errors for both the SEP and SEC for both fitting approaches, while the drift-free data give precise and accurate calibrations and predictions, as expected.
4.1. Compensation for drift 4. Results and discussion The results for fitting both the drift-free and drift affected synthetic data sets using the classical Kalman 0.5 0.4 0.3 0.2 QAI 0
"°'
2'0
,o
;0
WavelengthNumber
8;
,00
Fig. 3. Two component system with random drift added; r = 2.5
× 10-5; 18 samples.
The next series of experiments were designed to determine if a nonzero q element (qb = E ( W b " Wb), where E( ) represents the expectation value) assigned to the baseline component could compensate for the drift added to the synthetic data. A plot of the values for SEC and SEP for each of the other two components is shown in Fig. 4. A relatively wide range of qb values (10 -4 tO 10) gives acceptable values for the percent error for both calibration and prediction. The value 1 × 10 -4 describes the variance of the random component originally added to the data to simulate the drift in the baseline (see Eq. (12)). A comparison to the results shown in Table 8 shows that, in particular, the values for the SEP for the two components can be dramatically reduced by taking into account the random drift. As qb increases further, eventually the SEC values become quite small, and the SEP values quite large, which is typically observed in cases of overfitting. The smallest SEPs
S.C. Rutan et al. / Chemometric s and Intelligent Laboratory Systems 35 (1996) 199-211 I0
(a) (b)
',,
/ .
(c)
207
be considered reasonable in that some of the variance in the drift describes a linear trend (see Table 7), and the remainder is the random contribution. For both drift correction models described above, when the appropriate q value is chosen, the SEP percentages are much improved over those obtained when no drift correction is employed (see Table 8). Even though this is a relatively simple baseline drift, these results demonstrate the feasibility of the approach for more complex drift effects.
(d) 4.2. Inverse calibration
1o-
,b
/o"
#
1o,
Syltem Nolle Va~nce (q)
Fig. 4. SEC and SEP for components 1 and 2 in the mixture spectra depicted in Fig. 2; correction for random drift in baseline (model shown in Table 1, qb ~ 0). (a) SEP component 1; (b) SEP component 2; (c) SEC component 1; (d) SEC component 2.
are found when qb = 10, which is larger than the value corresponding to the variance of the random variable added to the drift component ( q b = 1 × 1 0 - 4 ) . It is not clear why this occurs. An alternate model can be used to fit this data, where the offset is presumed to be affected by linear drift, as well as by a random component. This situation corresponds to that presented in Table 4A. The results for the SEC and SEP as a function of the value used for qt3 is shown in Fig. 5. Here the best SEP values are obtained for q~ = 1 X 10 -5, which could
~ ,( d )
(b)
(c)
Inverse calibration with the Kalman filter has not been proposed previously, but has some important characteristics. First, by using a smaller initial value for P (which implies that some a priori information is available), problems due to ill-conditioning when too many wavelengths are used can be avoided. For example, approximate values of the sensitivities (at least the order of magnitude) are often available. The matrix P will then represent the uncertainty in these initial estimates. Normally for inverse regression, only a limited number of wavelengths can be used, since the rank of the concentration matrix is only p or m (whichever is smaller). By providing smaller initial values for P, implying the availability of prior information, more wavelengths can be used, and this difficulty can be avoided. However, since the F and P matrices have dimensions of m X m in this case (m is the number of wavelengths), computer memory limitations may come into play for larger numbers of wavelengths. Still, a much larger number of wavelengths can normally be used than for regular inverse regression, potentially resulting in greater precision of the calibration parameters. The results given in Table 9 were obtained using an initial value of P = I . 10 -3 ; Table 9 Results for inverse Kalman analysis of 18 sample system, model shown in Table 5 a Inverse calibration
10o
calibration (18 samples)
102
System Noise Variance (q)
Fig. 5. SEC and SEP for components 1 and 2 in the mixture spectra depicted in Fig. 2; correction for linear drift and random drift in baseline (model shown in Table 4A, qt3 ~ 0). (a) SEP component 2; (b) SEP component 1; (c) SEC component 2; (d) SEC component 1.
calibration (9 samples)
prediction (9 samples)
SEC(1) SEC(2) SEC(1) SEC(2) SEP(1) SEP(2) Data w / o d r i f t 0.75% 0.73% 1.07% Data w/drift 1.21% 1.21% 2.23% a p = 10-3.
1.08% 0.82% 1.56% 1.92% 6.81% 6.02%
208
S.C. Rutan et al. / Chemometrics and Intelligent Laboratory Systems 35 (1996) 199-211
Table 10 Results for Kalman analysisNIR standards (A) Model shown in Table 4C, data set NIR2 calibration
prediction
no correction q = 0 Polystyrene 0.17% High std. 0.15% Low std. 0.21%
q=
10 -14
0.13% 0.13% 0.13% 0.13% 0.064% 0.064%
q=
10 -12
0.14% 0.12% 0.075%
q = 10-1° no correction q = 0 0.16% 0.13% 0.31%
0.28% 0.39% 2.2%
q=
10 -14
0.67% 0.22% 0.55% 0.38% 3.7% 1.9%
q=
10 -12
q= 10-1°
0.21% 0.37% 1.7%
0.67% 0.55% 3.7%
q = 10 -12
q= 10-1°
0.093% 0.16% 0.26%
0.37% 0.17% 0.16%
(B) Model shown in Table 4B, data set NIR2 calibration
prediction
no correction q = 0 Polystyrene a High std. a LOWstd. a
q=
10 -14
0.12% 0.17% 0.13% 0.13% 0.065% 0.071%
q = 10 -12
q = 10-l° no correction q = 0
0.12% 0.12% 0.073%
0.14% 0.14% 0.29%
a a a
q=
10 -14
0.11% 0 . 1 2 % 0.16% 0.16% 0.36% 0.32%
a Results indicatedmodel colinearity.
values larger than this tended to result in over fitting - - very small values for SEC, but very noisy calibration sensitivities. Values of P that were smaller than this resulted in larger values of the SEC, and increasing deviations between the true component spectra and the calibration sensitivities returned by the algorithm. The values obtained using P = I . 10 -3 were judged as acceptable due to the similarities to the SEC's found using the classical Kalman filter regression results. 4.3. Analysis o f N I R standards
The ability of the baseline drift algorithms to compensate for errors in NIR spectra was also examined. First the NIR1 data set was analyzed, which consisted of a series of 16 spectra of the strongly absorbing reference. The concentration of each standard was arbitrarily assigned a value of one. Here the variance of the noise in the spectra was estimated at 1 X 1 0 - 7 ( r ( k ) ) , and the estimate for the variance of the random baseline drift was 1 X 10 -5 (qb) (model shown in Table 1). In this case the SEP for the last 8 spectra, used as a prediction set, was reduced from 0.10% to 0.058%. If the process affecting the baseline is assumed to be a combination of a linear drift and random drift (model shown in Table 4A), the SEP was 0.078% with an assumed variance of the random drift contribution of 1 X 10 -8. As was the case for the
synthetic data, the system variance decreases when a portion of the fluctuations are assumed to arise from a linear process. In this particular case, the fluctuations affecting the data appeared to be approximately sinusoidal (not linear), and so the approach that did not incorporate a correction for linear drift gave a better result for the SEP. Next, the complete series of standard spectra (data set NIR2) was evaluated by a Kalman filter, where the concentrations were again arbitrarily set to one for each component, so that the Y matrix ( h T vectors in Kalman filter notation) had the following form: - 100100 , . .
010 010 001 001 with the pattern depending on the sequence in which the standards were measured. These spectra are shown in Fig. 6, and it can be seen that the spectra for each standard are quite reproducible, with little response drift apparent at this plot scale. Here, no offset was included in the model, and each of the three components was assigned to a separate drift component, but with equal variances, qc~m q~B = =
S.C. Rutan et al. / Chemometrics and Intelligent Laboratory Systems 35 (1996) 199-211 2.5
2
15
b
i 1500
1000
i 20(]0 WIl~h
(nm)
Fig. 6. Three standard NIR spectra with replicates used to test the Kalman filter drift algorithm. (a) polystyrene; (b) high absorbance halon standard; (c) low absorbance halon standard.
q~c (see Table 4C). The results, where the first 102 samples were used for calibration, and the final 45 were used for prediction, are shown in Table 10A. A slight improvement in the prediction errors can be seen, with an optimal value with q~, set equal to 1 × 10 -12. This small value for q, and the slight improvement in the prediction error, is consistent with the high stability of this instrument, and the care taken during measurement to minimize systematic drift. The low absorbance standard shows much higher prediction errors compared to the calibration errors, and higher prediction errors than for the other standards. This may be due to a drift in the instrument baseline, which was not included in this model, and would probably affect the results for the low absorbance standard more than the other two standards. To check this, an offset was added to the model, and the SEC and SEP values were recalculated. These results are shown in Table 10B (the corresponding
209
model used is shown in Table 4B). These results seem to confirm this hypothesis, as the SEC and SEP values are much more consistent, and the SEP's for the low absorbance standard are reduced by almost an order of magnitude. While the drift correction algorithm provided good results in this case, the regular Kalman filter did not, probably because of the multi-colinearity between the high and low absorbance standard when combined with an offset. It is not clear why the model with drift correction included was more stable. These data can now serve as a reference data set, where additional drift in wavelength positioning, sensitivity, and baseline can be added, in order to better characterize the performance of this and other algorithms for detection and correction for drift in NIR systems [8].
4.4. Analysis of UV-visible spectra of transition metals As another test of the capabilities of the Kalman filter drift correction approaches described here, a set of UV-visible spectra of transition metals, shown in Fig. 7, were studied [24]. The data consisted of 8 calibration spectra, and 19 test spectra, and the calibration and prediction errors for these data, with no drift correction, are shown in Table 11. By using the drift correction approach represented in Table 4A, somewhat improved results were obtained, also shown in Table 11. In addition, the errors for each of the three components are more uniform, probably due to the use of equal q values for all three components. Once a q value has been established in this way for a particular instrument, future data can be handled using the same value. Using this approach of comparing
Table 11 Results for Kalman analysis of transition metal UV-visible DAD spectra Calibration (8 samples)
Prediction (19 samples)
a
SEC(Co 2+ )
SEC(Cu 2 + )
SEC(Ni 2 + )
SEP(Co 2 + )
SEP(Cu 2 + )
SEP(Ni 2 + )
b
1.78% 1.41%
0.38% 1.16%
2.41% 2.10%
2.91% 1.76%
0.52% 1.49%
3.62% 2.25%
No drift correction applied; classical calibration with offset (model in Table 1, q = 0). b Drift correction according to model in Table 4A, with q,~ = q~ = 1 × 10 -2. a
210
S.C. Rutan et al. / Chemometrics and Intelligent Laboratory Systems 35 (1996) 199-211 0.35
sources of calibration drift in analytical instruments based on variations in model spectra, and to implement a drift correction approach based on PLS or PCR latent variables for real calibration standards and unknowns. Helland et al. have described a recursive approach for updating PLS latent variables [25], however, this approach uses an empirical weighting scheme and does not explicitly model instrument drift. The Kalman filter update approach described here would allow more appropriate weighting factors to be used and would permit drift parameters to be measured directly.
0.3
0.25
0.1
0,05
O,
300
350
i
i
400
450
i
i
i
500 550 600 Wavelength (nm)
650
i
i
700
750
800
Fig. 7. Calibration and prediction UV-visible spectra for mixtures containing three transition metals used to test the Kalman filter drift algorithm.
several different models allows the best model to be chosen for the particular application. This data set was also used to investigate the ability of the algorithm to detect and correct for the presence of a purely linear drift in the baseline. A linear drift in increments of 0.01 absorbance units was added to these data, and the model shown in Table 4A was used to estimate the drift component. Since the drift is a change relative to the previous spectrum, the estimates for drift are only accurate once the data from several spectra have been evaluated. Once the first spectrum has been fit, however, the estimate for the drift was accurately determined. The best resuits were obtained for a q value of zero, since there was no random component to the drift. 4.5. Conclusions In this paper we have attempted to demonstrate that the Kalman filter can be used to follow and correct for drift in the parameters of calibration systems. This was demonstrated with data from standard samples that were relatively consistent, and affected by only minor contributions from drift. From the studies with the synthetic data, it can be seen that these approaches can be used to correct for a substantial amount of drift. Several model variations have been given, and each instrument type may benefit from the use of a different model. Subsequent studies will be focused on using the Kalman filter to understand the
Acknowledgements S.C.R. would like to acknowledge grant support from the U.S. Department of Energy, and a fellowship from the Nationaal Fonds von Wetenschappelijk Onderzoek, Belgium.
References [1] F.K. Hildrum, T. Isaksson, T. Naes and A. Tandberg, Near Infrared Spectroscopy (Ellis Horwood, New York, 1992). [2] B.G. Osborne, T. Fearn and P.H. Hindle, Practical NIR Spectroscopy, 2nd Ed. (Longman Scientific and Technical, New York, 1993). [3] K.N. Andrew, N.J. Blundell, D. Price and P.J. Worsfold, Anal. Chem. 66 (1994) 917A. [4] P. MacLaurin and P.J. Worsfold, Microchem. J. 45 (1992) 178. [5] P. MacLaurin, P.J. Worsfold, P. Norman and M. Crane, Analyst 118 (1993) 617. [6] K.N. Andrew and P.J. Worsfold, Analyst 119 (1994) 1541. [7] L. Huber and S.A. George, Eds., Diode Array Detection in HPLC (Marcel Dekker, Inc., New York, 1993). [8] E. Bouveresse, S.C. Rutan, Y. Vander Heyden, W. Penninckx and D.L. Massart, Anal. Chim. Acta (1996), submitted. [9] E. Bouveresse, D.L. Massart and P. Dardenne, Anal. Chim. Acta 297 (1994) 405. [10] E. Bouveresse, D.L. Massart and P. Dardenne, Anal. Chem. 67 (1995) 1381. [11] O.E. de Noord, Chemom. Intell. Lab. Syst. 25 (1994) 85. [12] E. Bouveresse and D.L. Massart, Vibr. Spectrosc. 11 (1996) 3. [13] H.A. Strobel, Chemical Instrumentation: A Systematic Approach, 2nd Ed. (Addison-Wesley, London, 1973) p. 191. [14] P.C. Thijssen, S.M. Wolfrum, G. Kateman and H.C. Smit, Anal. Chim. Acta 156 (1984) 87.
S.C. Rutan et al. / Chemometrics and Intelligent Laboratory Systems 35 (1996) 199-211
[15] P.C. Thijssen, Anal. Chim. Acta 162 (1984) 253. [16] P.C. Thijssen, G. Kateman and H.C. Smit, Anal. Chim. Acta 173 (1985) 265. [17] P.C. Thijssen, L.T.M. Prop, G. Kateman and H.C. Smit, Anal. Chim. Acta 174 (1985) 27. [18] D. Wienke, T. Vijn and L. Buydens, Anal. Chem. 66 (1994) 841. [19] S.D. Brown, Chemom. Intell. Lab. Syst. 10 (1991) 87-105. [20] S.C. Rutan, Chemom. Intell. Lab. Syst. 6 (1989) 191. [21] S.C. Rutan, J. Chemom. 4 (1990) 103.
211
[22] R.J. Barnes, M.S. Dhanoa and S.J. Lister, Appl. Spectrosc. 43 (1989) 772. [23] Instructions for the Infraalyzer Reference Cup Kit (PN-189B315-01) (Bran + Luehbe Analyzing Technologies, Elmsford, NY, 1987). [24] K.N. Andrew, Ph.D. Thesis, University of Plymouth, Plymouth (1996). [25] K. Helland, H.E. Bemtsen, O.S. Borgen and H. Martens, Chemom. Intell. Lab. Sys. 14 (1991) 129.