Simplex optimization of the adaptive kalman filter

Simplex optimization of the adaptive kalman filter

Analytica Chimica Acta, 167 (1985) 39-50 Elsevier Science Publishers B.V.. Amsterdam - Printed in The Netherlands SIMPLEX OPTIMIZATION OF THE ADAPTIV...

741KB Sizes 30 Downloads 188 Views

Analytica Chimica Acta, 167 (1985) 39-50 Elsevier Science Publishers B.V.. Amsterdam - Printed in The Netherlands

SIMPLEX OPTIMIZATION OF THE ADAPTIVE KALMAN FILTER

SARAH C. RUTANa and STEVEN D. BROWN* Department (U.S. A.)

of Chemistry,

Washington

State

University,

Pullman,

WA 99164-4630

(Received 23rd July 1984)

SUMMARY A method for determining concentrations from overlapped spectral data when a complete model is not available is described. This approach combines simplex optimization with the adaptive Kalman filter to yield a method in which initial guesses for the adaptive filter are generated by the simplex algorithm. The performance of the method is demonstrated by deconvoluting overlapped synthetic data and spectral data.

The determination of species from responses that are overlapped is of interest in many areas of analytical chemistry. Spectroscopic, chromatographic and electrochemical methods often yield responses in which the signal arising from the analyte of interest overlaps with signals from known or unknown interfering species. If the responses for all the species that contribute to the mixture response can be described empirically or theoretically, accurate estimates for the concentrations can be obtained with a variety of techniques based on least-squares regression. If all the responses for the species which contribute to the overall mixture response are not known, the concentrations that are estimated will be inaccurate. Recently, a technique called adaptive Kahnan filtering has been proposed as a method for estimating concentrations when responses for all the species which contribute to the overall response may not be known [ 11. The adaptive Kalman filtering approach for compensation of model errors is complicated by the requirement that the filter needs to be restarted several times with different initial guesses until an optimal fit is obtained. If the procedure for choosing new initial guesses could be automated, this procedure would be simplified. The adaptive filter was found to yield optimal results when the diagonal elements of the covariance matrix reached minimum values [ 11. An appropriate measure of the performance of the adaptive filter is therefore related to the trace of the covariance matrix. If some suitable function of the diagonal elements of the covariance matrix is monitored as a function of the initial guesses for the various component concentrations, a aPresent address: Department of Richmond, VA 23284-0001, U.S.A. 0003-2670/85/$03.30

Chemistry,

Virginia

Commonwealth

o 1985 Elsevier Science Publishers B.V.

University,

40

variance surface is obtained. When this surface is relatively noise-free, the results from previous fits can be used to predict new initial guesses which should yield improved results. Several algorithms have been developed which allow surfaces that are not easily described by a model to be searched for optimal values. Gradient search techniques and evolutionary operation methods are examples of such techniques [ 2-51. One type of evolutionary optimization approach, simplex optimization, has been applied in many areas of chemistry [2], including optimization of chromatographic conditions [ 6, 71, fitting data to nonlinear expressions that are not easily fitted by using other methods [8], and optimization of experimental conditions [9]. In this paper, examples of the variance surface for the adaptive filter are plotted for two-component systems, yielding threedimensional surfaces that are relatively easily visualized. These surfaces can then be searched by using the simplex algorithm; the applicability of this approach is demonstrated by deconvoluting the u.v.visible spectral responses of mixtures of metal ions in solution. THEORY

Adaptive Kalman filter The adaptive filter algorithm described previously [l] is prone to errors, especially when the measurements are very precise. In similar cases, a squareroot filter algorithm has been found to alleviate this difficulty [ 10,111. The algorithm involves expressing the covariance matrix, P, as a covariance square root, S: P = S ST, where S is a lower triangular matrix. The equations for the square-root filter are given in Table 1. The adaptive filter estimate for the measurement variance, R, is calculated by using the current estimate for P [l] ; the square-root algorithm equivalent for the adaptive estimate of R is l

R(k) = (l/m)

[ f

v(k -j)

u(k -j)]

- [HT(k)

lS(lzlk -l)]’

(9)

i=l

Simplex optimization The theory of simplex optimization as it has been applied in chemistry has been discussed by several authors [ 5,6,8, 91. Several articles have discussed refinements to this basic procedure [9, 12,131. In order for this technique to be successful, two requirements must be met. First, an appropriate measure of the “goodness of fit”, or in this case, a measure of the adaptive filter performance must be available. An appropriate measure for these studies is Y = Zlog Pii. A plot of Y as a function of the initial guesses for the parameters yields a variance surface on which the simplex routine can search for an optimum value (in this case, the minimum Y). The second requirement is that this variance surface must show a clear

41 TABLE 1 Algorithm equations for the square-root Kalman filter State estimate extrapolation: X(k\k-l)=F(k,k-l).X(k--Ilk-l)

(1)

Covariance square root extrapolation: S(klk-

1) = F(k,k -l).S(k

-

Ilk-

l)*FT(k,k

- 1)

(2)

where F(k,k - 1) = I Kalman gain: K(k) = aS(klk - 1). G(k)

(3)

where 1) -H(k)

(4)

G(k) + R(k)

(5)

G(k) = ST(klk l/a = GT(k) d = l/{l

l

+ [aR(k)]1’2}

(6)

State estimate update: X(kik)=X(klk-1)

+K(k)*[Z(k)-H’(k).X(klk-l)]

(7)

Covariance square root update: S(ktk)=S(klk-1)-adS(k(k-l)*G(k)*GT(k)

(8)

optimum which coincides with an optimal, or near optimal result. The optimal result for the adaptive filter corresponds to the situation for which the accuracies of the resulting concentration estimates are high. In order to verify that these two requirements are met, it is valuable to compare a plot of the accuracy of the concentration estimate(s) vs. the initial guesses (the error surface) to a plot of the measure, Y, vs. the initial guesses (the variance surface). If the minimal values on the variance surface coincide with the minimal values on the error surface, and if the variance surface is not extremely noisy, it should be possible to use simplex optimization to determine the initial guesses which yield the best parameters.EXPERIMENTAL

Ultraviolet-visible spectrometry A Hewlett-Packard 5841A diode-array spectrophotometer was used to measure the spectra of all mixtures and the spectra of the corresponding pure components. Stock solutions were prepared at concentrations which yielded peak absorbances of approximately 2.5. The following reagent-grade chemicals were used to prepare solutions at the indicated concentrations: Cu(NO&-3H20, 0.194 M; Co(NO&-6Hz0, 0.501 M; Ni(N0&*6Hz0, 0.504 M; UOz(NO&-6Hz0, 0.313 M; picric acid (2,3,6&initrophenol),

42 TABLE 2 Spectral deconvolution with the Kalman filter Mixture*

1 2 3 3e 4 5

coa+

cua+

Nil+

Cont. (M)

Error (%)

Cont. (M)

Error (%)

Cont. (M)

0.03874 0.01250 0.1761 0.1761 0.01490 0.01384

3.6 7.1 0.0 1.1 3.6 8.1

0.1001 0.1615 0.0 0.0 0.03851 0.03575

0.5 ;!1600006)c

0.1007 0.2275 0.04579 0.04579 0.03871 0.3598

(0.00229)C 4.2 1.7

Error (%) 0.1 0.4 4.0 -1.5 9.5 -0.1

aMixtures l-3 resemble those mixtures reported by Jochum and Schrott. Resolution is 2 nm/point, collected over the wavelength range 350-820 nm, except where noted. bDeterministic variance of the fit. CConcentration estimated by the filter when the species was not present in solution. dNot estimated. eFit over a limited range, 620-820 nm.

0.1 mM. These stock solutions were chosen to parallel closely those used by Jochum and Schrott [14] in their study of a non-negative least-squares multicomponent analysis algorithm, so that their technique could be compared with the Kalman filter. The concentrations of these five substances used in a series of five mixtures are shown in Table 2. Spectra of these five mixtures, as well as of the pure stock solutions were obtained with the spectrophotometer and transferred to an LSI-11/23 computer [l] through an 138-232 interface to a serial port. Synthetic data The overlapped gaussian peaks that were generated by the computer in the previous study [l] to characterize the performance of the adaptive filter were also used in this study to generate variance surfaces and error surfaces for the adaptive filter as a function of the initial guesses. These peaks, along with the mixture spectrum generated by adding these responses together, are shown in Fig. 1. Four different models were used in attempts to obtain the relative contributions of these peaks to the mixture spectrum by using the simplex-optimized adaptive Kalman filter; these models are summarized in Table 3. Computer programs The computer programs described in the previous study for the adaptive Kalman filter were used here, with the exception that the standard filter algorithm was replaced by the square-root algorithm. This algorithm has been presented by Kaminski et al. [lo] and implemented by Brown et al. [ll] . Another program was written which generated an n-dimensional array of values for the measure, Y, as a function of the initial guesses chosen at

43

uoy

Picric Acid Error (%)

Cont. (W

Error (4%)

Cont. (W

0.06262 0.01010 0.0

-1.6 12.0

0.2000 0.01290 0.0

1.0 -0.8 (0.00002)c

0.03846 0.00714

-2.6 -4.6

4

(~~0014)C

0.1204 0.02236

-9.0 7.6

4

-d

Variance of fitb

3.7 2.8 1.4 4.9 8.1 1.4

x x x x x x

lo+ 10-s lo+ 10” 10-S 1o-5

regular intervals. In addition, the total percent deviations of the results obtained with the adaptive filter from the true values for the concentrations were calculated for synthetic data. These calculations yield the data which comprise the variance surface and the error surface, respectively. These results were calculated on the LSI-11/23 and were subsequently transferred to an Amdahl V8 mainframe computer, where contour graphics routines were

Fig. 1. Synthetic spectra: (A) mixture spectrum; (B) component peak centered at 565 nm; (C) component peak centered at 575 nm; (D) component peak centered at 585 nm. Fig. 2. Simplex-optimized

adaptive Kalman filter.

44 TABLE 3 Generation of variance and error surfaces Modela

Peak 1

Peak 2

Wlb

P( i,i)c

X(i)*

A B C D

565.0 565.0 575.0 575.0

575.0 585.0 585.0 585.0

1.0 x 10” 100.0 100.0 100.0

20.0 20.0 20.0 0.01

5.0 5.0 5.0 1.0

aComponent peaks included in the model, see Fig. 1. bInitial value for the measurement variance. CInitial value for the diagonal elements of the covariance matrix. *Increment in the initial guesses for X used to generate the variance and error surfaces.

available. Variance surfaces and error surfaces were plotted by using these routines for the synthetic data described above. The program for the simplex optimization of the adaptive Kalman filter used the Nelder-Mead simplex algorithm [ 151, as described by O’Neill [ 161. This modified simplex algorithm was implemented on the LSI-11/23 previously [17]. A flow chart which summarizes the simplex-optimized adaptive Kalman filter is shown in Fig. 2. RESULTS

AND DISCUSSION

The implementation of the square-root filter affected the final values of the covariance matrix to some extent, especially for highly precise data with initial guesses close to the “correct” values. This effect, attributed to roundoff errors affecting the traditional algorithm, did not cause any significant change in the overall accuracy of the adaptive filter, except for the case in which measurements with large model errors were processed initially (Models C and D in Table 4). In this case, there was a substantial improvement in the accuracy of the final estimates for the concentrations, when compared to the results obtained in the previous study [ 11. Synthetic data The synthetic data shown in Fig. 1 were used to verify that the variance surfaces and error surfaces meet the requirements for the successful implementation of the simplex optimization algorithm. These plots are shown in Figs. 3-6 for two of the models given in Table 3. Although the variance surfaces in some cases show several optimum locations, in each case the optimum locations correspond to locations on the error surface where the total deviation of the results from the true values is within a few percent. This was true for all the surfaces generated, provided that the basic restriction that the model be accurate for some relatively important region of the spectrum is satisfied. The corresponding results for fits with the simplexoptimized adaptive Kalman filter for these surfaces are summarized in

30 100 170 30 100 170 30 100 170 30 100 170

A A A B B B C C C D D D

64.502 141.086 133.172 75.772 123.197 127.079 58.560 56.081 58.121 100.050 100.061 99.188

BestC X(l)

Component

100.050 99.755 100.015 100.298 101.567 100.304 100.465 100.462 98.157

99.993 99.986 99.990

Resultd X(l)

1

1.2 2.3 3.8 2.5 7.1 7.0 3.5 1.9 3.5 2.3 2.4 9.8

x x x x x x x x x x x x

10” 10” lo9 lo4 10” lo* 10” 10” 10” lo-$ lo* 1o-3

Variance= B(l,l) 100.304 100.395 100.380 99.751 99.768 99.775 99.991 99.971 99.990 99.985 99.985 144.957

Resultd X(2)

Best? X(2) 108.611 95.131 131.677 55.526 65.323 62.196 85.014 116.118 85.642 99.968 99.967 145.026

2

Component

7.3 5.8 6.3 6.5 6.5 6.5 6.2 1.4 6.3 1.1 1.1 1.0

x x x x x x x x x x x x

10” 10” 10” lOA lo4 lo4 lo-’ 10” lo-’ 10” 10” 10’

Variancee W2,2) 0.3 0.4 0.4 0.3 0.5 0.2 0.3 1.6 0.3 0.5 0.5 47.0

Error @If

-12.07 -11.88 -12.63 -6.79 -8.33 -8.34 -11.67 -11.59 -11.67 -10.57 -10.57 -4.01

Y

79 65 77 72 109 58 57 84 122 101 54 202

1ter.s

aModels are described in Table 3. bInitial values for the elements of X used to start the simplex. CBest initial guess determined by simplex optimization. dFinai result from the adaptive filter for the best initial guess. Wovariance matrix element estimated by the filter. ‘Total percent deviation of both results from the true values. aNumber of simplex iterations required.

Initial guessesb

adaptive Kalman filter for synthetic spectra

ModeP

Simplex-optimized

TABLE 4

Fig. 3. Variance surface for model A in Table 3. Contours represent the values for Y; (+) represent the best guesses estimated by the simplex algorithm. Fig. 4. Error surface for model A in Table 3. Contours represent the values for the total percent deviation; (+) represent the best guesses estimated by the simplex algorithm.

Fig. 5. Variance surface for model D in Table 3. Contours represent the values for Y, (+) represent the best guesses estimated by the simplex algorithm. Fig. 6. Error surface for model D in Table 3. Contours represent the values for the total percent deviation; (+) represent the best guesses estimated by the simplex algorithm.

Table 4. In most cases, the simplex-based algorithm is able to identify a pair of initial guesses which yield accurate estimates of concentration. An exception to this is observed when the initial covariance elements are small and the initial guesses used to start the simplex are not very accurate. In this case, the variance surface is relatively flat, except in the region very close to the optimum guesses, as seen in Fig. 5, and the simplex “gets lost” in the noise on the flat portion of the surface, as occurred for the fit described on the

47

last line of Table 4. The best procedure for finding the optimum fit with the adaptive Kalman filter in this case is to start the simplex with random guesses and relatively large initial variances to locate an optimum region on the variance surface. When the correct region has been located, a new simplex optimization is started with smaller initial variances, and this much narrower optimum region is searched to yield estimates of concentration. This can be seen by comparing the contour plots shown in Fig. 3 and Fig. 5, where the use of small initial variances yields a narrower optimal region (Fig. 5), while the use of larger initial variances yields a much broader optimal region (Fig. 3). Ultraviolet-visible spectra The u.v.-visible spectra were deconvoluted with both the ordinary filter and with the adaptive filter that was optimized by using the simplex algorithm. This allows a comparison to be made between the results obtained when the model is known relatively accurately and the results obtained when there are components missing from the model. The five mixture spectra summarized in Table 2 were deconvoluted with the ordinary Kalman filter initially in order to compare the performance of the Kalman filter to the non-negative least-squares deconvolution technique described by Jochum and Schrott [ 141. The first three mixture spectra closely resemble the mixture spectra used in that study. The results for the Kalman filter deconvolution of these spectra are given in Table 2; an example of a fit is shown in Fig. 7, and the individual component spectra are shown in Fig. 8. The quality of the results in general is comparable to that obtained by Jochum and Schrott. The most noticeable difference between these results and the results of Jochum and Schrott was the quality of the estimate for many1 ion in the second mixture. The concentration of this minor component was estimated

0.0

no.0

4~0.0

560.0

640.0

720.0

8E0.0

460.0

Fig. 7. Typical Kalman filter fit for Mixture 1 in Table 2: (A) original spectrum; Kalman filter fit;(C) fit residuals.

(B)

HRVELENETl4 ha)

Fig. 8. Individual component acid.

no.0

880.0

spectra:

d60.0

0.0

10.0

80.1

56Q.O

NO.0

RO.0

WAVELENGTHhm)

Ro.0

(A) Cu’+; (B) Co”; (C) Ni’+; (D) UO:+; (E) pick

48

by Jochum and Schrott with a 38% error compared to a 12% error when the Kahnan filter was used. The results reported here for the concentration of copper(I1) ions tend to be high; this could be due to the somewhat poorer performance of the diodearray spectrophotometer with respect to stray light at the long wavelength end of the diode array, causing deviations from Beer’s law at the high peak absorbance used here. To demonstrate the performance of the simplex-optimized adaptive filter for quantifying one or several components in the presence of one or more unknown components, the adaptive Kalman filter was also applied to the spectra described above. A typical case occurs when the spectral responses for known components overlap a spectral response arising from one or more unknown contaminants. In this case, the use of a measurement at the wavelength where the maximum response occurs to quantify the component can yield results that are biased by the presence of the interfering species. Another option would be to choose a wavelength which is not “contaminated” by the interference; however, this option can lower sensitivity substantially. The adaptive filter offers an alternative to these approaches. In this study, three cases were investigated to establish the applicability of the adaptive filter for alleviating these difficulties. These are as follows: the quantitation of UOf’, Ni2’, Co’+, and. picric acid in the presence of “unknown” Cu2+, the quantitation of Cu2+, Co2+, and Ni2’ in the presence of “unknown” UO”,+and picric acid, and the quantitation of Co2+ in the presence of “unknown” Cu2+, Ni2+, UO’,+, and picric acid. A summary of the results is given in Table 5. While these concentration estimates tend to be slightly less accurate than those obtained with a valid model using the ordinary filter, in most cases the errors for the known components are less than 10%. An exception is for the estimation of Co’+ (fit 3) ; this mixture contained five-fold excess amounts of each of the “unknown” uranyl ion and picric acid species, and the Co’+ concentration was overestimated by 18%. The other exception is for the estimation of Co’+ (fit 8), where all other components were “unknown”, and the mixture contained a ten-fold excess concentration of Ni’+. In this case, the Co2’ concentration was overestimated by 14%. These relatively large errors arise from the fact that the underlying assumptions for the adaptive filter have not been met for those mixtures and models. In the first case, the fivefold excess concentration of the uranyl ion and picric acid means that these species will have small, nonzero absorbances throughout most of the Co2’ absorption region, and the model restrictions for the adaptive filter are no longer strictly satisfied. In the second case, the “unknown” Ni” can be seen to absorb substantially throughout the spectral range used here (see Fig. 8), so that the estimation of the Co’+ concentration in the presence of a large concentration of “unknown” Ni2+is not reliable. For the studies described here, the identity of the “unknown” species were known, so that it was possible to predict under what conditions the adaptive filter would be successful. For most applications, this would not be the case; however, a knowledge of the types of species involved could

1 2 4 5

9 10 11 12

d

d d

d

d d

d

d

0.04060 0.01372 0.01367 0.01505

4.8 9.8 -6.3 8.7

(%I

ErrorC

0.1006 0.1657 0.04033 0.03739

0.1025 0.1671 0.04547 0.03844 0.1029 0.1669 0.04116 0.04074

Conc.b (MI

Coz+

6.9 14.0 0.5 2.6 4.7 4.6

1.3 18.0 7.5 2.8 3.3

2.4

ErrorC (%I

0.09725 0.2327 0.04014 0.3562

d d

d

d

0.09907 0.2226 0.04241 0.3571

Conc.b (MI

Ni2+

-3.4 2.3 3.7 -0.4

-1.6 -2.2 9.4 -6.8

WI

ErrorC

0.06312 0.01060 0.1116 0.02225

0.01981 0.01226 0.03701 0.00773

d d

d

d d

d

(MI

Conc.b

d d d d d

0.8 5.0 -7.3 -0.5

ErrorC (%I

Picric Acid

d d d d d

(MI

Conc.b

uo:+

-1.0 -5.0 -3.8 8.2

ErrorC (%I

aMixtures are described in Table 2. bConcentrations estimated by the simplex-optimized adaptive filter. CPercent deviation from the true concentration, reported in Table 2. dResponse for this species was omitted from the model.

1 2 4 5 1 2 4 5

1 2 3 4 5 6 7 8

Conc.b (MI

cua+

Fit

Mixturea

adaptive Kalman filter - mixture spectra

Simplex-optimized

TABLE 5

.

50

aid in predicting the mixture responses that can be reliably deconvoluted by using the adaptive filter. In addition, if the model is not correct for some significant portion of the spectrum where each of the known species gives rise to a significant absorbance, a poor fit can result. A poor fit, as indicated by a large deterministic variance of fit, is therefore an indication that the limiting assumption is not valid. The simplex-optimized adaptive Kalman filter should be applicable to any analytical method, provided that the relationship between detector response and concentration is linear. This work was supported by the Graduate School, Washington State University, through a Grant-in-Aid. Additional funding was provided by the Department of Energy, through Grant (DE-FG06-84ER13202). We thank Mr. Steve Sibley, of Hewlett-Packard Instruments, for the loan of the spectrometer. REFERENCES 1 2 3 4 5

S. C. Rutan and S. D. Brown, Anal. Chim. Acta, 160 (1984) 99. S. N. Deming and S. L. Morgan, Anal. Chem., 45 (1973) 278A. G. R. Walsh, Methods of Optimization, Wiley, New York, 1975. W. Spendley, G. R. Hext and F. R. Himsworth, Technometrics, 4 (1962) 441. G. E. P. Box, W. G. Hunter and J. S. Hunter, Statistics for Experimenters, Wiley, New York, 1978. 6 J. C. Berridge, Analyst (London), 109 (1984) 291. 7 S. L. Morgan and S. N. Deming, J. Chromatogr., 112 (1975) 267. 8 S. N. Deming and L. R. Parker, CRC Crit. Rev. Anal. Chem., 7 (1978) 187. 9 S. L. Morgan and S. N. Deming, Anal. Chem., 46 (1974) 1170. 10 P. G. Kaminski, A. E. Bryson, Jr. and S. F. Schmidt, IEEE Trans. Auto Control, AC-16 (1971) 727. 11 T. F. Brown, D. M. Caster and S. D. Brown, Anal. Chem., 56 (1984) 1214. 12 L. A. Yarbro and S. N. Deming, Anal. Chim. Acta, 73 (1974) 391. 13 A. D. Brookes, J. J. Leary and D. W. Golightly, Anal. Chem., 53 (1981) 720. 14 P. Jochum and E. L. Schrott, Anal. Chim. Acta, 157 (1984) 211. 15 J. A. Nelder and R. Mead, Comput. J., 7 (1965) 303. 16 R. O’Neill, Appl. Stat., 13 (1971) 338. 17 J. J. Toman and S. D. Brown, Anal. Chem., 53 (1981) 1497.