Simultaneous determination of cobalt and nickel by flow injection analysis and partial least squares regression with outlier detection

Simultaneous determination of cobalt and nickel by flow injection analysis and partial least squares regression with outlier detection

w Original Research Paper 291 Chemometricsand IntelligentLaboratorySystems,14 (1992) 297-303 Elsevier Science Publishers B.V., Amsterdam Simultane...

590KB Sizes 0 Downloads 18 Views

w

Original Research Paper

291

Chemometricsand IntelligentLaboratorySystems,14 (1992) 297-303 Elsevier Science Publishers B.V., Amsterdam

Simultaneous determination of cobalt and nickel by flow injection analysis and partial least squares regression with outlier detection Carsten Ridder * and Lars Nargaard Chemistry Department A, Building 207, The Technical Uniuersity of Denmark, DK-2800 Lyngby (Denmark) (Received

30 May 1991; accepted

7 October

1991)

Abstract

Ridder, C. and Norgaard, L., 1992. Simultaneous determination of cobalt and nickel by flow injection analysis and partial least squares regression with outlier detection. Chemometricsand IntelligentLaboratory Systems, 14: 297-303. A flow injection analysis system with photodiode array detection is used for the simultaneous determination of G$II) and Ni(II) in the concentration range O-l ppm. A prediction error of 3-4% is achieved using partial least squares or principal component regression, which give comparable results. The method presented exploits the fact that the absorbance spectra of the complexes between the metal ions and 4-(2-pyridylazol-resorcinol are slightly different, and that multivariate calibration techniques are able to extract such differences in the form of useful analytical information. A mathematical outlier method which detects samples containing interferences is developed and demonstrated using Zn(I1) and Cu(III as examples.

INTRODUCTION

The simultaneous determination of chemical species is often based on differences in the reaction rate with a common reagent, and by monitoring the reactions at one wavelength. Fernandez et al. [l] and Betteridge and Fields [2] have developed flow injection analysis (FIA) systems in which the sample plug is split in two in order to achieve two different reaction times in the same analytical system. In a well-designed FIA system, the resulting double peak is characterized by containing different amounts of the ions under investigation - in this case Co(B) and Ni(I1). The quantification of the ions in the above-mentioned papers is accomplished by assuming additivity of absorbances, and performing separate univariate 0169-7439/92/$05.00

linear calibrations for each peak. Since this simple mathematical model leaves no degrees of freedom after predicting the Co(H) and Ni(I1) concentrations, there are no possibilities for mathematical outlier detection. The problem with interferent ions is, however, partly solved by the use of EDTA, which masks some ions while others are left unaffected. Dado and Rosenthal [3] describe the use of multivariate linear regression for the simultaneous determination of high concentrations of Co(B), Ni(I1) and C&I). In the concentration range examined the analytes are themselves coloured, which permits direct absorbance measurements without preceding chemical reaction. In this paper we present the combination of a simple flow injection system [4] with photodiode

0 1992 - Elsevier Science Publishers B.V. AI1 rights reserved

Chemometrics and Intelligent Laboratory Systems

298

n

performed to compare the prediction errors when using samples spanning the whole calibration space and samples located near the centre of the calibration space. In the third calibration run a model for outlier detection is studied and tested on two interfering ions, namely Zn(I1) and Cu(I1).

Fig. 1. FIA manifold used for the analysis of Co and Ni. C = carrier stream (distilled water); R = reagent (0.1 mM PAR in 0.1 M phosphate buffer at pH = 9); D = photodiode array detector. The sample volume is 70 ~1.

EXPERIMENTAL

Apparatus

array (PDA) detection (Fig. 1). Cobalt and nickel form different complexes with 4-(2-pyridylazo)-resorcinol (PAR), the absorbance spectra of which are pH-dependent [5,6]. At pH = 9.0 only one Co-PAR complex and only one Ni-PAR complex exist, with absorption maxima at 510 nm and 495 nm, respectively. There is no single wavelength, however, where the absorbance is due to only one of the complexes, and the discrimination of Co and Ni is exclusively based on the small, but significant, spectral differences of the complexes (Fig. 2). The loss of chemical selectivity is compensated by measuring the absorbances at 51 different wavelengths and 61 different times, and using multivariate calibration 171. The predictability of the simple two-line FIA system is investigated. Three calibrations were performed: two on the same day and a third one month later. The first two calibration runs were

A Hewlett-Packard HP 8452A photodiode array spectrophotometer equipped with an 8 ~1 flow cell was used. To propel the two streams, ABU 80 autoburets from Radiometer A/S were employed. The automatic injection valve and the valve control program were made in this laboratory. Programs The calibrations were performed by Unscrambler Version 2.2 (CAM0 A/S, Trondheim, Norway). The PDA spectrophotometer was connected to an IBM-compatible PC through HP 89531A UV/VIS Operating software, HewlettPackard. Programs for data manipulation and outlier detection were written in Turbo Pascal Version 5.5, Borland International. Reagents

absorbance

.

m-m* *.

0.2

-

I.

:

0.1

^\.

co ____N,

.\ \. **_. -I ‘. *$ : :

:

** : *. I\ .\.*

Stock solutions (2000 ppm) of Co(R), Ni(II), Zn(II> and Cu(I1) chlorides were prepared in 0.005 M H,SO,. The reagent stream was a 1.0 mM PAR (Merck) solution in a 0.1 M phosphate buffer (pH = 9.0). The carrier stream was distilled water. Standard solutions A

.-._

0

----.___

wavelength (nm)

Fig. 2. Absorption spectra at time 30 s from two different injections of standards of 0.5 ppm Co and 0.5 ppm Ni.

Both calibration and prediction samples were made according to a full 3’ design, with the concentration levels 0, 0.50 and 1.00 ppm for both Co(R) and Ni(II), i.e. eighteen samples in total.

H

Original Research Paper

299

Standard solutions B The calibration samples were made as in series A, but for the prediction samples the concentration levels were chosen as 0.25, 0.50 and 0.75 ppm for both Co(B) and NKII). Procedure 1

I .

The scanning period was initiated at the moment of sample injection, and both the PDA spectrophotometer and the pneumatic injection valve were controlled by a computer. The scanning was continued for 60 s with one scan per second over the wavelength range 490-590 nm in 2-nm steps. This resolution was not chosen arbitrarily, but was an intrinsic result of the resolution of the instrument. Below 490 nm the absorption of PAR itself increases and the measurements become noisy.

RESULTS AND DISCUSSION

Variable selection /data pretreatment The output from the PDA spectrophotometer after one injection is a matrix - denoted X consisting of the absorbances at 51 wavelengths (rows) and 61 times (columns), i.e. a total of 3111 absorbance values. In recognition of the redundancy in the data, a smaller number of model variables was selected: (a) a subset of 51 variables equivalent to the raw absorbance spectrum measured at the time corresponding to the peak maximum at t = 30 s (column No. 31 in the data matrix); (b) the ‘eigenspectrum’ extracted by principal component analysis [8,9] of the non-centered 3111 element wavelength-time data matrix using the first principal component (score values) as variables. Because of the close resemblance of the spectra of Co-PAR and Ni-PAR, the PCA results in only one significant factor describing at least 99.9% of the original variance in the data. It should be noted that the 51 variables selected in this way contain information from all the original 3111 variables.

51

1 variable no.

Fig. 3. Loading vectors for the first three factors.

Calibration run 1 The standard solution series A was used in this run, and randomized injection of the calibration set was followed by randomized injection of the prediction set. For a thorough treatment of partial least squares regression (PIER) and principal component regression (PCR) calibration see e.g. refs. 7 and 10-12. The following comments concerning the multivariate modeling apply when using either the first PC or the spectrum at t = 30 as variables. The number of factors chosen to enter the PIS2 model is based on the investigation of the loading plot (Fig. 3) and the test set standard error of prediction (SEP) (Fig. 4). The loading of SEPbwm)

0

1

2

3

4

factor no.

Fig. 4. SEP =\lEfl_l(Cpdic'ed - CF)*/N as a function of the number of factors; N is the number of test set objects and C is concentration.

Chemometrics and Intelligent Laboratory Systems

300

n

TABLE 2 True and predicted concentrations of Co and Ni samples from calibration run 2

jo

/21

.6 6 14

= a 16

True (ppm)

Predicted (ppm)

Relative error (%I

0.250 Ni

0.224 0.255

- 10.4 2.0

0.500 co 0.250 Ni

0.492 0.250

- 1.6 1.2

0.750 Co 0.250 Ni

0.729 0.250

- 2.0 1.2

0.250 Co 0.500 Ni

0.238 0.489

-4.8 -2.2

0.500 co 0.500 Ni

0.467 0.507

-6.6 1.4

0.750 co 0.500 Ni

0.727 0.472

-3.1 -5.6

0.250 co 0.750 Ni

0.232 0.750

- 7.2 0.0

0.500 co 0.750 Ni

0.525 0.702

5.0 -2.4

0.750 Co 0.750 Ni

0.727 0.726

-3.1 3.2

Average relative error (%)

co Ni co Ni co Ni

5.0 2.1 4.4 2.9 -0.14 -0.06

0.250 !j

-< -1 -

-10

“16 cb

:T_

Q* 0

10

seems factor no. 1

Fig. 5. Plot of score vector 1 versus score vector 2. Numbers l-9 are calibration objects and lo-18 are prediction objects. The metal concentrations can be found in Fig. 6, where the numbering is from lower left (1 and 10) to the upper right (9 and 18); standard 3 (and 12) being 1.0 ppm Co/O.0 ppm Ni and standard 4 (and 13) are 0.0 ppm Co/O.5 ppm Ni.

the third factor is very noisy as compared to the loadings of the first and second factor; the gain in prediction error is not significant in going from two to three factors. Furthermore, the residuals after two and three factors are comparable (not shown). This means that the model should be based on two factors, indicating the number of chemical components in the samples; this was further confirmed by cross-validation. The 3’ design is reflected in the scores plot (Fig. 5) and it is obvious that both factors are necessary to describe a single analyte. The slight systematic displacement of the prediction samples compared to the calibration samples reveals the sequence of sample injection. The results of the calibration run are compiled in Table 1. The results and conclusions are exactly the same when using PCR modeling. The reason for

TABLE 1

RSEP(%) Bias (ppm)

this similarity between PLSR and PCR in this case is that all variance in the X matrix is relevant for the description of the chemical variation. Calibration run 2

RSEP and bias for the two-factor PLSR models based on calibration run 1. RSEP is calculated as lOO~SEP/C,,,,, where C,,,, = E:fi=,Ci/N. Bias is calculated as EE t(CFdimed -CT-) and reflects the sequence of injections (see text)

RS.EP(%) First PC Single spectrum

co

Bias

Co

Ni

Co

Ni

7.5 6.3

5.4 4.8

0.32 0.26

- 0.21 -0.18

The prediction samples used in calibration run 1 spanned the whole calibration space. This is not normal in calibration situations, as the predictive capability is best at the centre of the calibration space. To test the predictive ability of the FIA system with samples located around this centre, the series B standard solutions were applied. The results of PLSR modeling with the first PC as

n

variable are shown in Table 2. As can be seen, a slight - and expected - improvement of RSEP compared to calibration run 1 is encountered. It is not possible, however, to test the statistical significance of this improvement, as these results represent only one realization in each sample space. The concentration range is lower and the prediction error obtained in this calibration run is in general better than previously published results. Thus, Betteridge and Fields [2] report RSEP, = 6.8% and RSEP,, = 9.9%, whereas Fernandez et al. [l] report RSEP, = 5.1% and RSEP,, = 2.7%; these last results are based on triplicate injections.

Outlier detection

Model The outlier detection is based on the assumption that the residuals in the calibration X matrix for the ith object are distributed independently according to XT EN(O,U~*), i = 1-51. It is also assumed that there are no outliers in the objects from which the model residuals originates. By use of these calibration residuals a confidence interval is calculated for each variable. The formula for the interval is [fi

301

Original Research Paper

+Sj’t(n)cT/2;

P

Fig. 6. Scheme for the preparation of outlier samples (all concentrations in ppm). The bold-faced numbers in the cells are the concentration of interferent ion added to the series A Co/Ni standards. Object numbers are written in italics: objects 19-24 correspond to Zn(II) samples and 25-30 correspond to Cu(II) samples. E.g. sample 23 contains 0.5 ppm NKII), 1.0 ppm Co(R) and 0.5 ppm Zn(I1).

Calibration run 3 with outlier detection Test of model A calibration run using standard solutions A were used to build the PISR model with the first PC as variables. By investigating the same kind of plots as aforementioned, it was concluded that a PLSR model with three factors should be used. The explanation for this could be a non-zero blank signal due to a higher temperature in the laboratory (30°C compared to 22°C in the previous experiments) and the use of dated solutions of PAR and carrier (one month old compared to freshly prepared in the previous experiments). The model is tested with two interfering ions in the samples, namely Zn(I1) and Cu(I1). Stan-

+Sj’t(n)I-CC/2]

absorbance

where p = 0, (Y is the significance level, n is the number of calibration objects used in generating the interval and s,? = (l/n)Cy=,X$ For a new sample the number of variables with a residual outside the confidence interval is recorded. If this number is greater than e.g. 5% of the number of variables, the sample is tagged as an outlier. The demand for a theoretically correct number of degrees of freedom in the above formulas, and exact knowledge of the actual distribution of the residuals, are, however, not essential, due to this subjective way of establishing a rejection criterion. The suggested method thus provides a practical working frame for the detection of outliers.

0.6

co ____RI

--\\ \

\

\

\ _.._..Zn

:-:

;<

0

*

--:_:_159( wavdongth (nm)

Fig. 7. Absorption spectra at time 30 s from four different injections of standards of 0.5 ppm Zn, 0.5 ppm Cu, 0.5 ppm Co and 0.5 ppm Ni.

Chemometrics and Intelligent Laboratory Systems

302

c

et

1A

n

should be performed. Alternatively, one could calibrate for the interfering ions by including them in the calibration set.

CONCLUSIONS

wavelength (nm) Fig. 8. 95% confidence band and residuals of a good sample (# 14) and an outlier (#22).

dards with two concentration levels of these ions were prepared as shown in Fig. 6. In Fig. 7 the resemblance of the absorption spectra of the ZnPAR, &-PAR, Co-PAR and Ni-PAR complexes are illustrated. The 95% confidence interval for each variable is shown as a confidence band in Fig. 8; the residuals of standards 14 and 22 are also plotted to show the difference between a ‘good’ sample and an outlier sample. In Fig. 9 the number of variables outside this band is shown for the test set objects (10-18) and the outlier objects (19-30). It is seen that the samples with interfering ions are clearly tagged as outliers and no prediction of Co and Ni in these samples

The concentration range is lower and the prediction error obtained is generally better than previously published results. Due to the multivariate approach used in this work, it was furthermore possible to develop an outlier detection method, which effectively indicates samples containing interferences. The FIA system described has the advantage of being very simple. This simplicity is compensated by the use of a multiple detection system in combination with a computer. The combination of FIA and multivariate calibration has promising possibilities especially in analysing components with severely overlapping absorbance spectra.

ACKNOWLEDGEMENT

The Danish Council for Industrial and Scientific Research is acknowledged for financial support.

REFERENCES no.

A. Fernandez, M.D. Luque de Castro and M. Valclrcel, Comparison of flow injection analysis configuration for differential kinetic determination of cobalt and nickel, Analytical Chembtry, 56 (1984) 1146-1151. D. Betteridge and B. Fields, Two point kinetic simultaneous determination of cobalt(H) and nickel(H) in aqueous solution using flow injection analysis (FIA), Fresenius’

60 40 30

Zeitschrift fib Analytische Chemie, 314 (1983) 386-390. G. Dado and J. Rosenthal, Simultaneous determination

20

10

0

10 12 14 16 18 20 22 24 26 28 30

objectno. Fig. 9. Number of variables outside the 95% confidence band; test objects (10-18) and outlier objects (19-30).

of cobalt, copper, and nickel by multivariate linear regression, Journal of Chemical Education, 67 (1990) 797-800. J. Ruzicka and E.H. Hansen, Flow Injection Analysis, Wiley, New York, 2nd edn., 1988. D. Nonova and B. Evtimova, Complex formation of nickel(B) and cobalt(H) with 4_(2-pyridylazo)-resorcinol, Analytica

Chimica Acta, 62 (1972) 456-461.

M. Tanaka, S. Funahashi and K. Shirai, Chemical analysis of mixtures of some heavy metals by means of differential

n

Original Research Paper

reaction rates of ligand substitution reactions, Analytica Chimica Acta, 39 (1967) 437-445. H. Martens and T. Nms, Multivariate Calibration, Wiley, New York, 1989. T.W. Anderson, An Introduction to Multivariate Statistical Analysis, Wiley, New York, 2nd edn., 1984, Ch. 11. S. Wold, K. Esbensen and P. Geladi, Principal component analysis, Chemometn’cs and Intelligent Laboratory Systems, 2 (1987) 37-52.

303

10 P. Geladi and B.R. Kowalski, Partial least squares regression: A tutorial, Analytica Chink Acta, 185 (1986) 1-17. 11 P. Geladi and B.R. Kowalski, An example of 2-block predictive partial least squares regression with simulated data, Analytica Chimica Acta, 185 (1986) 19-32. 12 A. Hoskuldsson, PLS regression methods, Journal of Chemometrics, 2 (1988) 211-228.