Talanta 46 (1998) 129 – 138
Modelling the background current with partial least squares regression and transference of the calibration models in the simultaneous determination of Tl and Pb by stripping voltammetry Ana Herrero *, M. Cruz Ortiz Dpto. Quı´mica, Fac. de CyTA y C. Quı´micas, Uni6ersidad de Burgos, Pza. Misael Ban˜uelos s/n, 09001 Burgos, Spain Received 14 May 1997; received in revised form 24 July 1997; accepted 5 August 1997
Abstract With the aim of carrying out a calibration transfer for routine analysis, partial least squares (PLS) regression was successfully applied to simultaneously determine thallium and lead by stripping voltammetry when an interfering background current is present. The presence of a significant blank signal that overlaps the thallium peak, together with the overlapping thallium and lead signals were both suitably modelled by this multivariate regression technique. Moreover, once the PLS models are built, the piecewise direct standardization (PDS) method can be used to transfer these models over time in such a way that the number of calibration samples that will be needed in future determinations is reduced from 25 to 9, without a loss of quality in the analyses. The mean of the relative errors (in absolute values) obtained for thallium and lead is below 4.94% and 3.19%, respectively. © 1998 Elsevier Science B.V. All rights reserved. Keywords: Calibration transfer; Partial least squares regression; Background current; Overlapping signals; Multianalyte determination; Stripping voltammetry
1. Introduction The simultaneous determination of thallium and lead is commonly performed by stripping voltammetry, where the current recorded during the stripping step is given by the sum of both the current corresponding to the metal of interest and the background current, unrelated to the reaction of interest. The latter can also be divided into the * Corresponding author.
charging current and the background component related to redox reactions of impurities, supporting electrolytes, solvents or reactions of the electrode itself. The accuracy of the stripping voltammetric determinations depends on the discrimination that can be achieved against the different contributions to the total current, i.e. on the uncertainty in measuring the current of the reaction of interest. Effective approaches for compensating the charging current have been developed, perhaps the
0039-9140/98/$19.00 © 1998 Elsevier Science B.V. All rights reserved. PII S 0 0 3 9 - 9 1 4 0 ( 9 7 ) 0 0 2 6 9 - 5
130
A. Herrero, M.C. Ortiz / Talanta 46 (1998) 129–138
most widely used being the differential pulse stripping voltammetry [1] which in addition, gives a gain in the overall signal-to-background ratio. However, it should be noted that the differential pulse mode discriminates only against the nonfaradaic charging current whereas it does not correct the various faradaic components of the background current. To also discriminate this latter contribution some subtractive procedures have been developed with the aim of measuring the background current and subtracting it from the recorded current. Among these, the subtractive stripping voltammetry [1] presents the disadvantage of needing two identical electrodes (with the same surface area, morphology, etc.), which implies that, since two electrode surfaces are not perfectly matched, the background current is not always completely compensated. The present-day electrochemical instrumentation and Chemometrics [2] provide wide information about a system under study and useful mathematical tools respectively, that allows one to approach the background current problem from a different point of view. The use of multivariate regression techniques [3], such as the partial least squares (PLS) regression [4,5], implies the extraction of the most relevant information related to the analyte of interest and possibly a way to discriminate the background current, from the whole voltammetric signal (not only from the peak current). In this paper, PLS regression has been used to simultaneously determine thallium and lead by differential pulse anodic stripping voltammetry (DPASV) when the experimental blank also gives a voltammetric peak that seriously interferes with the thallium peak. The determination of these two metals in most electrolytes is difficult because of the similar potentials at which both are oxidized, which results in the overlapping of their voltammetric peaks and, hence, in an analytical error. In fact, several chemical and mathematical approaches have been proposed to solve these two overlapping peaks that include the use of extraction techniques [6], suitable electrolytes [7,8] or surfactants [9], medium-exchange [10,11], and others such as: derived voltammetry [12], Kalman filter [13,14], PLS [15] or continuum regressions [16], etc.
Once the PLS models had been built, a calibration transfer procedure was carried out to reduce the experimental effort that would be needed in similar analyses in the future carried out in a routine way. Among the several standardization procedures [17–20] that have been developed, the piecewise direct standardization (PDS) method [17,21] has provided very successful results when it has been used with NIR spectroscopic data, but it can also be applied to polarographic [22] and voltammetric data [23]. This method is based on calculating a matrix, the transformation matrix F, which relates a subset of signals measured for instance on a specific instrument or a set day to other signals measured on another instrument or a different day. Then, the PLS model built with the whole primary calibration set can be applied to the signals of the new situation corrected by means of the transformation matrix, in such a way that only a small subset of calibration samples has been measured in the new situation. In this way, the number of samples needed for simultaneously determining thallium and lead in future analyses by means of the PLS regression has been reduced from 25 to 9, without a loss of quality in the determinations.
2. Experimental
2.1. Apparatus The voltammetric measurements were carried out using a Metrohm 646 VA processor with a 647 VA stand in conjunction with a Metrohm multimode electrode (MME) used in the hanging mercury drop electrode (HMDE) mode. The three-electrode system was completed by means of a platinum auxiliary electrode and an Ag AgCl KCl (3 mol dm − 3) reference electrode. A Crison micro pH-2002 pH-meter was used for pH measurements. The analysis of data was done with PARVUS [24] and MATLAB [25].
2.2. Reagents Analytical-reagent grade (Merck) chemicals were used without further purification. All the
A. Herrero, M.C. Ortiz / Talanta 46 (1998) 129–138
131
solutions were prepared with deionised water obtained in a Barnstead NANO Pure II system. The voltammetric measurements were carried out in an oxalic acid (0.1 mol dm − 3) and hydrochloric acid (0.1 mol dm − 3) supporting electrolyte medium (pH=1.037). Nitrogen (99.997%) was used to remove dissolved oxygen.
2.3. Procedure The solution was placed in the voltammetric cell and purged with nitrogen for 10 min. Once the solution had been deoxygenated, a deposition potential of −0.600 V was applied to the Table 1 Experimental design used in calibrates A and B Sample
1 t1 2 3 4 5 6 7 8 9 10 t3 11 12 13 14 15 16 17 t2 18 19 20 t4 21 22 23 24 25
Experimental design level
Concentration (M×107)
Thallium
Lead
Thallium
Lead
1 1 1 1 1 1 2 2 2 2 2 2.5 3 3 3 3 3 4 4 4 4 4 4 4.5 5 5 5 5 5
1 1.5 2 3 4 5 5 1 2 3 4 4 1 2 3 4 5 1 2 2.5 3 4 5 5 1 2 3 4 5
3.2659 3.2594 3.2530 3.2402 3.2275 3.2148 6.4047 6.5059 6.4803 6.4549 6.4297 8.0214 9.7205 9.6824 9.6445 9.6070 9.5698 12.9098 12.8594 12.8343 12.8093 12.7597 12.7104 14.2717 16.0742 16.0117 15.9496 15.8880 15.8269
2.2909 3.4295 4.5636 6.8185 9.0557 11.2754 11.2315 2.2818 4.5457 6.7918 9.0203 9.0027 2.2728 4.5278 6.7652 8.9852 11.1880 2.2639 4.5102 5.6267 6.7389 8.9504 11.1448 11.1233 2.2551 4.4926 6.7128 8.9158 11.1019
t1 to t4 are the test set samples.
Fig. 1. Stripping voltammograms recorded for calibrates A (a) and B (b). The peaks correspond to thallium (left) and lead (right).
mercury electrode during 30 s. Next, the stirrer was switched off and, after 30 s of rest period, an anodic potential scan was initiated from − 0.558 V to − 0.282 V in the differential pulse mode. Other instrumental parameters were: modulation amplitude, 50 mV; pulse repetition time, 0.6 s; nominal area, 0.40 mm2; scan rate, 10 mV s − 1; stirring rate, 1290 rev. min − 1. The solution was stirred and deoxygenated for 15 s after each addition.
3. Results and discussion The major overlap of thallium and lead peaks is the most significant obstacle found in the
A. Herrero, M.C. Ortiz / Talanta 46 (1998) 129–138
132
Fig. 2. Stripping voltammograms corresponding to the experimental blank (dotted line) and to sample 33 of calibrate A with (dash-dot line) and without (solid line) subtracting the blank signal.
simultaneous determination of both metals by stripping voltammetry. Previous analyses demonstrate that this interference is so important that if an insufficient experimental design is used (eight training samples distributed in a central composite design), then the information provided to the PLS model is too limited and, consequently, the regression model built becomes useless (explained variance below 96%). Therefore, the experimental design showed in Table 1 has been used, which exhaustively covers the considered concentration ranges; the training set is formed of 25 calibration samples whereas the test set is composed of four samples.
A whole calibration set was recorded by applying the experimental procedure to the samples of the experimental design, the corresponding voltammograms being shown in Fig. 1a, that henceforth will be denoted as calibrate A. As expected, the presence of two overlapping voltammetric peaks, that can be analysed by means of multivariate techniques [15,16], is observed; the first (left) assigned to thallium and the second (right) to lead. However, the supporting electrolyte used in this paper also gives a voltammetric peak which, as can be seen in Fig. 2, overlaps the thallium peak. This signal, considered background current, leads to a significant increase in the recorded signal (solid line) with regard to the net signal (dash–dot line) obtained when the blank signal is subtracted. Likewise, a shifting of the thallium peak (the left peak) is observed since the blank signal and the thallium peak potential do not coincide exactly, which is more evident by comparing the voltammograms in Fig. 2. In Fig. 1a the thallium peak is being shifted to more positive potentials when the concentration of thallium is increased, i.e. when the thallium contribution to the peak is greater. Whereas in Fig. 2, when the blank signal has been subtracted, the peak potential is about − 0.435 V in all the voltammograms recorded for the different levels of thallium considered. Usually, if a blank gives a voltammetric signal in the potential range where the peaks of the metals to be determined appear, this signal is
Table 2 Explained and cross-validated variance of the PLS models built for calibrate A L. V.
Thallium
Lead
Y block
1 2 3 4 5 6
X block
Y block
X block
Explained variance
Cross-validated vari- Explained variance ance
Explained variance
Cross-validated vari- Explained variance ance
47.15 94.31 99.68 99.71 99.72 99.72
48.93 94.62 99.65 99.67 99.64 99.53
54.43 99.28 99.36 99.58 99.58 99.59
55.71 99.32 99.40 99.57 99.52 99.49
81.00 98.10 99.33 99.52 99.72 99.86
L.V. is the number of the considered latent variables.
81.00 98.10 99.27 99.57 99.75 99.86
A. Herrero, M.C. Ortiz / Talanta 46 (1998) 129–138
133
recorded at 47 potential values as predictor variables for each sample, the concentration of metal being the response variable in each case. The data have not been autoscaled. The full cross-validation method [26,27] has been used with the aim of evaluating the stability and prediction ability of the PLS models built. The high explained and cross-validated variance values, showed in Table 2, indicate that the interference due to the blank signal has been suitably modelled by the PLS regression, despite the significance of the phenomenon, principally on the thallium calibration. The models built for both metals consist of four latent variables, whose
Fig. 3. Loadings for the 1st (a), 2nd (b) and 3rd (d) latent variables corresponding to the PLS model built for thallium (calibrate A).
subtracted from the calibration and test set voltammograms alike to avoid undesirable analytical errors. But, it is not always possible to have the experimental blank signal and so, a method able to model this interference would be of great interest. So, to evaluate the ability of the PLS regression to model this kind of interference, which in this case is added to the above indicated strong overlap of thallium and lead peaks, the original signal has been used in the analysis, i.e. the voltammograms showed in Fig. 1 where the blank signal has not be subtracted.
3.1. Multi6ariate regression models A PLS model has been built independently for both thallium and lead, taking the current
Fig. 4. Scores for the 1st – 2nd (a) and 1st – 3rd (b) latent variables corresponding to the PLS model built for thallium (calibrate A). The first digit indicates the concentration level of thallium, the second is the level of lead.
134
A. Herrero, M.C. Ortiz / Talanta 46 (1998) 129–138
Fig. 5. Standardization subsets (samples signalled with a circle) used in the calibration transfer procedures. On ordinates the level of thallium and on abscissae that of lead.
chemical interpretation has been analysed through the corresponding loadings and scores, Fig. 3 and Fig. 4, respectively. The shape of the loadings calculated for the first latent variable, showed in Fig. 3a for thallium, corresponds to a ‘mean voltammogram’, which indicates that it is a size factor, as was expected. However, the loadings corresponding to the second latent variable, Fig. 3b, establish a difference between the two ‘evident’ overlapping peaks of the signal, extracting that information related to the metal of interest in each case and adequately correcting the interference due to the other overlapping peak. Thus, those potentials related to the metal of interest have positive val-
ues, whereas the contribution of the rest to the signal is subtracted through the negative sign of the corresponding loadings. In fact, the distribution of the scores in the plane formed by these first two latent variables reproduces the experimental design used (Fig. 4a) where the samples are arranged along the abscissae axis in increasing concentration order (from sample 11 to 55). The models built with both latent variables have high explained and cross-validated variance (Table 2) but the need for more variables in the PLS models is evident, especially for thallium since the interference of the blank signal is critical in this case. On the other hand, the loadings obtained for the third latent variable in the model built for
A. Herrero, M.C. Ortiz / Talanta 46 (1998) 129–138
135
Table 3 Effect of modelling the blank interference on the concentration calculated with the PLS models for the test set samples of calibrate A. True values are found in Table 1 Sample
t1 t2 t3 t4
Concentration (M×107) calculated with the two latent variables PLS models (without modelling interferences)
Concentration (M×107) calculated with the four latent variables in PLS model (modelling interferences)
Thallium
Lead
Thallium
Lead
5.3312 13.1027 8.5162 12.9128
3.3714 5.8917 10.0103 11.6537
3.2613 13.3710 8.0696 14.0989
3.5082 5.7825 9.4880 11.0425
thallium (Fig. 3c) are related principally to the left side of the voltammogram, establishing the difference between those potentials related to the thal-
lium peak and those corresponding to the blank signal, which is subtracted. Moreover, the linear relation that exists between the scores of the first (size factor) and third latent variables (Fig. 5b) indicates that the phenomenon modelled by the third latent variable is intimately related to the predictor variables themselves, i.e. to something referred to the global signal and not to the concentration of the metals. This was confirmed since another two PLS models built by subtracting the blank signal to the training set voltammograms had similar explained and cross-validated variance values, but no latent variable presented this characteristic structure (the scores in Fig. 4b seem to ‘put in order’ the samples in the plane of the 1st and 3rd latent variables taking into account the sum of the concentrations or levels of both analytes). With regard to the fourth latent variable, which is included in the models of both metals, it seems to be related rather to specific samples or objects than to more general phenomena. Table 4 Relative errors (%) for the test set samples of calibrate B calculated without and with calibration transfer Sample
t1 t2 t3 t4 Fig. 6. SEP contour lines of the response surfaces corresponding to thallium (a) and lead (b).
With the full calibration matrix B
Using calibration transfer
Thallium
Lead
Thallium
Lead
7.01 −2.39 −2.20 −2.85
7.17 −2.79 −4.47 −3.40
5.07 −3.13 −4.72 −6.77
0.71 −2.95 −5.24 −3.38
The parameters used in the calibration transfer were: no=7 and w=47 for thallium and no = 7 and w =21 for lead.
136
A. Herrero, M.C. Ortiz / Talanta 46 (1998) 129–138
The third and four latent variables are then related to the interference and some irregularities so, if one does not consider these two latent variables, it is possible to evaluate the error due to the interference. In the training set, considering only two latent variables, the mean relative error (in absolute value) is 3.11% versus 2.24% when the concentration of lead is calculated with four latent variables. In the case of thallium the effect is much more important, 14.48% versus 1.51% with two and four latent variables, respectively. This high discrepancy is due to the fact that the interference shows an overlapping signal in the range of potentials where the thallium peak appears. Detailed errors for test samples are shown in Table 3. Mean errors obtained are 5.59% versus 2.80% for lead and 20.34% versus 1.51% for thallium. The loadings and scores shown in Fig. 3 and Fig. 4 for thallium were similar to those obtained in the PLS model built for lead, so a parallel analysis was made.
3.2. Calibration transfer As the PLS regression technique has proved to be useful in the determination carried out, a calibration transfer procedure, the PDS method, has been applied for minimizing the experimental effort that would be necessary in later routine analyses. For this purpose, a new calibration set, henceforth named calibrate B, was recorded some days later following the above experimental design. This new training set (Fig. 1b) is very similar to the former, corresponding to the calibrate A showed in Fig. 1a, but they are not identical. The PDS method [17,21] consists of relating the voltammograms measured in two different situations, in this case on different days, through a transformation matrix F which is calculated from a small subset (standardization subset) of voltammograms recorded in both situations (X( A and X( B for calibrates A and B, respectively). X( A =X( BF The standardization subset should be large enough in such a way that predictor variables sufficiently cover the whole experimental range. In
this way, each predictor variable (xi ) from the first day is related to a moving window (Zi ) of variables measured on the second day by means of a local multivariate regression, the corresponding regression vectors being arranged along the main diagonal of the transformation matrix and the rest of the elements being zero, so the size of F is p ×p, p being the number of predictor variables measured. Zi = [xB,i − j, xB,i − j + 1,…, xB,i + k − 1, xB,i + k ] xA,i = Zibi Then, F can be used to correct any voltammogram measured on the second day to the format which would have been obtained if it had been measured on the first day. In this way, the previously built and analysed PLS models can be applied, through their corresponding closed forms [28], to the corrected signals for determining the concentration of thallium and lead in the samples measured on the second day. The m-function pdsgen, implemented in the PLS Toolbox by Wise [29], has been used to calculate the transformation matrix F. The input parameters for this function are the two training sets (XA and XB), the window size and the standardization subset. With the aim of optimizing the standardization procedure, several values have been used for the last two parameters. The window size (w) has taken the following values: 3, 5, 7,…, 47; i.e. 23 different values in all, whereas the different standardization subsets (no) used are those shown in Fig. 5. The standard error of prediction (SEP), calculated from the test set samples, has been used to evaluate the quality of the calibration transfer procedures carried out, although those errors related to the training set samples have also been taken into account. SEP is given by the square root of % (yj − yˆB, j )2 s2 =
j
J
where yj and yˆB, j, are the true concentrations and those computed by the PLS model built for calibrate B respectively, and J is the number of
A. Herrero, M.C. Ortiz / Talanta 46 (1998) 129–138
objects in the test set. Fig. 6 shows, through the contour lines of the response surfaces, the values of SEP obtained for both metals as a function of the different values of w and no used in the calibration transfer (a total of 207 independent calibration transfers have been done). As can be seen, the best results are related to the standardization subsets 1 and 7; whereas the worst were obtained with the standardization subset 6, whose samples correspond to one of the diagonals of the design (Fig. 6) and for this reason, this subset provides highly correlated voltammograms that lead to a significant structural defect in this standardization subset [30]. Since no=1 implies all the training set samples in the standardization procedure, i.e. it corresponds to a full recalibration (25 training set samples), and no =7 requires only the measurement of nine standardization samples in the new situation, it is evident that the latter is the optimum standardization subset for simultaneously determining thallium and lead by DPASV through the calibration transfer procedure proposed. The use of only one standardization subset for both metals is a desirable feature of the calibration transfer because, in this way, no additional experimental effort is necessary to determine one of the metals. The influence of the different window sizes used on the SEP values is less than that of the standardization subsets. For no = 7, the minimum SEP corresponds to w =47 for thallium and w= 21 for lead. The larger window size for thallium is due to the fact that the blank signal seriously interferes with the thallium peak, which makes necessary the use of some potentials where this interference does not exist to adequately carry out the calibration transfer. So, the interference of the background current is also detected in the calibration transfer step. The relative errors corresponding to test set samples of calibrate B calculated by means of the calibration transfer and the PLS models built with the full recalibration are shown in Table 4. The errors obtained with the calibration transfer for both test and training sets are similar to those corresponding to the full recalibration, i.e. to those calculated by the PLS models built with all
137
the calibration samples of calibrate B, which indicates the success of the standardization approach. To confirm the ability of the instrumental standardization to predict the concentrations of thallium and lead, the procedure has been applied to the 16 samples of calibrate B which do not belong to transfer set no=7; these samples are independent of the standardization procedure and their concentrations are known. The mean of the relative errors (in absolute values) obtained for thallium and lead is 4.94% and 3.19% respectively. These errors are in the range expected when this stripping technique is used, but the number of standard calibration samples that will be needed in later routine determinations of both metals has been reduced to 64%.
Acknowledgements This work has been partially supported by Consejerı´a de Educacio´n y Cultura de la Junta de Castilla y Leo´n under Project BU22/96.
References [1] J. Wang, Stripping Analysis: Principles, Instrumentation and Applications, VCH Publishers, Derfield Beach, 1985. [2] S.D. Brown, R.S. Bear, Crit. Rev. Anal. Chem. 24 (1993) 99. [3] H. Martens, T. Næs, Multivariate Calibration, 2nd edn., Wiley, New York, 1992. [4] R.G. Brereton, Chemometrics, Applications of Mathematics and Statistics to Laboratory Systems, Ellis Horwood, Chichester, UK, 1990. [5] M. Meloun, J. Militky´, M. Forina, Chemometrics for Analytical Chemistry, Vol. 2, Ellis Horwood, New York, 1994. [6] Y.A. Karbainov, A.G. Stromberg, Izv. Tomsk. Politekhn. Inst. 151 (1966) 7. [7] M.A. Allus, R.G. Brereton, Analyst 117 (1992) 1075. [8] Z. Lukaszewski, W. Zembrzuski, A. Piela, Anal. Chim. Acta 318 (1996) 159. [9] Z. Lukaszewski, M.K. Pawlak, A. Cizewski, Talanta 27 (1980) 181. [10] L. Anderson, D. Jagner, M. Josefson, Anal. Chem. 54 (1982) 1371. [11] R. Neeb, I.Z. Kiehnast, Fresenius Z. Anal. Chem. 226 (1967) 153. [12] J.J. Berzas, J. Rodrı´guez, Fresenius Z. Anal. Chem. 342 (1992) 273.
138
A. Herrero, M.C. Ortiz / Talanta 46 (1998) 129–138
[13] D.P. Binkley, R.E. Dessy, Anal. Chem. 52 (1980) 1335. [14] C.A. Scolari, S.D. Brown, Anal. Chim. Acta 166 (1984) 253. [15] A. Henrion, R. Henrion, G. Henrion, F. Scholz, Electroanalysis 2 (1990) 309. [16] M.C. Ortiz, M.J. Arcos, L.A. Sarabia, Chemometr. Intell. Lab. Syst. 34 (1996) 245. [17] Y. Wang, D.J. Veltkamp, B.R. Kowalski, Anal. Chem. 63 (1991) 2750. [18] O.E. de Noord, Chemometr. Intell. Lab. Syst. 25 (1994) 85. [19] M. Forina, G. Drava, C. Armanino, R. Boggia, S. Lanteri, R. Leardi, P. Corti, P. Conti, R. Giangiacomo, C. Galliena, R. Bigoni, I. Quartari, C. Serra, D. Ferri, O. Leoni, L. Lazzeri, Chemometr. Intell. Lab. Syst. 27 (1995) 189. [20] B. Walczak, E. Bouveresse, D.L. Massart, Chemometr. Intell. Lab. Syst. 36 (1997) 41. [21] Y. Wang, M.J. Lysaght, B.R. Kowalski, Anal. Chem. 64 (1992) 562.
.
[22] A. Herrero, M.C. Ortiz, Anal. Chim. Acta, in press. [23] A. Herrero, Ana´lisis Multivariante y Transferencia de Calibrado en Voltamperometrı´a. Ph.D. thesis, University of Burgos, 1996. [24] M. Forina, R. Leardi, C. Armanino, S. Lanteri, PARVUS: An Extendable Package of Programs for Data Exploration, Classification and Correlation, Ver. 1.1, Elsevier Scientific Software, 1990. [25] Matlab, The MathWorks, Natick, Mass., 1992. [26] M. Stone, J. R. Stat. Soc. Ser. B 36 (1974) 111. [27] S. Lanteri, Chemometr. Intell. Lab. Syst. 15 (1992) 159. [28] E. Marengo, R. Todeschini, Chemometr. Intell. Lab. Syst. 12 (1991) 117. [29] B.M. Wise, PLS Toolbox for use with MATLAB, ver. 1.3, available from the author. [30] J.H. Kalivas, P.M. Lang, Mathematical Analysis of Spectral Orthogonality, in: Practical Spectroscopy series, Vol. 17, Marcel Dekker, New York, 1994.
.