Analytica Chimica Acta 420 (2000) 109–121
Simultaneous determination of phenol isomers in binary mixtures by differential pulse voltammetry using carbon fibre electrode and neural network with pruning as a multivariate calibration tool Rosangela M. de Carvalho, Cesar Mello1 , Lauro T. Kubota∗ Institute of Chemistry, Unicamp, P.O. Box 6154, 13083-970 Campinas, SP, Brazil Received 28 July 1999; received in revised form 1 May 2000; accepted 8 May 2000
Abstract Neural networks with pruning were applied to model overlapped peaks obtained in differential pulse voltammetry (DPV) with modified carbon fibre electrode with TiO2 of binary mixtures of catechol and hydroquinone. The best condition for electrochemical response was obtained with 0.05 mol l−1 Tris–HCl buffer at pH 6.0 and T-800 sized carbon fibre electrode. Initially the voltammograms were processed using Fourier transform filter and principal component analysis (PCA) to noise reduction and data compression, respectively. The scores of these principal components were the input into the neural network and the optimal brain surgeon (OBS) was the procedure employed for pruning the neural network. The results obtained with pruning procedure were slightly better in relation to hydroquinone in comparison to the PLS1 and PLS2. However, the similar errors were obtained to catechol when using PLS or neural networks models. Using neural networks with pruning was possible to determine catechol and hydroquinone by DPV using carbon fibre electrode, in concentration range of 1.0×10−4 up to 6.0×10−4 mol l−1 with root mean square errors of predictions (%RMSEP) of 7.42 and 8.02, respectively. The good results show that the proposed methodology is a good alternative to simultaneous determination of catechol and hydroquinone in binary mixtures. © 2000 Elsevier Science B.V. All rights reserved. Keywords: Simultaneous determination of catechol and hydroquinone; Electroanalysis; Neural networks; Pruning; Optimal brain surgeon; Carbon fibre electrode; Differential pulse voltammetry
1. Introduction The determination of phenolic compounds is of interest in many fields, such as environmental control [1], neurochemistry [2–5], pharmaceutical [6] and clinical chemistry [7].
∗ Corresponding author. Tel.: +55-19-788-3127; fax: +55-19-788-3023. E-mail address:
[email protected] (L.T. Kubota). 1 Present address: University of Franca, 14404-600 Franca, Brazil.
In the most part of the works described in the literature for the catechol and hydroquinone determination [8] are made after previous chemical or physical separation. This strategy is very important but the time, wastes and reagent consuming are generally high, in these analyses. Thus, the development of new methods, that possibility the simultaneous determination, without previous separations of these compounds is a relevant subject. The phenolic compounds, such as hydroquinone (1,4-dihydroxy-benzene) and catechol (1,2-dihydroxybenzene), have basic quinones structures that might
0003-2670/00/$ – see front matter © 2000 Elsevier Science B.V. All rights reserved. PII: S 0 0 0 3 - 2 6 7 0 ( 0 0 ) 0 1 0 1 2 - 6
110
R.M. de Carvalho et al. / Analytica Chimica Acta 420 (2000) 109–121
be electrochemically oxidised at platinum or carbon electrodes [9]. The carbon fibres show advantages, due to its low dimensions (5–15 m). Consequently when operated as a single microdisc electrode, the perpendicular diffusion to the surface is only a small component of the analyte flux and the radial mass transport become more significant, leading high sensitivity. The response of these electrodes is very fast (less than 1 s), allowing fast scanning [10–12]. The Ohmic drop problems associated to the use of these electrodes in resistive media remain minimal [13]. These facts are of a great practical importance because it makes the cylindrical microelectrodes useful for direct electrochemical measurements even in resistive media and small volumes, e.g. use to monitor and quantify neurotransmitters in vesicles localised in neurones [13–18], to in vivo experiments [19,20], etc. A very reasonable experimental approach for simultaneous determination of these phenolic isomers in binary mixtures, without previous separations, could be made by means of differential pulse voltammetry (DPV) using carbon fibre electrode [21]. The chemical principle for this method is the electrochemical redox of quinone structures, at surface of carbon fibre electrode modified with TiO2 , present in the catechol (1,2-dihydroxy-benzene) and hydroquinone (1,4-dihydroxy-benzene) [9]. The determination of these phenolic compounds might be made without complex modification of electrode surface of carbon fibre, only a simple routine pre-treatment to increase the repeatability and sensitivity. The pre-treatment is necessary to increase the sensitivity due to the electron transfer improvement [22] and probably minimise the electrode passivation caused by the oxidised phenolic compounds. This pre-treatment of the carbon fibre makes its surface more flat, avoiding the adsorption of the oxidised products. The voltammetric peaks (obtained with carbon electrodes or carbon fibre electrodes) corresponding to oxidation/reduction of two phenol isomers are, in many cases, highly overlapped. Moreover, the competition of the phenolic isomers by electrode surface makes the relationship between the voltammetric response and the isomers concentrations, in the mixtures, non-linear. These two aspects acting simultaneously make it very difficult, or impossible, to model these systems which show considerable generalisation errors by linear univariate calibration methods. An alternative to solve
these problems is the application of multivariate calibration methods, where the full voltammogram is considered and not only the characteristic peak current at the oxidation or reduction potentials of each phenolic isomer in the mixture [23]. Among the multivariate calibration methods possible to apply in these cases point out to PLS [24], neural networks [25] and more recently the neural networks with pruning [26]. In this works PLS and neural networks were applied to determine catechol and hydroquinone simultaneously by DPV using carbon fibre electrode modified with TiO2 . 1.1. Theory In the last years, the number of neural network applications to model data related to the analytical chemistry has showed a considerable increasing and diversification [27,28]. They shall be found either in modelling non-linear calibration curves [29,30], pattern recognition [31], multi-component analysis [32] and on-line process control in real time [33,34]. This fact is, possibly, due to ability of the artificial neural networks to model data equally well the linear and non-linear relation between dependent and independent variables. Basic units information processing called neurones [35], which are arranged in lines or layers, compose artificial neural networks. An artificial neural network has always input and an output layer and between them, there are a variable number of hidden layers. This layer disposition and the number of neurones by layers are called neural network architecture. The input can be any multivariate signal like current measured at different potentials of the voltammograms, absorbance at different wavelengths and so on. The output or neural network responses are the independent variables like concentrations, which the neural network, will be trained, in a typical calibration process. The response or output, Yˆ of a neural network which ni input neurones, one hidden layer with nh neurones and one output neurone, to an input vector x, can be written as ! nh nh X X Wj f wji xi + bj + B (1) Yˆ = j =1
i=1
R.M. de Carvalho et al. / Analytica Chimica Acta 420 (2000) 109–121
where f is a linear, sigmoid or tangent hyperbolic transfer function, bj and B are the biases of the model and wji and Wj the weights of hidden and output layer, respectively. In this study, the input vectors x are the voltammograms scores and output the concentrations of catechol and hydroquinone in binary mixtures. Once the estimated values by neural network are got, that is, the neural network output, the calibration error (E), defined as the sum of the square resulting from the difference between the estimated value by network (output) and the expected value, can be calculated, using the Eq. (2) E(W ) =
m X
(Y − Yˆ )2
(2)
i=1
where Y is the real value, Yˆ the neural network output value and m is the number of samples used to train the artificial neural network. After obtaining the neural network output value, the next step is to correct the weights of all layers until the calibration error (E) is minimised, which can be made through the error backpropagation method [36] or by means of Marquardt–Levenberg method [37,38]. In the back propagation method, the weight correction process is started in the last layer and follows in the direction to the first layer, thus called as back propagation. The equation for the weight can be written as 1Wjil = ηδjl outl−1 + µ 1Wji
previous
(3)
According to the equation, the weight correction (1Wjil ) is composed of the sum of the two terms and act in conjunction to minimise the error prediction. The first term (ηδjl outl−1 ) is basically the steepest descendent method, when it is applied to the nets. The essence of this method is to make corrections in opposite directions to those indicated by the gradient (δjl ) of the error surface. The parameter η called the training speed is introduced to weigh the corrections, in a way to avoid large corrections mainly in the end of the iterative process, when the main parts of the weight was already corrected and a deep correction might affect the corrections already made. The outl −1 parameter is the anterior output layer. In the second term previous ), the constant µ, called as momentum, (µ 1Wji shifts the ‘optimum’ weight obtained for the first term, forcing the error to be evaluated with this new weight.
111
This term minimises the possibility of local minimum convergence, because a higher error indicates that the minimum obtained in the first term was really global, in the opposite case, the minimum was local and the searching for the minimum goes on. Summarising, the second term acts as a safety device of the first term, avoiding local minima. In the second training method, the weights are optimised through a variable of the Gauss–Newton method [37,38], known as Marquardt–Levenberg method. The weight adjustment in this case is made through Eq. (4) 1W = (J T J + λI )−1 J T ∇E
(4)
where J is the Jacobian matrix of the error for each of the weights, λ the non-negative scalar, I the identity matrix and E is the error vector given by Eq. (2). The Marquardt–Levenberg method can be seen as an intermediate procedure between the Gauss–Newton method and the descendent gradient method, since λ assumes elevated values (λ → ∞) the descendent gradient method is obtained and when λ assumes small values (λ → 0) the Gauss–Newton method is obtained. In this work the Marquardt–Levenberg method was used which is faster in the convergence and is more robust. The aim of the training procedure is to find an optimal set of possible weights and bias that the network will produce predictions (Yˆ ) as similar as possible to the known outputs (Y), minimising the error function, E(W, w). Correction of weights ends when error (E) reaches the previous established convergence criteria. At this point, artificial neural network is completely trained, and it is possible to evaluate the generalisation properties of the neural networks by adopting another group, a validation set, which has different data from those used in the calibration. However, in some cases, artificial neural networks even after being trained shall demonstrate low errors for the calibration set and high errors for the validation set, that is, overfitting due to an excessive quantity of neurones applied in the hidden layer. For instance, the number of neurones in hidden layer is similar to the principal component number used in principal component regression (PCR) [39] or to polynomial order used in polynomial regression [40]. Therefore, to avoid overfitting is fundamental to optimise the quantity of neurones in the hidden layer. The proper number of neurones in the hidden layer can be determined by training neural network with differ-
112
R.M. de Carvalho et al. / Analytica Chimica Acta 420 (2000) 109–121
ent number of neurones in the hidden layer and then calculating the %S.E.C. and %S.E.P. Thus, the configuration presents lower calibrations and prediction errors can be chosen. Another method very useful for architecture optimisation is the network pruning.
Since the expansion was quadratic surrounding a possible minimum, the second term of Eq. (4) becomes null, thus the cost function can be written as c(w) = c(w0 ) + 21 (∇ 2 E(w) + D)(w − w0 )2
(7)
or in an equivalent way as 1.2. Network pruning
C(w) = c(w0 ) + 21 δwT H δw
The basic idea of this method is to start a neural network with excessive numbers of neurones inside the hidden layer during the training step and to cut all weights (W) that have slight influence on error E. Neurones with all their cut connections will be deleted. Thus, only those useful for modelling will be considered. The pruning technique [41] indeed reduces the complexity of neural networks, and improves their ability of prediction because avoiding the use of over parameterised (excessive quantity of neurones) models. There are two methods of pruning neural networks: the optimal brain damage (OBD) [42] and the optimal brain surgeon (OBS) [43]. In both the methods the connections (or weights) are cut and the corresponding variation in error E, called as saliency [44], is evaluated. In the first method, the cuts in the connections occur during the training step and the neural network is not trained again. Moreover, in the OBS method, connections are trained after each cut, allowing new cuts to be made. Therefore, in the OBS method the neural network is retrained fitting the training errors by a quadratic function, to warrant a possible minimum. In other words, in OBS method a term of regularisation [45] is added to the error function E(w) that is given by Eq. (2), regarding to the function cost (C(W )), which can be written in generic form as C(w) = E(w) + Dw2
(5)
The regularisation term induces pruning process and assures numerical stability to the method, by simply punishing weights with low values through the regularisation matrix D. The next step is to enlarge cost function in Taylor series until second-order terms, around a possible minimum weight value W0 . This is given as follows: c(w) ∼ = c(w0 ) + (∇ 2 E(w) + 2Dw)(w − w0 ) + 21 (∇ 2 E(w) + D)(w − w0 )2
(6)
(8)
where δW = W − W0 and H=(A+D), is the Hessian [46] matrix with regularisation term. A is the second derivative matrix of training error (∂ 2 E/∂w2 ) containing all second-order derivatives. After this phase, the weights must be eliminated and the cost function minimised. The elimination of the jth weight can be expressed as δW T ej = −W0T ej
(9)
where ej is the jth unit vector. The weight elimination and simultaneous cost function minimisation can be obtained by applying the Lagrange multiplier method [47] C˜ j (W ) = C(W ) + λ(δW + W0 )T ej
(10)
where λ is the Lagrange multiplier. By solving the above equation one can obtain the variations on the weights that leads to the simultaneous minimisation of the following equation: δW = −λH −1 ej
(11)
From this step it is possible to return to Eq. (3), for explicit E(W ) and to obtain the variation of the error in the weight function, as shown in Eq. (12). δE(w) = A δw 2 − Dw0 δw
(12)
In this point we replace the δW by the value obtained in Eq. (7) and reach the following equation for the saliency δEj (W ) = λW0 DH−1 ej + 21 λ2 ejT H −1 (∇ 2 E + D)H −1 ej
(13)
If D is zero, Eq. (13) reduces to the equation obtained by Hassib–Stork for the saliency calculation. The saliency equation (Eq. (13)) will furnish a mathematical criterion for connection cuts and can occur with simultaneous error minimisation.
R.M. de Carvalho et al. / Analytica Chimica Acta 420 (2000) 109–121
113
2. Experimental
2.4. Optimisation of pH and buffer concentration
2.1. Reagent and materials
In order to assure the best peak resolution the influence of pH and buffer concentration were investigated. The influence of the buffer (Tris–HCl), concentration was studied in the range 0.05 up to 0.25 mol l−1 and the influence of pH between 6.0 and 8.0.
Hydroquinone (HQ) and catechol (C) were obtained from Aldrich (Milwaukee, WI, USA). Araldite® from Brascola LTDA, Leit-C conductive carbon cement from Neubauer Chemikalien, polyethylene tube and PAN type T-sized 800 carbon fibre with 8 m diameter from Toray Industries Inc. (Tokyo, Japan) were employed in this work. All other reagents were of analytical grade. 2.2. Electrode preparation The carbon fibre electrode was prepared with titanium tetrachloride, following the protocol described in the literature [22]. A bundle of about 50 fibres were inserted in a 3 mm i.d. polyethylene tube, leaving about 3 mm of length for sealing by heating. Electrical contact was provided with a copper wire through a conducting carbon paste cement dissolved in xylene and the extremity of electrode was sealed with epoxy resin (Araldite® ). Fig. 1 shows a schematic diagram of the microelectrode.
2.5. Dataset The samples utilised at simultaneous determination of catechol and hydroquinone were synthetics, resulting from the factorial design [48]. The sample set was prepared in the concentration range 1.0×10−4 –6.0×10−4 mol l−1 , resulting in a total of 121 samples. However, the mixtures with high concentrations of catechol and hydroquinone saturate the electrode and did not present differences in theirs respective voltammograms. Thus, these samples were removed and 81 samples remained, as shown in Fig. 2. The samples were prepared from stock solutions of catechol and hydroquinone contained in 0.05 mol l−1 Tris–HCl buffer at pH 6.0.
2.3. Electrochemical measurements Differential pulse voltammetry experiments were performed using an Autolab potentiostat (model PGSTAT20 from Netherlands), connected to a PC microcomputer for potential control and data acquisition. A conventional three-electrode system was employed, consisting of a carbon fibre as working electrode, Pt wire as counter electrode and a saturated calomel electrode (SCE) as reference. The potential was scanned in the range 500 up to −300 (mV) using 5 ml of solution containing the mentioned analytes.
Fig. 1. Schematic diagram of the microelectrode; 1: carbon fibre (2–3 mm length), 2: electrical contact with carbon cement, 3: polyethylene capillary tube (20–30 mm length), 4: epoxy resin and 5: copper wire.
Fig. 2. Diagram for the experimental design for catechol and hydroquinone mixtures.
114
R.M. de Carvalho et al. / Analytica Chimica Acta 420 (2000) 109–121
Fig. 3. Plot of scores on PC1 vs. scores on PC2.
2.6. Samples selection The dataset was split in two subsets, one for calibration and other for validation of the model. The calibration set had 63 samples and the validation set 18 samples. The selection of the samples to subsets was made according to the principal component analysis (PCA) by plotting the scores of the first principal component (PC1) versus the second component (PC2), that represent 93% of data variance, as showed in the Fig. 3. In calibration and validation sets, samples representing each one of the four quadrants were put. 2.7. Data pre-processing Before the calibration step the data was compressed by PCA and the noise was reduced by fast Fourier transform filter [49,50]. In the model constructed with neural networks using pruning 15 principal components were employed which explained 95% of the data variance. The choice of 15 PC should be explained by the necessity of higher number of PC as input in neural networks. The extra PC will have their weight cut, without significant variation in the calibration error.
The noise minimisation in the voltammograms was carried out, before the application of modelling methods, employing direct Fourier transform [51–53], given by the Eq. (14), to obtain the voltammograms represented in the frequency domain (w), i.e. the direct Fourier transform was applied to obtain the frequency spectra F(w), of the voltammograms. 1 F (w) = √ 2π
Z
∞
−∞
f (E)eiEw dE
(14)
In Eq. (14), E represents electric potential and f (E) the voltammogram. After obtaining the frequency spectra of the analytical signal, the high frequencies are eliminated, since the major of the high frequencies are associated to the instrumental noise. Finally, the inverse Fourier transform is applied given by Eq. (15), and the original signal (voltammogram) is recovered with minimised noise. 1 f (E) = √ 2π
Z
∞
−∞
F (w)e−iEw dw
(15)
R.M. de Carvalho et al. / Analytica Chimica Acta 420 (2000) 109–121
2.8. Evaluation of the performance of models The relative performance of the different models, to each isomer, was evaluated in terms of root mean square of error prediction (%RMSEP) in relation to the standard deviation of the concentrations from the prediction set defined as qP np 2 1 i=1 (yi − yˆi ) × 100 (16) %RMSEP = Sp np where Sp and np are the standard deviation of concentrations and number of the samples used in the validation set, respectively, yi the real value and yˆi is the predicted value by the different employed models. 2.9. Computer programs The programs used to prune the network was obtained from the neural network based system identification toolbox [51] for use with MatLab. The PLS and PCA calculations were from the PLS toolbox [52] for use with MatLab. The program for noise reduction by fast Fourier transform was implemented utilising sub-routines from MatLab 4.0.
115
work, the electrochemical reduction was used to minimise the passivation of the electrode surface. Although, the sensitivity is slightly decreased, improvements in the resolution of peaks and stability of the signal are obtained. The use of carbon fibre electrodes treated with TiCl4 shows advantages when compared to the conventional electrodes (e.g. activated glassy carbon electrode, the electrodes catechol and hydroquinone are assumed to be strongly adsorbed [54]), they are very reproducible. The i–E wave is not changed upon successive scans. Also, it is not necessary to make measurements in a controlled atmosphere (e.g. nitrogen) to obtain the experimental results, such as activated glassy carbon electrode [54,55]. However, the electrode could not be used with high concentrations of phenolic compounds due to the passivation problems even with treated carbon fibres. Thus, a concentration range between 1.0×10−4 and 6.0×10−4 mol l−1 for both catechol and hydroquinone were employed. Although, this range is apparently narrow it should be considered that this range is extended to 1.2×10−3 mol l−1 due to addition of compounds. Higher than this level the electrode can be affected by the passivation. 3.1. Optimisation of the buffer concentration and pH
2.10. PLS and linear regression Besides evaluating neural networks and pruning neural network approaches, other calibration methods such as PLS [53] and univariate linear regression were taken into account. In the partial least squares (PLS1 and PLS2) method the model was constructed based on the training set and predictions performed for the validation set. The data were first mean centred and cross-validation was used to determine the optimal number of factors (latent variables). In the linear regression method the models were constructed by establishing the relationship between current at characteristic reduction potentials and the concentration of isomers in the binary mixtures.
3. Results and discussion The determination of derivatives of hydroquinone with glassy carbon electrodes and carbon paste electrodes usually are made with anodic sweep. In this
The change in the buffer concentration resulted in a shift of the cathodic peak potential of the species and a variation in the signal intensity. In Fig. 4 cathodic peak potential versus buffer concentration are plotted and at 0.05 mol l−1 the best peak separation was obtained, without loss of sensitivity. For concentrations lower than 0.05 mol l−1 , a decrease of the signal/noise ratio was obtained, thus the concentration of 0.05 mol l−1 was chosen for further work. In the pH studies performed for each analyte pH values higher than 8.0 did not show linear relationship between the peak current and concentration for the both isomers presumably due to their low stability in alkaline media. For pH lower than 7.0 a good linear relations were observed for both isomers, in the studied concentration range. The redox potential is pH dependent as can be seen in Fig. 5. The pH value where the both peak potentials are closest to zero was chosen. Based on these results the best condition was defined as the buffer concentration of 0.05 mol l−1 at pH 6.0.
116
R.M. de Carvalho et al. / Analytica Chimica Acta 420 (2000) 109–121
Fig. 4. Cathodic peak potential vs. buffer concentration of each individual isomer and mixture; (a) catechol 4.00×10−4 mol l−1 , (b) hydroquinone 4.00×10−4 mol l−1 and (c) mixture (1:1) 4.00×10−4 mol l−1 at pH 6.0.
Fig. 5. Cathodic peak potential vs. pH of (a) catechol 4.00×10−4 mol l−1 and (b) hydroquinone 4.00×10−4 mol l−1 , in 0.05 mol l−1 Tris–HCl buffer. Differential pulse voltammetry with scan rate of 10 mV s−1 and modulation amplitude of 25 mV.
Despite all the efforts to improve the peak resolution, a considerable overlap between them is still present, as shown in Fig. 6 and the signal resulting from this overlap is not a simple addition of individual signals of each isomer. This behaviour suggests that the determination of these compounds in the mixture is not trivial. 3.2. Calibration methods: construction and validation models 3.2.1. Linear regression There is no correlation between current intensities at characteristic peak potentials (−13.6 mV to catechol and −79.0 mV to hydroquinone) and concentration. These results indicated that the linear univariate method is unable to model the system. This occurs basically because the overlapping of voltammetric peaks and non-linear response systems, presumably due to the competition between the analytes by the electrode surface. 3.2.2. Partial least squares 1 (PLS1) Two PLS1 models were separately elaborated one for catechol and other for hydroquinone. Initially the
Fig. 6. Differential pulse voltammetric signal provided by each of the individual isomer studied and signal of mixture in 0.05 mol l−1 Tris–HCl buffer. (a) Hydroquinone 1.00×10−4 mol l−1 , (b) catechol 1.00×10−4 mol l−1 and (c) mixture 1.00×10−4 mol l−1 catechol/4.00×10−4 mol l−1 hydroquinone. The scan rate was 10 mV s−1 and the modulation amplitude was 25 mV.
R.M. de Carvalho et al. / Analytica Chimica Acta 420 (2000) 109–121
data were meaning centred and the optimal number of factor (latent variables) utilised in PLS1 models, was obtained by cross-validation. The optimal number of factors was 8 for catechol and 9 for hydroquinone. These large number of factors were necessary, possibly, due to the complexity caused by peaks overlaps, shift of reduction peak potentials changes the concentration and non-linear response of the carbon fibber electrode due to the competition between catechol and hydroquinone by the electrode surface. The results obtained for the prediction set are shown in Fig. 7(a and b), and Table 1. The %RMSEP obtained for this model was 8.04 for catechol and 12.33 for hydroquinone. 3.3. Partial least squares 2 (PLS2) In this partial squares method both analytes were modelled simultaneously. The optimal number of factors, obtained by cross-validation, was 9. This number is similar to those obtained for PLS1, indicating the same complexity of the system. The results obtained for the prediction set are shown in Fig. 8(a and b), and Table 1. The %RMSEP obtained for this model was 9.08 for catechol and 13.94 for hydroquinone.
117
3.4. Neural network with pruning The inputs employed to train the neural network were 15 principal components, which was able to explain 97% of the data variance, scaled in the range of −1 to 1. The initial network (before pruning) architecture and training parameters are summarised in Table 2. In Table 2, ‘L’ indicates a linear transfer function and ‘H’ a tangent hyperbolic transfer function. For example, the notation LHLHLH means that there are six neurones in the hidden layer, one with linear transfer function other with tangent hyperbolic function and so on. This procedure is very interesting because, in principle, it is possible to model linear and non-linear systems. Moreover, this procedure may be considered as a diagnosis for non-linearity, because by the end of the pruning procedure some non-linear neurones were present, and so the system will be non-linear. The initial neural network was trained using the Marquardt–Levenberg method. The convergence criteria (CC) is based on measurement of closeness in terms of a mean square error criteria and the training was stopped when the CC function gets below
Table 1 Results obtained for validation set modelled with PLS1, PLS2 and neural network, to catechol and hydroquinone, at concentration range of 1.0×10−4 –6.0×10−4 mol l−1 in 0.05 mol l−1 Tris–HCl buffer at pH 6.0 with respective errors Catechol (×10−4 mol l−1 ) Added value 5.26 5.34 5.74 3.27 2.36 2.42 1.95 0.962 3.38 2.44 1.42 1.47 1.44 1.45 1.94 3.30 4.85 5.79
PLS-1
Hydroquinone (×10−4 mol l−1 ) PLS-2
Neural Network
Predicted value
Error (%)
Predicted value
Error (%)
Predicted value
Error (%)
5.37 5.11 5.63 3.50 2.44 2.21 1.98 0.967 3.23 2.29 1.31 1.59 1.45 1.42 2.10 3.47 4.88 5.74
−2.16 4.87 3.31 −7.27 −2.74 7.17 −4.19 3.03 5.89 6.13 9.81 −7.19 2.98 3.98 −7.45 −5.09 −1.82 2.73
5.37 5.08 5.55 3.51 2.42 2.25 2.03 0.932 3.19 2.29 1.28 1.58 1.39 1.39 2.09 3.47 4.94 5.63
−2.15 4.87 3.31 −7.27 −2.74 7.17 −4.19 3.07 5.89 6.13 9.81 −7.19 2.98 3.98 −7.45 −5.09 −1.82 2.73
5.09 5.25 5.45 3.44 2.49 2.49 2.02 0.988 3.40 2.41 1.33 1.57 1.38 1.48 2.09 3.21 4.75 5.65
3.19 1.61 4.91 −5.14 −5.53 −3.06 −3.37 −2.77 −0.48 1.39 6.11 −6.49 3.66 −1.86 −6.52 2.88 2.11 2.46
Added value 3.83 2.43 3.82 3.27 3.31 5.82 5.36 2.88 4.81 4.85 3.79 4.90 2.87 1.93 0.971 2.36 2.43 2.89
PLS-1
PLS-2
Neural Network
Predicted value
Error (%)
Predicted value
Error (%)
Predicted value
Error (%)
3.93 2.26 4.01 3.4 3.24 5.47 5.53 2.95 4.6 4.77 3.59 4.94 2.76 1.96 0.980 2.57 2.46 3.07
−2.72 6.84 −4.86 −3.94 2.06 6.01 −3.11 −2.28 4.32 1.74 5.31 −0.78 3.86 −1.43 −0.948 −8.97 −0.86 −6.02
3.91 2.25 4.04 3.39 3.15 5.44 5.52 2.85 4.51 4.69 3.62 4.86 2.85 1.99 0.926 2.56 2.50 3.13
−2.23 7.25 −5.71 −3.88 4.70 6.50 −2.97 1.03 6.13 3.46 4.41 0.78 0.59 −0.06 4.67 −8.43 −2.59 −8.03
3.68 2.35 3.91 3.15 3.16 5.88 5.52 2.87 4.78 4.98 3.62 5.03 2.90 2.07 0.920 2.24 2.33 2.96
3.89 3.26 −2.32 3.71 4.41 −1.01 −2.85 0.56 0.65 −2.70 4.52 −2.59 −1.03 −7.04 4.75 4.96 4.38 −2.14
118
R.M. de Carvalho et al. / Analytica Chimica Acta 420 (2000) 109–121
Fig. 7. Results obtained for prediction set modelled with PLS1, to (a) catechol and (b) hydroquinone, at a concentration range of 1.0×10−4 –6.0×10−4 mol l−1 in 0.05 mol l−1 Tris–HCl buffer at pH 6.0. Fig. 8. Results obtained for prediction set modelled with PLS2, to (a) catechol and (b) hydroquinone, at a concentration range of 1.0× 10−4 –6.0×10−4 mol l−1 in 0.05 mol l−1 Tris–HCl buffer at pH 6.0.
a previous established value N
CC =
1 X (yi − yˆi )2 2N
(17)
i=1
where yi is the real value and yˆi the predicted value by neural network with pruning and N is the number of samples used in the calibration set.
The optimisation of the neural network architecture was carried out using the optimal brain surgeon algorithm (OBS). The optimal architecture obtained, after both training and pruning procedures, is shown in Fig. 9. The results obtained for prediction set are
R.M. de Carvalho et al. / Analytica Chimica Acta 420 (2000) 109–121
119
Table 2 Parameters employed at initial network architecture and training parameters (before pruning) Architecture
Value
Parameter
Value
Input layer nodes Input layer transfer function Hidden layer nodes Hidden layer transfer function Output layer nodes Output layer transfer function
15 L 6 LHLHLH 2 LL
Maximum number of iterations Stop criteria Weight decrement to hidden layer (D)
500 1.0×10−2 0.01
shown in Fig. 10(a and b), and Table 1. The %RMSEP obtained to this model was 7.42 for catechol and 8.02 for hydroquinone. These results indicate that pruning neural network is a slightly better procedure to model the system when compared to PLS. 3.5. Selection of the best procedure method The evaluation of the better procedure to model this system was made applying a F-test, at 95% confidence level to compare RMSEP for the tested methods (PLS1, PLS2 and pruning neural network) according to Eq. (18)
F (pi , pj ) =
RMSEPPLS1 or PLS2 RMSEPnetwork
2 (18)
where p is the number of degrees of freedom of the model. The PLS degrees of the freedom should be equal to validation samples minus the number of latent variables plus 2. The value 2 is for each mean that is subtracted from the blocks of data. For neural networks there is no method for deciding on the correct number of loss of degrees of freedom, however, the number of hidden neurones would be a conservative estimate. The critical F-test values are Fcritical,PLS1 =2.53 for catechol and 2.57 for
Fig. 9. Optimal architecture obtained, after training and pruning procedures.
120
R.M. de Carvalho et al. / Analytica Chimica Acta 420 (2000) 109–121
and PLS2 for catechol, but this difference in %RMSEP was not significant at the 95% confidence level (FPLS1catechol =1.17 and FPLS2catechol =1.49). However, an analysis of the linear coefficient and slope of Figs. 7, 8 and 10 showed that the PLS method presented a slightly better performance.
4. Conclusions Several conclusions can be drawn from the results of this study. (1) Due to the peak overlap and non-linear response of the system, it is impossible to model the system by linear univariate methods. (2) The OBS pruning gave a slightly better result than PLS2 for hydroquinone, showing that the use of extra factors in PLS2 model was not sufficient to model non-linear behaviour of the system. However, a comparison between models made for the catechol shows that the results are statistically the same. (3) The application of multivariate calibration procedure to model the non-linear response of the highly overlapping voltammetric peaks, obtained with carbon fibre electrodes, modified with titanium oxide, for the phenol isomer mixture was a very useful alternative method for simultaneous determination of catechol and hydroquinone, without any previous chemical separation. The electrode showed a great stability indicating that the titanium oxide and cathodic sweep contribute to minimise the electrode passivation. The concentration range between 1.0×10−4 and 6.0×10−4 mol l−1 to both catechol and hydroquinone was employed. (4) The result obtained points out the possibility to use this method for others mixtures of phenolic compounds and real samples. Fig. 10. Results obtained for prediction set modelled with neural network, to (a) catechol and (b) hydroquinone, at a concentration range of 1.0×10−4 –6.0×10−4 mol l−1 in 0.05 mol l−1 Tris–HCl buffer at pH 6.0.
hydroquinone and Fcritical,PLS2 =2.57 for catechol and hydroquinone. Using this approach, the results obtained for hydroquinone with pruning neural networks was slightly better than PLS2 (FPLS1hydroquinone =2.36 and FPLS2hydroquinone =3.02) and statistically equal, at the 95% confidence level, to PLS1. The pruning neural network gave slightly lower RMSEP value than PLS1
Acknowledgements The authors thank FAPESP for financial support and RMC is indebted to CNPq for a fellowship. References [1] R. Rella, D. Ferrara, G. Barison, L. Doretti, S. Lora, Biotechnol. Appl. Biochem. 24 (1996) 83. [2] W.A. Pryor, K. Stone, L.Y. Zang, E. Bermudez, Chem. Res. Toxicol. 11 (1998) 441.
R.M. de Carvalho et al. / Analytica Chimica Acta 420 (2000) 109–121 [3] C. Amatore, Y. Bouret, L. Midrier, Chem.-Eur. J. 5 (1999) 2151. [4] B.B. Anderson, G.Y. Chen, D.A. Gutman, A.G. Ewing, J. Neurosci. Meth. 88 (1999) 153. [5] B.B. Anderson, D.A. Gutman, A.G. Ewing, J. Pharm. Biomed. Anal. 19 (1999) 15. [6] J. Svitel, S. Miertus, Environ. Sci. Technol. 32 (1998) 828. [7] M.W. Powley, G.P. Carloson, Toxicology 139 (1999) 207. [8] H. Cui, C. He, G.J. Zhao, J. Chromatogr. A 855 (1999) 171. [9] R.L. McCreey, in: A.J. Bard (Ed.), Electroanalytical Chemistry: A Series of Advances, Vol. 27, Marcel Dekker, New York, 1991, pp. 293–295. [10] M.A. Bunin, R.M. Wightman, J. Neurosci. 18 (1998) 4854. [11] J.B. Meyerhoff, M.A. Ewing, A.G. Ewing, Electroanalysis 11 (1999) 308. [12] M.A. Bunin, C. Prioleau, R.B. Mailman, R.M. Wightman, J. Neurosci. 70 (1998) 1077. [13] L. Agü´ı, J.E. López-Guzmán, A. Gonzáles-Cortes, P. YáñezSedeño, J.M. Pingarrón, Anal. Chim. Acta 385 (1999) 241. [14] J.-P. Rivot, A. Sousa, J. Montagne-clavel, J.-M. Besson, Brain Res. 821 (1999) 101. [15] B.B. Anderson, G. Chen, D.A. Gutman, A.G. Ewing, Brain Res. 788 (1998) 294. [16] R.A. Clark, S.E. Zerby, A.G. Ewing, J. Electroanal. Chem. 20 (1998) 227. [17] G.K. Kumar, J.L. Overholt, G.R. Bright, K.Y. Hui, H.W. Lu, M. Gratzl, N.R. Prabhakar, Am. J. Physiol-Cell. Ph. 43 (1998) C1592. [18] Y.H. Sheu, M.S. Young, Rev. Sci. Instrum. 69 (1998) 1860. [19] L.J. Yang, C.F. Yang, T.Z. Peng, H.S. Yang, C. Gao, G.Q. Liu, Electroanalysis 11 (1999) 438. [20] N.V. Kulagina, L. Skankar, A.C. Michael, Anal. Chem. 71 (1999) 5093. [21] L.-H. Wang, Analyst 120 (1995) 2241. [22] R.M. Carvalho, L.T. Kubota, J.J.R. Rohwedder, E. Csöregi, L. Gorton, J. Electroanal. Chem. 153 (1998) 83. [23] H. Martens, T. Naes, Multivariate Calibration, Wiley, Chichester, 1989. [24] K.R. Beebe, B.R. Kowalski, Anal. Chem. 59 (1987) 1007A. [25] J. Zupan, J. Gasteiger, Neural Networks for Chemists: An Introduction, VCH, Weinheim, 1992, p. 5. [26] B. Hassibi, D.G. Stork, in: S.J. Hanson, J.D. Hanson, C.L. Giles (Eds.), Advances in Neural Information Processing Systems, Morgan Kaufmann, San Mateo, 1993, p. 164. [27] H. Chan, A. Butler, M.D. Falck, M.S. Freund, Anal. Chem. 69 (1997) 2373. [28] A. Cladera, J. Alp´ızar, J.M. Estela, V. Cerdà, M. Catasús, E. Lastres, L. Garc´ıa, Anal. Chim. Acta 350 (1997) 163. [29] I. Facchin, M.I.M.S. Bueno, C. Mello, R.J. Poppi, X-ray Spectrom. 1 (1999) 12. [30] P. Bhandare, Y. Mendelson, R.A. Peura, G. Janatsch, J.D. Kruse-Jarres, R. Marbach, H.M. Heise, Appl. Spectrosc. 49 (1993) 1214.
121
[31] D. Wienke, G. Kateman, Intell. Chemom. Lab. Syst. 23 (1994) 309. [32] P.J. Gemperline, J.R. Long, V.G. Gregoriou, Anal. Chem. 63 (1991) 2313. [33] N. Bhat, T.J. MacAvoy, Comput. Chem. Eng. 14 (1990) 573. [34] V. Venkatasubramanian, R. Vaidyanathan, Y. Yamamoto, Comput. Chem. Eng. 14 (1990) 699. [35] W.S. McCulloch, W. Pitts, Math. Bull. Biophys. 9 (1947) 127. [36] M. Tusar, J. Zupan, Neural networks, in: J. Gasteiger (Ed.), Software Development in Chemistry, Vol. 4, Springer, Berlin, 1990, p. 367. [37] J.R.S. Jang, C.T. Sun, E. Mizutani, Neuro Fuzzy and Soft Computing, Prentice Hall, Upper Saddle River, 1970, p. 165. [38] F.S. Acton, Numerical Methods that Works, Harper and Row, New York, 1970. [39] E.V. Thomas, D.M. Haale, Anal. Chem. 62 (1990) 1091. [40] N.R. Draper, H. Smith, Applied Regression Analysis, 2nd Edition, Wiley, New York, 1981, p. 219. [41] J. Heitz, A. Krogh, R.G. Palmer, Introduction to the Theory of Neural Computation, Addison-Wesley, New York, 1991, p. 120. [42] Y. Le Cun, B. Boser, J.S. Denker, D. Henderson, R.E. Howard, W. Hubbard, L.D. Jakel, Advances in Neural Information Processing System, Vol. 2, Morgan Kaufmann, San Mateo, p. 396. [43] Y. Le Cun, J.S. Denker, S.A. Solla, in: D.S. Touretzky (Ed.), Proceedings of the Neural Information Processing Systems, Vol. 2, Morgan Kaufmann, San Mateo, 1990, p. 598. [44] R.J. Poppi, D.L. Massart, Anal. Chim. Acta 375 (1998) 187. [45] S. Haykin, Neural Networks: A Comprehensive Foundation, MacMillan College Publishing, New York, 1994. [46] S.E. Fahlman, C. Libiere, in: D.S. Touretzsky (Ed.), Advances in Neural Information Processing Systems, Vol. 2, Morgan Kaufmann, San Mateo, 1990, p. 524. [47] G.B. Arfken, H.J. Weber, Mathematical Methods for Physicists, 4th Edition, Academic Press, New York, 1995. [48] P.E.G., Hunter, G.W. Hunter, J.S. Hunter, Statistics for Experimenters, Wiley, New York, 1985, p. 291. [49] J.P. Gollub, G.L. Baker, Chaotic Dynamic, Cambridge University Press, Cambridge, 1990, p. 28. [50] W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery, Numerical Recipes in fortran: The Art of Scientific Computing, 2nd Edition, Cambridge University Press, New York, 1992, p. 490. [51] N. Norgaard, Neural Network Based System Identification Toolbox, Technical Report 95-E-773, Institute of Automation, Technical University of Denmark, 1995. [52] B.M. Wise, N.B. Gallagher, PLS-Toolbox for Use With MATLAB, Version 1.5.1, 1995. [53] S. Wold, N. Kettaneh-Wold, B. Skagerberg, Chemom. Intell. Lab. Syst. 7 (1989) 53. [54] M. Aihara, M. Fukata, Anal. Lett. 20 (1987) 669. [55] J. Zak, T. Kuwana, J. Electroanal. Chem. 150 (1983) 645.