Comparison of linear and nonlinear calibration models based on near infrared (NIR) spectroscopy data for gasoline properties prediction

Comparison of linear and nonlinear calibration models based on near infrared (NIR) spectroscopy data for gasoline properties prediction

Chemometrics and Intelligent Laboratory Systems 88 (2007) 183 – 188 www.elsevier.com/locate/chemolab Comparison of linear and nonlinear calibration m...

252KB Sizes 2 Downloads 83 Views

Chemometrics and Intelligent Laboratory Systems 88 (2007) 183 – 188 www.elsevier.com/locate/chemolab

Comparison of linear and nonlinear calibration models based on near infrared (NIR) spectroscopy data for gasoline properties prediction Roman M. Balabin a,⁎, Ravilya Z. Safieva a , Ekaterina I. Lomakina b a

b

Gubkin Russian State University of Oil and Gas, 119991 Moscow, Russia Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, 119992 Moscow, Russia Received 12 March 2007; received in revised form 23 April 2007; accepted 26 April 2007 Available online 5 May 2007

Abstract Six popular approaches of «NIR spectrum–property» calibration model building are compared in this work on the basis of a gasoline spectral data. These approaches are: multiple linear regression (MLR), principal component regression (PCR), linear partial least squares regression (PLS), polynomial partial least squares regression (Poly-PLS), spline partial least squares regression (Spline-PLS) and artificial neural networks (ANN). The best preprocessing technique is found for each method. Optimal calibration parameters (number of principal components, ANN structure, etc.) are also found. Accuracy, computational complexity and application simplicity of different methods are compared on an example of prediction of six important gasoline properties (density and fractional composition). Errors of calibration using different approaches are found. An advantage of neural network approach to solution of «NIR spectrum–gasoline property» problem is illustrated. An effective model for gasoline properties prediction based on NIR data is built. © 2007 Elsevier B.V. All rights reserved. Keywords: Gasoline; Principal component regression (PCR); Linear partial least squares regression (PLS); Polynomial partial least squares regression (Poly-PLS); Spline partial least squares regression (Spline-PLS); Artificial neural network (ANN)

1. Introduction Multivariate calibration methods are increasingly used to extract relevant information from different types of spectral data to predict analyte concentrations or properties of complex samples [1–7]. The main problem for these methods is data nonlinearity [8]. Several strategies have been used for the calibration of nonlinear data systems. They are: data pretreatment (such as data transformation and variable selection); the use of linear methods (for slight nonlinearities only); the use of local modeling; the addition of extra variables; the use of nonlinear calibration techniques [8–10]. Among these approaches, the last one is the only one that should be able to build robust calibration model since such calibration model has the potential of modeling severe intrinsic nonlinearities that can be found in natural «sophisticated» (e.g., multicomponent) systems [8,11–14]. ⁎ Corresponding author. Tel.: +7 926 592 7920. E-mail address: [email protected] (R.M. Balabin). 0169-7439/$ - see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.chemolab.2007.04.006

The most important linear calibration method is partial least squares regression (PLS). The two most important nonlinear calibration methods are the nonlinear variants of PLS and artificial neural networks (ANN) [8]. Several comparative studies on these two techniques have been investigated using various data sets [11,15–17]. In some studies, it was shown that neural networks performed better than PLS when data are nonlinear [8,18]. In other situations, however, ANN and nonlinear PLS gave equally good results [15]. One can assume that the different conclusions obtained from different studies result from the differences in the nature of the nonlinearities [8]. In this paper, we compare the performance of linear (MLR, PCR, PLS) and nonlinear calibration techniques (polynomial and spline PLS, ANN) for two data sets of gasoline near infrared spectra (8000–14,000 cm− 1). The model is created to predict six important gasoline properties: density, initial boiling point (IB), end boiling point 10% v/v (T10), end boiling point 50% v/v (T50), end boiling point 90% v/v (T90), final boiling point (FB). We also explore the influence of preprocessing methods (normalization, magnitude normalization, linearization,

184

R.M. Balabin et al. / Chemometrics and Intelligent Laboratory Systems 88 (2007) 183–188

Table 1 Distribution of gasoline samples by properties Unit

Density at 20 °C Initial boiling point (IB) End boiling point 10% v/v (T10) End boiling point 50% v/v (T50) End boiling point 90% v/v (T90) Final boiling point (FB)

kg/m3 °C °C °C °C °C

Table 3 Experimental parameters

Interval Min

Max

640 35 58 93 121 178

800 59 117 128 175 205

Mean value

Standard deviation

712 41 101 118 141 191

40 12 19 13 20 10

differentiation, double differentiation, autoscaling, and range scaling) on models' prediction ability. 2. Experimental 2.1. Materials Two gasoline sample sets were used in this study. The first set consists of gasolines and gasoline fractions, both without additives, from the laboratories of Orsknefteorgsintez OAO, a subsidiary of Oil Company RussNeft OAO (46 items), and Kirishinefteorgsintez OAO (50 items) [19,20]. The second set consists of gasolines and gasoline fractions, both without additives, obtained from the same laboratories (44 items from Orsknefteorgsintez OAO and 60 items from Kirishinefteorgsintez OAO) [21]. All gasoline samples were stored at the temperature of − 4 °C not over than 90 days from the date of receipt. Distribution of gasoline samples by properties is shown in Table 1.

Spectral range Optical path Sample volume Resolution Number of scans Cell material Time of one measurement Number of measurements (per sample)

cm− 1 cm cm3 cm− 1 – – min –

14,000–8000 10 100 16 64 Quartz 2–3 3

Experimental parameters are given in Table 3. It should be noted that such a great optical path (100 mm) was required in order to fix the spectrum with a proper accuracy (see also [19–21]). Examples of gasoline NIR spectra are shown in Fig. 1. The absorbance in the region 11,300–11,600 cm− 1 is due to the presence of aromatic hydrocarbons in the gasoline [19,22]. Because of the fact that different gasolines were used (e.g., aromatic-rich from reforming and aromatic-poor straight-run) the spectra fall into two groups. 2.3. Software and computing The initial spectra were digitized using the special software complex created by one of the authors (Balabin R.M.). After digitization each spectrum was represented as a vector 1 × 285. The length of the vector is defined by spectrometer resolution. A spectrum averaging or smoothing was not applied. The software package MATLAB 7.0 (Mathworks Inc., Natick, MA) along with the Statistics Toolbox and the Neural Network Toolbox were extensively used in designing and executing our procedures. Standard programs of Statistics and Neural Network Toolboxes were modified and extended by Balabin R.M.

2.2. Methods 2.4. Model efficiency estimation Near infrared spectra were measured by Near-IR FT Spectrometer InfraLUM FT-10 (LUMEX, Russia), which data are given in Table 2. As it is clear from Table 2, the instrument does not cover the whole NIR range (4000–14,000 сm− 1) [22], but only a part of it: partially Second and Third Overtone Regions (see also [19–21]). InfraLUM FT-10 (as well as most NIR-spectrometers) has no thermostatic control. In order to compensate this shortcoming, we took background spectrum before and after measurement of each spectrum; then, the averaged background spectrum was subtracted from the sample spectrum. This allowed obtaining an analytical signal with satisfactory accuracy and precision. The instrument calibration (intensity axis) was performed using four pure hydrocarbons (toluene, hexane, benzene, and isooctane). Table 2 NIR device (InfraLUM FT-10) parameters Spectral range Resolution Wave number accuracy Photometric accuracy Radiation source Detector Size and weight

cm− 1 14,500–7500 cm− 1 1–16 cm− 1 0.01 % 0.1 Halogen incandescent lamp Silicon photodiode mm; kg 580 × 515 × 295; 37

In order to characterize prediction capacity of created model the root mean squared error (RMSECV) of cross-validation (see Section 2.5) was used vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi uN uP u ð t i  yi Þ 2 ti¼1 RMSECV ¼ ð1Þ N where ti and yi refer to the actual value and predicted value for sample i, respectively; N is the number of samples in a validation set. The 10-fold cross-validation procedure was applied. 2.5. Model optimisation To compare different methods the efficiency of the best model was found. Because of the dependence of model efficiency on calibration parameters, the following parameters were varied: MLR no parameters; PCR number of principal components (1–20); PLS (or Linear-PLS) number of latent variables (1–20);

R.M. Balabin et al. / Chemometrics and Intelligent Laboratory Systems 88 (2007) 183–188

185

Fig. 1. Gasoline NIR spectra example. The absorbance in the region 11,300–11,600 cm− 1 is due to the presence of aromatic hydrocarbons in the gasoline (see Ref. [22]). The maximal transmittance is 5 a.u.

Poly-PLS number of latent variables (1–20) and degree of polynomial (1–7; 10); Spline-PLS number of latent variables (1–20), number of knots (1–7; 10), and degree of polynomial (1–7; 10); ANN training parameters are shown in Table 4. So, one MLR, 20 PCR, 20 PLS, 160 Poly-PLS, 1280 SplinePLS, and 400 ANN models were checked. The 10-fold cross-validation procedure was conducted for each model. Different types of spectra preprocessing methods were used: normalization, magnitude normalization, linearization (taking the logarithm), differentiation, double differentiation, autoscaling, and range scaling in different intervals ([0;1], [0.1;0.9], [0.2;0.8], [0.3;0.7]). For each calibration method and gasoline property the best preprocessing technique was found [21]. So, 18,810 different models (for each property) were used to find the best one. 3. Results and discussion 3.1. Multiple linear regression (MLR) Multiple linear regression approach (MLR) is the simplest approach for calibration model creation. It is found on an Table 4 Neural network (ANN) training parameters Algorithm Minimised error function Learning Initialisation method Transfer function

Input layer Hidden layer Output layer

Number of training iterations Maximal number of epoch Number of input neurons (principal components) Number of hidden neurons a

No transfer function.

Levenberg–Marquardt Mean squared error (MSE) Supervised Nguyen–Widrow –a Hyperbolic tangent (tanh) Hyperbolic tangent (tanh) 150 5000 1–20 1–20

assumption of a linear «signal–property» connection [23,24]. It is possible to say that this approach is the basic method for experimental data processing in analytical chemistry. The task of a researcher comes to selection of such a signal, for which the linearity is fulfilled maximum precisely in a maximum wide range (e.g. range of concentration) [23]. In the simplest case, a single sample property is connected with a single signal (e.g. peak's height) by one constant. For a number of «simple» systems the MLR approach has proved its effectiveness [23,24]; at the same time, its application for systems with greater number of degrees of freedom (e.g., spectral data) is not always effective [10,25]. Varying a number (up to several hundreds) of free parameters, it is possible to reduce calibration error; however, cross-validation error (or prediction error) remains high. Results of the MLR method application to spectral data obtained by us with a view to predict gasoline properties are given in Table 5. Two hundred eighty five values of optical density at different wavelengths were used as variables. As it is clear from Table 5, this method proved ineffective when building a calibration model. In spite of a relatively low error of calibration itself (0.3 kg/m3, 0.7, 0.5, 0.4, 0.4, and 0.5 °C for density, IB, T10, T50, T90, and FB, respectively), the error of cross-validation is very high.

Table 5 Results of multiple linear regression (MLR) approach for gasoline properties prediction

Density at 20 °C Initial boiling point (IB) End boiling point 10% v/v (T10) End boiling point 50% v/v (T50) End boiling point 90% v/v (T90) Final boiling point (FB)

Unit

Best preprocessing method

RMSECV

kg/m3 °C °C

Normalization Differentiation Normalization

8.8 5.1 6.2

°C

Logarithmation

4.7

°C

Double differentiation

6.1

°C

Normalization

6.9

186

R.M. Balabin et al. / Chemometrics and Intelligent Laboratory Systems 88 (2007) 183–188

Table 6 Results of principal component regression (PCR) approach for gasoline properties prediction

Density at 20 °C Initial boiling point (IB) End boiling point 10% v/v (T10) End boiling point 50% v/v (T50) End boiling point 90% v/v (T90) Final boiling point (FB) a

Unit

Best preprocessing method

PC a

RMSECV

kg/m3 °C °C

Differentiation Differentiation Logarithmation

5 4 19

3.3 2.4 2.4

°C

Logarithmation

15

2.8

°C

Logarithmation

6

3.4

°C

Differentiation

8

3.3

3.2. Principal component regression (PCR) Principal component regression (PCR) approach is a more advanced method for multivariate analysis of spectral data. Greater forecast accuracy is achieved in this method due to reduction of variables number (from several hundreds to several tens) and noise removal [10]. When samples to variables ratio reaches 10, it is possible to assume that calibration model is accurate and robust. This can hardly be achieved for MLR (thousands of samples are needed) and can easier be achieved for PCR method. At the same time, the basis of the PCR approach (as well as the MLR approach) is the assumption of linearity of «signal– property» relation. It was demonstrated that linearity of optical density (Bouguer–Lambert–Beer law) even for a number of simple systems is not observed in a wide range of blend components' concentrations [8]. It is difficult to expect that the

Table 7 Results of partial least squares (PLS) approach for gasoline properties prediction

a

Best preprocessing method

LV a

kg/m3 °C °C

Double differentiation Differentiation Differentiation

10 10 9

2.8 2.0 2.2

°C

Differentiation

12

2.4

°C

Differentiation

19

2.8

°C

Logarithmation

18

2.8

LV is the number of latent variables.

Preprocessing method

LV–K–D a

RMSECV

kg/m3 °C °C

Differentiation Logarithmation Logarithmation

10–4–3 15–4–3 9–5–2

2.4 1.6 1.8

°C

Differentiation

15–5–3

2.0

°C

Differentiation

15–4–3

2.3

°C

Differentiation

17–4–3

2.2

LV is the number of latent variables; K is the number of knots; D is the degree.

Since the effectiveness of any model depends on the «samples/free parameters» ratio, MLR technique is not effective with just a hundred samples in calibration set. It may be concluded that MLR is not suitable for practical realization of a method for gasoline property and quality indices determination on the basis of near infrared spectroscopy. It also means that more sophisticated methods of data analysis are needed.

Density at 20 °C Initial boiling point (IB) End boiling point 10% v/v (T10) End boiling point 50% v/v (T50) End boiling point 90% v/v (T90) Final boiling point (FB)

Density at 20 °C Initial boiling point (IB) End boiling point 10% v/v (T10) End boiling point 50% v/v (T50) End boiling point 90% v/v (T90) Final boiling point (FB)

Unit

a

PC is the number of principal components.

Unit

Table 8 Results of spline partial least squares (Spline-PLS) approach for gasoline properties prediction

RMSECV

above-stated linearity would be observed in such a multicomponent blend as gasoline. PCR results on the basis of our data are given in Table 6. Superiority of PCR approach over MLR approach (Table 5) is obvious. However, substantially high values of cross-validation errors in comparison with calibration errors (1.1 kg/m3, 1.4, 1.2, 1.4, 2.0, and 2.2 °C for density, IB, T10, T50, T90, and FB, respectively), do not characterize this method positively. From Table 6 it may be also concluded that gasoline density is a nonlinear function of its spectrum (regression error is high in comparison with the standard laboratory method). 3.3. Linear, polynomial, and spline partial least squares regression (PLS, Poly-PLS, Spline-PLS) PLS approach is at this stage one of the most popular chemometric algorithms for calibration model creation. Due to its simplicity and small volume of calculations, it is used in analysis of different data [8,10]. PLS method itself (as well as PCR) is a linear method of data analysis. However, there exists a number of its nonlinear modifications: Poly-PLS [26,27] and Spline-PLS [28]. The only difference between these two algorithms and the linear PLS is in one step, in which linear function is changed by polynomial one (for Poly-PLS) or spline function — a piecewise polynomial function (for Spline-PLS). The polynomial and spline function

Table 9 Results of poly partial least squares (Poly-PLS) approach for gasoline properties prediction Unit Density at 20 °C Initial boiling point (IB) End boiling point 10% v/v (T10) End boiling point 50% v/v (T50) End boiling point 90% v/v (T90) Final boiling point (FB) a

Preprocessing method LV–D a RMSECV

kg/m3 Double differentiation 9–3 °C Differentiation 15–5 °C Differentiation 9–4

2.4 1.6 1.8

°C

Differentiation

14–3

1.9

°C

Differentiation

14–5

2.2

°C

Differentiation

18–3

2.1

LV is the number of latent variables; D is the degree.

R.M. Balabin et al. / Chemometrics and Intelligent Laboratory Systems 88 (2007) 183–188 Table 10 Results of neural network (ANN) approach for gasoline properties prediction Unit Density at 20 °C Initial boiling point (IB) End boiling point 10% v/v (T10) End boiling point 50% v/v (T50) End boiling point 90% v/v (T90) Final boiling point (FB)

Best preprocessing Network RMSECV b method structure a

kg/m3 Differentiation °C Differentiation °C Differentiation

10–7–1 16–8–1 19–6–1

2.0 1.3 1.4

°C

Differentiation

15–9–1

1.6

°C

Differentiation

14–9–1

1.7

°C

Differentiation

18–7–1

1.7

a

Neural network structure is PC-NN-1, where PC is the number of principal components (input neurons), NN is the number of hidden neurons. b Average RMSECV for 5 best neural networks is given.

can have any order (see above). In details the algorithms are discussed in Refs. [8,26–28]. The above-stated methods, having the simplicity of their «ancestor» (Linear-PLS), are substantially more complex in calculations. Time of their realization may be compared with the time of a neural network (ANN) training [8,25]. Tables 7–9 represent results of a calibration model creation using the methods of PLS, Spline-PLS and Poly-PLS, respectively. Results of our analysis are in line with previously obtained data [8]. Modified PLS methods may be arranged in the following sequence with respect to their effectiveness: PLS, Spline-PLS and Poly-PLS. As is clear from Tables 7–9, that methods of PLS regression exceed both methods of MLR and PCR with respect to its effectiveness. The cause of it is their greater flexibility (use of at property matrix decomposition) and nonlinearity (except of linear PLS itself). Notwithstanding the fact that polynomial and spline approximation for «spectrum–property» relation is more progressive in comparison with a linear one, it cannot be still called universal. No theoretical basis for these methods is observed.

187

3.4. Artificial neural network (ANN) Neural networks (ANN) occurred in chemistry relatively recently, but they immediately attracted much attention [29]. Since neural networks (multilayer perceptron) are universal approximators [30], they deprived of linear methods disadvantages, stated above. This «universality» allows neural networks to find application in different spheres of science and techniques [31]. The main disadvantage of ANN approach is its computational complexity and stochastic nature (results of ANN training depends on initial parameters). Previously obtained results of neural networks application for analysis of our spectral and reference data are repeated in Table 10. Obtained results were discussed previously [20,21]. Herein, we will only highlight self-evident superiority of the neural networks approach in comparison with traditional methods of multivariate analysis (see also Fig. 2). 3.5. Comparison of methods Fig. 2 summarizes results obtained by different methods. With respect to effectiveness they can be arranged in the following order: ANNNPolyPLS≈SplinePLSNPLSNPCRNNMLR One can see (Fig. 2) that neural network approach is the most effective for creation of an express method for gasoline analysis (on the basis of NIR spectroscopy). Its advantage is not so significant, as it was obtained using synthetic data sets [8], though it is great in comparison with linear methods (MLR, PCR, PLS). ANN method is 4 times more efficient than MLR, and 1.5 more efficient than PLS (according to error ratios). It should be noted, that (except for accuracy) calibration method is also characterized by simplicity for investigation (comprehensibility of main algorithms, availability of software, etc.) and by volume of required calculations (capacity of

Fig. 2. Root mean squared error of cross-validation (RMSECV) of different calibration techniques. Models: MLR — multiple linear regression; PCR — principal component regression; Poly-PLS — linear partial least squares regression; Poly-PLS — polynomial partial least squares regression; Spline-PLS — spline partial least squares regression; ANN — artificial neural network. Properties: IB — initial boiling point; T10 — end boiling point 10% v/v; T50 — end boiling point 50% v/v; T90 — end boiling point 90% v/v; FB — final boiling point. RMSECV confidence interval for ANN method is shown for five best neural networks.

188

R.M. Balabin et al. / Chemometrics and Intelligent Laboratory Systems 88 (2007) 183–188

computers for realization, time of a model creation, etc.) [32]. With respect to these parameters, the above-stated methods can be arranged in the following order (see also Ref. [8]): Computation time: MLR≈PCR≈PLSbPolyPLSbbANNbSplinePLS Ease of use: MLR≈PCR≈PLS≈PolyPLSNSplinePLSNNANN ANN training is more than 103 times more time consuming than MLR or PLS model building approach. These data should also be considered when making a decision on advantages of one or another method for processing of spectral and reference data. One should note that these expenses are of no importance for the end-consumer (e.g., refinery laboratory). No real difference in calculation time of created model is observed. 4. Conclusions Having analyzed results of application of the six methods for calibration model creation using a number of spectral data sets (near infrared spectra), one can make the following conclusions: 1. Nonlinear methods proved their superiority over linear ones, what tells of «nonlinear» character of investigated object (gasoline). 2. Neural networks turned out to be the most suitable methods for making a calibration model «near infrared spectrum– gasoline property». It confirms the necessity and prospectivity of neural networks approach application when processing near infrared spectroscopy data for complex «nonlinear» objects (e.g., gasolines). 3. An effective model for gasoline properties prediction based on near infrared data is built. The time of one analysis is reduced to 3 min (in comparison with 60 min of traditional methods). We hope that results obtained by us will help both further chemometric investigations and investigations in the sphere of near infrared spectroscopy of multicomponent systems. Acknowledgements Balabin Roman is grateful to ITERA International Group of companies for a scholarship. The authors acknowledge the Lumex Ltd. for supplying the NIR device. References [1] L. Xu, J.-H. Jiang, H.-L. Wu, G.-L. Shen, R.-Q. Yu, Chem. Intell. Lab. Syst. 85 (2007) 140–143.

[2] S. Wold, H. Martens, H. Wold, The multivariate calibration method in chemistry solved by the PLS method, in: A. Ruhe, B. Kågström (Eds.), Proceedings on the Conference on Matrix Pencils, Lecture Notes in Mathematics, Springer-Verlag, Heidelberg, 1983, pp. 286–293. [3] S. Wold, J. Cheney, N. Kettaneh, C. McCready, Chem. Intell. Lab. Syst. 84 (2006) 159–163. [4] C.L. Stork, B.R. Kowalski, Chem. Intell. Lab. Syst. 48 (1999) 151–166. [5] S.K. Sengupta, S.C. Cheeseman, S.D. Brown, H.C. Foley, Ind. Eng. Chem. Res. 31 (1992) 2003–2010. [6] F. Dousseau, M. Pézolet, Biochemistry 29 (1990) 8771–8779. [7] L. Eriksson, J. Gottfries, E. Johansson, S. Wold, Chem. Intell. Lab. Syst. 73 (2004) 73–84. [8] H. Yang, P.R. Griffiths, J.D. Tate, Anal. Chim. Acta 489 (2003) 125–136. [9] V. Centner, J. Verdú-Andrés, B. Walczak, D. Jouan-Rimbaud, F. Despagne, L. Pasti, R. Poppi, D.-L. Massart, O.E. de Noord, Appl. Spectrosc. 54 (2000) 608–623. [10] T. Næs, T. Isaksson, T. Fearn, T. Davies, A User-Friendly Guide to Multivariate Calibration and Classification, NIR Publications, Chichester, UK, 2002. [11] E. Bertran, M. Blanco, S. Maspoch, M.C. Ortiz, M.S. Sánchez, L.A. Sarabia, Chem. Intell. Lab. Syst. 49 (1999) 215–224. [12] H. Swierenga, A.P. de Weijer, R.J. van Wijk, L.M.C. Buydens, Chem. Intell. Lab. Syst. 49 (1999) 1–17. [13] H. Tan, X. Su, W. Wei, S. Yao, Chem. Intell. Lab. Syst. 48 (1999) 71–80. [14] P. Geladi, E. Dåbakk, J. Near Infrared Spectrosc. 3 (1995) 119–132. [15] S. Sekulic, M.B. Seasholtz, Z. Wang, B.R. Kowalski, Anal. Chem. 65 (1993) 835–845. [16] P. Geladi, L. Hadjiiski, P. Hopke, Chem. Intell. Lab. Syst. 47 (1999) 165–173. [17] L. Hadjiiski, P. Geladi, P. Hopke, Chem. Intell. Lab. Syst. 49 (1999) 91–103. [18] Y. Li, C.W. Brown, S.-C. Lo, J. Near Infrared Spectrosc. 7 (1999) 55–62. [19] Near-IR Absorption Bands (Analytical Spectral Devices, Inc.); http:// www.asdi.com/nir-chart_grid_rev-3.pdf. See also references therein. [20] R.M. Balabin, R.Z. Safieva, E.I. Lomakina, Neural Comput. Appl., submitted for publication. [21] R.M. Balabin, R.Z. Safieva, E.I. Lomakina, Chem. Intell. Lab. Syst., submitted for publication. [22] Guide for Infrared Spectroscopy (Bruker Optics, Inc.); http://www.brukeroptics.com. [23] Y. Ni, L. Wang, S. Kokot, Anal. Chim. Acta 439 (2001) 159–168. [24] K. Bessant, S. Saini, J. Electroanal. Chem. 489 (2000) 76–83. [25] M.T. Hagan, H.B. Demuth, M.H. Beale, Neural Network Design, PWS Publishing, Boston, MA, 1996. [26] I.E. Frank, Chem. Intell. Lab. Syst. 8 (1990) 109–119. [27] S. Wold, N. Kettaneh-Wold, B. Skagerberg, Chem. Intell. Lab. Syst. 7 (1989) 53–65. [28] S. Wold, Chem. Intell. Lab. Syst. 14 (1992) 71–84. [29] G. Kateman, Chem. Intell. Lab. Syst. 19 (1993) 135–142. [30] K. Hornik, Neural Netw. 4 (1991) 251–257. [31] S. Haykin, Neural Networks: A Comprehensive Foundation, 2nd edition, Prentice Hall, 1998. [32] J.T. Kent, J.M. Bibby, K.V. Mardia, Multivariate Analysis (Probability and Mathematical Statistics), Elsevier, 2006.