Wavelet neural network (WNN) approach for calibration model building based on gasoline near infrared (NIR) spectra

Wavelet neural network (WNN) approach for calibration model building based on gasoline near infrared (NIR) spectra

Chemometrics and Intelligent Laboratory Systems 93 (2008) 58–62 Contents lists available at ScienceDirect Chemometrics and Intelligent Laboratory Sy...

287KB Sizes 8 Downloads 143 Views

Chemometrics and Intelligent Laboratory Systems 93 (2008) 58–62

Contents lists available at ScienceDirect

Chemometrics and Intelligent Laboratory Systems j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / c h e m o l a b

Wavelet neural network (WNN) approach for calibration model building based on gasoline near infrared (NIR) spectra Roman M. Balabin a,⁎, Ravilya Z. Safieva a, Ekaterina I. Lomakina b a b

Gubkin Russian State University of Oil and Gas, 119991 Moscow, Russia Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, 119992 Moscow, Russia

A R T I C L E

I N F O

Article history: Received 21 March 2007 Received in revised form 7 April 2008 Accepted 9 April 2008 Available online 16 April 2008 Keywords: Artificial neural network (ANN) Wavelet transform (WT) Multilayer perceptron (MLP) Wavelet neural network (WNN) Near infrared (NIR) spectroscopy Gasoline Ethanol–gasoline fuel

A B S T R A C T In this paper we have compared the abilities of two types of artificial neural networks (ANN): multilayer perceptron (MLP) and wavelet neural network (WNN) — for prediction of three gasoline properties (density, benzene content and ethanol content). Three sets of near infrared (NIR) spectra (285, 285 and 375 gasoline spectra) were used for calibration models building. Cross-validation errors and structures of optimized MLP and WNN were compared for each sample set. Four different transfer functions (Morlet wavelet and Gaussian derivative – for WNN; logistic and hyperbolic tangent – for MLP) were also compared. Wavelet neural network was found to be more effective and robust than multilayer perceptron. © 2008 Elsevier B.V. All rights reserved.

1. Introduction Near infrared (NIR) spectroscopy, being a relatively inexpensive means of data collection is enabling many industrialists and academics the opportunity to increase the experimental complexity of their research, which in turn results in more accurate and precise information of their area of interest [1]. One of the possible fields of near infrared spectroscopy usage is petroleum industry [2–4]. Near infrared spectroscopy is useful to overcome such limitations, especially in a complicated real process, where on-line measuring is important to monitor the quality of products (e.g., gasoline production) [5]. The NIR spectrometers satisfy the requirements of users who want to have quantitative product information in real-time because the NIR instrument provides the information promptly and easily. Multivariate statistical calibration techniques (linear and nonlinear), which process enormous amounts of experimental data, have boosted the use of NIR instruments [5,6]. Artificial neural networks (ANN) as tools for the construction of relationships between spectra and properties of chemical compounds and multicomponent mixtures have a long history. The field of ANN applications in chemistry is great as shown by the hundreds of works

⁎ Corresponding author. E-mail address: [email protected] (R.M. Balabin). 0169-7439/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.chemolab.2008.04.003

published [7,8]. The whole variety of neural networks applications can be found in Ref. [9]. Wavelet neural network (WNN) is a feed forward neural network based on wavelet transform (WT) [10–12]. The essence of WNN is to find a family of wavelet in the characteristic space so that the complex function relationship contained in the original signal might be exactly expressed [10–12]. The network has advantages of the wavelet transform (in denoising, background reduction and recovery of characteristic information) and also has neural network capacity of “universal” approximation [10]. Wavelet neural networks have demonstrated remarkable results in the prediction, classification and modeling of different nonlinear signals [13–16]. Among the reports of WNN applied in the chemical area, the modeling and prediction of chemical properties is the main theme, gathering the complexation equilibria of organic compounds with α-cyclodextrins [17], the chromatographic retention times of naphtas [18], the QSPR relationships for critical micelle concentration of surfactants [19], and the model building for multianalyte quantification in an overlapped-signal voltammetric application [13]. The application of WNN in chemical process control is also mentioned [13,20]. In this paper wavelet neural network (WNN) is built to predict properties of gasoline samples (3 data sets) from their NIR spectra (8000–14,000 cm− 1; 714–1250 nm). WNN prediction abilities are compared with abilities of ordinary neural network — multilayer

R.M. Balabin et al. / Chemometrics and Intelligent Laboratory Systems 93 (2008) 58–62

59

Table 2 NIR device (InfraLUM FT-10) parameters cm− 1 14,500–7500 cm− 1 1–16 cm− 1 0.01 % 0.1 Halogen incandescent lamp Silicon photodiode mm; kg 580 × 515 × 295; 37

Spectral range Resolution Wave number accuracy Photometric accuracy Radiation source Detector Size and weight

2.3. Wavelet neural network (WNN)

Fig. 1. Structure of wavelet neural network (WNN): [Ψ1;ΨNN] — wavelet functions; NN — number of neurons in hidden (wavelet) layer.

perceptron (MLP). For both networks, the best training parameters are found. 2. Theoretical remarks In this section basic remarks about methods used in our study are presented. Some references on detailed papers are also included.

Wavelet neural network (WNN) is a new network based on wavelet transforms [24]. The architecture of the wavelet neural network (WNN) is based on a multilayer perceptron (MLP) [9,10,24]. In the case of WNN, discrete wavelet function is used as node activation function. Because the wavelet space is used as characteristic space of pattern recognition, the characteristic extraction of signal is realized by weighted sum of inner product of wavelet base and signal vector. Furthermore, because it combines the function of time-frequency localization by wavelet transform and self-studying by neural network, the network possesses capacity of approximate and robust. In this paper, a wavelet neural network was designed with one hidden layer (Fig. 1). Its node activation function based on two types wavelet basic functions: Morlet wavelet [10,25] and the first derivative of Gaussian function (“Gaussian derivative”) [13]. 3. Experimental

2.1. Artificial neural network (ANN) 3.1. Sample sets Artificial neural networks (ANN) are computing systems made up with a number of simple, highly interconnected processing elements (called nodes or artificial neurons) that abstractly emulate the structure and operation of the biological nervous system [13]. There are many different types and architectures of neural networks. Details about ANN construction and usage can be found in Ref. [9]. 2.2. Wavelet transform (WT) The WT is an important tool for the analysis and processing of nonstationary signals (whose spectral components vary in time) because it provides an alternative to the classical analysis made with the Short Time Fourier Transform (STFT) [13,21]. The advantage of WT over STFT is the good localization properties obtained in both time and frequency domains. The main idea of wavelet theory consists on representing an arbitrary signal by means of a family of functions that are scaled and translated versions of a single main function known as the mother wavelet. For a detailed analysis one can consult Refs. [13,22,23]; see also Ref. [10] for algorithms' description.

Three sample sets (Sets A–C) are used in our work to evaluate efficiency of wavelet neural network as method of calibration model building. Information about them can be found in Table 1. One can see that two types of objects are used: pure gasoline and ethanol–gasoline fuel blends. Table 1 also provides information about properties that should be predicted and samples distribution over them. Neural network (MLP) analysis of sample Set A was already provided in Refs. [26,29]. 3.2. Apparatus and experimental parameters The spectra were acquired with a Near-IR FT Spectrometer InfraLUM FT-10 (LUMEX, Russia) fitted with special sampler for liquids (Table 2). The spectra were acquired at room temperature (20– 23 °C). Experimental parameters (for all data sets) are listed in Table 3. InfraLUM FT-10 (as well as most NIR spectrometers) has no thermostatic control. In order to compensate this shortcoming, the background spectrum was taken before and after measurement of each spectrum; then, the averaged background spectrum was subtracted from the sample spectrum. This allowed obtaining an analytical signal

Table 1 Sample sets

Object Number of samples Property for calibration Unit Property range Mean value Standard deviation Reference method error a b c

Set Aa

Set B

Set Cb

Gasoline 95 Density kg/m3 640–800 712 40 0.5

Gasoline 57 Benzene content % w/w 0–10.0 3.8 3.1 0.10–0.25

Gasoline + ethanol 75 Ethanol content % w/w 0–15.0 6.0 5.1 0.05c

See also Ref.[26]. See also Refs.[28,30]. Samples were prepared in our laboratory.

Table 3 Experimental parameters

Spectral range Sample volume Optical path Spectra resolution Number of scans Cell material Time of one measurement Number of measurements (per sample)

cm− 1 cm3 cm cm− 1 – – min –

Set A

Set B

Set C

14,000–8000 100 10 16 64 Quartz 2–3 3

13,500–8500 100 10 8 72 Quartz 3–4 5

13,500–8500 100 10 8 64 Quartz 3–4 5

60

R.M. Balabin et al. / Chemometrics and Intelligent Laboratory Systems 93 (2008) 58–62

Table 4 Training parameters of neural networks: multilayer perceptron (MLP) and wavelet neural network (WNN) — for different NIR data sets

Algorithm Minimised error function Learning Initialisation method Transfer functions

Learning rate Maximal number of epoch Early stopping

Set A

Set B

Set C

Constant learning rate Mean squared error (MSE) Supervised Random

Constant learning rate Mean squared error (MSE) Supervised Random

Constant learning rate Mean squared error (MSE) Supervised Random

Morlet wavelet; Gaussian Morlet wavelet; derivativea; logistic; Gaussian derivative; logistic; hyperbolic tangent hyperbolic tangent 0.01 0.025 50,000 30,000

Morlet wavelet; Gaussian derivative; logistic; hyperbolic tangent 0.01 25,000

5-fold cross-validation

5-fold crossvalidation 200

5-fold crossvalidation 175

1–20

1–25

1–20

1–25

150 Number of training iterations 1–25 Number of input neurons (principal components) Number of 1–25 hidden neurons a

4. Results and discussion To evaluate prediction capabilities of different types of ANN (MLP and WNN) their best architecture and effective training parameters should be found. 4.1. Number of principal components (PC) optimization In case of NIR spectroscopy PCA analysis is an important step in effective calibration model creation. It leads to data reduction (from hundreds to tens) and noise removal [31–33]. The first ability is very important for ANN building because the number of weights should not exceed the number of training samples (usually tens– hundreds). We have tried different numbers of principal components (PC) to find the best one. Examples of our research are given in Fig. 2a–b. The PC value, which leads to the lowest cross-validation error, is chosen as the optimal one.

The first derivative of Gaussian function.

with satisfactory accuracy and precision. The instrument calibration (wavelength and signal stability) was performed using four pure hydrocarbons (toluene, hexane, benzene, and isooctane). 3.3. Software and computing All network calculations were performed using MATLAB 6.5 standard software, as well as originally developed software (see also Refs. [26–30]). MATLAB files are available via contacting the leading author. Personal computer (Intel Pentium III 3.0 GHz, 512 Mb RAM) was used in our studies. No spectra preprocessing (except of PCA) was applied. 3.4. Neural networks training parameters Training parameters for MLP and WNN are listed in Table 4. The simplest training algorithm (with constant learning rate) was applied, even though its time of ANN training is not the lowest [9]. One should note that “epoch” is a step of learning procedure, while “iteration” is a separate learning procedure with different initial parameters. Different network architectures (number of input and hidden neurons) were tested to find the best one (Table 4). In all cases principal component analysis (PCA) [33] was used for spectra reduction. So, the number of input neurons is equal to the number of principal components (PC) (see Table 4). 3.5. Model efficiency estimation To characterize prediction ability (efficiency) of created model the root mean squared error of cross-validation (RMSECV) was used

RMSECV ¼

vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi uN uP u ðti  yi Þ2 ti¼1 N

ð1Þ

where ti and yi refer to the actual value and predicted value for sample i, respectively; N is the number of samples in a validation set. 5-fold cross-validation was used to evaluate model's efficiency.

Fig. 2. a–b. Dependence of root mean squared error of cross-validation (RMSECV) on number of neurons in input layer (number of principal components). (a) Set B: WNN; NN = 3. (b) Set B: WNN; NN = 5.

R.M. Balabin et al. / Chemometrics and Intelligent Laboratory Systems 93 (2008) 58–62

61

Table 6 Comparison of multilayer perceptron (MLP) and wavelet neural network (WNN) optimal architecturea

Set A Set B Set C

WNN

MLP

16-5-1 (113) 14-3-1 (71) 10-3-1 (51)

16-6-1 (129) 12-5-1 (85) 8-5-1 (57)

a Network's architecture is presented in the form of PC-NN-1 (W), where PC is the number of input neurons (principal components), NN is the number of hidden neurons, W = PC × (NN + 2) + 1 — number of weights in the optimal network.

In the study were have checked various NN number to find the best one. Examples of our research are given in Fig. 3a–b. The NN value, which leads to the lowest cross-validation error, is chosen as the optimal one. 4.3. Efficiency of models Table 5 shows the errors (calibration and cross-validation) for each data set and ANN type. From Table 5, it is evident that WNN is more effective than MLP for gasoline model building. One should note that the differences in transfer functions (inside one ANN class) are not of great importance (±2%). The best results of WNN usage are for Set C (ethanol–gasoline fuel): RMSECV is 19% lower (in comparison with MLP). For Set B (gasoline; benzene content) WNN shows quite the similar results as MLP: RMSECV is just 1% lower. The results of prediction for both networks are acceptable (in comparison with reference method errors; see Table 1). 4.4. Architectures of networks

Fig. 3. a–b. Dependence of root mean squared error of cross-validation (RMSECV) on number of neurons in hidden layer. (a) Set C: MLP; PC = 8. (b) Set C: MLP; PC = 10.

4.2. Number of hidden neurons (NN) optimization According to the theory, the number of nodes in the hidden layer (NN) of the wavelet neural network is equal to that of wavelet base [10]. If the number is too small, WNN may not reflect the complex function relationship between input data and output value. On the contrary, a large number may create such a complex network that might lead to a very large output error caused by over-fitting of the training sample set.

Table 6 results the best architectures for WNN and MLP for each data set. One can see that number of input neurons (or PC) is mostly greater for WNN. The situation is different for the number of hidden neurons (NN): for WNN this number is mostly lower than for MLP. The first fact could mean that WNN extracts more information from NIR spectra. The second fact means that wavelet neural network is able to reproduce the non-linearity of NIR spectra with less complex network structure. So the wavelet functions can easier express “spectra-property” relation. In all cases (Sets A–C) total number of weights is lower for wavelet neural network, so the WNN “samples/weights” ratio is greater. This fact makes WNN more robust than MLP. The ANN architecture can also be a measure of object complexity. One can say that density (a function of the whole hydrocarbon content) is a “complex” property (more than one hundred free parameters are needed to create an effective model); benzene content (content of one hydrocarbon) is “less complex” property: it can be estimated using seventy-eighty weights; ethanol content (content of non-hydrocarbon) can be predicted with ANN of approximately fifty weights. Of course, one should not understand the word “complex” literally.

Table 5 Comparison of training results of multilayer perceptron (MLP) and wavelet neural network (WNN) Transfer functions

Set A Set B Set C a

Property

RMSEC

RMSECV

WNN

MLP

WNN

MLP



Morlet wavelet

Gaussian derivativea

Logistic

Hyperbolic tangent

Morlet wavelet

Gaussian derivative

Logistic

Hyperbolic tangent

Density; kg/m3 Benzene cont.; % w/w Ethanol cont.; % w/w

1.6 0.40 0.10

1.6 0.54 0.07

1.5 0.43 0.05

1.7 0.42 0.09

1.8 0.58 0.11

1.8 0.59 0.11

2.1 0.58 0.13

2.1 0.60 0.14

The first derivative of Gaussian function.

62

R.M. Balabin et al. / Chemometrics and Intelligent Laboratory Systems 93 (2008) 58–62

Because of the fact that the optimal structure for MLP and WNN differs greatly, it is not the best way to compare them when they have similar architecture. It is more appropriate to compare the lowest errors that can be achieved by these types of ANN. 5. Conclusions We have described a wavelet neural network application aimed to quantitatively determine the concentrations of two gasoline components (benzene and ethanol) and one gasoline property (density) based on information obtained from near infrared spectra (8000–14,000 cm− 1). We have found that the WNN as a multivariate modeling tool for NIR data performs better than ordinary MLP network. The wavelet transform as transfer function has proven its ability for capturing essential features of a NIR signal. One should note that the accuracy and stability of the WNN could be further improved by the implementation of other wavelet functions. References [1] D. Donald, D. Coomans, Y. Everingham, D. Cozzolino, M. Gishen, T. Hancock, Chemom. Intell. Lab. Syst. 82 (2006) 122. [2] N. Pasadakis, V. Gaganis, C. Foteinopoulos, Fuel Process. Technol. 87 (2006) 505–509. [3] K. Brudzewski, A. Kesik, K. Kołodziejczyk, U. Zborowska, J. Ulaczyk, Fuel 85 (2006) 553. [4] K. Brudzewski, S. Osowski, T. Markiewicz, J. Ulaczyk, Sens. Actuators B: Chem. 113 (2006) 135. [5] J. Yoon, B. Lee, C. Han, Chemom. Intell. Lab. Syst. 64 (2002) 1.

[6] D.A. Burns, E.W. Ciurczak, Handbook of Near-Infrared Analysis, CRC Press, 2001. [7] V. Tchistiakov, C. Ruckebusch, L. Duponchel, J.-P. Huvenne, P. Legrand, Chemom. Intell. Lab. Syst. 54 (2000) 93. [8] V. Kvasnicka, J. Pospichal, J. Mol. Struct.: Theochem 235 (1991) 227. [9] S. Haykin, Neural Networks: A Comprehensive Foundation, Macmillan College Publishing Company, NY, 1994. [10] H. Zhong, J. Zhang, M. Gao, J. Zheng, G. Li, L. Chen, Chemom. Intell. Lab. Syst. 59 (2001) 67. [11] I. Daubechies, IEEE Trans. Inf. Theory 36 (5) (1990) 961. [12] I. Daubechics, Commun. Pure Appl. Math. 41 (1988) 909. [13] A. Gutés, F. Céspedes, R. Cartas, S. Alegret, M. del Valle, J.M. Gutierrez, R. Muñoz, Chemom. Intell. Lab. Syst. 83 (2006) 169. [14] B.R. Bhakshi, G. Stephanopoulos, AIChE J. 39 (1993) 57–81. [15] H. Szu, B. Telfer, J. Garcia, Neuronal Netw. 9 (1996) 695–708. [16] L. Cao, Y. Hong, H. Fang, G. He, Physica, D 85 (1995) 225–238. [17] Q.-X. Guo, L. Liu, W.-S. Cai, Y. Jiang, Y.-C. Liu, Chem. Phys. Lett. 290 (1998) 514–518. [18] X. Zhang, J. Qi, R. Zhang, M. Liu, Z. Hu, H. Xue, B. Fan, Comput. Chem. 25 (2001) 125–133. [19] Z. Kardanpour, B. Hemmateenejad, T. Khayamian, Anal. Chim. Acta 531 (2005) 285–291. [20] J. Zhao, B. Chen, J. Shen, Comput. Chem. Eng. 23 (1998) 83–92. [21] O. Rioul, M. Vetterli, IEEE Signal Process. 8 (1991) 14–38. [22] G. Kaiser, A Friendly Guide to Wavelets, Ed. Birkhäuser, Boston MA, 1994, p. 300. [23] B. Walczak (Ed.), Wavelets in Chemistry, Elsevier, Amsterdam, 2000. [24] Y.C. Pati, P.S. Krishnaparasad, IEEE Trans. Neural Netw. 4 (1993) 73–85. [25] A. Subasi, A. Alkan, E. Koklukaya, M.K. Kiymik, Neural Netw. 18 (2005) 985. [26] R.M. Balabin, R.Z. Safieva, E.I. Lomakina, Chemometr. Intell. Lab. 88 (2007) 183. [27] R.M. Balabin, R.Z. Safieva, J. Near Infrared Spectrosc. 15 (2007) 343. [28] R.M. Balabin, R.Z. Syunyaev, S.A. Karpov, Fuel 86 (2007) 323. [29] R.M. Balabin, R.Z. Syunyaev, J. Colloid Interf. Sci. 318 (2008) 167. [30] R.M. Balabin, R.Z. Syunyaev, S.A. Karpov, Energy Fuels 21 (2007) 2460. [31] R.M. Balabin, R.Z. Safieva, Fuel 87 (2008) 1096. [32] R.M. Balabin, R.Z. Safieva, Fuel 87 (in press). doi:10.1016/j.fuel.2008.02.014. [33] S. Wold, K. Esbensen, P. Geladi, Chemom. Intell. Lab. Syst. 2 (1987) 37.