Partial least squares and artificial neural networks modeling for predicting chlorophenol removal from aqueous solution

Partial least squares and artificial neural networks modeling for predicting chlorophenol removal from aqueous solution

Chemometrics and Intelligent Laboratory Systems 99 (2009) 150–160 Contents lists available at ScienceDirect Chemometrics and Intelligent Laboratory ...

867KB Sizes 0 Downloads 29 Views

Chemometrics and Intelligent Laboratory Systems 99 (2009) 150–160

Contents lists available at ScienceDirect

Chemometrics and Intelligent Laboratory Systems j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / c h e m o l a b

Partial least squares and artificial neural networks modeling for predicting chlorophenol removal from aqueous solution☆ Kunwar P. Singh ⁎, Priyanka Ojha, Amrita Malik, Gunja Jain Environmental Chemistry Division, Indian Institute of Toxicology Research, Council of Scientific & Industrial Research, Post Box 80, MG Marg, Lucknow-226 001, India

a r t i c l e

i n f o

Article history: Received 17 March 2009 Received in revised form 8 September 2009 Accepted 22 September 2009 Available online 2 October 2009 Keywords: Partial least squares Radial basis function partial least squares Artificial neural network Feed-forward Back-propagation Removal efficiency

a b s t r a c t Linear and nonlinear partial least squares (PLS) regression and three-layer feed-forward artificial neural network (ANN) models were constructed to predict the removal efficiency (RE%) of the coconut fibers carbon (FC) for 2-chlorophenol (2-CP) from aqueous solutions based on 800 experimental sets obtained in a laboratory batch study. The effect of operational variables, such as pH, initial concentration of the adsorbate, contact time and operating temperature were studied to optimize the conditions for maximum removal of 2-CP from water. The root mean square error of prediction (RMSEP), relative error of prediction (REP), coefficient of determination (R2), Nash–Sutcliffe coefficient of efficiency (Ef), and the accuracy factor (Af) were used as the modeling performance criteria. Performance of all the three models in predicting the removal efficiency of the studied adsorbate–adsorbent system was satisfactory. The linear PLS, nonlinear PLS and ANN models (prediction) yielded the REP of 10.19, 9.88 and 7.98, respectively. The correlation coefficient between the model predicted and experimental values of the removal efficiency was 0.87, 0.88 and 0.96 for the linear PLS, nonlinear PLS and ANN models, respectively. However, the nonlinear PLS and ANN models performed relatively better than the linear PLS due to the capability of the earlier ones in capturing the non-linear relationships in the variables. All the three models can be employed for predicting the adsorption capacity. © 2009 Elsevier B.V. All rights reserved.

1. Introduction Phenols and their substituted products are generally present in the effluents of coal tar, gasoline, plastic, rubber–proofing, disinfectant, pulp and paper, pharmaceutical and steel industries, domestic wastewaters, agricultural runoff and chemical spillage [1]. Disposal of such effluents on to the open land and into the surface water bodies in long run may contaminate these resources as well as the groundwater aquifers [2]. Some chlorinated organics are formed during the chlorination of water supplies [3]. Chlorophenols are of major concern due to their widespread contamination of soil and potable groundwater supplies and their harmful effects on man and animals [4]. Repeated exposure to low levels of phenol in water may cause liver damage, diarrhea, mouth ulcers, dark urine and hemolytic anemia. Phenols have been registered as priority pollutants by the US Environmental Protection Agency (USEPA) with a permissible limit of 0.1 mg L− 1 in wastewater [5]. The Bureau of Indian Standards (BIS)

☆ The work described in this manuscript has not been submitted elsewhere for publication, in whole or in part, and all the authors listed have approved the manuscript that is enclosed. ⁎ Corresponding author. Tel.: +91 522 2476091, +91 522 2436077; fax: +91 522 2628227. E-mail addresses: [email protected], [email protected] (K.P. Singh), [email protected] (A. Malik), [email protected] (G. Jain). 0169-7439/$ – see front matter © 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.chemolab.2009.09.004

has prescribed a permissible limit of 1.0 µg L− 1 for phenol in drinking water [6]. Therefore, removal of the phenols and their substituted products from the water/wastewater is necessarily required. Although, various physical [7–9], chemical [10], and biological [11,12] methods have been reported for the treatment of water/ wastewater containing phenolic compounds, the adsorption methods have widely been investigated as these offer an efficient and economically feasible technology for their removal from water/ wastewater [13]. Adsorption studies are usually performed in batch and column modes. The batch studies are aimed at determining the kinetics and isotherm constants while, the column studies are performed for determining the break-through curves. The removal efficiency (RE) of the adsorbent for a particular adsorbate is the most significant output of the adsorption studies with respect to water/ wastewater treatment. The variation of RE for an adsorbate– adsorbent system depends on several factors falling under both the dimensional as well as the dimensionless group of the variables. Promulgation of stringent regulations by the regulatory agencies for the control of contaminants in water/wastewater has compelled for an optimum management of the water/wastewater, hence, understanding of the new concepts involving efficient operation and design development. The nature of the sorption process largely depends on the physical and/or chemical characteristics of the adsorbent systems and also on the system conditions. The adsorption process is highly complex due to the interaction of a number of

K.P. Singh et al. / Chemometrics and Intelligent Laboratory Systems 99 (2009) 150–160

process variables, and is, thus, difficult to model and simulate using the conventional mathematical modeling approaches [14]. The adsorption processes irrespective of the adsorbate–adsorbent are usually modeled using the mechanistic or empirical kinetic models. Although, the kinetic models, such as external mass transfer model [15], intra-particle diffusion model [16], Lagergren first-order kinetics [17], and pseudo-second order kinetics [18,19] are excellent in representing the kinetics of sorption process, they can fit to the sorption data obtained under a particular operating condition. Moreover, none of the kinetic or empirical models could relate the uptake of adsorbate by the adsorbent with the operating variables, such as, particle size of the adsorbent, the initial adsorbate concentration, initial pH of the solution, agitation speed, contact time, and operating temperature [14]. Therefore, a high quality representative model will be needed to provide favorable solution in the process control and help to explain the real process performance with a view to develop a continuous control strategy for such decontamination technologies. The partial least squares (PLS) regression and artificial neural networks (ANNs) modeling approaches have the ability to relate the input and output variables without having any pre-knowledge on physics of the system provided an accurate and large amount of data on the system variables is available. The PLS method is a multivariate regression method that projects the input–output data down in to a latent space, extracting a number of principal factors with an orthogonal structure, while capturing most of the variance in the original data. PLS derives its usefulness from its ability to analyze data with strongly collinear, noisy and numerous variables in the predictor matrix X and responses Y [20,21]. Both the linear and non-linear PLS methods have successfully been applied to predict the dependent variable(s) through modeling the input–output relationship in the data [22,23]. The artificial neural networks (ANNs) due to their simplicity towards simulation, and prediction, and capability to capture the non-linear relationships existing between variables in complex systems have emerged as a promising modeling tool during the last few decades. The ANN approach has several advantages over traditional phenomenological or semi-empirical models, since they require known input data set without any assumptions [24]. The ANN develops a mapping of the input and output variables, which can subsequently be used to predict desired output as a function of suitable inputs [25]. A multi-layer neural network can approximate any smooth, measurable function between input and output vectors by selecting a suitable set of connecting weights and transfer functions [24]. ANN models could describe adsorption isotherms better than general rate models [26,27]. Hence, it is preferable to use a non-parametric technique such as feed-forward back propagation neural network modeling to represent such an equilibrium relationship [28]. In recent years, ANNs have found successful applications to the adsorption processes [29–32]. The present study describes the removal potential of the adsorbent derived from the coconut fibers for 2-chlorophenol (2-CP) from the aqueous solutions using the two different modeling approaches. On the basis of the batch adsorption experiments, we applied both the PLS and the ANN models to predict the 2-CP removal efficiency of the derived adsorbent in this study. The predicted results by the two modeling approaches (PLS and ANN) were discussed and compared with those obtained through the experiments.

methods employed for preparation and characterization of the adsorbent are available elsewhere [33]. In brief, coconut fibers carbon (FC) was prepared by treating one part of the coconut fibers with two parts (by weight) of concentrated sulfuric acid (36 N) and keeping the same in an oven maintained at150–165 °C for a period of 24 h. The carbonized material after washing well with double distilled water was dried at 105–110 °C for 24 h. The activation was carried out under closely controlled conditions to obtain the optimum properties. The product so obtained was sieved to the desired particle sizes and the carbon having 30–200 mesh size particles was used in the sorption study. The prepared adsorbent has chemical composition as ash content 7.22%; carbon (C) 76.38%; nitrogen (N) 0.38%; and hydrogen (H) 1.95%. The pH and the surface (BET) area of the adsorbent were 5.80 and 512 m2 g− 1, respectively. The surface morphology of the adsorbent particles is characterized by scanning electron microscopy (SEM) analysis. The SEM image (1500×) of the adsorbent (Fig. 1) shows that the shape of the FC used in this study is irregular and porous. Various forms of different constituents in the adsorbent (FC) were identified with the help of IR spectra (figure not shown for brevity). The IR spectrum of FC indicated weak and broad peaks in the region of 3812–763 cm− 1. Approximate FT-IR band assignment indicated the presence of carbonyl, carboxyls, lactones, and phenols. The 1800–1540 cm− 1 band is associated with C O stretching mode in carbonyls, carboxylic acids, and lactones, while 1440–1000 cm− 1 band was assigned to the C–O stretching and O–H bending modes such as phenols and carboxylic acids. The assignment of a specific wave number to a given functional group was not possible because the absorption bands of various functional groups overlap and shift, depending on their molecular structure and environment. Shifts in absorption positions may be caused by factors such as intra-molecular and intermolecular hydrogen bonding, stearic effect, and degree of conjugation. For instance, within the given range, the position of C O stretching band (common to carbonyls, carboxylic acids, and lactones) is determined by many factors. These include: the physical state, electronic and mass effects of neighboring substituent, conjugation, hydrogen bonding, and ring strain. The IR absorption bands of oxygen groups on the surface of the activated carbons are likely to be affected by some or all of the factors listed above. Although some inference can be made about the surface functional groups from IR spectra, the weak and broad band do not provide any authentic information about the nature of surface oxides. A stock solution of 5 × 10− 3M of 2-CP was prepared by dissolving analytical grade chlorophenol (Merck) in double distilled water. Through subsequent dilutions, solutions of 5 × 10− 5, 1 × 10− 4, 5 ×10− 4, and 1 × 10− 3 M of 2-CP were prepared for the batch sorption studies. The pH of the test solutions was adjusted using dilute HCl (0.1 N) and NaOH

2. Materials and methods 2.1. Adsorption study 2.1.1. Preparation of the adsorbent and adsorbate solutions In this study an adsorbent namely the activated carbon derived from the coconut fibers was studied for assessing its removal potential for 2-chlorophenol (2-CP) from the aqueous solutions. Details of the

151

Fig. 1. Scanning electron micrographs (SEM) of the adsorbent (FC) at 1500×.

152

K.P. Singh et al. / Chemometrics and Intelligent Laboratory Systems 99 (2009) 150–160

(0.1 N). The pH measurements were made using a pH meter (model 744, Metrohm). 2.1.2. Variables affecting the adsorption The adsorbent dose, agitation rate, particle size (surface area) of adsorbent, initial concentration of the adsorbate, initial pH of the solution, contact time, and the operating temperature, are the key factors which influence the adsorption process [14,32]. Among these, effects of the last four variables on adsorption process were investigated here. Subsequently, the experimental design for the adsorption of 2-CP using the adsorbent (FC) was developed assigning four different experimental modes, keeping the first three variables (adsorbent dose, particle size, and agitation rate) constant throughout the experiments. Here, the four different modes of the experimental design were; • Mode 1: The pH, initial concentration of 2-CP, and operating temperature were kept constant and the contact time was varied (1, 2, 4, 6, 18, 20, 22, 24, 26, and 28 h). • Mode 2: Keeping the initial concentration, pH, and contact time constant, the operating temperature was varied (15 °C, 20 °C, 25 °C, 30 °C, and 40 °C). • Mode 3: In this mode of operation, the initial concentration of 2-CP, contact time, and the operating temperature were kept constant, while the pH was varied (2, 3, 4, and 6). • Mode 4: In this mode, the initial concentration of the adsorbate (5 × 10− 5, 1 × 10− 4, 5 × 10− 4, and 1 × 10− 3 M) was varied keeping the other three variables (pH, contact time, operating temperature) constant. Since, there were four process variables in the designed experiments (Table 1), their factorial design yielded total 800 experimental data and subsequently, equal number of the adsorption efficiency values for the adsorbate (2-CP). 2.1.3. Batch adsorption studies Batch adsorption experiments were conducted to determine the effect of initial concentration of adsorbate (2-CP), initial pH, operating temperature, and contact time on the adsorption performance of the carbon (FC). Throughout the experiments a single constant dose of the adsorbent (2 g L− 1) was maintained. Subsequently, 100 mg of the adsorbent dosage were introduced into 100 ml Erlenmeyer stoppered conical flasks containing 50 mL of adsorbate (2-CP) solution containing 5 × 10− 5, 1 × 10− 4, 5 × 10− 4, and 1 × 10− 3 M of 2-CP and maintained at four different pH (2–6). The flasks were then placed in a thermo-controlled water bath shaker (model MSW 275) and agitated up to a total contact time of 28 h at a fixed agitation speed. Blank samples without adsorbent were used under the same conditions. Independent samples were withdrawn at ten predetermined time intervals (1, 2, 4, 6, 18, 20, 22, 24, 26, and 28 h) and processed for analysis of the adsorbate concentrations left in the solution phase. Adsorption studies were performed at five different temperatures (15 °C, 20 °C, 25 °C, 30 °C, and 40 °C) and at optimum pH to obtain data on the extent of sorption. For determining the concentration of unadsorbed amount of the adsorbate, the solution

Table 1 Range of the operating variables taken in batch adsorption experiments. Operating variable

Range

Initial adsorbate concentration (mol L− 1) Initial pH Operating temperature (°C) Contact time (h)

5 × 10− 5, 1 × 10− 4, 5 × 10− 4, and 1 × 10− 3 2, 3, 4, and 6. 15, 20, 25, 30, and 40 1, 2, 4, 6, 18, 20, 22, 24, 26, and 28

was filtered and 2-CP concentrations were determined spectrophotometrically at the corresponding λmax (273 nm). Absorbance measurements were made on UV–visible spectrophotometer model GBC Cintra 40. The spectrophotometer response time was 0.1 s and the instrument had a resolution of 0.1 nm. Absorbance values were recorded at the wavelength for maximum absorbance i.e. 273 nm (λmax), corresponding to 2-CP. The absorbance was measured with a 1-cm path-length cell, with an accuracy of ±0.004. Absorbance was found to vary linearly in concentration range of 10− 4–10− 3 M. Here, the percentage of 2-CP removal was considered as a measure of adsorption efficiency of the adsorbents. Percentage removal (RE%) of 2-CP from the aqueous solutions by the adsorbent was calculated as; C0 −Ce × 100 C0

REð%Þ =

ð1Þ

where, C0 and Ce are the initial and the equilibrium concentrations of 2-CP in solution, respectively. In order to account for the effect of initial concentrations of the adsorbate (C0), initial pH of the solution (pH), operating temperature (T), and contact time (t) between the adsorbate–adsorbent, a total 800 batch experiments were performed with single dose of the adsorbent (FC). 3. Modeling 3.1. Partial-least squares (PLS) modeling Partial least squares (PLS) regression, a multivariate calibration technique aims to find the relationship between a set of predictor (independent) data, X (m × n), and a set of responses (dependent), Y (m × l). Here, n and l are the independent and dependent variables, respectively, and m is the observation vectors. However, it differs from the multiple linear regression technique (MLR) mainly that PLS is able to give stable predictions even when X contains highly correlated variables. Both the linear and non-linear PLS regression methods were applied here. 3.1.1. Linear PLS modeling Detailed description of PLS method and its algorithms could be found elsewhere [34], however, in brief, it can be expressed as a bilinear decomposition of both X and Y as [22,35]; T

ð2Þ

T

ð3Þ

X = TW + EX and Y = UQ + EY

such that the scores in X and the scores of the yet unexplained part of Y have maximum covariance. Here, T and W, and U and Q are X and Y PLS scores and loadings (weights) vectors, respectively; EX and EY are the X and Y residuals, respectively. The decomposition models of X and Y and the expression relating these models through regression constitute the linear PLS regression model. In case of one Y-variable, y, the model (PLS1) can be expressed as a regression equation (y = Xb + E), where b is the regression coefficient [22,35]. The PLS model performed in two stages, uses a set of calibration (training) samples to construct the model, which is employed to compute a set of regression coefficients (bPLS). These coefficients are then used to make prediction of the dependent variable (ynew) in new (test) experimental set as; ynew = Xnew ⋅bPLS + E

ð4Þ

Here, the bPLS vector is derived from the model parameters. Here, we have used the linear PLS1 model to analyze our data set.

K.P. Singh et al. / Chemometrics and Intelligent Laboratory Systems 99 (2009) 150–160

3.1.2. Non-linear PLS modeling The nonlinear PLS method is nonlinear version of PLS. In this approach, a nonlinear inner relationship is adopted instead of the linear inner relation in PLS. The nonlinear inner relationship may be achieved using the radial basis function. In case of the radial basis functions-partial least squares (RBF-PLS), the RBFs are used to carry out the non-linear transformation of X to form an activation matrix, XA. The elements of which are defined as [36]; aij = rj ðxi Þ;

i; j = 1; 2; …::m;

ð5Þ

where aij is the element of XA at the ith row and the jth column, rj is the jth RBF, and xi is a vector consisting the values of independent variables taken from the ith observation. The Gaussian function is the most commonly used RBF, which takes the form; 2

2

aij = expð−‖cj −xj ‖ = σ Þ i; j = 1; 2; …:; m;

ð6Þ

in which ‖ ‖ denotes the Euclidean distance when the argument is a difference of two vectors, cj and σj are two parameters of jth Gaussian function, the center and the width. The parameter cj of jth Gaussian function is computed by; cj = xj ; j = 1; 2; …; m;

ð7Þ

and the elements of the parameter σj of jth Gaussian function is obtained as; σji = σj2 = … = σjm =

e m ∑ ‖x −χj ‖ m j=1 i

ð8Þ

where e is a constant assigned a value e > 0. Thus, the diagonal elements of activation matrix XA have the value 1. Thus, the number of RBFs is m, and the RBFs themselves are vector functions, the dimensions of which are all n. Now in PLS1, the variance of the prediction is minimized, while maximizing the covariance of XA and y and the PLS1 model is set up as; T

y = TB + E = ðXA XA ÞB + E

ð9Þ

where T is the low dimensional score matrix of XA with the dimension of m × nT, B is the regression coefficient matrix with the dimension of nT × l, XTA is the transformation matrix of XA with the dimension of n × nT, and E is the residuals matrix with the dimension of m × l. The optimal value of nT used in the score matrix, T can be determined by the leave-one-out cross-validation method. Since, T is the linear combination of the Gaussian function (row vector of XA) that will maximize the variance between XA and y. Subsequently, the RBF-PLS model is obtained and used for prediction purposes. In general, prior to compute the activation matrix, XA, the variables of the observation vectors are always scaled to the range [0, 1]. Now, for a new data set, Xnew which is not included in RBF-PLS model, the dependent variable, ynew can be computed as; T

ynew = XAnew XA B

ð10Þ

where XAnew is the activation matrix of the new dataset (Xnew) and is pre-processed in an identical manner as X of the data set used for modeling and the new activation matrix (XAnew) is calculated keeping the values of centeres and widths of the Gaussian functions. 3.1.3. Data arrangement for PLS modeling X-block: The adsorption data were arranged in a two way array by taking the 800 experimental runs as the rows and 4 process variables as the columns, yielding the X-matrix of dimensions 800 × 4. The Xmatrix data constituted the independent set of variables.

153

Y-block: In a similar way, the y-vector is comprised of 800 experimental runs as rows and single removal efficiency (RE%) variable as the column, thus, of dimensions 800 × 1. The y-block data were the set of single (RE%) dependent variable. For the purpose of regression analysis, the whole data set was divided into two subsets of equal size (X = 400 × 4; y = 400 × 1), namely calibration and prediction data sets. The Kennard–Stones (K–S) approach was used to split the data. The K–S algorithm designs the model set in such a way that the objects are scattered uniformly around the calibration domain. Thus, all sources of the data variance are included into the calibration model [37]. 3.1.4. Preprocessing of data for PLS modeling Prior to PLS modeling, the data were preprocessed through column mean centering across the first mode. This removes any offsets in data [38,39]. Since, the data were pre-processed; these were transformed back to the original form prior to the post modeling computations. 3.1.5. Validation and prediction Leave one out cross-validation (LOO-CV) approach was adopted for the model validation for selecting the optimum number of PLS latent variables (LVs) [38]. To test the robustness of the model, each single experimental run was in turn used as test set on the model built with the data corresponding to the remaining experimental runs. The number of significant LVs has been determined through the LOO-CV on 399 data points and then the resulting model has been used to predict the single left out (1 × 1). This was repeated, so that data points corresponding to each of the 400 runs are left once and predicted, thus, yielding the root mean square error of leave one out cross-validation (RMSECV-LOO) values for each of the experimental run, left out in turn. The RMSECV-LOO was computed as [22,35]; sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðymodel;i −ymeas;i Þ2 N RMSECV  LOO = ∑i = 1 N

ð11Þ

where N is the number of data points, and ymodel,i and ymeas,i are the predicted and measured value of the variable (RE%), respectively. The test data set for prediction purpose was comprised of left out set of experimental runs. The X array comprised of 400 runs × 4 variables (400 × 4), whereas the y consisted of 1 run × 1 variable (1 × 1). The PLS1 model built up and calibrated for half of the data (experimental runs) was used to predict the single variable (RE%) pertaining to the prediction data set of other half experimental runs and the root mean squares error of prediction (RMSEP) was computed. 3.2. Artificial neural networks modeling The artificial neural network employs the model structure of a neural network which is very powerful computational technique for modeling complex non-linear relationships particularly in situations where the explicit form of the relation between the variables involved is unknown [40]. The basic structure of an ANN model is usually comprised of three distinctive layers, the input layer, where the data are introduced to the model and computation of the weighted sum of the input is performed, the hidden layer or layers, where data are processed, and the output layer, where the results of ANN are produced. Each layer consists of one or more basic element(s) called a neuron or a node. An artificial neuron in a typical ANN architecture receives a set of inputs (signals (x) with weight (w), calculates their weighted average ((z), using the summation function), and then uses some activation function to produce an output (a = f(z)), where z = ∑xi · wi). The connection between the input layer and hidden layer contains weights, which usually are determined through training of the system. The hidden layer sums the weighted inputs and uses the transfer function to create an output value. The transfer

154

K.P. Singh et al. / Chemometrics and Intelligent Laboratory Systems 99 (2009) 150–160

function is a relationship between the internal activation level of the neurons and the outputs. A sigmoid function which varies between −1 and +1 for a range of inputs is generally used to model the nonlinear relationships. The supervised training is used to train the ANN in such a way as to minimize the difference between the network output and the measured target. In training process weights are adjusted to obtain a desirable outcome with minimum error. Back propagation (BP) is the most commonly used training algorithm in ANNs. This training algorithm is a technique that helps distribute the error in order to arrive at a best fit or minimum error. After the information has gone through the network in a forward direction and the network has predicted an output, the back-propagation algorithm redistributes the error associated with this output back through the model, and weights are adjusted accordingly. Minimization of the error is achieved through several iterations. Although, traditional BP uses a gradient descent algorithm to determine the weights in the network, it computes rather slowly due to linear convergence. Hence, Levenberg–Marquardt algorithm (LMA), which is much faster as it adopts the method of approximate second derivative was used here [41]. The LMA is similar to the quasi-Newton method in which a simplified form of the Hessian matrix (second derivative) is used. One iteration of this algorithm can be written as [42]; T

xk + 1 = xk –½J J + μI

−1 T

J e

ð12Þ

in which J is the Jacobian matrix which contains first derivatives of the network errors with respect to the weights and biases, and e is a vector of network errors, µ is the learning rate and I is the identity matrix. During training the learning rate µ is incremented or decremented by a scale at weight updates. The LMA is reported to have the fastest convergence for the neural networks that contain up to few hundred neurons [43]. The learning rate was set to 0.001. The mean square error (MSE), used as the target error goal, is defined as [43,44];

MSE =

1 N 2 ðy −ymeas;i Þ ∑ N i = 1 model;i

ð13Þ

where ymodel,i and ymeas,i represent the predicted and measured values of the variable, and N represents the number of observations. The maximum number of epochs, target error goal MSE and the minimum performance gradient were set to 300, 10− 5, and 10− 5, respectively. Training stops when any of these conditions occur. The optimal architecture of the ANN models and its parameter were determined based on the minimum value of the mean squared error (MSE) of the training and validation sets. Through trial and error approach, various combinations of the number of neurons in hidden layer, back-propagation algorithms, and transfer functions (linear and sigmoid) were used. Lowest MSE for the training and the validation sets was the criteria for selecting the best case. Here, a three-layer feed forward neural network with back propagation (BP) learning was constructed for prediction of the removal efficiency of 2-CP from the aqueous solution using the adsorbent (FC). The initial concentration of adsorbate, initial pH, operating temperature, and the contact time were the four input variables, p (xpi, i = 1…4), whereas, the removal efficiency (RE%) was the target output variable. All computations were performed using the EXCEL 97 and MATLAB (MathWorks, Inc, Natwick, MA). Nonlinear PLS was performed using the TOMCAT toolbox [37]. 3.2.1. ANN input variables and data preprocessing Four operational variables (pH, initial concentration of adsorbate, operating temperature, and the contact time) were used as the input

vectors and the percent removal of the adsorbate as the single output vector to the ANN model. The basic statistics of the aforesaid variables is presented in Table 1. Since, the ANNs are data intensive [45], preparation of the training, validation and test data sets is an important aspect. ANNs learn the underlying physics of the system from the training samples, which are basically the cause–effect samples. Therefore, the number of training samples significantly influences a network's predictive performance [45,46]. Increasing the number of training samples provides more information about the shape of the solution surface(s) and thus increases the potential level of accuracy that can be achieved by the network [47]. Too few samples in training set will lead to poor generalization [45]. An optimum number of training samples would be the one that fully represents the modeling domain. Since we have data for total 800 experimental runs for the adsorbent (FC), these were divided in to three sub-sets comprising of 65% as training, 17.5% as validation, and 17.5% as test set. The output variable (RE%) corresponding to the input variables belong to the same experimental run. In the validation and testing phase, the performance of the network is tested for unused experimental data obtained within the range of selected variables. In view of the requirements of the neural computation algorithm, the raw data of both the independent (input) and dependent (output) variables were normalized to an interval by transformation. The transformation modifies the distribution of the input variables so that it matches the distribution of the estimated outputs. Here, all the variables are transformed to the same ground-uniform distributions on −1, +1, as [48];

x̂ ij =

2ðxij − minj Þ ðmaxj −minj Þ

−1

ð14Þ

where, x̂ij is the transformed variable, minj and maxj are minimal and maximal values of the jth variables, respectively. The ANNs were applied to provide a non-linear relationship between the inputs and the network output.

3.3. Modeling performance criteria The performance of both the PLS and ANNs models was evaluated using the five different criteria: the root mean square error of prediction (RMSEP), the relative error of prediction in percent (REP), the coefficient of determination (R2) [49], the Nash–Sutcliffe coefficient of efficiency (Ef) [50,51], and the accuracy factor (Af). The RMSEP represents the error associated with the model and can be computed as [52]; sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi N 2 ∑i = 1 ðymodel;i −ymeas;i Þ RMSEP = N

ð15Þ

Where, ymodel,i and ymeas,i represent the model computed and measured values of the variable, and N represents the number of observations. The RMSEP, a measure of the goodness-of-fit, best describes an average measure of the error in predicting the dependent variable. However, it does not provide any information on phase differences. The relative error of prediction of the dependent variable in percentage (REP) is calculated as [53];

REP =

vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u N u∑i = 1 ðymodel;i −ymeas;i Þ2 t N

∑i

2 = 1 ðymeas;i Þ

× 100

ð16Þ

K.P. Singh et al. / Chemometrics and Intelligent Laboratory Systems 99 (2009) 150–160

155

The coefficient of determination (R2) (square of the correlation coefficient) represents the percentage of variability that can be explained by the model and is calculated as [49]; 2 N

N

N

32

N∑i = 1 ymeas;i ymodel;i −ð∑i = 1 ymeas;i Þð∑i = 1 ymodel;i Þ 6 7 R = 4qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 5 N 2 N N 2 N 2 ½N∑i = 1 ymodel;i −ð∑i = 1 ymeas;i Þ2   ½N∑i = 1 ymodel;i −ð∑i = 1 ymodel;i Þ  2

ð17Þ The Nash–Sutcliffe coefficient of efficiency (Ef) [50], an indicator of the model fit is computed as; 2

N

Ef = 1−

∑i = 1 ðymodel;i −ymeas;i Þ

N 2 ∑i = 1 ðymeas;i −ymeas Þ

ð18Þ

where, ymeas is the mean of the measured values. The Ef is a normalized measure (−∞ to 1) that compares the mean square error generated by a particular model simulation to the variance of the target output sequence. The Ef value of 1 indicates perfect model performance (the model perfectly simulates the target output), an Ef value of zero indicates that the model is, on average, performing only as good as the use of the mean target value as prediction, and an Ef value < 0 indicates an altogether questionable choice of model [51,54]. The accuracy factor (Af), a simple multiplicative factor indicating the spread of results about the prediction is computed as [55]; 0 B B B∑ B @

j

ymodel;i log ymeas;i

Fig. 2. Variation of the experimental adsorption efficiency (RE%) with the initial pH and concentration the adsorbate (2-CP).

! 1

N

j

C C C C A

Af = 10

ð19Þ

The larger the value of Af, the less accurate is the average estimate. A value of one indicates that there is perfect agreement between all the predicted and measured values. Each performance criteria term described above conveys specific information regarding the predictive performance efficiency of a specific model. Goodness of fit of the selected PLS and ANNs models was also checked through the analysis of the residuals. 4. Results and discussion The basic purpose of this study is to demonstrate the applicability of PLS and ANN models for the prediction of adsorption efficiency of 2CP using the adsorbent (FC) from the aqueous solutions taking into account the effect of four different operational variables (pH, adsorbate concentration, contact time, and operating temperature) simultaneously. 4.1. Experimental observations Although, none of the PLS and ANNs modeling approaches require knowledge of the effects of operating variables on adsorption efficiency, the experimental observations may be helpful in understanding the complexity and the nature of the behavior of the process, and hence the usefulness of the two modeling methods (PLS and ANNs). The observed experimental trends for the influence of four different process variables on the removal efficiency (RE%) of the adsorbent (FC) for 2-CP are presented in Figs. 2 and 3. 4.1.1. Effect of pH and adsorbate concentration The experimental adsorption efficiency (RE%) of the adsorbent (FC) is plotted (Fig. 2) as function of the initial pH of the solution and the initial concentration of the adsorbate (2-CP). The clusters in the

Fig. 3. Variation of the experimental adsorption efficiency (RE%) with the contact time and operating temperature.

plot represent the four different pH values (2–6) and the initial concentration (5 × 10− 5, 1 × 10− 4, 5 × 10− 4, and 1 × 10− 3 M) of the adsorbate (2-CP). The variation trends in RE% indicate that the adsorption process is highly sensitive to any change in both the variables. It is evident that the adsorption efficiency is highest at the lowest pH and decreases with increasing pH of the solution. At higher pH the phenols dissociate, forming phenolate anions, whereas surface functional groups may be either neutral or negatively charged. The electrostatic repulsion between the like charges lowers the adsorption capacity. Being a weak acid, 2-CP (pKa = 8.3) is associated with the electron withdrawing effect of the aromatic ring. At the lower pH, the functional groups on the carbon surface are in the protonated form and high electron density of the solute molecules would lead to higher adsorption [56,57]. In case of the initial adsorbate concentration, the adsorption efficiency is low at the lowest concentration and then increased to highest at next higher concentration which further decreases with increasing concentration. The overall trends indicate non-linear behavior for both the variables. 4.1.2. Effect of contact time and operating temperature Fig. 3 shows the variation of the adsorption efficiency as function of the contact time (1, 2, 4, 6, 18, 20, 22, 24, 26, and 28 h) and the operating temperature (15, 20, 25, 30, and 40 °C). It is evident that the adsorption efficiency of the studied adsorbent decreases non-linearly

156

K.P. Singh et al. / Chemometrics and Intelligent Laboratory Systems 99 (2009) 150–160

Table 2 The linear and nonlinear PLS1 model results for calibration and prediction data sets. Parameter

RMSECV RMS RMSEP REP R2⁎ Ef Af R2⁎⁎

Linear PLS

Nonlinear PLS

Calibration

Prediction

Calibration

Prediction

5.29 5.48 – 10.87 0.86 0.86 1.10 0.00

– – 5.14 10.19 0.87 0.87 1.09 0.00

3.23 4.36 – 8.65 0.91 0.91 1.08 0.00

– – 4.98 9.88 0.88 0.88 1.09 0.00

⁎ Between experimental and modeled values of y. ⁎⁎ Between residuals and modeled values of y.

with increasing temperature. Thus, a higher temperature favors the adsorption of 2-CP from the aqueous solution and this suggests for the endothermic nature of the process. Costa et al. [54,58] attributed the increased phenol adsorption with temperature to the high percentage contribution of micropores to the porosity of the activated carbon used. They suggested that an important fraction of these micropores could have a size similar to that of the phenol molecule so that access to these pores would only be possible at certain conditions of temperature at which the phenol molecule could penetrate and diffuse. A similar pattern is also visible for variation of adsorption efficiency with the contact time. It increases with the increase of the contact time, however, in a non-linear manner.

Fig. 5. Plot of the measured and linear PLS1 model predicted values (prediction mode) of the adsorption efficiency (RE%) of 2-CP from aqueous solution by FC.

précised predictive capability of the selected model. Further, the model-predicted values of removal efficiency (RE%) yielded the Ef values of 0.86 and 0.87 and Af values of 1.10 and 1.09, respectively for the calibration and prediction sets. The Ef and Af values of closer to unity, and significantly high values of the correlation coefficient (R2) between the experimental and model predicted RE% for the two data

4.2. Performance of the linear PLS1 model The PLS1 model calibration was performed using the leave-oneout cross-validation method. The number of the latent variables (LVs) was selected on the basis of the minimum root mean squares error of cross-validation (RMSECV) value. A minimum value of RMSECV was obtained for the four LVs model. Hence, the four LVs PLS1 model was selected and run for the complete calibration and prediction data sets. The model performance criteria parameters are summarized in Table 2. The four LVs model when run for the complete calibration data set captured 100 and 86% of the variance in X and y, respectively, and yielded the root mean square error (RMS), RMSECV and REP values of 5.48, 5.29 and 10.87, respectively (Table 2) with a correlation coefficient (R2) of 0.86 between the experimental and predicted removal efficiency values. Similarly, the calibrated four component model applied to the test data set for the prediction of removal efficiency (RE%) yielded the RMSEP and REP values of 5.14 and 10.19, respectively with a correlation coefficient value of 0.87. The experimental and model predicted values of the removal efficiency (RE%) obtained for the calibration and prediction data sets are plotted in Figs. 4 and 5. Low model prediction error (RMSEP, REP) and close pattern of variation among the experimental and model predicted values of RE% for both the data sets is evident suggesting for the

Fig. 4. Plot of the measured and linear PLS1 model predicted values (calibration mode) of the adsorption efficiency (RE%) of 2-CP from aqueous solution by FC.

Fig. 6. Plot of the residuals versus model predicted values of the adsorption efficiency (RE%) of 2-CP from aqueous solution by FC (a) calibration (b) validation set using linear PLS1 model.

K.P. Singh et al. / Chemometrics and Intelligent Laboratory Systems 99 (2009) 150–160

sets (Table 2) suggest for the adequacy of the selected linear PLS1 model in predicting the removal efficiency of the adsorbent for the studied adsorbate from the aqueous solutions. Further, the residuals versus Ymodel (model predicted values of the removal efficiency) plots (calibration and prediction sets) for the linear PLS1 are shown in Fig. 6a,b. Residuals versus Ymodel plots can be more informative regarding model fitting to a data set. If the residuals appear to behave randomly (low correlation), it suggests that the model fits the data well. On the other hand, if non-random distribution is evident in the residuals, the model does not fit the data adequately [59]. Fig. 6a,b shows random pattern in distribution of the residuals with a very low correlation (R2**) between the Ymodel and residual values (Table 2), hence, suggests that the PLS1 model fits appropriately to all the data points. Although, in terms of various model performance criteria parameters considered here, the linear PLS1 model fitted well to the experimental data set and yielded satisfactory results suggesting for its suitability for predicting the adsorption efficiency of the studied adsorbate–adsorbent system taking all the four different process variables in to consideration simultaneously, since, it is a bilinear method and the adsorption process variables may have some degree of nonlinear relationships, we attempted to treat the adsorption data set using the non-linear modeling approaches and make a comparative study.

157

Fig. 8. Plot of the measured and nonlinear PLS1 model predicted values (prediction mode) of the adsorption efficiency (RE%) of 2-CP from aqueous solution by FC.

4.4. Performance of the ANN model Several different ANN models were constructed and tested in order to determine the optimum number of nodes in the hidden layer and type of the transfer functions. Selection of an appropriate number of nodes in the hidden layer is very important aspect as a larger number of these may result in over-fitting, while a smaller number of

4.3. Performance of the nonlinear PLS1 model The calibration of the four LVs RBF-PLS1 model yielded the minimum RMSECV, RMS and REP values of 3.23, 4.36 and 8.65, respectively with a significantly high correlation coefficient (R2) of 0.91 between the experimental and predicted removal efficiency values. Similarly, the calibrated model applied to the test data set for the prediction of removal efficiency (RE%) yielded the RMSEP and REP values of 4.98 and 9.88, respectively with a correlation coefficient value of 0.88 (Table 2). Relatively low values of model prediction errors (RMSEP, REP), and Ef, and Af values closer to unity suggest that the nonlinear PLS1 model fitted the adsorption data well. A close pattern of variation of the model predicted and experimental values of removal efficiency (Figs. 7 and 8) and a random distribution of the residuals (Fig. 9a,b) with very low correlation (R2**) between the ymodel (model predicted values of RE) and the residuals (Table 2) further support the adequacy of the selected nonlinear PLS1 model in predicting the dependent variable. In view of the model performance criteria parameter values as obtained for the linear and nonlinear PLS1 models (Table 2), it is evident that the performance of the nonlinear model was relatively better in predicting the adsorption efficiency of the adsorbent. It may be attributed to the fact that the adsorption process variables may have some nonlinear relationships. Therefore, we further attempted to analyze the data using the artificial neural network (ANN) model.

Fig. 7. Plot of the measured and nonlinear PLS1 model predicted values (calibration mode) of the adsorption efficiency (RE%) of 2-CP from aqueous solution by FC.

Fig. 9. Plot of the residuals versus model predicted values of the adsorption efficiency (RE%) of 2-CP from aqueous solution by FC (a) calibration (b) validation set using nonlinear PLS1 model.

158

K.P. Singh et al. / Chemometrics and Intelligent Laboratory Systems 99 (2009) 150–160

Table 3 Performance parameters of the artificial neural network model for prediction of the adsorption efficiency (RE%) of the studied adsorbate–absorbent system. ANN structure

R2⁎

RMSEP

REP(%)

Af

Ef

R2⁎⁎

I–H–O 4–8–1

0.96 0.96 0.96

2.86 4.07 3.25

5.73 7.98 6.24

1.06 1.08 1.06

0.96 0.91 0.94

0.0004 0.063 0.087

Calibration Validation Test

I—Input nodes; H—Hidden nodes; O—Output nodes. ⁎ Between experimental and modeled values of y. ⁎⁎ Between residuals and modeled values of y.

nodes may not capture the information adequately. The number of nodes in the hidden layer (M) in this model was between I and 2I + 1 [60], where I is the number of input nodes. As a guide, M should not be less than the maximum of I/3 and the number of output nodes. The optimum value of M is determined by trial and error. Subsequently, a network was identified for the prediction of the adsorption efficiency (RE%) of the adsorbent for 2-CP. The network was trained using the training data set, and then it was validated with the validation data set. The optimal network size was selected from the one which resulted in minimum mean square error (MSE) in training and validation data sets. The model features for the selected ANNs are given in Table 3. The architecture of the best ANN model for the prediction of adsorption efficiency of the studied adsorbate–adsorbent system is shown in Fig. 10. The ANN model is comprised of the input layer with four input nodes, single hidden layer with eight nodes, and the output layer with single output variable. The constructed ANN model (4–8–1) was trained using the Levenberg–Marquardt algorithm (LMA). A nonlinear transfer function (tansigmoid) was used both in the hidden and the outer layers. The selected model (Fig. 10) yielded the minimum MSE of 0.004 and 0.028 for the training and validation sets and converged after six epochs. The RMSEP, coefficient of determination (R2), the Nash–Sutcliffe coefficient of efficiency (Ef), and the accuracy factor (Af) as computed for the training, validation and the test data sets used for the model are presented in Table 3. Fig. 11a–c shows the plots between the measured and the model predicted values of RE% in training, validation and test set, respectively. The selected ANN provided the best fit model for RE% prediction in all the three sets. For the RE% values predicted by the model, the coefficient of determination (R2) values (p < 0.001) were 0.96 for each of the

Fig. 10. A general conceptual diagram of the three-layer artificial neural network (ANN) model for the adsorption efficiency (RE%) of 2-CP from aqueous solution by FC.

training, validation and test sets. ANN predictions are precise if R2 values are closer to unity. The respective values of RMSEP for the three data sets are 2.86 for the training, 4.07 for the validation, and 3.25 for the testing. The Nash–Sutcliffe coefficient of efficiency (Ef) and the accuracy factor (Af) values were 0.96 for the training, 0.91 for validation, and 0.94 for the test set respectively. A closely followed pattern of variation by the measured and model predicted RE% values (Fig. 11a–c), significantly high correlation (R2), considerably low RMSEP, and Ef, and Af values closer to unity suggest for the appropriateness of the model for prediction of the removal efficiency. The model-predicted RE values and the residuals corresponding to the training, validation and testing sets are plotted in Fig. 12a–c. The observed relationship between residuals and model predicted values of RE for all the three sets show almost complete independence and random distribution with negligibly small correlations between them (Table 3). For the predicted values of RE, the respective correlations (R2) with their residuals (0.0004 for training; 0.063 for validation; and 0.087 for test) are small. Further, Fig. 12a–c shows that the distribution of residuals is random as the points are distributed on both sides of the horizontal line of zero ordinate representing the average of the residuals. Residuals versus predicted value plots can be more informative regarding model fitting to a data set. If the residuals appear to behave randomly it suggests that the model fits the data well. On the other hand, if non-random distribution is evident in the residuals, the model does not fit the data adequately [59]. From the model performance criteria parameter values for the selected linear and nonlinear PLS1 (Table 2) and the ANN (Table 3), it

Fig. 11. Plot of the measured and model predicted values of the adsorption efficiency (RE%) of 2-CP from aqueous solution by FC (a) calibration, (b) validation, and (c) test set using ANN model.

K.P. Singh et al. / Chemometrics and Intelligent Laboratory Systems 99 (2009) 150–160

159

5. Conclusions In this study, on the basis of batch adsorption experiments performed with four different process variables (pH, initial concentration of adsorbate, contact time, and operating temperature), an important objective was to obtain a model that could make reliable prediction on the percent removal of 2-chlorophenol from aqueous solutions using the coconut fibers carbon. The linear and nonlinear partial least squares (PLS) regression and a three layer back propagation ANN model with tangent sigmoid transfer functions in both the hidden and outer layers were constructed for predicting the removal efficiency of the adsorbent. The PLS1 regression models were validated using the leave-one-out cross-validation method and the ANN model was trained using the Levenberg–Marquardt algorithm. Performance of all the selected models were evaluated using the criteria of the relative error of prediction (REP), root mean square error of prediction (RMSEP), coefficient of determination (R2), Nash– Sutcliffe coefficient of efficiency (Ef), the accuracy factor (Af), and analysis of the residuals. All the three models predicted the adsorption efficiency of the derived adsorbent for the 2-chlorophenol from aqueous solutions satisfactorily, However, the performance of the ANN was relatively better as compared with the PLS1 models. All the three models can be used as tools for predicting the adsorption efficiencies with various combinations of the process variables and thus, for the purpose of process optimization and control. Acknowledgements The authors thank the Director, Indian Institute of Toxicology Research, Lucknow (India) for his keen interest in the work. Financial assistance from CSIR, New Delhi is thankfully acknowledged. References

Fig. 12. Plot of the residuals versus model predicted values of the adsorption efficiency (RE%) of 2-CP from aqueous solution by FC (a) training, (b) validation, and (c) test set using ANN model.

is evident that the predictive performance of all the three models is satisfactory. However, it may be noted that the performance of the nonlinear models (RBF-PLS and ANN) have been relatively better as far as the prediction of the adsorption efficiency (RE%) of the studied adsorbate–adsorbent system is concerned. This may be attributed to the fact that the PLS is a bilinear model and hence, could not catch the non linear relationships among the adsorption process variables, whereas, the RBF-PLS and ANN modeling approaches are capable of capturing the non-linearity in data.

[1] V.K. Gupta, S. Sharma, I.S. Yadav, D. Mohan, J. Chem. Technol. Biotechnol. 71 (1998) 180. [2] R. Spandre, G. Dellomonaco, J. Environ. Hydrol. 4 (1996) 1. [3] T. Esplugas, P.I. Yue, M.I. Pervez, Water Res. 28 (1994) 1323. [4] R.R. Perez, G.G. Benito, M.P. Miranda, 1997, Bioresour. Technol. 60 (1997) 207. [5] USEPA, Technical Support Document for Water Quality Based Toxics Control, EPA/ 440/485032, United States Environmental Protection Agency, Washington, DC, USA, 1985. [6] BIS, Tolerance Limit for Industrial Effluents Discharged into Inland Surface Waters: Coke Oven, S 2490 (Part 1), Bureau of Indian Standards, New Delhi, 1974. [7] I.Z. Shirgaonkar, A.B. Pandit, Ultrason. Sonochem. 5 (1998) 53. [8] A.B. Pandit, P.R. Gogate, S. Mujumdar, Ultrason. Sonochem. 8 (2001) 227. [9] A. Agrios, K. Gray, E. Weitz, Langmuir 19 (2003) 1402. [10] E. Leyva, E. Moctezuma, M.G. Ruiz, L. Torresmartinez, Catal. Today 40 (1998) 367. [11] C. Barbeau, L. Deschenes, D. Karamanev, Y. Comeau, R. Samson, Appl. Microbiol. Biotechnol. 48 (1997) 745. [12] I.D. Buchanan, J.A. Micell, Biotechnol. Bioeng. 54 (1997) 251. [13] M.C. Burleigh, M.A. Markowitz, M.S. Spector, B.P. Gaber, Environ. Sci. Technol. 36 (2002) 2515. [14] V.K. Kumar, K. Porkodi, R.L. Avila Rondom, F. Rocha, Ind. Eng. Chem. Res. 47 (2007) 486. [15] T. Furusava, J.M. Smith, Ind. Eng. Chem. Fundam. 12 (1973) 197. [16] W.J. Weber Jr, J.C. Morris, J. Sanit, Eng. Div., Am. Soc, Civ. Eng. 89 (1963) 31. [17] S. Lagergren, K. Sven Vetenskapsakad, Handl. 24 (1898) 1. [18] G. Blanchard, M. Maunaye, G. Martin, Water Res. 18 (1984) 1501. [19] [19] Y.S. Ho, Adsorption of Heavy Metals from Waste Streams by Peat. Ph. D. Thesis, The University of Birmingham, UK, 1995. [20] S. Wold, A. Rube, H. Wold, W.J. Dunn, SIAM J. Sci. Stat. Comput. 3 (1984) 735. [21] S. Wold, M. Sjostrom, L. Ericksson, Chemom. Intell. Lab. Syst. 58 (2001) 109. [22] K.P. Singh, A. Malik, N. Basant, P. Saxena Anal. Chim. Acta 584 (2007) 385. [23] T. Kimura, Y. Miyashita, K. Funatsu, S. sasaki, J, Chem. Inf. Comput. Sci. 36 (1996) 185. [24] M.W. Gardner, S.R. Dorling, Atmos. Environ. 32 (1998) 2627. [25] R. Schalkoff, Pattern Recognition: Statistical, Structural and Neural Approaches, Wiley, NY, 1992. [26] X. Du, Q. Yuan, J. Zhao, Y. Li, J. Chromatogr. A23 (2007) 165. [27] W. Gao, S. Engell, Comput. Chem. Eng. 29 (2005) 2242. [28] J.S.J. van Deventer, S.P. Liebenberg, L. Lorenzen, C. Aldrich, Miner. Eng. 8 (1995) 1489. [29] C. Acharya, S. Mohanty, L.B. Sukla, V.N. Misra, Prediction of sulphur removal with Acidithiobacillus sp. using artificial neural networks, Ecol. Model. 190 (2006) 223.

160

K.P. Singh et al. / Chemometrics and Intelligent Laboratory Systems 99 (2009) 150–160

[30] S. Aber, N. Daneshvar, S.M. Soroureddin, A. Chabok, K. Adaspour-Zeynali, Desalination 211 (2007) 87. [31] N. Prakash, S.A. Manikandan, L. Govindarajan, V. Vijaygopal, J. Hazard Mater. 152 (2008) 1268. [32] K. Yetilmezsoy, S. Demirel, J. Hazard Mater. 153 (2008) 1288. [33] D. Mohan, K.P. Singh, D. Ghosh, Environ. Sci. Technol. 39 (2005) 5076. [34] M. Martens, T. Naes, Multivariate Calibration, Wiley, Chichester, 1989. [35] C. Durante, M. Cocchi, M. Grandi, A. Marchetti, R. Bro, Chemom. Intell. Lab. Syst. 83 (2006) 54. [36] X.F. Yan, D.Z. Chen, S.X. Hu, Comput. Chem. Eng. 27 (2003) 1393. [37] M. Daszykowski, S. Semeels, K. Kaczmarck, P. VanEspen, C. Croux, B. Walczak, Chemom. Intell. Lab. Syst. 85 (2007) 269. [38] A. Smilde, R. Bro, P. Geladi, Multi-way Analysis, Application in the Chemical Sciences, John Wiley & Sons Ltd., England, 2004. [39] R. Henrion, C.A. Anderson, Chemom. Intell. Lab. Syst. 47 (1999) 189. [40] M. Smith, Neural Networks for Statistical Modelling, Van Nostrand Reinhold, NY, 1994, p. 235. [41] Q.H. Wang, Journal of Qinghai University 22 (2004) 82. [42] A.P. Dedecker, P.L.M. Goethals, W. Gabriels, N. De Pauw, Ecol. Model. 174 (2004) 161. [43] C. Karul, S. Soyupak, A.F. Cilesiz, N. Akbay, E. German, Ecol. Model. 134 (2000) 145. [44] M.T. Hagan, H.P. Demuth, M. Beale, Neural Networks Design, PWS Publishing, Boston, MA, USA, 1996.

[45] SCE Task Committee, J. Hydrologic. Eng. 5 (2000) 124 2000. [46] I. Flood, N. Kartam, Neural networks in civil engineering I: principles and understanding, J. Comput. Civil Eng. 8 (1994) 131. [47] C.J. Bowden, H.R. Maier, G.C. Dandy, Water Resour. Res. 38 (2002) 2.1. [48] V. Tomenko, S. Ahmed, V. Popov, Modelling constructed wetland treatment system performance, Ecol. Model. 205 (2007) 355. [49] J.F. Chenard, D. Caissie, Hydrol. Process. (2008) DOI: 1002/ hyp. 6928. [50] J.E. Nash, I.V. Sutcliffe, J. Hydrol. 10 (1970) 282. [51] S. Palani, S. Liong, P. Tkalich, Mar. Pollut. Bull. 56 (2008) 1586. [52] D.R. Legates, G.J. McCabe Jr., Wat. Resour. Res. 35 (1999) 233. [53] S. Platikanov, X. Puig, J. Martin, R. Tauler, Water Res. 41 (2007) 3394. [54] B. Schaefli, H.V. Gupta, Hydrol. Process. 21 (2007) 2075. [55] T. Ross, J. Appl.Biotechnol. 81 (1996) 501. [56] K. Laszlo, E. Tombacz, P. Kerepesi, Colloids Surf., A Physicochem. Eng. Asp. 13 (2004) 230. [57] K.P. Singh, A. Malik, S. Sinha, P. Ojha, J. Hazard. Mater. 150 (2008) 626. [58] E. Costa, G. Callega, L. Marijuan, Adsorp. Sci. Technol. 5 (1989) 213. [59] NIST/SEMATECH e-Handbook of Statistical Methods, 2006 http://www.itl.nist. gov/div898/handbook. [60] R. Hecht-Nielsen, Kolmogorov's mapping neural network existence theorem, Proceedings of 1st IEEE International Joint Conference of Neural Networks, Institute of Electrical and Electronics Engineers, New York, NY, 1987.