Chemical Engineering and Processing 48 (2009) 1371–1381
Contents lists available at ScienceDirect
Chemical Engineering and Processing: Process Intensification journal homepage: www.elsevier.com/locate/cep
Neural network modeling of Pb2+ removal from wastewater using electrodialysis Mohtada Sadrzadeh a , Toraj Mohammadi a,∗ , Javad Ivakpour b , Norollah Kasiri b a b
Research Centre for Membrane Separation Processes, Department of Chemical Engineering, Iran University of Science and Technology (IUST), Narmak, Tehran, Iran Computer Aided Process Engineering (CAPE) Lab, Department of Chemical Engineering, Iran University of Science and Technology (IUST), Narmak, Tehran, Iran
a r t i c l e
i n f o
Article history: Received 15 July 2008 Received in revised form 1 July 2009 Accepted 2 July 2009 Available online 10 July 2009 Keywords: Electrodialysis Neural network Wastewater treatment Metal ions
a b s t r a c t Artificial neural network (ANN) was applied to predict separation percent (SP) of lead ions from wastewater using electrodialysis (ED). The aim was to predict SP of Pb2+ as a function of concentration, temperature, flow rate and voltage. Optimum numbers of hidden layers and nodes in each layer were determined. The selected structure (4:6:2:1) was used for prediction of SP of lead ions as well as current efficiency (CE) of ED cell for different inputs in the domain of training data. The modeling results showed that there is an excellent agreement between the experimental data and the predicted values. © 2009 Elsevier B.V. All rights reserved.
1. Introduction Many industrial wastewaters produced by metal plating, metal finishing, mining, automotive, aerospace, battery and general chemical plants, often contain high concentration of heavy metals [1]. Lead is a highly toxic heavy metal. It impairs hemoglobin synthesis, particularly in children, and may cause neurological disorders. Wastes that include lead are found in paints, pipes, batteries and in some petrol types [2]. The conventional heavy metal removal processes such as chemical precipitation [3] coagulation, complexing, solvent extraction [4–6], ion exchange [7,8], biosorption [9–13] and electro-membrane processes [14–16] and ion exchange/adsorption on solid surfaces [17] has some inherent shortcomings such as requiring a large area of land, a sludge dewatering facility, skillful operators, high capital and regeneration costs and multiple basin configurations [3]. Membranes can also be used to obtain effluents without metallic contaminants. The main disadvantage of membrane processes for treatment of effluents with heavy metals is ionic size of dissolved metallic salts. These ions, as hydrated ions or as low molecular weight complexes, pass easily through all membranes with the exception of reverse osmosis membranes. However, reverse osmosis membranes are not selective, because interesting metallic ions are retained together with alkaline and alkaline-earth ions [1]. Electrodialysis (ED) is an electrochemical process for separation of ions across charged membranes from one solution to another
∗ Corresponding author. Tel.: +98 21 77240496; fax: +98 21 77240496. E-mail address:
[email protected] (T. Mohammadi). 0255-2701/$ – see front matter © 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.cep.2009.07.001
under the influence of an electrical potential difference used as a driving force. This process has been widely used for production of drinking and process water from brackish water and seawater, treatment of industrial effluents, recovery of useful materials from effluents and salt production. The basic principles and applications of ED were reviewed in the literature [18–20]. Artificial neural network (ANN) utilizes interconnected mathematical nodes or neurons to form a network that can model complex functional relationships [21]. Its development started in the 1940s to help cognitive scientists to understand the complexity of the nervous system. It has been evolved steadily and was adopted in many areas of science. Basically, ANNs are numerical structures inspired by the learning process in the human brain. They are constructed and used as alternative mathematical tools to solve a diversity of problems in the fields of system identification, forecasting, pattern recognition, classification, process control and many others [22]. In recent years, ANNs have been used as a powerful modeling tool in various membrane processes such as Membrane Filtration (FT) [23,24], Microfiltration (MF) [25–29], Ultrafiltration (UF) [30–45], Nanofiltration (NF) [46–49], Reverse Osmosis (RO) [50–55], Gas Separation (GS) [56,57], Membrane BioReactors (MBR) [58,59] and Fuel Cells (FC) [60,61]. Key features of previous studies are summarized in Table 1. According to this table, it can be concluded that: 1. RO and MF were the earliest membrane processes modeled and simulated by ANN. 2. Researchers have recently focused on modeling of GS and MBRs using ANN. However, ongoing works have been also accomplished on traditional membrane processes, lately.
1372
M. Sadrzadeh et al. / Chemical Engineering and Processing 48 (2009) 1371–1381
Table 1 Summary of researchers’ studies in modeling of membrane processes by ANN. Reference
Year
Membrane process
ANN type
Training method
ANN application
Dornier et al. [25] Niemi et al. [50] Al-Shayji and Liu [53,54]
1995 1995 1997, 2002
MF RO RO
FFNNa -MLPb FFNN-MLP FFNN-MLP
SLc -BPd SL-BP SL-BP
Bowen et al. [32,33,36] Delgrange et al. [30,31]
1998, 2001 1998–2002
UF UF
FFNN-MLP FFNN-MLP
Hamachi et al. [27] Teodosiu et al. [34] Bowen et al. Jafar and Zilouchian [55] Cabassud et al. [38]
1999 2000 2002 2001 2002
MF UF NF RO UF
RNf FFNN-MLP FFNN-MLP FFNN-MLP & RBFj RN
SL-BP SL-BPQNLAe USLg SL-BP CGMh -SCGi SL-BP USL
Bhattacharjee and Singh [37] Razavi et al. [40,41] Shetty et al. [47]
2002 2003 2003
UF UF NF
FFNN-MLP FFNN-MLP FFNN-MLP
Lee et al. [60] Aydiner et al. [28] Chellam [29]
2004 2005 2005
FC MF MF
FFNN-MLP FFNN-MLP FFNN-MLP
SL-BP SL-BP SL-BPLMTk SL-BP SL-BP SL-BP-LMT
Zhao et al. [51] Abbas and Al-Bastaki [52] Rai et al. [43] Geissler et al. [58] Ou et al. [61] Chen and Kim [24] Cinar et al. [59] Curcio et al. [45] Wang et al. [57] Sahoo and Ray [23] Shahsavand and Pourafshari Chenar [56] Al-Zoubi et al. [49]
2005 2005 2005 2005 2005 2006 2006 2006 2006 2006 2007
RO & NF RO UF MBR FC FT MBR UF GS FT GS
FFNN-MLP & NRBFl FFNN-MLP FFNN-MLP SRNm (ENn ) FFNN-MLP & RBF FFNN-MLP & RBF CFNNo -MLP FFNN-MLP FFNN-RBF FFNN-RBF FFNN-MLP & RBF
SL-BP SL-BP-LMT SL-BP SL-BP SL-BP SL-BP-LMT SL-BP SL-BP SL-BP SL-BP-LMT SL-BP
2007
NF
FFNN-MLP
SL-BP
Modeling of crossflow MF Simulation of RO membrane separation Modeling and optimization of large-scale commercial water desalination plants Prediction of the rate of crossflow membrane UF of colloids Modeling of UF fouling, prediction of UF transmembrane pressure in drinking water production Modeling of crossflow MF of bentonite suspension Predicting the time dependence of flux evolution in UF Predicting salt rejections at NF Modeling of an RO water desalination Predictive control algorithm to improve the productivity of an UF Modeling of a continuous stirred UF process Prediction of milk UF performance Prediction of contaminant removal and membrane fouling during municipal drinking water NF Modeling of polymer electrolyte membrane FC performance Modeling of flux decline in crossflow MF Modeling of transient crossflow MF of polydispersed suspensions Predicting RO/NF water quality Modeling of an RO water desalination Modeling batch UF of synthetic fruit juice and mosambi juice Modeling of capillary modules in MBR Modeling of proton exchange membranes Prediction of permeate flux decline in crossflow membrane FT Modeling of submerged MBR treating cheese whey wastewater Reduction and control of flux decline in crossflow UF Modeling of hydrogen recovery from refinery gases Prediction of flux decline in crossflow membranes Modeling the separation of CO2 from CH4 using hollow fiber module Modeling the rejection of sulphate and potassium salts by NF
a b c d e f g h i j k l m n o
Feed Forward Neural network. Multi-Layer Perceptron. Supervised Learning. Back Propagation. Quasi Newton Learning Algorithm. Recurrent Network. Unsupervised Learning. Conjugate Gradient Method. Scaled Conjugate Gradient. Radial Basis Function. Levenberg–Marquardt Technique. Normalized Radial Basis Function. Simple Recurrent Network. Elman Network. Cascade Forward Neural Network.
3. ANN modeling of UF was most frequently observed in the literature. 4. The most popular form of ANN in use was feedforward neural network (FFNN) modeling. 5. Of FFNN, Multilayer perceptron (MLP) & Radial Basis Function (RBF) were by far the most widely used in membrane processes. 6. Almost all the researchers applied Back Propagation (BP) training method for their ANN. 7. RO and NF water desalination processes was modeled by ANN previously. 8. No record for modeling of ED desalination by ANN was found in the literature. The objective of this paper is to develop a MLP neural network model in order to predict separation percent (SP) of lead ions during ED of wastewater. 81 experimental data were initially distributed to three subsets of training (50), validation (21) and testing (10). Then, the appropriate neurons’ transfer functions, ANN structure,
training algorithm and initial weights were selected. After that current efficiency of ED cell was predicted using the selected network. Finally, generalization of the selected structure was tested and optimum values of operating parameters were selected based on both SP and CE. The results showed that ANN modeling could give excellent agreement with experimental results, with average errors less than 1%.
2. Theory The objective of a neural network is to compute output values from input values by some internal calculations [31]. Basic component of a neural network is the neuron, also called “node”. Fig. 1 illustrates a single node of a neural network. The inputs are represented by a1 , a2 and an , and the output by Oj . There can be many input signals to a node. The node manipulates these inputs to give a single output signal [62].
M. Sadrzadeh et al. / Chemical Engineering and Processing 48 (2009) 1371–1381
Fig. 1. Single node anatomy.
Fig. 2. Schematic view of an ED cell.
The values w1j , w2j , and wnj , are weight factors associated with the inputs to the node. Weights are adaptive coefficients within the network that determine the intensity of the input signal. Every input (a1 , a2 , . . ., an ) is multiplied by its corresponding weight factor (w1j , w2j , . . . , wnj ), and the node uses summation of these weighted inputs (w1j a1 , w2j a2 , . . . , wnj an ) to perform further calculations. In the initial setup of a neural network, weight factors usually are chosen randomly according to a specified statistical distribution. The other input to the node, bj , is the node’s internal threshold, also called bias. This is a randomly chosen value that governs the node’s net input through the following equation: ui =
n
(wij ai ) − bj
(1)
i=1
The node’s output is determined using a mathematical operation on the node’s net input. This operation is called a transfer function. The transfer function can transform the node’s net input in a linear or non-linear manner. In the present work Sigmoid transfer function was applied: f (x) =
1 1 + e−x
0 ≤ f (x) ≤ 1
1373
(2)
The neuron’s output, Oj , is found by performing one of these functions on the neuron’s net input, uj . Neural networks are made of several neurons that perform in parallel or in sequence. The weight factors and thresholds are adjusted in training process. Training is the process by which the weights are adjusted systematically so that the network can predict the correct outputs for a given set of inputs. There are many different types of training algorithms. One of the most common classes of training algorithms for FFNNs is called BP. Many other methods have been developed based on BP [62,63]. Mathematical aspects of all training algorithms are comprehensively described in the literature [64–67]. Three subsets of data are used to build a model: training, validation and testing subsets. Training data set is used in training procedure. Validation data set is used to check the generality of network. If the network is trained sufficiently, the network output will differ only slightly from the actual output data. In initial steps of training, errors of both training and validation data are reduced. After several steps, the error of training data decreases while that of validation data increases. As a result, the network is overtrained and its generality decreases. Hence, the training process must be continued until the validation data error decreases. Testing data set is used to test the trained network for unseen patterns (whose results are known for researcher but not used in training procedure). The network generalizes well when it sensibly interpolates these new patterns.
3. Materials and methods 3.1. Materials An analytical grade salt (99.9% lead nitrate supplied by Merck) and deionized water were used in all experiments to produce solutions with wastewater qualities. The purpose of these experiments was to study the effects of voltage, flow rate, temperature and feed concentration on the ED cell performance. 3.2. Cell and membranes The ED cell was packed with a pair of cation and anion exchange membranes (CEM and AEM) and a pair of platinum electrodes (anode and cathode). Both electrodes were made of pure platinum. Area of each electrode was 4.2 cm × 4.2 cm. Thickness of dilution cell (center) is 4 mm and thickness of each concentrate cell (left and right) was 3 mm. Schematic view of the applied ED cell is presented in Fig. 2. Lead nitrate solution is introduced into the three compartments. When a DC potential is applied between two electrodes, positively charged lead ions move toward the cathode, pass through the negatively charged CEM and are retained by the positively charged AEM. On the other hand, nitrate ions move toward the anode, pass through the AEM and are retained by the CEM. At the end, ion concentration increases in the side compartments with a simultaneous decrease of ion concentration in the middle compartments. Electrical potential is the driving force for transport of ions in ED. This potential is applied from an external power supply through electrodes situated on either end of the ED cell. Current flow in the circuit external to the stack is electronic, i.e., electrons flow through metallic conductors. However, current flow within the stack is electrolytic, i.e., ions flow through solutions. The transfer of electrical charge from electronic to electrolytic conduction is accomplished via electrode reactions [68]: Anodereaction Cathodereaction
H2 O → 1/2O2 (g) + 2H+ + 2e− 2H2 O + 2e− → H2 (g) + 2OH−
O2 and H2 gas bubbles produced by the anode and the cathode reactions can blanket the electrodes and increase the resistance of the ED cell [69]. High values of pH at cathode can cause precipitation of any pH sensitive materials (Pb2+ ions in this study reacts with hydroxide ions and produce Pb(OH)2 precipitate) on CEM, which can raise resistance and even make the ED cell inoperative [69]. At lower flow rates, precipitation is more important. The precipitate was removed by cleaning in place (CIP) using distilled water after each run [20].
1374
M. Sadrzadeh et al. / Chemical Engineering and Processing 48 (2009) 1371–1381
Fig. 3. Schematic view of ED setup.
AR204SXR412 and CR67, MK111 anion and cation exchange membranes supplied by Arak petrochemical complex and made by Ionics incorporated were used in all experiments. Effective area of each membrane was 6 cm × 6.5 cm. IECs of anion and cation exchange membranes were 2.8 and 2.4 meq/g dry membrane, respectively. 3.3. ED setup
3.4. Experimental design Many parameters affect performance of the ED cell. According to our previous studies [2,20,70], four parameters were selected. It is believed that they have the greatest effect on SP: feed concentration, dilute solution flow rate, voltage and feed temperature. Four factors with their levels were studied based on the full factorial design: Temperature (T): 25, 40 and 60 ◦ C. Concentration (C): 100, 500 and 1000 ppm. Flow rate (Qf ): 0.07, 0.7 and 1.2 mL/s. Voltage (V): 10, 20 and 30 V.
SP defined as follows was used as a criterion of the cell performance: SP =
C0 − C × 100 C0
CE =
(3)
where C0 and C are feed and dilute concentrations, respectively. Each experiment was lasted for about 15 min to reach steady state condition. Three samples were taken every 5 min and the average value was reported. Another important parameter that determines the optimum range of applicability is CE. CE is a measure of how effective ions are transported across the ion exchange membranes for a given applied
zFQf (C0 − C) NI
(4)
By substitution of SP in Eq. (4), the following equation for CE is derived: CE =
ED setup consists of a feed tank (TK-01) where waste water is stored, two pumps (P-01 and P-02), a rectifier (DC-01) and two globe valves (GB-01 and GB-02) to control feed flow rate in three compartments of a self-designed ED cell. A simplified diagram of the setup is shown in Fig. 3.
• • • •
current. It is calculated using the following equation:
zFQf C0 (SP) 100 NI
(5)
where z is the charge of ion, F is Faraday constant (96,485 A s/mol), Qf is the dilute flow rate (m3 /s), N is the number of cell pairs (1 in this study) and I is the current (A). CE was calculated using data presented in this study and our previous studies [2,70]. Substituting z = 2, F = 96,485 A s/mol, C0 = 0.33, 1.65 and 3.30 mol/m3 , N = 1, Qf = 0.07 × 10−6 , 0.7 × 10−6 and 1.2 × 10−6 m3 /s, in Eq. (5) CE values calculated to be in the range of 0.60–45.14. It should be noted that I (in Eq. (5)) was measured in each run using basic electrochemistry rules [70]. Typical experimental results are presented in Table 2. SP in this table was calculated using Eq. (3). CE was calculated using either Eqs. (4) and (5). Low CEs, as in this study, indicate water splitting in the dilute or concentrate streams, non-perfect permselectivity of membranes, shunt or stray currents between the electrodes running specially in the non-active cell area and physical leakage or back-diffusion of ions from the concentrate to the dilute compartments (leading to impurities in the products). These can be minimized by cell design features, and by limiting the cell voltage, as well as the conductivity ratio between the dilute and concentrate streams. In the present work, water splitting in the concentrate and the dilute compartments due to the high voltages (10–30 V) and the low feed concentrations (100–1000 ppm) as well as the non-active cell area (The area of electrodes, 17.6, is smaller than that of the membranes, 39.0 cm2 .) led to low values of CE.
3.5. Analytical method Concentration of cations (Pb2+ ) only in the dilute compartment was measured at various operating conditions. In all experiments, atomic absorption (Shimadzu, AA-670) was used to measure the amount of lead ions in water.
M. Sadrzadeh et al. / Chemical Engineering and Processing 48 (2009) 1371–1381
1375
Table 2 Typical experimental results indicating SP and CE calculation using feed (C0 ) and dilute concentration (C). Operating parameters ◦
Responses
T ( C)
C0 (ppm)
V (V)
I (mA)
Q (mL/s)
C (dilute concentration)
SP
CE
298.15 298.15 298.15 298.15 298.15 298.15 298.15 298.15 298.15 313.15 313.15 313.15 313.15 313.15 313.15 313.15 313.15 313.15 333.15 333.15 333.15 333.15 333.15 333.15 333.15 333.15 333.15
100 100 100 100 100 100 100 100 100 500 500 500 500 500 500 500 500 500 1000 1000 1000 1000 1000 1000 1000 1000 1000
10 20 30 10 20 30 10 20 30 10 20 30 10 20 30 10 20 30 10 20 30 10 20 30 10 20 30
0.08 0.15 0.23 0.10 0.19 0.29 0.10 0.19 0.29 0.46 0.82 1.21 0.62 1.13 1.68 0.63 1.14 1.70 1.23 2.15 3.19 1.66 3.02 4.48 1.68 3.06 4.59
0.07 0.07 0.07 0.70 0.70 0.70 1.20 1.20 1.20 0.07 0.07 0.07 0.70 0.70 0.70 1.20 1.20 1.20 0.07 0.07 0.07 0.70 0.70 0.70 1.20 1.20 1.20
74.92 58.01 56.31 99.74 98.99 98.61 99.75 99.24 99.05 209.35 124.78 116.28 450.41 365.84 357.33 460.85 376.28 367.78 353.98 184.85 167.83 836.10 666.97 649.95 856.98 687.85 689.00
25.08 41.99 43.70 0.26 1.01 1.39 0.25 0.76 0.95 58.13 75.04 76.75 9.92 26.83 28.53 7.83 24.74 26.45 64.60 81.52 83.22 16.39 33.30 35.01 14.30 31.22 31.10
7.53 6.92 4.85 0.70 1.35 1.24 1.15 1.73 1.45 18.64 13.37 9.22 25.13 36.59 26.15 33.73 57.30 41.15 18.41 12.77 8.78 38.25 41.28 29.12 56.83 65.81 43.69
3.6. ANN modeling of ED
4. Results and discussion
ANN input parameters were carefully selected to only include physically meaningful and easily to measurable membrane operational and feed water quality variables as cell voltage and feed flow rate, temperature and concentration. Totally 81 experimental data are collected and used for ANN modeling of ED for training/validation/testing subsets. In this work, feedforward multilayer neural network with one or two hidden layers were employed for modeling of ED. They were used to transform input data (concentration, temperature, flow rate and voltage) into a desired response (SP or CE). With the aid of hidden layers, they can approximate virtually any input–output map. The well trained neural network can be used for prediction purpose. Fig. 4 illustrates the structure of a typical ANN used for modeling of ED. In this study, structures with two hidden layers were also investigated in most of the cases.
Five important aspects that must be determined in design procedure of ANN are as follows: • Data distribution in three subsets (training, validation and testing). • Selection of neurons’ transfer functions. • Selection of ANN structure. • Selection of training algorithm and its parameters. • Selection of initial weights. • Testing the ANN generalization. 4.1. Training, validation and testing data As mentioned, MLP feedforward model was used in this study. The total 81 experimental data were randomly divided into three
Fig. 4. Structure of the multilayer ANN used for modeling of ED.
1376
M. Sadrzadeh et al. / Chemical Engineering and Processing 48 (2009) 1371–1381
solution, the Levenberg–Marquardt training method was selected as a proper training algorithm in agreement with the literature [23,24,29,46,47,52,63]. Another important factor in ANN design is the type of transfer functions. ANNs owe their non-linear capability to the use of non-linear transfer functions [48]. Different transfer functions can be used for neurons in the different layers. Different transfer functions were examined in each layer, separately and with respect to the mean squared error (MSE) of testing data, the proper transfer functions were chosen. MSE is calculated as follows:
MSE =
N
(SPcal − SPexp )2 N
(6)
where subscripts cal and exp denote calculated and experimental values of SP, respectively. N is the number of validation and training data. Among different transfer functions available in Matlab, log sigmoid function was selected for all neurons due to its better prediction performance than other transfer functions. The log sigmoid function is bounded between 0 and 1, so the input and output data should be normalized to the same range as the transfer function used. In other words, the logarithmic sigmoid transfer function gives scaled outputs (SP) in this range (0–1). 4.3. ANN structure
Fig. 5. Distribution of (a) training, (b) validation, and (c) testing data subsets.
subsets of training, validation and testing for developing ANN model. Distribution of these data is shown in Fig. 5. 50 training data were used to update the network weights and biases. In order to check the generality of network prediction and to prevent the data overfitting, 21 validation data were applied. The rest of data was used to test the neural network. During the initial phase of training, validation and training errors normally decrease. However, when the network begins to overfit the data, the validation set error typically increases. Termination of training procedure at a proper time, i.e., when the minimum validation error is achieved, results in a generalized predictor network. 4.2. Training algorithm and transfer function The MLP networks were created in the neural network toolbox of Matlab with newff function. Performances of different training algorithms were studied for a specified network with four layers (1 input layer/2 hidden layer/1 output layer). Due to the convergence speed and the performance of network to find better
Network structure has significant effects on the predicted results. The number of input and output nodes, as mentioned before, is equivalent to the number of input and output data, respectively (4 and 1 in this work). However, the optimal number of hidden layers and the optimal number of nodes in each layer, are case dependent and there is no straightforward method for determination of them. Hornik showed that MLP feedforward networks with one hidden layer and sufficiently large neurons can map any input to each output to an arbitrary degree of accuracy [71]. However, Flood and Kartam reported that many functions are difficult to approximate well with one hidden layer [72]. They revealed that use of more than one hidden layer provides greater flexibility and enables the approximation of complex functions with fewer neurons. Baughman and Liu found out that adding a second hidden layer improves the network prediction capability significantly without having any detrimental effects on the generalization of the testing data set [64]. However, it was observed that adding a third hidden layer results in a prediction capability similar to that of two hidden layer network, but it requires longer training times due to its more complex structure. In this study, optimum number of hidden layers and optimum number of nodes in each layer were determined using two nested for/next loop, one for setting the number of neurons in the first hidden layer (1–7 neurons) and another for neurons in the second hidden layer (0–7 neurons). Regarding the number of training data, only structures with number of weights lower than 50 were investigated. Another important factor that affects the performance of networks is selection of the initial weights. Typically, the weight factors are set randomly using either normal or Gaussian distribution. The wrong choice of initial weights can lead to the local minimum values and therefore bad performance of the networks. In order to prevent these phenomena, 20 runs were performed for each network structure using different random values of initial weights. Then, the best trained network was selected as a trained network for that structure. The optimum structure was selected based on the minimum MSE of validation data acquired during 5000 epochs for each run. Fig. 6 illustrates the performances of different networks. The number of neurons in each hidden layer can be evaluated from
M. Sadrzadeh et al. / Chemical Engineering and Processing 48 (2009) 1371–1381
1377
Fig. 6. Performances of various applied networks structures with the number of weights lower than the number of training data.
the structure number as follows: Structure number = j × 7 + i
(7)
where i and j are the number of neurons in the first and second layers, respectively. Obviously, the networks with structure numbers of 1–7 (j = 0) have only one hidden layer. Between these networks, the 4:5:1 network had the best performance regarding the MSE values of validation data set. The 20th structure (6 neurons in the first hidden layer and 2 neurons in the second) demonstrated the best performance. The peaks in Fig. 6 represent structures with one neuron at the first hidden layer. As observed, these structures do not fit training and validation data sets precisely. As illustrated in this figure, MSE of training data decreases with increasing the number of neurons in the second hidden layer at initial steps and reaches to a minimum value at the moderate numbers. Further increase in number of neurons overfits the training data and enhances MSE of validation data. For example, the 4:3:6:1 network had an
Fig. 8. Performance of 4:6:2:1 network at the minimum MSE of CE validation data.
acceptable MSE of training data set but its performance for the validation data set was poor. The performance of this structure was compared with the optimum structure as presented in the next section. 4.4. ANN predictability The experimental results versus neural network predictions for three subsets of data are plotted in Fig. 7. Performance of the selected network (4:6:2:1) is depicted in Fig. 7a. This network is
Fig. 7. Performance of (a) 4:6:2:1 network at the minimum MSE of SP validation data, (b) 4:5:1 network at the minimum MSE of SP validation data, (c) 4:3:6:1 network at the minimum MSE of SP validation data, and (d) 4:3:6:1 network at the minimum MSE of SP training data.
1378
M. Sadrzadeh et al. / Chemical Engineering and Processing 48 (2009) 1371–1381
Fig. 9. Generalization performances of optimal ANN, effects of (a) flow rate and temperature at 500 ppm and 30 V, (b) flow rate and concentration at 30 V and 60 ◦ C, (c) voltage and temperature at 500 ppm and 0.07 mL/s, (d) voltage and flow rate at 500 ppm and 60 ◦ C, (e) voltage and concentration at 0.07 mL/s and 60 ◦ C, and (f) concentration and temperature at 0.07 mL/s and 30 V on SP. Bold symbols are experimental data and star signs are experimental data at intermediate levels.
compared with two other networks (4:5:1 and 4:3:6:1) as shown in Fig. 7b and c. Fig. 7c and d shows performances of 4:3:6:1 network at minimum MSE of validation and training data, respectively. According to the prediction results for testing data, shown in Fig. 7a–c, it is found that performance of the selected network (4:6:2:1) is better than that of the two other networks. Taking a closer look to Fig. 7a and b, it is concluded that two hidden layer structure performs better than one hidden layer structure. Considering Fig. 7a and c, it was found out that (for the structure with relatively same number of neurons and also same number of weights) having higher number of neurons in the first hidden layer is more efficient than having higher number of neurons in the second hidden layer. Fig. 7d exhibits overfitting problem in 4:3:6:1 network at higher number of training epochs, the overfitted network predicts training data very well, while its predictions for validation and testing data are
relatively poor. As a result, the better performance of the 4:6:2:1 network is confirmed. The experimental CE values versus 4:6:2:1 ANN predictions for three subsets of data are plotted in Fig. 8. As can be seen, the selected network is able to predict CE of ED cell excellently. According to Eq. (5), CE is a dependent variable of SP. Hence, after selecting the optimum network, CE values can also be calculated using predicted values of SP. 4.5. ANN generalization After selection of a proper network, it can be used for prediction of SP and CE for different inputs in the domain of training data. The aim was to investigate ED behavior with operating parameters and find optimum parameters which provide the best SP along with CE. Having determined the SP values at different inputs by the network,
M. Sadrzadeh et al. / Chemical Engineering and Processing 48 (2009) 1371–1381
1379
Fig. 10. Generalization performances of optimal ANN, effects of (a) flow rate and voltage at 500 ppm and 40 ◦ C, (b) voltage and concentration at 20 V and 40 ◦ C, (c) temperature and flow rate at 500 ppm and 20 V on CE, and (d) concentration and temperature at 20 V and 0.07 mL/s. Lines are ANN predictions and symbols are experimental data.
the predicted values can be shown as the mesh plot. In Fig. 9, SP was plotted versus operating parameters in 3D plots. In these figures, the 4:6:2:1 network predictions are indicated by the surfaces and experimental data indicated by the bold points for comparison. Furthermore, a number of experimental data at intermediate levels of inputs variables were collected and indicated in these figures by star signs. Regarding these points, it can be found that the selected network can predict the SP accurately even in the intermediate levels. As can be seen, increasing temperature (Fig. 9a, c, and f), concentration (Fig. 9b, e, and f) and voltage (Fig. 9c, d, and e) increases SP values. It is obvious due to the fact that increasing temperature and concentration decreases the electrical resistance of solution, while increasing voltage increases the driving force. At higher flow rates, SP values decreases because the more flow rate means the less residence time, and thus, ions that are between the membranes do not have enough time to transfer through them (Fig. 9a, b, and d) [2,20,70]. ANN predictions and experimental data concerning to the effect of operating parameters on CE is shown in Fig. 10. Two other parameters, in all figures, are set at their medium levels, i.e., 313.15 K, 20 V, 0.7 × 10−6 m3 /s and 500 ppm. According to Fig. 10a, increasing flow rate increases CE at all cell voltages. However, CE increases more slightly with increasing flow rate at lower levels of cell voltage. Similar results were obtained for flow rate versus CE at various temperatures and feed concentrations. At lower flow rates, optimum CE was achieved at lower values of temperature and cell voltage. These results can be rationalized by Eq. (5). SP, in this equation is itself dependent on cell voltage (or current intensity), flow rate, temperature and feed concentration. All these parameters have a dual effect on SP and CE. For example, increasing current intensity in the denominator of Eq. (5) decreases CE in one hand but increases it through a simultaneous increase in SP, on the other hand. For the case of other parameters, a similar behavior was observed. Hence, it cannot determinedly concluded
that increasing flow rate and feed concentration or decreasing voltage and temperature enhances the values of CE at all operating conditions. In the case of feed flow rate, increasing flow rate in the numerator of Eq. (5) increases CE but tends to decrease it via reduction of SP. As indicated in Fig. 9a, b, and d, SP approaches to a constant minimum value at higher flow rates. Hence, CE increases continually with increasing flow rate. According to Fig. 10b and c, increasing cell voltage and temperature increase CE up to a maximum value then decrease it. Fig. 9 shows that increasing these parameters enhance SP up to constant maximum values. Dual effect of increasing these parameters on SP and CE (increase of SP and reduction of CE) results in graphs with maximum values of CE at medium levels of parameters, as shown in Fig. 10b and c. As indicated in Fig. 10d, increasing feed concentration increases CE at all temperatures. It is due to the fact that both C0 and SP in the numerator of Eq. (5) increase, simultaneously. However, since SP reaches to an almost constant value at higher feed concentration, CE increases slowly between 500 and 1000 ppm. At various cell voltages and flow rates, similar results were obtained which were not reported here for brevity. Figs. 9 and 10 confirm that, operating conditions at which optimum values of CE are obtained are not exactly those where optimum values of SP are achieved. CE, as presented in Eq. (5), is directly dependent on feed concentration (Ci ), SP and flow rate of handled solution (Qf ) and inversely dependent on cell voltage. When an ED cell treats a great deal of concentrated aqueous solution at ambient temperature using minimum cell voltage (minimum energy consumption) and this results in a maximum SP, CE reaches to its maximum value. Hence, it may be concluded that maximum value of CE must be achieved at cell voltage of 10 V, feed concentration of 1000 ppm, flow rate of 1.2 × 10−6 m3 /s and temperature of 298.15 K. This result is correct to some extent. However, dual effect of temperature and cell voltage on SP and CE should be taken into account. It should be noted that, in spite of the fact that conducting experi-
1380
M. Sadrzadeh et al. / Chemical Engineering and Processing 48 (2009) 1371–1381
ments at 10 V cell voltage and 298.15 K feed temperature minimizes energy consumption, however does not increase the value of SP significantly. Facing a trade-off between low energy consumption (at lower cell voltages and temperatures) and high SP (at higher cell voltages and temperatures), medium levels of cell voltage and temperature (20 V and 313.15 K, respectively) and higher levels of feed concentration and flow rate (1000 ppm and 1.2 × 10−6 m3 /s, respectively) rationally lead to the optimum values of CE. Taking a closer look to Fig. 9, it was found that the differences between SP values regarding medium and high levels of parameters are negligible comparing to those regarding low and medium levels, i.e., at higher values of parameters, almost constant values of SP are achieved. As a result, regarding both CE and SP as the performance indicators of ED cell, medium values of all parameters were found out to be optimum values. The corresponding generalization performances of the selected structure (4:6:2:1 network), as shown in Figs. 9 and 10, show no oscillations, and this confirms an excellent prediction performance of ANN. ANN predictions can also be used for optimization purposes for the applied ED setup in the range of operating parameters considered in the present work. 5. Conclusion ANN was employed as an interesting method for modeling of ED; a wastewater treatment process. A multilayer network (FFNNMLP), with two hidden layers (4:6:2:1), was applied to predict SP of Pb2+ ions in the dilute compartment and CE of a laboratory scale ED cell. Levenberg–Marquardt algorithm was used to train the network. ANN successfully tracked the non-linear behavior of SP and CE versus temperature, voltage, concentration and flow rate with standard deviation not more than 1%. For almost all experiments, the ANN was confirmed to be an adequate interpolation tool, where good prediction was obtained. ANN modeling technique has many favorable features such as efficiency, generalization and simplicity, which make it an attractive choice for modeling of complex systems, such as wastewater treatment processes. References [1] P. Canizares, A. Perez, R. Camarillo, Recovery of heavy metals by means of ultrafiltration with water-soluble polymers: calculation of design parameters, Desalination 144 (2002) 279–285. [2] T. Mohammadi, A. Razmi, M. Sadrzadeh, Effect of operating parameters on Pb2+ separation from wastewater using electrodialysis, Desalination 167 (2004) 379–385. [3] P. Zhou, J.C. Huang, A.W.F. Li, S. Wei, Heavy metal removal from wastewater in fluidized bed reactor, Water Res. 33 (8) (1999) 1918–1924. [4] S.R. E1-Hasani, S.M. AI-Dhaheri, M.S. EI-Maazawi, M.M. Kamal, Polarographic and voltammetric determination of some toxic heavy metal ions in the treated wastewater at Abu-Dhabi, UAE, Water Sci. Technol. 40 (7) (1999) 67–74. [5] E. Rodryguez de San Miguel, J.C. Aguilar, M.T.J. Rodryguez, J. De Gyves, Solvent extraction of Ga (III), Cd (II), Fe (III), Zn (II), Cu (II), and Pb (II) with ADOGEN 364 dissolved in kerosene from 1–4 tool dmy3 HCI media, Hydrometallurgy 57 (2000) 151–165. [6] A.R.K. Dapaah, N. Takano, A. Ayame, Solvent extraction of Pb(II) from acid medium with zinc hexamamethylenedithiocarbamate followed by backextraction and subsequent determination by FAAS, Anal. Chim. Acta 386 (1999) 281–286. [7] V.J. Inglezakis, M.D. Loizidou, H.P. Grgorooulou, Equilibrium and kinetic ion exchange studies of Pb2+ , Cr3+ , Fe3+ and Cu2+ on natural clinoptilolite, Water Res. 36 (11) (2002) 2784–2792. [8] M.V. Mier, R.L.P. Callejas, R. Gehr, B.E.J. Cisneros, P.J.J. Alvarez, Heavy metal removal with Mexican clinoptilolite: multi-component ionic exchange, Water Res. 35 (2) (2001) 373–378. [9] A. Vecchio, C. Finoli, D. Di Simine, V. Andreoni, Heavy metal biosorption by bacterial cells, Fresenius, J. Anal. Chem. 361 (4) (1998) 338–342. [10] M.L. Merroun, N. Ben Omar, M.T. Gonzalez Munoz, J.M. Arias, Removal of lead from aqueous solutions by Myxococcus xanthus, Int. Biodet. Biodeg. 37 (3-4) (1996) 241. [11] M. Bustard, A.P. McHale, Biosorption of heavy metals by distillery-derived biomass, Bioprocess Eng. 19 (5) (1998) 351–353.
[12] A.A. Pradhan, A.D. Levine, Microbial biosorption of copper and lead from aqueous systems, Sci. Total Environ. 170 (1995) 209–220. [13] J.M. Brady, J.M. Tobin, Binding of hard and soft metal ions to Rhizopus arrhizus biomass, Enzyme Microb. Technol. 17 (1995) 791–796. [14] V. Gopal, G.C. April, V.N. Schrodt, Selective lead ion recovery from multiple cation waste streams using the membrane–electrode process, Sep. Purif. Technol. 14 (1998) 85–93. [15] K. Basta, A. Aliane, A. Lounis, R. Sandeaux, J. Sandeaux, C. Gavach, Electroextraction of Pb2+ ions from dilute solutions by a process combining ion-exchange textiles and membranes, Desalination 120 (1998) 175–184. [16] R.S. Juang, S.W. Wang, Metal recovery and EDTA recycling from simulated washing effluents of metal contaminated soils, Water Res. 34 (15) (2001) 3795–3803. [17] G. Banerjee, S. Sarker, The role of salvinia rotundifolia in scavenging aquatic Pb(II) pollution: a case study, Bioprocess Eng. 17 (1997) 295–300. [18] W.S. Winston Ho, K.K. Sirkar, Membrane Handbook, Champan & Hall, 1992. [19] T. Mohammadi, A. Kaviani, Water shortage and seawater desalination by electrodialysis, Desalination 158 (2003) 267–270. [20] T. Mohammadi, A. Moheb, M. Sadrzadeh, A. Razmi, Separation of copper ions by electrodialysis using Taguchi experimental design, Desalination 169 (2004) 21–31. [21] W. Sha, K.L. Edwards, The use of artificial neural networks in materials science based research, Mater. Des. 28 (2007) 1747–1752. [22] F.S. Mjalli, S. Al-Asheh, H.E. Alfadala, Use of artificial neural network black-box modeling for the prediction of wastewater treatment plants performance, J. Environ. Manage. 83 (2007) 329–338. [23] G.B. Sahoo, C. Ray, Predicting flux decline in cross-flow membranes using artificial neural networks and genetic algorithms, J. Membr. Sci. 283 (2006) 147–157. [24] H. Chen, A.S. Kim, Prediction of permeate flux decline in cross-flow membrane filtration of colloidal suspension: a radial basis function neural network approach, Desalination 192 (2006) 415–428. [25] M. Dornier, M. Decloux, G. Trystram, A. Lebert, Dynamic modeling of crossflow microfiltration using neural networks, J. Membr. Sci. 98 (1995) 263–273. [26] E. Piron, E. Latrille, F. Rene, Application of artificial neural networks for crossflow microfiltration modelling: “black-box” and semiphysical approaches, Comp. Chem. Eng. 21 (9) (1997) 1021–1030. [27] M. Hamachi, M. Cabassud, A. Davin, M.M. Peuchot, Dynamic modeling of crossflow microfiltration of bentonite suspension using recurrent neural networks, Chem. Eng. Process. 38 (1999) 203–210. [28] C. Aydiner, I. Demir, E. Yildiz, Modeling of flux decline in crossflow microfiltration using neural networks: the case of phosphate removal, J. Membr. Sci. 248 (2005) 53–62. [29] S. Chellam, Artificial neural network model for transient crossflow microfiltration of polydispersed suspensions, J. Membr. Sci (1–2) (2005) 35–42. [30] N. Delgrange, C. Cabassud, M. Cabassud, L. Durand-Bourli, J.M. Laine, Modeling of ultrafiltration fouling by neural network, Desalination 118 (1998) 213–227. [31] N. Delgrange, C. Cabassud, M. Cabassud, L. Durand-Boulier, J.M. Laine, Neural networks for prediction of ultrafiltration transmembrane pressure—application to drinking water production, J. Membr. Sci. 150 (1998) 111–123. [32] W.R. Bowen, M.G. Jones, H.N.S. Yousef, Dynamic ultrafiltration of proteins—a neural network approach, J. Membr. Sci. 146 (1998) 225–235. [33] W.R. Bowen, M.G. Jones, H.N.S. Yousef, Prediction of the rate of crossflow membrane ultrafiltration of colloids: a neural network approach, Chem. Eng. Sci. 53 (22) (1998) 3793–3802. [34] C. Teodosiu, O. Pastravanu, M. Macoveanu, Neural network models for ultrafiltration and backwashing, Water Res. 34 (18) (2000) 4371–4380. [35] N. Delgrange-Vincent, C. Cabassud, M. Cabassud, L. Durand-Bourlier, J.M. Laine, Neural networks for long term prediction of fouling and backwash efficiency in ultrafiltration for drinking water production, Desalination 131 (2000) 353– 362. [36] W.R. Bowen, H.N.S. Yousef, J.I. Calvo, Dynamic crossflow ultrafiltration of colloids: a deposition probability cake filtration approach, Sep. Purif. Technol. 24 (2001) 297–308. [37] C. Bhattacharjee, M. Singh, Studies on the applicability of artificial neural network (ANN) in continuous stirred ultrafiltration, Chem. Eng. Technol. 25 (12) (2002) 1187–1192. [38] M. Cabassud, N. Delgrange-Vincent, C. Cabassud, L. Durand-Bourlier, J.M. Laine, Neural networks: a tool to improve UF plant productivity, Desalination 145 (2002) 223–231. [39] S. Curcio, V. Calabro, G. Iorio, Monitoring and control of TMP and feed flow rate pulsatile operations during ultrafiltration in a membrane module, Desalination 145 (1–3) (2002) 217–222. [40] S.M.A. Razavi, S.M. Mousavi, S.A. Mortazavi, Dynamic prediction of milk ultrafiltration performance: a neural network approach, Chem. Eng. Sci. 58 (2003) 4185–4195. [41] S.M.A. Razavi, S.M. Mousavi, S.A. Mortazavi, Dynamic modeling of milk ultrafiltration by artificial neural network, J. Membr. Sci. 220 (18) (2003) 47–58. [42] S. Curcio, V. Calabro, G. Iorio, Integration of Matlab and LabWiew to design innovative control systems applicable to membrane processes, in: Proceedings of the EuroMembrane 2004, Hamburg, Germany, September 28–October 01, 2004, 2004. [43] P. Rai, G.C. Majumdar, S. DasGupta, S. De, Modeling the performance of batch ultrafiltration of synthetic fruit juice and mosambi juice using artificial neural network, J. Food Eng. 71 (3) (2005) 273–281. [44] S. Curcio, G. Scilingo, V. Calabro, G. Iorio, Ultrafiltration of BSA in pulsating conditions: an artificial neural networks approach, J. Membr. Sci. 246 (2) (2005) 235–247.
M. Sadrzadeh et al. / Chemical Engineering and Processing 48 (2009) 1371–1381 [45] S. Curcio, V. Calabro, G. Iorio, Reduction and control of flux decline in crossflow membrane processes modeled by artificial neural networks, J. Membr. Sci. 286 (2006) 125–132. [46] G.R. Shetty, S. Chellam, Predicting membrane fouling during municipal drinking water nanofiltration using artificial neural networks, J. Membr. Sci. 217 (2003) 69–86. [47] G.R. Shetty, H. Malki, S. Chellam, Predicting contaminant removal during municipal drinking water nanofiltration using artificial neural networks, J. Membr. Sci. 212 (2003) 99–112. [48] W.R. Bowen, M.G. Jones, J.S. Webfoot, H.N.S. Yousef, Predicting salt rejection at nanofiltration membranes using artificial neural networks, Desalination 129 (2000) 147–162. [49] H. Al-Zoubi, N. Hilal, N.A. Darwish, A.W. Mohammad, Rejection and modeling of sulphate and potassium salts by nanofiltration membranes: neural network and Spiegler–Kedem model, Desalination 206 (2007) 42–60. [50] H. Niemi, A. Bulsari, S. Palosaari, Simulation of membrane separation by neural networks, J. Membr. Sci. 102 (1995) 185–191. [51] Y. Zhao, J.S. Taylor, S. Chellam, Predicting RO/NF water quality by modified solution diffusion model and artificial neural networks, J. Membr. Sci. 263 (2005) 38–46. [52] A. Abbas, N. Al-Bastaki, Modeling of an RO water desalination unit using neural networks, Chem. Eng. J. 114 (2005) 139–143. [53] K. Al-Shayji, Y.A. Liu, Neural networks for predictive modeling and optimization of large-scale commercial water desalination plants, in: Proc. IDA World Congress Desalination Water Science, vol. 1, 1997, pp. 1–15. [54] K. Al-Shayji, Y.A. Liu, Predictive modeling of large-scale commercial water desalination plants: data-based neural network and model-based process simulation, Ind. Eng. Chem. Res. 41 (25) (2002) 6460–6474. [55] M.M. Jafar, A. Zilouchian, Adaptive receptive fields for radial basis functions, Desalination 135 (2001) 83–91. [56] A. Shahsavand, M. Pourafshari Chenar, Neural networks modeling of hollow fiber membrane processes, J. Membr. Sci. 297 (2007) 59– 73. [57] L. Wang, C. Shao, H. Wang, H. Wu, Radial basis function neural networks based modeling of the membrane separation process: hydrogen recovery from refinery gases, J. Nat. Gas Chem. 15 (2006) 230–234.
1381
[58] S. Geissler, T. Wintgens, T. Melin, K. Vossenkaul, C. Kullmann, Modeling approaches for filtration processes with novel submerged capillary modules in membrane bioreactors for wastewater treatment, Desalination 178 (2005) 125–134. [59] O. Cinar, H. Hasar, C. Kinaci, Modeling of submerged membrane bioreactor treating cheese whey wastewater by artificial neural network, J. Biotechnol. 123 (2006) 204–209. [60] W.Y. Lee, G.G. Park, T.H. Yang, Y.G. Yoon, C.S. Kim, Empirical modeling of polymer electrolyte membrane fuel cell performance using artificial neural networks, Int. J. Hydrogen Energy 29 (2004) 961–966. [61] S. Ou, L.E.K. Achenie, A hybrid neural network model for PEM fuel cells, J. Power Sources 140 (2) (2005) 319–330. [62] K.A. Al-Shayji, Modeling, Simulation and Optimization of Large-scale Commercial Desalination Plant, PhD thesis, Virginia Polytechnic Institute and State University, Virginia, USA, 1998. [63] H. Demuth, M. Beale, Neural Network Toolbox: For Use With MATLAB (Version 4.0), The Math Works Inc., 2004. [64] D.R. Baughman, Y.A. Liu, Neural Network in Bioprocessing and Chemical Engineering, Academic Press, San Diego, CA, USA, 1995. [65] J.A. Freeman, D.M. Skapura, Neural Networks Algorithms, Applications, and Programming Techniques, Addison-Wesley, Reading, MA, USA, 1992. [66] S. Haykin, Neural Networks: A Comprehensive Foundation, Macmillan College Co., New York, USA, 1994. [67] P. Mehra, B.W. Wah, Artificial Neural Networks: Concepts and Theories, IEEE Press, Los Alamitos, CA, USA, 1992. [68] M.C. Porter, Handbook of Industrial Membrane Technology, Noyes Publications, New Jersey, 1990. [69] R.W. Rousseau, Handbook of Separation Process Technology, John Wiley and Sons Inc., New York, 1987. [70] T. Mohammadi, A. Moheb, M. Sadrzadeh, A. Razmi, Modeling of metal ion removal from wastewater by electrodialysis, Sep. Purif. Technol. 41 (2005) 73–82. [71] K. Hornik, M. Stinchombe, H. White, Multi-layer feedforward networks are universal approximations, Neural Networks 2 (1989) 359–366. [72] I. Flood, N. Kartam, Neural networks in civil engineering. I. Principles and understanding, J. Comp. Civil Eng. 8 (1994) 131–148.