Energy Conversion and Management 49 (2008) 1989–1998
Contents lists available at ScienceDirect
Energy Conversion and Management journal homepage: www.elsevier.com/locate/enconman
Estimation of furnace exit gas temperature (FEGT) using optimized radial basis and back-propagation neural networks J.S. Chandok a,1, I.N. Kar b,*, Suneet Tuli c a
Energy Technologies, NTPC Limited, New Delhi 110003, India Department of Electrical Engineering, Indian Institute of Technology, Delhi, New Delhi 110016, India c Center for Applied Research in Electronics, Indian Institute of Technology, Delhi, New Delhi 110016, India b
a r t i c l e
i n f o
Article history: Received 30 October 2006 Received in revised form 25 July 2007 Accepted 5 March 2008 Available online 28 April 2008 Keywords: Boiler Furnace exit gas temperature (FEGT) Prediction Neural network RBF
a b s t r a c t The boiler is a very important component of a thermal power plant, and its efficient operation requires continuous online information of various relevant parameters. Furnace exit gas temperature (FEGT) is one such important design/operating parameter. Knowledge of FEGT is not only useful for design of convective heating surface but also helpful for operating actions and decision making. Its online information ensures improvement in economic benefit of the power plant. Non-availability of FEGT on the operator desk greatly limits efficient operation. In this study, a novel method of estimating FEGT using neural network is presented. The training data are first generated by calculating FEGT using heat balances through various heat exchangers. Prediction accuracy and fast response are major advantages in using neural network for estimating FEGT for operator information. Two types of feed forward neural modeling networks, radial basis function and back-propagation network, were applied and compared based on their network simplicity, model building and prediction accuracy. Results are verified on practical data obtained from a 210 MW boiler of a thermal power plant. Ó 2008 Elsevier Ltd. All rights reserved.
1. Introduction In conventional two pass pulverized fuel (PF) boilers, combustion takes place in the furnace and circulating fluid inside evaporative furnace tubes absorbs less than half of the total fuel heat. Furnace tubes viewed by the flame are treated as radiant heat transfer surfaces whereas other boiler tubes downstream of furnace are assumed convective or a combination of two types [1]. In the furnace exit, flue gases produced as a product of combustion, attain a certain temperature termed as furnace exit gas temperature (FEGT). The FEGT is an important design and operating parameter and can be defined as the ratio of heat absorption by the radiant heating surfaces in the furnace and that by the convective heating surfaces downstream of the furnace [2]. A high value of FEGT would make the furnace compact but the convective sections larger. The FEGT is chosen to be below the ash deformation temperature, to avoid severe fouling of back-pass tubes by molten ash. Similarly special provision of over fire air is also made in some large furnace boiler to reduce peak furnace temperature, CO formation and to improve furnace safety. Further, inferior coal firing leads to excessive soot formation and ash on * Corresponding author. Tel.: +91 11 26591093; fax: +91 11 26581264. E-mail addresses:
[email protected] (J.S. Chandok),
[email protected] (I.N. Kar). 1 Tel.: +91 9868390930. 0196-8904/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.enconman.2008.03.011
the furnace water wall tubes and causing high FEGT value. The knowledge of FEGT is useful for the design of convective heating surfaces and plant operating actions, which ensures improvement in the economic benefit of the power plant. Non-availability of FEGT on the operator desk puts great limitation on efficient operation. In PF boiler, predominant design and operating considerations that govern FEGT are size of convective sections, ash deformation temperature, NOx formation and pollution control. Conventionally thermocouples, furnace temperature probes, etc., mounted on the left and right side of furnace are generally employed to measure FEGT. However their performance is compromised due to inability to sustain such high temperature. Recently, in a few power plants, methods like acoustics pyrometer, optical pyrometer, etc., have been used but have not become popular due to their own limitations of high cost, lens fouling, combustion noise and frequent calibration. Alternately, FEGT can also be calculated analytically by carrying out heat balances in various heat exchangers in the flue gas path. Since FEGT depends on various direct and indirect boiler parameters, it is difficult to build a precise mathematical model. A novel approach of estimating FEGT using artificial neural network is proposed in this paper. Neural network can be used as model free estimator, capable of estimating from available historic data called training data, consisting of both inputs and output. In this study, FEGT is considered to be the output of neural network, which is not directly measurable. Hence to know this output an
1990
J.S. Chandok et al. / Energy Conversion and Management 49 (2008) 1989–1998
Nomenclature hgo Tgo Tgi hgi hfwi hfwo yj(k) dj(k) ej(k)
flue gas enthalpy at economizer O/L in BTU/lb flue gas temperature at economizer outlet in °F flue gas temperature at economizer outlet in °F energy of the flue gas at the economizer inlet feed water enthalpy at economizer inlet feed water enthalpy at economizer outlet output of the neuron j desired value for neuron j error function of neuron j
analytical calculation of FEGT is first carried out using heat balance in each heat exchanger in the flue gas path. This requires knowledge of pressure, temperature and flow of working fluid at the inlet and outlet of each heat exchanger. A total of nine input parameters (Section 3) based on the operator’s experience and theoretical studies have been selected as input to the neural network. These inputs may or may not have direct mathematical relation with FEGT, but certainly have great influence on it. Prediction accuracy and faster response are major concerns in using neural network for reliable and meaningful value of FEGT for operator information. This work also contributes to evaluating the prediction ability of two important neural networks using radial basis function (RBF) and multilayer perceptron (MLP) backpropagation networks respectively. Measures such as correlation coefficient, prediction error bar chart, speed, etc., are adopted to compare the performance of these two network. These neural network modeling methods are currently becoming very popular because of their applications in various disciplines, e.g. prediction of coal ash fusion temperature [11], turbogenerator control [14], solar array modeling and maximum power point prediction [15], and power plant condenser performance [18]. Amoudi and Zhang [15] reported comparison of BP and RBF neural network and concluded that BP take longer time but require less information as compare to RBF. Optimized design of neural networks has been reported, depending on their application. Prieto et al. [18] applied artificial neural network for power plant condenser and reported that the design of the neural network could be enhanced by knowledge of physical variables. FEGT, though a very important measurement for plant operation, has received fairly little attention in the literature for estimating it using artificial neural network. There is no reference found available in open literature where FEGT is estimated using ANN. This paper first describes an analytical calculation of FEGT using convective heat transfer in the backward path. Important considerations for selection of plant parameters as input to the neural network are then elaborated. Architecture and training of feed forward network both for MLP network and RBF network is explained and the results and performance comparison of both MLP and RBF network discussed comparing their advantages and disadvantages.
wlij E(k) g /0 yL1 cj xi r k
connecting weight for ith neuron in the Lth layer instantaneous sum squared error learning rate derivative of activation function final output of first neuron in the Lth layer center of the RBF neuron in the hidden layer training data width of the Gaussian function regularization parameter
1600 °C in the flame core. Though the flue gas is cooled by the evaporator and platen superheater in the furnace, the temperature at the exit of the furnace is still in the range of 1000–1250 °C. Since the flue gas flows through the furnace at a low velocity, only a small fraction of the total heat transferred to the walls is through convection. After leaving the furnace the flue gas enters the convective section of the boiler at furnace exit gas temperature as shown in Fig. 1, where it cools further by transferring heat to water, steam and in some cases to combustion air. The principal mode of heat transfer in this section being forced convection this section is called the convection section. Here, the gas enters at the FEGT and leaves at slightly above the stack temperature [2]. In the boiler, the heat exchangers located in the convective section include the reheater (RHTR), final superheater (FNSH), low temperature superheater (LTSH), economizer and air preheater (APH) all placed in series. 2.1. FEGT calculation The method employed in the present work is the lumped analysis approach in which a heat exchanger can be defined by average characteristics. The analytical method is based on a series of heat balances beginning with the average flue gas temperature measured with a left and right thermocouple at the economizer outlet. By working upstream toward the furnace outlet, the average gas temperature entering each tube bank is determined using a series of heat transfers calculations. Heat gained by the working fluid (water or steam) equals the heat lost by the flue gas. The last heat transfer section in this series of calculations is the reheater and the last, or reheater inlet gas temperature, is the FEGT [3].
2. Process description and FEGT calculation Heat transfer in steam generator is a complex and inexact phenomenon due to its geometry, absorbing gas volumes and furnace walls, etc., and thus there are varied opinion regarding which correlation to apply in a particular situation. There are three mechanisms of heat transfer; conduction, convection and radiation – similar in some aspects but quite different in others. When fuel burns in the boiler it releases a large amount of energy, which heats up the product of combustion (flue gas) to a very high temperature. This temperature may range from 1500 °C to
Fig. 1. Overview of boiler and location of furnace exit gas plane.
1991
J.S. Chandok et al. / Energy Conversion and Management 49 (2008) 1989–1998
Flue gas temperature at the inlet of any heat exchanger can be calculated by first calculating flue gas enthalpy from known outlet temperature at the outlet of this heat exchanger by a quadratic equation as given below [4]. Flue gas enthalpy at economizer outlet, is given by
a T 2gi þ b T gi þ ðc hgi Þ ¼ 0
hgo ¼ a T 2go þ b T go þ c
This economizer inlet flue gas temperature will be taken as the averaged outlet temperature of the low temperature superheater (LTSH) (next element of the series, in backward path) and the temperature at the inlet of LTSH can be calculated, which will be used to know the averaged temperature at the final superheater inlet. Likewise, this final superheater inlet temperature will be considered as averaged reheater outlet temperature and finally used to calculate the reheater inlet temperature (FEGT). Fig. 2 shows the variation of calculated FEGT, load and feed water (FW) flow. Figs. 3 and 4 indicate the convective heat transfer and flue gas temperature at the inlet of various components of boiler at full load and part load, respectively. They confirm that the heat transfer in the reheater is maximum among all considered components, because reheater being nearest to the furnace is subjected to very high temperature, due to which some radiative heat transfer is also present. Further as per design, reheater is a single component, whereas the superheater is divided into three subcomponents, each sharing some amount of heat transfer. Total heat transfer in superheater (Platen + LTSH + Final SH) is much more than the heat transfer in reheater. These results also give the flue gas temperatures at the inlet of various components, which are useful to know the intermediate process conditions.
ð1Þ
where hgo is in BTU/lb and Tgo is flue gas temperature at economizer outlet in °F. The value of scalar constants a, b, c in Eq. (1) for a relevant range of temperature is a ¼1:725460 105 b ¼0:2336275 c ¼18:58662 The total energy of the flue gas at the inlet of the economizer (hgi) will be the sum of the flue gas energy at the outlet of economizer calculated from Eq. (1) and heat energy gained by the feed water from flue gas. hgi ¼ hgo þ ðhfwo hfwi Þ
ð2Þ
where hfwi and hfwo are specific enthalpies of feed water at economizer inlet and outlet, respectively. Further with the known value of specific enthalpy at the inlet of economizer hgi, Eq. (1) can be used again, but this time to calculate economizer inlet temperature Tgi as follows:
! Economizer inlet temperature; T gi ¼
b
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 b 4aðc hgi Þ 2a ð3Þ
3. Inputs for neural network modeling The main objective of the work is to make use of the potential of neural networks, to estimate the FEGT. This requires proper selection of input parameters for the NN model and is based on the rationale that all parameters, which have direct or indirect effect on FEGT, ought to be included. As shown in Fig. 5, the important input parameters selected for FEGT neural network modeling are
Fig. 2. Variation of load, FW flow and calculated FEGT for a set of operation.
1. 2. 3. 4. 5. 6. 7. 8. 9.
Feed water flow. Total coal flow. Secondary airflow. Secondary air temperature. Primary air to coal flow ratio. Flue gas O2%. Burner tilt. Mill combination. Cleanliness factor.
Fig. 3. Heat transfer and temperature in various component of boiler (full load).
1992
J.S. Chandok et al. / Energy Conversion and Management 49 (2008) 1989–1998
Fig. 4. Heat transfer and temperature in various component of boiler (part load).
Out of these parameters, feed water flow, coal flow, secondary airflow and secondary air temperature are basically input conditions to boiler and their variations are directly reflect the various plant load conditions and hence different FEGT values. The remaining parameters, i.e. PA/coal flow ratio, O2%, burner tilt, mill combination and cleanliness factor, though not mathematically related to the FEGT, have a great influence on it at a particular operating load. The velocity of primary stream must exceed the speed of flame propagation so as to avoid flashback, and on leaving the burner the velocity of the mixture must also be low enough for stable igni-
are six coal mills in the selected 210 MW plant and at any time, generally four mills are in service. The various combinations of four running mills, i.e. lower, middle or upper combination, affects FEGT at a particular load condition. A representative value corresponding to each combination can be obtained by taking the weighted average of mills in service. Upper mills are given more weightage compared to lower mills, as FEGT will be higher when upper mills are in service. The input mill combination in service is computed by weighted average method as
0:2xðMill AÞ þ 0:4xðMill BÞ þ 0:6xðMill CÞ þ 0:8xðMill DÞ þ 1:0xðMill EÞ þ 1:2xðMill FÞ 4:2
tion [5]. This ratio leads to variation in actual combustion and in turn to FEGT. Generally the air–coal ratio of primary stream is maintained as 2:1. Flue gas O2%, CO2 and CO are the product of combustion and characterize the quality of combustion and hence FEGT. As only the O2% measurement was available in the plant chosen, it alone was taken as input parameter for FEGT estimation. Burner tilt also has great impact on FEGT as the temperature goes high when tilt is in upward position and goes low in downward position. The last two inputs, i.e. mill combination and cleanliness factor are calculated parameters. Unlike the other seven parameters, these are not taken directly from plant data communication system. There
with (Mill X), which is in service, being considered as ‘1’ and not in service as ‘0’. Similarly input cleanliness factor, which is the ratio of actual operating heat of combustion and design heat of combustion [6,7] also influences the FEGT at a particular load condition. A high cleanliness factor indicates a clean boiler leading to better heat transfer, and thus comparatively low temperature at the furnace exit. Cleanliness Factor ¼ Operating heat of combustion=Design heat of combustion where heat of combustion is given by the sum (main steam heat energy + reheater steam heat energy + blow down water heat energy SH spray heat energy + RH spray heat energy). 4. Neural network implementation
Fig. 5. Neural network input/output model.
A feed forward network is a special architecture of neural network in which neurons are organized in the form of layers. In this work, two types of feed forward network namely multilayer perceptron and radial basis function networks are adopted to estimate FEGT. The general architecture for a feed forward network is depicted in Fig. 6, which is a standard architecture with one hidden layer. MLP and RBF mainly differ in the type of neuron employed in the hidden layer. The input layer consists of the descriptors, and the output layer employed in this study for both networks is a linear output function. The goal in neural modeling is to generate weights, w, and bias, b, which model the data and act as coefficients to predict accurate output value [8–10].
J.S. Chandok et al. / Energy Conversion and Management 49 (2008) 1989–1998
1993
Fig. 6. General architecture of single hidden layer feed forward network.
4.1. Training data generation Input data samples used for training and testing of the neural network model were collected from a 210 MW thermal power plant. All required data points are averaged over one minute in the communication system itself. In this way, data of around 3200 samples (known as exemplars) spanning over a period of three different days were collected. Inputs directly available from the plant and other calculated parameters, along with target FEGT are extracted from whole data set for neural network training. To avoid duplicity of input data, all data points with no variation in the values of parameters for successive samples are removed from the training data sample. Also, data in which there are extreme peculiarities is filtered out else the network ends up memorizing input peculiarities with the result that the generalization capability of the network becomes poor. In this manner, a total of 1489 data samples are selected for training, testing and validation. Neural network training can be made more efficient if certain preprocessing steps are performed on the training data set. The input data to be applied to network and the target data for training and testing is to be normalized in the range of the activation function. It is also to be seen that the normalized values of input and target data samples do not fall in the saturation regions of the activation function characteristic curve to avoid unrealistic network response [11]. Hence, all data samples are normalized in the range of 0.9 to+0.9 as the range of tan-sigmoid activation function is from 1 to+1. For a variable with maximum and minimum values of Vmax and Vmin, respectively, each value V is scaled to its normalized value A using. A¼
ðV V min Þ 0:9 1:8 ðV max V min Þ
ð4Þ
gation learning consists of two passes through the different layers of the network: a forward pass and backward pass. In forward pass, an input vector is applied to the sensory nodes of the networks and its effect propagates through the network, layer by layer. Finally, a set of outputs is produced as the actual response of the network. During the forward pass, all the synaptic weights of the network are fixed while during backward pass the synaptic weights are adjusted in accordance with the error correction rule [11]. The error signal at the output of neuron j at iteration k is defined by ej ðkÞ ¼ dj ðkÞ yj ðkÞ where yj(k) is the output of the neuron j; dj(k) is the desired value for neuron j; ej(k) is the error function of neuron j The instantaneous sum squared error E(k) is given by EðkÞ ¼
1X 2 e ðkÞ 2 j j
and the mean squared error (MSE) is obtained by averaging E(k) over set size N is given by MSE ¼ Eav ðkÞ ¼
N 1 X EðkÞ N k¼1
MLP have been applied successfully to solve some difficult and diverse problems by training them in a supervised manner with an algorithm known as back-propagation algorithm. Algorithm is based on the error-correction-learning rule. The error back-propa-
ð6Þ
For layer l = 1, 2, . . ., L, yL1 will be final output of first neuron in the output layer, which is a function of net input to this neuron. The net input to ith neuron in the Lth layer is given as netLi ¼
nl1 X ðwLij yL1 þ wLi0 Þ j
ð7Þ
j¼1
yLi ¼ /ðnetLi Þ
ð8Þ
The nonlinear activation function /() employed in this study is the tan sigmoid function (hyperbolic tangent) given by /ðnetÞ ¼ tanhðnet=2Þ ¼
4.2. MLP network with back-propagation algorithm
ð5Þ
1 enet 1 þ enet
ð9Þ
The weights wlij represent the connecting weight and their updation rule is given by " # P 1X wlij ðt þ DtÞ ¼ wlij ðtÞ þ g ðdip ylip Þ /0 ðnetip Þ yl1 ð10Þ j p p¼1
1994
J.S. Chandok et al. / Energy Conversion and Management 49 (2008) 1989–1998
where g is learning rate and /0 is derivative of the activation function. The process of iteration is continued until the total number of iterations (epochs) is reached or the specified error level for training is attained. The speed of training a network is largely dependent on the method of updating weight w and bias b in the hidden layer and also on the size of the training data matrix. In standard back-propagation w and b are updated by gradient descent with w and being moved in the opposite direction to the error gradient e. Each step down the gradient results in smaller errors until the error minima is reached. Normally, momentum and learning rate terms are incorporated in the training scheme, which makes changes proportional to the running average of the gradient thereby speeding up the convergence process and reducing the risk of trapping in a local minimum. In this work, the Levenberg–Marquardt approximation [12] is employed which is several orders of magnitude faster than standard gradient descent method. The Levenberg–Marquardt rule states that change in weight DW is, DW ¼ ðJT J þ lIÞ1 JT e
ð11Þ
where J is the Jacobian matrix of derivatives of errors for each weight, l is a scalar and e is an error vector, and I is the identity matrix [9,12]. 4.2.1. Initialization of the MLP network The back-propagation learning algorithm is sensitive to initial conditions. If the synaptic weights are assigned large initial value, the activation function of neurons (the sigmoid function) will very possibly reach saturation and thus, the whole MLP network will get stuck in local minima [11]. On the other hand, if synaptic weights are assigned a small initial value, the back-propagation algorithm may operate on a very flat and around the origin of the error surface. Many initialization methods have been put forward for the back-propagation learning algorithm. In this work, a layer’s weights and biases are initialized according to the Nguyen–Widrow initialization algorithm [12]. This algorithm chooses values in order to distribute the active region of each neuron in the layer evenly across the layer’s input space. As compared to purely random weights and biases, this algorithm has the advantages that fewer neurons are wasted (since all the neurons are in the input space) and training works faster (since each area of the input space has neurons). 4.2.2. Selection of MLP neural network architecture A total of 1489 preprocessed data used in the present work was divided suitably into three parts. Around 900 data are kept for training, 200 data are kept for validation and 300 data are taken for testing purpose.
Fig. 7. Comparison of energy value for the MLP with different hidden neurons.
In principle, it has been proved that a neural network model with only one hidden layer can uniformly approximate any continuous function [8,11]. In the present work, the FEGT approximation with nine dimensional input spaces is carried out with a network of single hidden layer with neurons varying from 2 to 9. MLP networks with different hidden neurons are trained with 900 patterns and trained over 500 iterations. Energy function (MSE) of different models for the 900 training patterns, against the different number of neuron in hidden layer is plotted as shown in Fig. 7. This indicates there is substantial improvement in the performance function up to 5 neurons in the hidden layer, however, there is marginal improvement with further increase in number of neuron from 6 to 10. In actual engineering system, the training data are usually erroneous, so the too high training precision will overfit the training pattern and then impede generalization of the MLP network [11,13]. From this experiment and analysis, five neuron are finally employed in the MLP network. 4.2.3. Selection of training precision To avoid overfitting due to large capacity of network and for good generalization, cross-validation method for early stop [8,11] is used in training the network. Early stopping is used to ensure that the network would generalize well to unseen data. To apply the cross-validation, the total data set is first partitioned into a training set and a testing set. The training set is further partitioned into two disjoint subsets: Estimation subset, used to select the model. Validation subset, used to validate the model. The training subset is used to adapt the weights by Eqs. (5)– (10). After each epoch, the network is queried with the input vectors in the validation subset and the mean squared validation error is calculated. The energy function for training patterns usually decreases with the progress of training, while that for validation patterns decreases at the initial stage of training and then increases with training further on, as shown in Fig. 8. The opinion that smaller preset training precision, the better is the generalization of the MLP network, is not always true, especially in actual applications, because data from almost all the engineering systems are inevitably corrupted by noise. If the validation error increases with continued training, the training is terminated due to potential for overfitting. If the validation error remains the same for more than 10 successive epochs, it
Fig. 8. MSE vs epoch for training and validation data set.
1995
J.S. Chandok et al. / Energy Conversion and Management 49 (2008) 1989–1998
is assumed that the network has converged. It can be seen from Fig. 8, that the training stopped after only 18 iterations because the validation error increased which means the training above this will impede the generalization capability of network. 4.3. Radial basis function network The design of a supervised neural network may be pursued in a variety of ways. The back-propagation algorithm for the design of a multiplayer perceptron as described in previous section may be viewed as the application of a recursive technique, however another approach called radial basis function can be viewed as the design of neural network vis-à-vis a curve fitting problem in a high dimensional space. RBF network has number of advantages over the MLP with regard to training and locality of approximation, making it an attractive alternative for on line applications [14]. 4.3.1. Basic features of RBF network The proposed RBF network for FEGT estimation is shown in Fig. 6. It comprises three layers: the input layer, the hidden layer and the output layer. The input layer consists of a nine dimensional vector. The output layer has only one element, i.e. the FEGT. The hidden layer is composed of m RBFs /j (j = 1, . . ., m) that are connected directly to all the elements in the input layer. For a data set consisting of N input vectors together with corresponding output FEGT, there are N such hidden units, each corresponding to one data point. The activation function of hidden units is symmetric in the input space, and the output of each hidden unit depends only on the radial distance between the input vector and the center for the hidden unit. The output of each hidden unit, hj, j = 1, 2, . . ., m, is given by hj ðxi Þ ¼ /ðkxi cj kÞ where /() is the Gaussian activation function is used and its transfer function is " # kxi cj k2 /j ðxi Þ ¼ exp 2r2 where xi is the training data, cj is the center of the neuron in the hidden layer,r is the width of the Gaussian function and k.k is the euclidean norm. An RBF network is nonlinear if the basis functions can move or change size or if there is more than one hidden layer. Here, we focus on single-layer network with functions that are fixed in position and size. When applied to supervised learning with this linear model, the least squares principle leads to a particularly easy optimization problem. If the model is f ðxÞ ¼
m X
wj hj ðxÞ
ð12Þ
j¼1
and the training set is fðxi; yi Þgpi¼1 , then the least square recipe is to minimize the sum squared error S¼
p X ^i f ðxi ÞÞ2 ðy
ð13Þ
2
h1 ðx1 Þ 6 h1 ðx2 Þ 6 6. 6 H ¼ 6 .. 6 6 .. 4.
h2 ðx1 Þ h2 ðx2 Þ .. . .. .
h3 ðx1 Þ h3 ðx2 Þ .. . .. .
3 hm ðx1 Þ hm ðx2 Þ 7 7 7 .. 7 7 . 7 .. 7 5 .
h1 ðxp Þ
h2 ðxp Þ
h3 ðxp Þ
hm ðxp Þ
1
T
1
and A = (H H) . In practical situations, training based on available data (which is contaminated by noise also) is an ill-posed problem [12]. In that the large data sets may contain a surprisingly small amount of information about the desired solution and there is no unique solution exit. In such situations, it is necessary to supply extra information (or assumption). The mathematical technique for this is known as regularization [10]. In the linear model (12), model complexity can be controlled by the addition of a penalty term. When this combined error E¼
p m X X ðyi f ðxi ÞÞ2 þ k w2j i¼1
ð15Þ
j¼1
is optimized, large components in the weight vector w are inhibited. This kind of penalty is known as ridge regression or weight decay and the parameterk, which controls the amount of penalty, is known as the regularisation parameter. A small value for k means the data can be fit tightly without causing a large penalty; a large value for k means a tight fit has to be sacrificed if it requires large weights. 4.3.2. Selection of RBF network architecture The selection of RBF network architecture consists of selecting its center, spread and optimal weights. An intractable problem often met in RBF network applications is the choice of centers, which affect the complexity and the performance of a network greatly. If too few centers were used, the network may not be capable of generating a good approximation to the target function. However, with too many centers the network may overfit the data and it may fit misleading variations due to imprecise or noisy data. There are different learning strategies that can follow in the design of an RBF, depending on how the centers of the radial basis functions of the network are specified. The learning method employed in the present work uses the forward selection approach to determine the centers of RBF functions. Forward selection [10,16] is a direct approach to control model complexity and to select a subset of centers from a larger set that consists of all the input samples. The input data vectors in the training set were used as the candidate centers. The method starts with an empty network and then adds one neuron at a time to the hidden layer, which is an incremental operation. Using forward selection approach, the candidate unit, which decreases the sum-squared-error (SSE) most and had not already been selected at each step is chosen to be added to the current network. When the error of network output reaches the pre-set error goal value in conventional RBF network the procedure of adding hidden neurons will stop. With the improved method, to decide when to stop adding further neuron, the generalized cross-validation (GCV) is used as the model selection criterion (MSC) to calculate the prediction error during the training procedure and is
i¼1
with respect to weights of the model. The minimization of above cost function leads to a set of m simultaneous linear equations in the m unknown weights and these linear equations can be solved to obtain optimal weight vector as ^ ¼ A1 HT y ^ w where the design matrix H, is
ð14Þ
^2GCV ¼ r
^T P 2 y ^ py ðtraceðPÞÞ2
ð16Þ
where P is projection matrix and p is the number of data sample in training the network This is a quantity to estimate how well the trained network will perform in the future. The learning process undertaken by RBF network consists of supervised adjustment of nonlinear parameters, i.e. center and
1996
J.S. Chandok et al. / Energy Conversion and Management 49 (2008) 1989–1998
Table 1 Selection of radius of basis function (k = 5 10E06) Serial no.
Radius selected
Number of basis function selected
Mean squared error on test data
1 2 3 4 5 6 7 8 9 10
2 3 4 5 6 6.5 7 7.5 8 9
67 24 20 16 16 25 12 11 17 20
0.1988 0.1356 0.1278 0.0322 0.0968 0.19 0.0021 0.0024 0.073 0.073
shape of receptive field, as well as linear parameter, i.e. weights of the output layer. Total 900 data were used for the training and 300 data are kept for testing the network. Initially training is performed without any subset selection, i.e. with simple RBF network, and with this the MSE is found to be very high. With forward selection and small regularization k = 5 10E06, much smaller mean square error equal to 0.0021 is found and only12 basis function are selected. The next step in the training process involves the computation of the Gaussian function width/radius. Radius reflects the size of the RBF unit, and thus affects the response of the network to an input directly. The effect of radius on RBF network performance was investigated in the present work. With small radii, the responses of networks are weak to all the samples because the hidden neurons are so narrow that most of the samples are far from the center. While relatively larger radii networks will give out
strong responses to some of the samples in order to differentiate them. There are various heuristics for calculating the radius of RBF, however, in the present work the MSE is used as a criterion for investigating the effect of radius of hidden neurons. The MSE does not monotonously decrease or increase with the increase of the radii, but there is a value or range of radii at which the MSE is minimum [17]. It can be seen from Table 1, that the mean squared error on test data is increasing and decreasing in the radius range of 2–6.5, however; in the range of 7–9 of radii the MSE is found to be falling near its minimum. With this analysis, 12 number of radial basis function of radius 7 with regularization parameter equal to 5 10E06 is selected in this application. With this network configuration, optimal weights have been obtained by minimizing the performance function given by Eq. (15). 5. Results and discussion The FEGT is predicted for 300 test data using MLP and RBF type neural network implemented in MATLAB. Test data are first normalized between 0.9 to+0.9 and later denormalized back to its exact value for comparison with actual FEGT value. In back-propagation algorithm, neural network with nine inputs, one hidden layer with five hidden neurons and one output neuron is employed for predicting FEGT. The activation function ‘tan-sigmoid’ is selected for hidden neurons and linear function is employed for output neuron. Back-propagation training using the Lavenberg–Marquadrt training algorithm with learning rate of 0.01 is used for training the network. MLP network predicted value and actual value of FEGT is plotted for set of 300 data as shown in Fig. 9.
Fig. 9. Predicted FEGT using back-propagation algorithm
Fig. 10. Predicted FEGT using radial basis function (FEGT)
J.S. Chandok et al. / Energy Conversion and Management 49 (2008) 1989–1998
Similarly, in the case of RBF network the number of basis function selected by forward selection supervised learning is 12. The
1997
radius of basis function and the regularization parameter are important for RBF performance and selected experimentally as 7 and 5 10E06, respectively. RBF network predicted value and actual value of FEGT is plotted for set of 300 data as shown in Fig. 10. 5.1. Comparison of two networks
Fig. 11. Correlation of actual and predicted FEGT using MLP network.
Fig. 12. Correlation of actual and predicted FEGT using RBF.
As MLP and RBF type neural networks initialize and train in different ways, a direct comparison is not straightforward. However, they can be compared based on ease of model building, prediction accuracy and network simplicity. The major parameters which influence training and prediction performance of MLP networks includes training time and number of neurons in the hidden layer. The choice of these two parameters is crucial and complex in MLP model building. Methods such as cross-validation have been used to monitor and reduce the training time by examination of mean squared error for prediction (MSEP). Models with minimum MSEP are considered adequate. Also because of the random initialization of weights and bias in MLP network, they will not necessarily yield the same result when repeated and it is very difficult to conclude, which configuration is better. In contrast, initialization of weights in the RBF network with forward selection by minimizing the MSEP, always results in the same performance for a particular set of parameters. Model building in RBF network is therefore easier as compared to MLP network in this case. The prediction accuracy of two feed forward networks is compared by computing mean square error (MSE) on test data set. The predicted FEGT using MLP network and RBF network as shown in Figs. 9 and 10 results in MSE of 0.0030 and 0.0021, respectively. The correlation coefficient is also calculated for both cases and found to be 0.982 and 0.985 for MLP and RBF networks, respectively as shown in Figs. 11 and 12. This indicates that RBF is comparatively better in approximating the actual FEGT. Another attempt was made to compare the performance between the RBF and MLP networks by plotting FEGT prediction errors against counts. Bar charts of Figs. 13 and 14 indicate that FEGT prediction error is within 7 to +7 for almost all the counts in the case of RBF network, whereas for the MLP network it is more wider and in the range of 10 to +10. In terms of network simplicity, the number of processing units in RBF network is more than twice of the MLP network. However,
Fig. 13. FEGT error using MLP network using back-propagation.
1998
J.S. Chandok et al. / Energy Conversion and Management 49 (2008) 1989–1998
Fig. 14. FEGT error using radial basis function network.
RBF network trains much more rapidly as compared to MLP network. A proper selection of spread parameter is crucial to the generation of good global model for the type of data used. 6. Conclusions This study presents a novel neural network based approach to estimate FEGT for operator information. The major emphasis in this study is on how to make FEGT estimate more reliable and useful to the operator. Various important parameters, which are directly or indirectly related to FEGT, have been logically selected as input variables based on operator experience and process knowledge to ensure reliable estimation. Prediction accuracy and faster response are major concerns in using neural network in estimating FEGT for useful operator information: therefore two types of feed forward neural modeling networks, radial basis function and back-propagation network, were applied and compared based on their network simplicity, model building and prediction accuracy. It has been shown that, for this application, RBF networks train very rapidly (about 10 times faster than MLP networks using backpropagation algorithm). A proper selection of the spread parameter is crucial to the generation of a good global model for the type of data used in this study, but both the spread parameter and the number of neurons in the hidden layer were easily optimized by forward selection method. In general the FEGT predictions of both MLP and RBF networks are very similar, i.e. the results show that both networks converged to the same performance in terms of prediction output, provided that care was taken to ensure that the network was optimized. The inference and conclusions drawn from the study will contribute to development of useful soft-sensors for FEGT. This online measurement of FEGT can be directly linked with existing distributed digital control and information system, for control and information in existing power plants. The FEGT available by this means will also be useful for various boiler optimization software viz. boiler plant optimization systems, intelligent soot blowing system and study of heat rate improvement.
References [1] British Electricity International. Modern power station practice – boiler & ancillary plant. Oxford: Pergamon Press; 1991. p. 1–75. [2] Basu P, Cen Kefa, Louies Jestin. Boilers and burners. New York: Springer-Verlag; 2000. [3] Babcock&Wilcox, R&D division – Full scale demonstration of low-NOx cell burner retrofit.
. [4] Babcock & Wilcox, Co. Steam – its generation and use, 37th ed. 1995, p. 9-17– 27. [5] Lawn CJ. Principal of combustion engineering for boilers. Academic Press; 1987. p. 9–15 and 260–2. [6] Davidson Ian, Carter HR. A fully intelligent sootblowing system. In: International conference on thermal power generation. Best practices and future technologies part-I. NTPC and USAID, session –II, New Delhi, India; 2003. p. 17–26. [7] Nasal RJ, Richard RD, Richard Deaver. Expert system support of a heat transfer model to optimize soot blowing – a case study at Delmarva’s Edge Moor unit #5”. In: Proceedings of the heat rate improvement conference, EPRI TR106529; May 1996. p. 23-1–14. [8] Haykin S. Neural networks. A comprehensive foundation. 2nd ed. PrenticeHall; 1999. [9] Jang JSR, Sun CT, Mizutani E. Neuro-fuzzy and soft computing. Pearson Education; 2004. [10] Orr MJ. Regularization in selection of RBF centres. Neural Comput 1995;7(3):606–23. [11] Chungen Y, Luo Z, Ni M, Kefa Cen. Predicting coal ash fusion temperature with a back propagation neural network model. Fuel 1998;77(15):1777–82. [12] Demuth H, Beale M. Neural network toolbox. For use with MATLAB. User’s guide. The MathWorks Inc.; 2002. MATLAB/help/pdf doc/nnet.pdf. [13] Nasr GE, Badr EA, Joun C. Backpropagation neural networks for modeling gasoline consumption. Energy Convers Manage 2003;44:893–905. [14] Flynn N, Mcloone S, Irwin GW, Brown MD, Swidenbank E, Hogg BW. Neural control of turbogenerator systems. Automatica 1997;33(11):1961–73. [15] Amoudi A, Zhang L. Application of radial basis function networks for solar array modeling and maximum power-point prediction. IEE Proc Gen Trans Dist 2000;147(5). [16] Chang F-J, Liang Jin-Ming, Chen Yen-Chang. Flood forecasting using radial basis function neural networks. IEEE Trans Systems Man Cybernet – Part C 2001;31(4):530–5. [17] Zhuoyong Zhang, Dan Wang, Peter de, B Harringnton, Voorhees KJ, Rees Jon. Forward selection radial bais function network applied to bacterial classification based on MALDI-TOF-MS. Talenta 2004;63(3):527–32. [18] Prieto MM, Montanes E, Menendez O. Power plant condenser performance forecasting using a non fully connected artificial neural network. Energy 2001;26:65–79.