The use of Artificial Neural Network models for CO2 capture plants

The use of Artificial Neural Network models for CO2 capture plants

Applied Energy 88 (2011) 2368–2376 Contents lists available at ScienceDirect Applied Energy journal homepage: www.elsevier.com/locate/apenergy The ...

823KB Sizes 2 Downloads 16 Views

Applied Energy 88 (2011) 2368–2376

Contents lists available at ScienceDirect

Applied Energy journal homepage: www.elsevier.com/locate/apenergy

The use of Artificial Neural Network models for CO2 capture plants Nikolett Sipöcz a,⇑, Finn Andrew Tobiesen b, Mohsen Assadi a a b

Department of Mechanical & Structural Engineering & Material Science, University of Stavanger, N-4036 Stavanger, Norway SINTEF Materials and Chemistry, Department of Process Technology, N-7465 Trondheim, Norway

a r t i c l e

i n f o

Article history: Received 5 January 2010 Received in revised form 4 January 2011 Accepted 6 January 2011

Keywords: ANN CO2 capture Chemical absorption Levenberg–Marquardt Scaled Conjugate Gradient

a b s t r a c t Artificial Neural Networks (ANN) are multifaceted tools that can be used to model and predict various complex and highly non-linear processes. This paper presents the development and validation of an ANN model of a CO2 capture plant. An evaluation of the concept is made of the usefulness of the ANN model as well as a discussion of its feasibility for further integration into a conventional heat and mass balance programme. It is shown that the trained ANN model can reproduce the results of a rigorous process simulator in fraction of the simulation time. A multilayer feed-forward form of Artificial Neural Network was used to capture and model the non-linear relationship between inputs and outputs of the CO2 capture process. The data used for training and validation of the ANN were obtained using the process simulator CO2SIM. The ANN model was trained by performing fully automatic batch simulations using CO2SIM over the entire range of actual operation for an amine based absorption plant. The trained model was then used for finding the optimum operation for the example plant with respect to lowest possible specific steam duty and maximum CO2 capture rate. Two different algorithms have been used and compared for the training of the ANN and a sensitivity analysis was carried out to find the minimum number of input parameters needed while maintaining sufficient accuracy of the model. The reproducibility shows error less than 0.2% for the closed loop absorber/desorber plant. The results of this study show that trained ANN models are very useful for fast simulation of complex steady state process with high reproducibility of the rigorous model. Ó 2011 Elsevier Ltd. All rights reserved.

1. Introduction The increased competitiveness due to deregulation of the electricity market and more stringent laws for environmental protection has driven the technology development in the power sector over the last two decades. This development has also lead to focus on CO2 abatement. It is of great importance to find an optimal technology for capturing CO2 as well as to integrate and optimize the capture technology with the power plant. Today most researchers believe that CO2 sequestration is necessary from fossil fuels, thus there is great emphasis towards viable CO2 capture technologies for a future low carbon economy. Capture of CO2 from flue gases by the utilization of chemical solvents, such as amines, is a proven technology, although not yet at the scales required for full scale capture. However, it is currently the most viable option for CO2 removal from exhaust gases and two commercialised processes exist and others are currently under development [1,2]. However the high energy requirement for solvent regeneration will reduce the effectiveness of the power plant with about 10% for a traditional amine MEA solvent as well as increase power plant operational ⇑ Corresponding author. E-mail address: [email protected] (N. Sipöcz). 0306-2619/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.apenergy.2011.01.013

costs [3–7]. These effects have been studied in various technoeconomic evaluations using commercially available simulation tools such as Aspen Plus, Hysys and gPROMS. Although many commercial simulators offer advanced features such as customizing component models for the application in hand, the possibility to carry out sophisticated process simulations by means of flexible user defined component models is limited. This restriction is mainly due to the limited access to the underlying sub models used which gives rise to uncertainties as to the underlying theory and assumptions made. They thus act as ‘‘black-boxes’’ without revealing the theory the simulator is based upon [8]. Nevertheless, it is of importance that simulation tools used for research purposes have the level of flexibility allowing for incorporation of specific component characteristics to be able to perform evaluation studies for design and optimization. Previous works from our research group include development of power plant component for different power plant applications such as analysis, monitoring and optimization at both design and off design conditions [9–13]. These models which rely on chemical-, fluid mechanic- and thermodynamic laws both require and provide extensive knowledge of the underlying physics of the process. However, processes like details of gas turbines, absorptionand desorption processes in CO2 removal plants involve non-ideal

N. Sipöcz et al. / Applied Energy 88 (2011) 2368–2376

characteristics and accurate modeling of these systems are necessary in order to obtain sufficient details of the particular system, such as column heights. The solution of these systems are iterative procedures which for purposes such as certain operating point analysis and optimization studies are very complex and computationally very time consuming. For this reason, Artificial Neural Networks (ANN) has been considered as a valuable alternative modeling approach to replicate the rigorous model and at the same time obtain the same level of detail. In this work an ANN model of a CO2 capture plant based on a chemical absorption process utilizing monoethanolamine (MEA) has been developed for a feasibility study. The data for the ANN model has been generated by the process simulation tool CO2SIM [8], which has the ability to model CO2 absorption processes using rate based modeling [8,14,15]. The objective of this article is to present the application of the developed ANN for a chemical absorption capture system, by capturing and finding relationships among inputs and outputs represented by the data sets of the absorption and desorption cycles. In a follow-up study the developed ANN model representing the CO2 capture plant will be implemented into a simulator particularly developed for power plant modeling. In this respect the simulators which are suited for each section of the power plant will be used to give the best predictive power for a techno-economic study of power plants with CO2 removal.

2. Brief description of Artificial Neural Networks ANN technology is a non-parametric-, statistical modeling tool and does not require any pre-assumption of the input–output relationship. It has also been shown that ANN has the ability to approximate any non-linear system with high interpolation capacity [16]. ANN are especially suited for high dimension modeling since the number of free parameters increases linearly with the dimension compared to e.g. polynomial fitting where the number of free parameters increases exponentially [17], thus unsuitable for this kind of function approximation. The introduction of ANN as a tool for modeling power plant processes at component level [11,13,18] as well as at system level [13,19,20] has been presented in several studies to be a very feasible option. Artificial Neural Networks are mathematical models built up by a number of parallel-interconnected units called neurons, in which incoming data is distributed and processed. ANN’s are characterized by not putting any constraint on the modelled system i.e. both linear and non-linear system can be modelled and does not require any pre-assumption about the deterministic relationships between inputs and outputs. The ability to effectively approximate non-linear systems is due to the presence of one ore more hidden layers and non-linear transfer functions in the hidden layer’s neurons. The main characteristic in regression modeling is their ability to learn the relationship among selected inputs and outputs in the presence of at least one hidden layer and sufficient number of neurons with non-linear transfer functions [16]. ANNs distinguishing feature of being adaptive require that external data is presented to the network in order to recognize the process at hand. This means that ANNs must be trained, which is a procedure during which the experience is stored in the so called network weights according to a learning algorithm. The learning algorithm is based on a cost or error function, which presents a measure of deviation between the calculated output and the desired output. This cost function is normally the least mean square error function (MSE), which is to be minimized in order to find the optimal solution. MSE has two advantages; it is the most appropriate cost function to be incorporated with the learning algorithm and it provides a fast learning. The MSE is given by:

MSE ¼

j¼n Xop i¼n Xdp 1 ðt ij  yij Þ2 nop  ndp J¼1 i¼1

2369

ð1Þ

where nop is the number of output parameters, ndp is the number of data patterns, tij are the target values and yij are the network outputs. In this work the network architecture called multilayer feedforward network also known as multilayer perceptron (MLP) has been utilized. The MLP configuration has been extensively used for static regression applications [10–13,19] and it consists of one input layer, one or more hidden layer(s) and one output layer. However, Cybenko [16] proved that a one hidden layer network is sufficient to represent any type of multidimensional non-linear function with sufficient number of hidden neurons. During the training regime the network parameters (weights and biases) are adjusted in order to map a given set of input patterns to desired outputs. The adjustment of the weights and biases are determined based on a training algorithm. The back-propagation algorithm first invented in 1974 by Werbos [21] but rediscovered and popularized in 1986 by Rumelhart et al. [22] is the most commonly used training algorithm. Its main function is to compute and evaluate the steepest gradient decent of the cost function at each weight and to update the weights accordingly:

Dwij ¼ g

@E @wij

ð2Þ

The term, wij, denotes the weight from node i to j. g is a constant >0 and is also known as the learning rate. @E/@wij, is the derivative of the error, E with respect to weight, wij. The network weights are initialized with random numbers close to zero and the training algorithm modifies the weights according to the procedure presented above. However, the weight adjustment is an optimization problem for which different algorithms are available e.g. the conjugate-gradient method, momentum learning, etc. The reason for using alternative optimization techniques is to avoid the problem of trapping into local minima or saturation points, since a MLP network configuration with one hidden layer, thus two layers of weights, and the fact that the transfer functions are non-linear, consequently has a performance surface which cannot guarantee convergence to a global minimum.

3. Data generation 3.1. Description of a CO2 absorption capture plant The CO2 removal system used in this work is a conventional amine absorption process as illustrated in Fig. 1. The process consists of two main units – the absorber and desorber columns, which are both of packed bed type. The flue gas from the power plant flows counter currently with the lean MEA solution through the absorber. The absorbent, MEA, reacts chemically with the CO2 in the flue gas. The treated gas stream of lower CO2 content is further treated in a water wash section at the top of the absorber, before it is dispersed to the atmosphere, in order to recover water and MEA and thus minimizing the loss of solvent. The rich MEA solution at the bottom is pumped via a heat exchanger, in which heat from the lean solvent is recovered, to the desorber top where the solvent is regenerated by heat provided to the stripper through the reboiler using low pressure steam. The regenerated solvent is returned via the lean/rich heat exchanger to the absorber column. The product gas at the top of the desorber is cooled in a condenser and the water is fed back to the desorber while the lean CO2 gas leaving the condenser is sent for additional conditioning before transport.

2370

N. Sipöcz et al. / Applied Energy 88 (2011) 2368–2376

Fig. 1. Flow sheet of the simulated process.

3.2. Simulation of synthetic data and definition of CO2 capture plant The CO2SIM simulator is a rate-based model that includes detailed underlying mass transfer models and thermodynamics. Due to the nature of the rate-based framework, a scale-up using the models from laboratory scale systems to industrial scale units does in theory not require any additional parameter fitting. The hydraulic parameters will be internally recalculated for the new and larger systems and adjusted accordingly. The process simulator is described in detail in Tobiesen et al. [8,14,15]. The flowsheet as shown in Fig. 1 was built and simulated with CO2SIM with process conditions given in Table 1. The columns were simulated using hydraulic correlations representing Sulzer Mellapak 250.

Table 1 Base case conditions for the CO2SIM CO2 capture model. Parameter

Value

Unit

Absorber/desorber height Absorber/desorber diameter Absorber/desorber pressure Solvent Lean solvent cooler temperature Lean/rich HTX temperature approacha Reboiler temperatureb

23.9/22.3 11.3/5.5 110/190 MEA 37 10 118–122

m m kPa 30 wt% °C °C °C

a The size of the heat exchanger ‘‘Rich/Lean HeatX’’ is given by specifying the temperature approach on the low temperature side. This is set to 10 °C. b The reboiler duty was varied to reach a CO2 removal efficiency of 92% CO2 at all simulated conditions.

In the development of a good ANN model the selection of input and output data, used for training and validation of the model is of importance. The choice has to be based upon the objective of the model as well as on the physics of the process. Thus, experience and understanding concerning the underlying relations between parameters are crucial for capturing the process behavior. The absorption process has been simplified to emphasize on the required energy for capturing CO2, therefore the water wash sections required in the CO2 absorber and desorber were not included in the simulations as these units effect on the overall reboiler duty is negligible, see Fig. 1. The absorption- and desorption processes are independent of this part and this exclusion would at the same time facilitate the generation of data considerably. The ANN process was for the same reason managed the process as a ‘‘black-box-model’’ by monitoring the essential input and output variables. The boundary conditions for the CO2 capture process modelled in CO2SIM are, as noted presented in Table 1. An amount of 4282 data patterns were simulated by performing variation in five variables; inlet flue gas flow, inlet flue gas temperature, CO2 concentration in the flue gas, reboiler duty and solvent circulation rate. The selection of the five variables was decided on the basis of the specific application i.e. to integrate the final ANN model into IPSEpro for detailed performance and optimization studies of power plants with CO2 capture based on optimization of the CO2 removal plant with respect to minimizing the reboiler steam requirements. In this way, the ANN CO2SIM model can be incorporated into a turbine modeling simulator for real time usage. Hence, some parameters of the CO2 capture process were considered as necessary for the ANN model to obtain compatibility with the high level programme language used in IPSEpro. The variation

N. Sipöcz et al. / Applied Energy 88 (2011) 2368–2376

interval for each of the five parameters was decided based on typical operating conditions for a standard 400 MW natural gasfuelled power plant. The reboiler duty was varied to reach a CO2 removal efficiency of 92% CO2 at all simulated conditions. The five variables with their range of variation are presented in Table 2.

4.1. Selection of input and output parameters Before training of the network is initiated a selection of inputand output parameters has to be made. As mentioned, special attention was paid to include not only parameters important for reflecting the behavior of the actual CO2 capture process, but also those necessary for inclusion into the power plant simulator, in this case IPSEpro. The size of the flue gas mass flow, flue gas temperature, CO2 removal efficiency and amount CO2 in the flue gas, corresponded to the definition of the power plant. The pressure of the flue gas entering the absorber was not taken into consideration in this definition since the blower downstream the heat recovery steam generator is assumed to keep the pressure constant. The CO2 removal efficiency is defined as:

_ m

gCO2 removal ¼ 1  _ CO2 captured at stripper outlets mCO2 in flue gas at absorber inlet

ð3Þ

_ is the mass flow rate in kg/s. In The term g, is the efficiency and m addition, two other parameters were selected as inputs namely lean load and amine circulation rate, since these properties have influence on the amount CO2 captured. The lean load describes the mole ratio of CO2 in the lean amine solution entering the absorber. In the simulations, as described in Section 3.2, the lean load varies between 0.08 and 0.28. A higher value of the lean loading implies the possibility to reduce the specific reboiler duty by increasing either the solvent circulation rate or the rich loading [8]. The solvent circulation rate is primarily determined by the partial pressure of CO2 in the feed stream. In order to absorb most of the CO2 in the feed gas and provide a driving a force for absorption at the bottom of the column sufficient solvent must thus be available. In the simulations the solvent circulation rate is within the range of 573–665 kg/s. Furthermore, specific duty and solvent rich load were selected as desired outputs to be predicted by the model. The specific duty is determined according to Eq. (4) and the rich load is defined as the mole ratio of CO2 in the liquid solvent at the absorber bottom.

Ereboiler _ CO2 m

order to determine the number of input parameters needed to represent the CO2 removal process with sufficient prediction accuracy (chapters 5.3–5.4). The resulting sensitivity test number was only aimed to be indicative since some of the input parameters cannot be removed due to the special application of the model. 4.2. Network structure and training

4. Development of the ANN model

Especific ¼

2371

ð4Þ

captured at stripper outlet

_ the where Ereboiler denotes the duty of the reboiler in kJ/s and m mass flow rate of CO2 at the top of the stripper in kg/s. The range of variation of these two outputs is 0.27–0.50 and 3.37–12.40 GJ/ tonCO2 respectively. An overview of the selected inlet and outlet parameters are presented in Table 3. However, as ANN is a data driven model and no physical relations among the inputs and the desired outputs are implemented an examination of the sensitivity of the ANN prediction to each input parameter has been carried out in

Table 2 Parameters and their range of variation. Variable

Interval

Unit

Mass% CO2 in inlet flue gas Temperature of inlet flue gas Mass flow inlet flue gas Reboiler duty Solvent circulation rate

4.0–10.65 46–50 585–630 105–343 573–665

% °C kg/s MW kg/s

The Artificial Neural Network used in this study is a fully connected one hidden-layer feed-forward network with back-propagation learning algorithm, as schematically illustrated in Fig. 2. The number of hidden neurons in the hidden layer is determined by the complexity of interrelationship between the input- and output parameters and is optimized by trial-and-error. The training of the ANN was carried out with the commercially available program called NeuroSolutions. The data produced by the simulator was randomly divided into three sets; training data, cross-validation data, and test data, in order to ensure that the ANN’s performance is accurate. In this distribution 60% of the entire data was used for training, 15% for cross-validation while the rest, 25% was devoted for the test set. The training data set is used to optimize the weight values according to the cost function with the learning algorithm while the cross validation set is used to validate the learning error. A decreasing error in the training set should be accompanied with a similar decrease in the cross-validation data set to ensure generalization capabilities. Since both the training and cross-validation data set are used by the network during training the test data set is used to evaluate the generalization capabilities of the network by introducing unseen data. If a limited amount of data is used together with a high number of neurons in the hidden layer, i.e. many free parameters to adjust, the network is prone to learn the actual data patterns instead of the underlying mechanism and thereby provides poor generalization. By using cross validation, a representative training set as well as making sure that the number of free parameters (number of weights) are much smaller than the number of training patterns good network generalization is achieved and the phenomena called overtraining is avoided. Training is terminated when the MSE in the cross-validation data set starts to increase. To find the desirable number of hidden neurons, a parameter variation is performed, after which the best network configuration is selected. The chosen ANN model should not only fulfill the requirement of having a relatively low prediction error, this error should also be similar for the three different data sets, i.e. training, cross-validation and test set. The training is repeated two times with the same randomized data, with small changes in the initiated weight values, to minimize the probability for convergence to local minima. As mentioned earlier the surface structure is multimodal and multiple solutions exist. In order to overcome the main shortcomings of the standard back-propagation (gradient decent) method such as slow convergence and the possibility to fall into local minima the standard back-propagation training algorithm has in this work been combined with two second order algorithms, the Scaled Conjugate Gradient (SGC)-, and the Levenberg–Marquardt (LM) algorithm. The two training algorithms have been compared based on prediction accuracy and the best network has been selected. To assure that all input parameters are non-redundant the definite dependence of each input on the outputs has been evaluated. Since the input and output variables of the ANN differ significantly in value, a rescaling in terms of normalization of the data is necessary to avoid convergence problems and to fit the selected transfer function used in the neurons, in this case the tangent hyperbolic function. The tangent hyperbolic function belongs to the sigmoid category of functions and range between 1 and 1 in an S-shaped manner. The data is properly scaled between 0.8 and 0.8 to include both the linear and non-linear behavior of the

2372

N. Sipöcz et al. / Applied Energy 88 (2011) 2368–2376

Table 3 Inlet and outlet parameters to the ANN model. Inputs

Range of variation

Outputs

Range of variation

TemperatureInlet flue gas Mass flowInlet flue gas Mass frac. CO2Inlet flue gas Solvent lean load Solvent circulation rate Removal efficiency

46–60 °C 585–630 kg/s 4.0–10.65% 0.08–0.28 573–665 kg/s 60–92%

Mass flow CO2 captured Rich load Specific duty

20–56 kg/s 0.27–0.50 3.37–12.40 GJ/tonCO2

Fig. 2. Initial model for the ANN for process identification of the CO2 capture plant.

transfer function while at the same time allowing for extrapolation capability. The network was trained in batch mode, meaning that any changes to the weights and biases take place after all training patterns have been presented once. 4.2.1. Scaled Conjugate Gradient algorithm The Scaled Conjugate-Gradient method is an implementation of conventional Conjugate Gradient algorithm (CG), introduced by Møller [23] to avoid the complicated line search procedure associated with (CG) algorithm. In conjugate gradient algorithms, a search is performed along conjugate directions, which produces generally faster convergence than steepest descent directions. Standard gradient descent algorithms use the local approximation of the slope of the performance surface (error versus weights) to determine the best direction to move the weights in order to lower the error. In second order methods like the CG algorithm, the weight update is determined by approximation of the second derivatives of the performance surface, which generates the curvature and not only the slope of the performance surface. However, to determine the step size along the conjugate gradient direction, a search has to be performed, which will determine the optimal distance to move along the current search direction. Then the next search direction is determined so that it is conjugate to previous search direction. The general procedure for determining the new search direction is to combine the new steepest descent direction with the previous search direction as governed by the following equations.

culated for each iteration, k, to find the MSE along the direction p. The calculation of a is the main drawback of the CG algorithm as it usually requires lot of computational time. In the SCG method the estimation of a is avoided and instead the Hessian matrix is approximated by

E00 ðwk Þpk ¼

E0 ðwk þ rk pk Þ  E0 ðwk Þ

rk

þ kk pk

ð8Þ

where E0 and E00 are the first and second derivative of the global error function E. rk and kk are the parameter controlling the second derivation approximation and parameter for regulating the indefiniteness of the Hessian. To ensure a good quadratic approximation of E, a method to raise and lower kk is needed when the Hessian is positive, which means that the global error is still decreasing. A detailed description about this regulation and of the SCG algorithm can be found in Møller [23]. 4.2.2. The Levenberg–Marquardt algorithm The LM algorithm belongs to the group of learning algorithms called pseudo second order methods and is a compromise between the steepest descent algorithm and the Gauss–Newton method. The LM was designed for minimizing functions that are sums of squares of other non-linear functions without having to compute the Hessian matrix [24]. Since the performance function for feedforward networks is normally represented by MSE, which has the Lp norm two, the Hessian can be approximated as

H  JT J

ð9Þ

wkþ1 ¼ wk þ ak pk

ð5Þ

p0 ¼ g 0

ð6Þ

g ¼ JT e

pkþ1 ¼ g kþ1 þ bk pk

ð7Þ

where the Jacobian matrix J contains the first derivates of the network with respect to the weights and biases, and e is a vector of network errors. The Jacobian matrix is computed by a variation of the back propagation technique that is much less complex than computing the Hessian matrix. In the LM method the weights are updated using the Gauss–Newton approximation defined as

The term w are the weights, p is the current direction of weight movement, g is the gradient, b is the parameter determining how much the past direction is combined with the gradient to form the new conjugate direction. a is the line search term, which is cal-

and the gradient can be computed as

ð10Þ

N. Sipöcz et al. / Applied Energy 88 (2011) 2368–2376

W iþ1 ¼ W i  ðJ Ti J i Þ1 J Ti ei

2373

ð11Þ

However one main problem with the Gauss–Newton method is that the matrix JTJ may not be invertible (positive definite). This is managed by modifying the Hessian approximation with an additional term that will control the probability of being non positive.

H  J T J þ lI

ð12Þ

Thus the weight update becomes

W iþ1 ¼ W i  ðJ Ti J i þ lIÞ1 J Ti ei

ð13Þ

One of the main features of the LM algorithm is that as l is increased the above expression approximates the steepest descent algorithm with small learning rate while as l is decreased to zero the algorithm approximates Gauss–Newton. The adjustment of l is performed similarly to the modification of the adaptive learning rate in the back-propagation algorithm. If a step does not result in a smaller error then the previous weight vector, l is multiplied by a constant # > 1 to approach the gradient descent algorithm, and thus obtain more stability. If a step results in a smaller error, l is divided by # for the next step, which will force the algorithm towards the Gauss–Newton algorithm, which would provide faster convergence. The stopping criteria for the LM algorithm is similar to that of the back-propagation method. The main drawback of the LM algorithm is the requirement of large computational memory. Since the optimization of the LM method involves the storage of the approximated Hessian matrix H  JTJ, which is an N  N matrix, where N is the number of parameters (weights and biases) in the network. The use of the algorithm could be impractical if the number of parameters is very large. However, the LM approach has shown to be the fastest higher-order adaptive algorithm for minimizing the MSE of a neural network with moderate number of parameters.

5. Results and discussion 5.1. Comparison of scaled conjugate- and Levenberg–Marquardt algorithms The training procedure with the SCG algorithm was carried out with a variation of 12–30 neurons and 40,000 epochs. The best solution converged at 18 neurons and 40,000 epochs. Training with additional epochs did not improve the results. The training of the network with LM algorithm was made with the same variation of neurons as with the SCG algorithm, but the number of epochs was limited to 5000. The reason for not performing training with a higher number of epochs is due to limitation in computational space as well as time. When using the LM algorithm there is no reason for continuing training if the network error is decreasing very slowly. In these cases the final error will only be slightly lower but the required time for training would amount to several days. The best solution for the ML network was found with converging at 30 neurons and 3422 epochs. The performance of the two networks are shown in Figs. 3 and 4, presenting the error distribution for the prediction of the three outputs, rich load, amount CO2 captured and specific duty. The figures clearly show that the network trained with LM algorithm has better prediction accuracy for all the three outputs. The difference is most notable for the prediction of amount CO2 captured which does not exceed an error of 0.18% for the network trained with LM algorithm while several predictions of the same output are above 0.20% and where the maximum is 1.4% for the SCG trained network. Nevertheless, due to the poor extrapolation capability of empirical models the results are only valid in the region of parameter space used for the model training.

Fig. 3. Distribution of the error of the outputs for the network trained with SCG algorithm.

5.2. Sensitivity analysis A sensitivity analysis has been performed to determinate the definite dependence of each assumed input parameter on the output parameters. If the prediction accuracy is not affected by removing an input parameter then this input parameter is redundant and can be omitted. It is therefore of interest to find the minimum amount of parameters required. On the other hand, if the prediction accuracy is decreased while removing an input parameter, the interdependency between the inputs and outputs is confirmed. As previously stated the purpose of the final ANN model limits the removal of certain parameters, hence this evaluation was made to point out the influence of each input parameter on the output as well as determine the actual number of input parameters needed to represent the modelled CO2 capture process with sufficient prediction accuracy. The sensitivity analysis was carried out by

2374

N. Sipöcz et al. / Applied Energy 88 (2011) 2368–2376

Fig. 5. Average- and maximum error respectively for the predicted outputs with the ANN model with all inputs (purple) and with the ANN model with the inputs inlet flue gas temperature and solvent circulation rate removed (blue). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 4. Distribution of the error of the outputs for the network trained with LM algorithm.

removing each input parameter one at a time while keeping the remaining five. The ANN model was then retrained using the same data sets, including the same randomization, and the prediction accuracy for each of these models was compared with the original model. It was concluded, as shown in Table 4, that all inputs except from inlet flue gas temperature has a major impact on the predicted specific duty while for the prediction of rich load in addition to inlet flue gas temperature either removal efficiency or inlet flue gas mass flow is required. The predicted output amount CO2 captured showed sensitivity for all the inputs except for solvent circulation rate, however the sensitivity for inlet flue gas temperature could maybe be considered to be outside the range of significance. The removal of the two inputs, inlet flue gas temperature and inlet flue gas mass flow showed less significance than expected. A possible explanation applicable to both of these inputs is that the

Table 4 Sensitivity of prediction of the outputs for each input. Input

Error (%)

Amount CO2 captured

Rich load

Spec. duty

All inputs

Average Maximum Average Maximum Average Maximum Average Maximum Average Maximum Average Maximum Average Maximum

0.02 0.17 0.04 0.31 0.20 3.49 0.73 44.04 0.05 0.27 0.02 0.16 0.09 4.07

0.06 2.77 0.08 2.51 0.09 2.82 0.45 29.79 0.12 5.10 0.10 3.07 0.07 2.27

0.20 3.06 0.20 3.15 0.25 3.75 0.89 29.48 0.66 49.42 0.24 7.38 0.18 4.00

w/o inlet flue gas temperature w/o inlet flue gas flow w/o CO2 mass fraction w/o lean load w/o solvent circulation rate w/o CO2 removal efficiency

N. Sipöcz et al. / Applied Energy 88 (2011) 2368–2376

2375

was presented to the model and the output prediction from the model was compared with the simulated data to appraise its performability. The result of the validation is shown by the prediction of the most significant output, specific reboiler duty, in Fig. 7, the numbers on the X-axis denotes number of data patterns, where each point represents the input values. 6. Conclusions

Fig. 6. Error distribution for prediction of specific duty with the ANN model with all inputs (blue) and with ANN model with the inputs inlet flue gas temperature and solvent circulation rate removed (purple). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

information about them are possessed by one or some of the other inputs The second explanation, which only applies to the input flue gas inlet temperature is that the variation in this parameter among the 4280 data points is not sufficient for the ANN to actually learn the entire relationship between it and the output. To confirm the unexpected results above when removing flue gas temperature and solvent circulation rate a last model was trained for which both these inputs were removed. The results, also presented and compared with the original model in Figs. 5 and 6, show that the ANN model could not predict the outputs as accurate without these two parameters. The average error for all the three predicted outputs increases with 100% or more. The increase in maximum error is most notable for the prediction of specific duty which is almost twice as high for the network with reduced number of inputs. The model with only flue gas inlet temperature removed could be considered having as good prediction accuracy as the original ANN model, nevertheless due to the ultimate application the original model has been chosen as the final one. 5.3. Model validation Since data on operating CO2 capture plants of the type and size considered in this work is not available at the present, validation of the best ANN model was carried out with data provided by the simulator. A random selection of 1070 data patterns out of the 4280 in terms of input values (previously unseen by the model),

Artificial Neural Networks are found to be useful tools for predicting complex processes such as CO2 capture processes, which, when simulating the closed process network, yields a challenging solution pathway which is computationally demanding and difficult to simulate using traditional methods. A trained ANN can replicate such complex processes without loosing accuracy. This is demonstrated by the development of two ANN models of a CO2 capture plant for which high prediction accuracy was achieved. Data used for training and validation of the ANN models was generated using a rate based process simulator for an amine-based post-combustion CO2 capture process. The average value of the errors for the prediction of specific reboiler duty is well below 0.2% and the maximum error does not exceed 3.1%. The prediction of solvent rich load and amount CO2 captured are even better, with maximum error below 2.8% and 0.17% respectively. In addition to their high congruence with the physical model, the ANN models are much faster, the computational time for the ANN model is less than 0.1% of that needed for the rigourous simulator, since the output from the ANN is not an iterative solution, and easier to use, once trained. These features make them suitable for implementation into process simulators for detailed simulation of larger processes, such as integration with power plant simulation or techno-economic assessments of power plants with CO2 removal. The trained ANN will thus replicate the rigorous simulator for CO2 capture, so the predictive power is maintained when including simulations of the rest of the power plant. The use of ANN models for pre-design studies has the strength of that once the most suitable CO2 capture process has been established for a specific power plant the incorporation of the ANN model for whole plant studies such as optimization and evaluation of improved heat utilization is simplified due to ANN’s capability of fast response and simplicity. Furthermore, since the number of inputs for an ANN model could be reduced through sensitivity analysis without loosing the model accuracy, a number of sensors for future commercial plants could be minimized using these types of models. This knowledge could also be valuable for future diagnostic purposes related to CO2 capture plants of this kind. Lastly, the results of this study have clearly shown the viability of an ANN driven CO2 capture simulator performing with minimum time demand and high accuracy. ANN

Fig. 7. Validation of the ANN model with unseen data.

2376

N. Sipöcz et al. / Applied Energy 88 (2011) 2368–2376

models for CO2 capture plants trained with simulated data could provide useful insight into not only pre-design modeling studies of power plants with CO2 capture for which high accuracy component models are desirable, but also into future monitoring purposes as these plants become a part of the reality and taken into operation.

[6]

[7] [8]

7. Future work [9]

This work has evaluated ANN as tool for modeling amine based CO2 capture processes. The original ANN model described in the work has been integrated externally into our existing heat and mass balance component library in IPSEpro using the traditional programming language C++ [25]. These external functions are introduced in the IPSEpro framework based on the Dynamic Link Library (DLL) concept. The description of the implementation and further analysis of the ANN model performance in IPSEpro compared to real power plant performance results will be presented in a following paper.

[10] [11]

[12] [13]

[14]

Acknowledgement The authors would like to thank Thomas Palmé for his technical support and profound knowledge about ANN modelling. References [1] Reddy S, Scherffius J, Freguia S, Roberts C. Fluor’s econamine FG PlusSM technology: an enhanced amine-based CO2 capture process. Presented at the second national conference on carbon sequestration, Alexandria, USA; May 5– 8, 2003. [2] Kishimoto S, Hirata T, Iijima M, Ohishi T, Higaki K, Mitchell R. Current status of MHI’s CO2 recovery technology and optimization of CO2 recovery plant with a PC fired power plant. Presented at the 9th international conference on greenhouse gas control technologies, Washington DC, USA; November 16–20, 2008. [3] Desideri U, Paolucci A. Performance modelling of a carbon dioxide removal system for power plants. Energy Convers Manage 1999;40:1899–915. [4] Singh D, Croiset E, Douglas PL, Douglas MA. Techno-economic study of CO2 capture from an existing coal-fired power plant: MEA scrubbing vs. O2/CO2 recycle combustion. Energy Convers Manage 2003;44:3073–91. [5] Abu-Zahra MRM, Schneiders LHJ, Niederer JPM, Feron PHM, Versteeg GF. CO2 capture from power plants. Part I. A parametric study of the technical

[15]

[16] [17] [18] [19] [20]

[21] [22]

[23] [24] [25]

performance based on monoethanolamine. Int J Greenhouse Gas Control 2007;1:37–46. Abu-Zahra MRM, Schneiders LHJ, Niederer JPM, Feron PHM, Versteeg GF. CO2 capture from power plants. Part II. A parametric study of the economical performance based on mono-ethanolamine. Int J Greenhouse Gas Control 2007;1:135–42. Romeo LM, Espatoleroa S, Bolea I. Integration of power plant and amine scrubbing to reduce CO2 capture costs. Appl Therm Eng 2008;28(June):1039–46. Tobiesen FA. Modelling and experimental study of CO2 absorption and desorption. Ph.D. thesis, Norwegian University of Science and Technology; 2006. Palmé T, Fast M, Assadi M, Pike A, Breuhaus P. Different condition monitoring models for gas turbines by means of artificial neural networks. Paper GT200959364, ASME Turbo Expo, Orlando, Florida, USA; June 2009. Fast M, Assadi M, De S. Development and multi-utility of an ANN model for an industrial gas turbine. Appl Energy 2009;86:9–17. Olausson P, Häggståhl D, Arriagada J, Dahlquist E, Assadid M. Hybrid model of an evaporative gas turbine power plant utilizing physical models and artificial neural networks. Paper GT2003-38116, ASME Turbo Expo, Atlanta Georgia, USA; June 2003. Arriagada J, Olausson P, Selimovic A. Artificial neural network simulator for SOFC performance prediction. J Power Sources 2002;112:54–60. Arriagada J, Costantini M, Olausson P, Assadi M, Torisson T. Artificial neural network model for a biomass-fuelled boiler. Paper GT2003-38070, ASME Turbo Expo, Atlanta Georgia, USA; June 2003. Tobiesen FA, Juliussen O, Svendsen HF. Experimental validation of a rigorous absorber model for CO2 postcombustion capture. AIChE J 2007;53(4):846–65. Tobiesen FA, Juliussen O, Svendsen HF. Experimental validation of a rigorous desorber model for CO2 postcombustion capture. Chem Eng Sci 2008;63(10):2641–56. Cybenko G. Approximation by superpositions of a sigmoidal function. Math Control Signals Syst (MCSS) 1989;2(4):303–14. Bishop CM. Pattern recognition and machine learning. Springer; 2006. Guo B, Li D, Cheng C, Lu Z, Shen Y. Simulation of biomass gasification with a hybrid neural network model. Bioresour Technol 2001;76(2):77–83. Boccaletti C, Cerri G, Seyedan B. A neural network simulator of a gas turbine with a waste heat recovery section. J Eng Gas Turbine Power 2001;123:371–6. Fast M, Assadi M, Smrekar J. Application of neural network to the condition monitoring and diagnosis of a CHP plant. In: Proceedings of ECOS 2008, Cracow-Gliwice, Poland; June 2008. Werbos PJ. Beyond regression: new tools for prediction and analysis in the behavioural sciences. Ph.D. Thesis, Harward University, Cambridge, MA; 1974. Rumelhart DE, Hinton GE, Williams RJ. Learning internal representations by error propagation. In: Rumelhart DE, McClelland JL, editors. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1. Cambridge, MA: MIT Press; 1986. p. 318–62. Møller AF. A scaled conjugate gradient algorithm for fast supervised learning. Neural Networks 1993;6:525–33. Hagan MT, Demuth HB, Beale M. Neural network design. Berlin: Springer; 1996. pp. 712. SimTech Simulation Technology, IPSEpro. .