System Identification of an Adsorption Process Using Neural Networks

System Identification of an Adsorption Process Using Neural Networks

Copyright @ IF AC Advanced Control of Chemical Processes, Kyoto, Japan, 1994 SYSTEM IDENTIFICATION OF AN ADSORPTION PROCESS USING NEURAL NE1WORKS A ...

1MB Sizes 0 Downloads 59 Views

Copyright @ IF AC Advanced Control of Chemical Processes, Kyoto, Japan, 1994

SYSTEM IDENTIFICATION OF AN ADSORPTION PROCESS USING NEURAL NE1WORKS

A BULSARI, 1. LEWANDOWSKI and S. PALOSAARI Department of Chemical Technology, Lappeenranta University of Technology, Box 20, 53851 Lappeenranta, Finland Abstract The aim of this work is to investigate the feasibility of the use of neural networks for system identification of the dynamics of a distributed parameter system, an adsorption column for wastewater treatment of water containing toxic chemicals. System identification of this process is done from simulated data. Feed-forward neural networks have been used for most of the work reported here. The inputs to the networks include the state of the column at a given point in time and the system input, the velocity. The network predicts the change in the state over a period of time based on these inputs. Networks of various sizes were trained using data from four simulation runs, and were then tested on data which was not used for training. The network with the configuration (11,6,10) gave the best results in absence of added noise. Recurrent networks were found to be capable of simulating the whole operation of the column from an initial state of zero concentrations throughout the colUIllIl, and thus predicting the complete breakthrough curves. Keywords system identification, adsorption, distributed parameter system, breakthrough curves, process dynamics

1. Introduction

ments are exceedingly time consuming. There are fundamental models for adsorption but the process parameters are difficult to obtain, such as the effective diffusion coefficient inside the particle, and the true void fraction in the bed. The traditional parameter fitting to the fundamental equations, based on the experimental breakthrough information does not easily produce reliable accuracy of the system parameters as discussed by Kaipainen, Reunanen and Palosaari [8] in a work where the moment method was used in the simulation of the separations of sugars by adsorption chromatography.

Artificial neural networks have been used for a number of applications in process engineering. Much of the work deals with dynamics of processes, but the processes considered are only lumped parameter systems. Our earlier work has dealt with system identification, state variable estimation, disturbance estimation, filtering, smoothing, sensor fault detection, and control of lumped parameter systems, exemplified by a biochemical process of fermentation by Saccharomyces cerevisiae [1-6] and multivariable control of a linear system [7].

System identification may therefore be useful. It is not the same as modelling. Modelling means that the variables are related in any convenient way. Dynamic system identification is a systematic way of extracting the dynamics in terms of first derivatives of its state variables. Distributed parameter systems, which we intend to study here, are complicated because their state variables are distributed.

The application in this paper considers system identification of a distributed parameter system. Here, in addition to the variation of state variables over time, the variation over location is also to be considered. The state is distributed over space and is discretised in the space dimension as well to avoid dealing with infinite dimensional systems. Some kind of computer simulation is necessary in the design of the adsorption equipment, since experi53

2. The adsorption process (or wutewater treatment

These equations were used to generate data for training the neural network. Concentrations vary along the column and also with time. It is this nature of variation that system identification should try to

Adsorption is an important technique of separation [14] used by the process industry for several purposes, of which an important one is environmental pollution control. Other major applications include separation of fine chemicals and biochemical products. Adsorption is essentially a surface phenomenon, where a component in the bulk phase attaches itself physically to the surface of the adsorbent.

grasp.

3. Artificial neural networks and their training A feed-forward neural network (Fig. 1) has an input layer which simply transmits the input variables without any processing to the next layer. The bias is a weighted unit input to each node, thus adding a constant term to the net input of the node. Two, one or no hidden layers are usually considered. Each node in the upper layers (hidden or output layers) receives weighted inputs from each of the nodes in the layer below it. These weighted inputs are added to get the net input of the node, and the output Xj (or the activation) is calculated by an activation function of the net input Dj as

The dynamics of an adsorption column is described by the following equations. Here we have a single component A being adsorbed from fluid to adsorbent particles when the fluid is flowing in a column.

ac

~ = -v

ac

1-e

a; - -e- RA

~s ~ksa(C; -CS)=kLa(C-C c; =.r( C equilibrium relation

RA =

j )

j)

k

..

= 10D.. d..

1

where C is the concentration of the sorbate A in liquid phase in bulk liquid, C j is the concentration in liquid on the particle surface, C.. is the mean concentration of sorbate A on solid volume basis, C..• is this concentration in solid on the surface of the particle, RA is the rate of transfer of A from the liquid phase to the adsorbent, in unit time and volume, v is the superficial velocity of the liquid phase, and k.. and kL are the mass transfer coefficients in the solid and liquid phases respectively. The effective diffusion coefficient in the particle is D .. and the particle diameter is d... The distance along the column is z and t is the time. The void fraction of the bed is given by s. In the above equations, axial dispersion has not been taken into account and plug flow is assumed. Diffusion inside the particle has been approximated by the linear driving force model.

Xj

where N is the number of nodes in a hidden layer or the number of input nodes and WiO is the bias of node j. fj is referred to as the gain term and is usually set to 1. Levenberg-Marquardt method [9,10,12,13] was used to determine the weights in the neural networks. Network training aims at minimising the sum of squares of errors, the errors measured as the difference between the calculated output and the desired output. Back propagation by the generalised delta rule, a kind of a gradient descent method is one popular method [11] for training feed-forward neural networks. The program ANNAT was used to train the neural networks [3]

The boundary conditions are C(z,l)

=0

C .. (Z,I) = 0 C(z,l) = J

all

= l+e-~'

=0

at 1= 0

atz=O

54

OUTPUT LAYER

0

POSSIBLE HIDDEN

0

LAYERS

0

[U

0

0

0

0

0

0

INPUT LAYER

Figure 1. Feed-forward neural network Table 1. Typical parameters and constants used for simulations Liquid feed concentration, kg/m 3 Void fraction in the adsorbent bed Diffusivity in the fluid, m% Diameter of the adsorbent particle, m Diffusivity in the adsorbent particle, m2/s Bulk density of the adsorbent, kg/m3 Liquid density, kg/m3 Liquid dynamic viscosity, kg/Cm s) Liquid superficial velocity, m1s

1.0 0.422 0.927 x 10-9 0.0011 1.195 x 10-11 724.1 1000 0.001 0.614 x 10-3

4. Results

10 variables and the velocity. The outputs are the change in these 10 variables over a period of 10000 seconds. A linear correlation results in an error square sum of 0.112 which confirms the non-linearity of the relation between the input and the output variables. A (11,5,10) network with sigmoidal activation functions in the hidden layer and linear (identity) activation functions in the output layer had a minimum error square sum of 0.0350 on the 34 training instances, or an rms (root mean square) error of 0.0101. The error square sum on 10 test instances at a different velocity was 0.0207, or an rms error of 0.0144. A (11,6,10) network resulted in an error square sum of 0.00885 (rms error of 0.(051) on the training instances and 0.0024 (rms error of 0.0049) on the 10 test instances. A (11,7,10) network resulted in an error square sum of 0.00734 (rms error of 0.(046) on the training instances and 0.0550 (rms

The system used for simulations was a typical wastewater treatment process, for which the physical parameters and operating conditions are given in Table 1. The adsorption isotherm (the equilibrium relation) was taken to be

Cs = 1920 153 C 1+153 C Four simulations were carried out with different flow rates (velocities) resulting in 34 training instances. This numerical method has been tested against analytical solutions in the applicable ranges, and by use of experimental breakthrough curve information as shown by Reunanen [13,16]. The liquid phase and solid phase concentrations are so obtained at 5 points in the column, making a total of 10 variables to be monitored for their dynamics. The inputs to the neural networks are these

55

error of 0.0234) on the 10 test instances. A (11,8,10) network resulted in an error square sum of 0.00192 (rms error of 0.(024) on the training instances and 0.0410 (rms error of 0.0203) on the 10 test instances. The (11,6,10) thus gave the best results, and the predicted changes in the liquid phase concentrations are plotted in Fig. 2. The predicted liquid phase concentrations are shown in Fig. 3. The results are good and the error is within the accuracy limits of measurements.

The solid phase concentrations are not plotted because the mass transfer is relatively fast and the concentrations in solid phase are simply a multiple of the concentrations in the liquid phase, i.e., the curves look more or less the same. This is also a reason why a network with just six nodes in the hidden layer is sufficient. More nodes would have been required if the mass transfer were slower.

0.2r-------------------------------------------. 0.1 m 0.2 m 0.3 m 0.5 m 0." m

• neural network -

o

2

10

8

8

4

training data

12

"

Time. 10 seconds

Figure 2. Changes in liquid phase concentrations at the 5 sampling points predicted using the (11,6,10) network. The changes are during the previous 10 000 seconds. The sampling points are measured from the entrance of the 0.5 m long column. 1.2

reQ

~

C .2

!

1 0.8 0.8

1: CD

()

c: 0.4 0

0

"t:J

"5 0.2 CT

::J

0

• neural network -

-0.2

0

2

4

8

8

Time. 10" seconds

Figure 3. Liquid phase concentrations predicted using the (11 ,6,10) network.

56

training data

10

12

6. DilCUJlion and Conclusions

5. Recurrent networks for full trajectory prediction

The method applied in this work appears a rather cumbersome one as compared with the simple neural network between input data and output concentrations of the fluid flow which leaves an adsorption column. In the method we here use the system dynamics is incorporated explicitly so that the neural network can be treated as a dynamic simulator, a system with its own dynamics, which is supposed to match that of the real system. The model is still empirical but has explicit dynamics in it. Besides, this method enables such method to be used where experimental information is used to build this kind of simulator, and the simulator is used in parameter fitting to produce the important process information such as the effective diffusivity of the absorbent and the void fraction of the bed. It is difficult to say whether doing the parameter fitting in this way is of advantage as compared with the parameter fitting directly to the fundamental equations.

The previous section aimed at understanding how the concentrations change over a single time step. One can extend this to predicting a complete trajectory of a variable, in other words, predicting the complete breakthrough curves at different sampling points. Recurrent networks can be used for this purpose. Feed-forward neural networks were now again trained with sigmoidal activation functions in the hidden layer as well as at the output layer with 6 nodes in the hidden layer. The use of sigmoids in the output layer is forced by the fact that the predictions diverged more when linear activation functions were used. The recurrent links were then set from the ten output nodes to the last ten input nodes, leaving aside the first input node for velocity. The concentrations were recalculated after every forward pass through the network based on the predicted changes in the variables i.e., the outputs of the network were calculated again using the concentrations from the previous calculation of the outputs of the network. The variation of the liquid phase concentrations with time (sampling number) is shown in Fig. 4.

This work illustrates the successful use of feedforward neural networks for system identification of the adsorption process. Unlike most other system identification problems reported in neural network literature, the process considered here is a distributed parameter system. The methodology used results in fairly accurate predictions of the changes in variables over time. The problem considered here has non-linear characteristics which justifies the use of feed-forward neural networks. A (11,6,10) network gave the best results.

The difference between figures 3 and 4 is that only the next step was predicted for Fig. 3, whereas Fig. 4 is generated by predicting the whole breakthrough curve from one initial condition. The initial condition is zero concentrations throughout the column. A ten step ahead prediction ~ts in very reasonable values, and almost the whole breakthrough curves at the first four sampling points are predicted quite well, starting with zero concentrations throughout the column. The last curve has accumulated errors from all the previous sampling points, and hence its accuracy is lower. However, the results are qualitatively correct for the last point as well.

The feasibility of system identification of this process has been established using synthetic data, which indicates that the same can be performed from experimental data when all the required measurements are available. Complete breakthrough curves were also successfully predicted using only the initial state with a recurrent network.

57

12.-------------------------------------------,

~

1

~

co

O•S

e 1: o.s 8

0.1 m

cS 0.4 't:I

:;

g

0.2

• neural networtc: -data

2

8

8

10

12

Time. 10· seconds

Figure 4. Liquid phase concentrations (breakthrough curves) predicted using the (11,6,10) recurrent network with 10 feedback links.

References 1. Bulsari, A. and H. Saxen, "System identification of a biochemical process using feed-forward new-al networks", Neurocomputing - An International Journal, \bl. 3 (1991) 125-133. 2. Bulsari, A., A. Medvedev and H. Saxen, "An algoritlun for sensor fault detection using state vector estimator and feed-forward neural networks applied to a biochemical process", Acta Polytechnica Scandinavica, Chemical Technology and Metallw-gy Series, No. 199 (1991) 1-20. 3. Bulsari,A., B. Saxen and H. Saxen "Prognunmen ANNAT och EVAL fl)r framAtkopplade neurala nAtvcrk" Report 91-2, VAnneteknik, Abo Akademi, April 1991 4. Bulsari, A. and H. Saxen, "Estimation of state and disturbance in a biochemical process using feed-forward new-al networks" Report 33, Chemical Technology, Lappeenranta University ofTechnology, May 1992 5. Bulsari, A. and H. Saxen, "Filtering, smoothing and prediction in a biochemical process using feed-forward neural networks" Report 92-4, VAnneteknik, Abo Akademi, July 1992 6. Bulsari, A. , B. Saxen and H. Saxen, "Control of a fed-batch bioreactor using feed-forward neural networks" Report 92-7, VAnneteknik, Abo Akademi, August 1992 7. Bulsari, A. and H. Saxen, "A study on multivariable control using feed-forward neural networks" Report 92-5, VAnneteknik, Abo Akademi, July 1992 8. Kaipainen.E., J. Reunanen and S. Palosaari, "The use of moment method in the simulation of the separation of the fructoseglucose mixture by adsorption chromatography", Lappeenranta Univ. of Technology, Department of Chemical Technology, Publication No 41 (1993), 1-20. 9. Fletcher, R, "Practical methods of optimization. \bl. I, Unconstrained optimization", John Wiley and Sons, Chichester, England (1980) 82-88. 10. Gill, P. E. , W. MUlTBY and M. H. Wright, "Practical Optimization" Academic Press, London (1981) 136-140. 11 . Jones, W. P., and J. Hoskins, "Back-Propagation: A generalized delta learning rule", Byte, (October 1987) 155-162. 12. Levenberg, K. "A method for the solution of certain nonlinear problems in least squares" Quart. Appl. Math., 2 (1944) 164-168. 13. Marquardt., D. W. "An algoritlun for least-squares estimation ofnonlinear parameters" J. Soc. Indust. Appl. Math., 11 (June 1963)431-441. 14. Reunanen, J. "Calculation of the breakthrough curve of adsorption column" (in Finnish) Licentiate of Technology Thesis, Lappeenranta University oftTechnology, Finland, 1992. 15. Suzuki, M . "Adsorption Engineering" Chemical Engineering Monographs, \bl. 25, Elsevier Science Publishers, Amsterdam, 1990. 16. Reunanen,J. , S. Palosaari, M. Miyahara and M. Okazaki, "Reliable nwnerical calculation of the breakthrough curve of an adsorption column", to be published in Chemical Engineering and Processing.

58