Copyright (<,) IFAC Advanced Control of Chemical Processes. Banff. Canada. 1997
HYBRID NEURAL NETWORK·FIRST PRINCIPLES MODELS APPLIED TO A FOOD PROCESS
S. Albert, G.A. Montague
Department o/Chemical and Process Engineering, University of Newcastle Newcastle upon Tyne . NE] 7RU. UK. E-mail: gary.montague@nc/.ac.uk
Abstract: In the recent years hybrid modelling approaches combining neural networks and first principle models have attracted interest The main reasons are that a black-box neural network model is not physically interpretable and not as extensive as a first principle model. while the development of an accurate mechanistic model for a specific plant is costly and difficult. In this paper a number of hybrid approaches are described and applied to develop models of a continuous food process plant. Results presented indicate that the hybrid approaches generate accurate predictions of food product quality. Keywords: hybrid models. neural networks. food processing
1. INTRODUCTION
They are poorly understood and the relationships between system variables are complex and highly non-linear. Furthermore. the actual system variables can be difficult to quantify. in some circumstances only a qualitative measure such as 'taste' may be available.
In today's economic climate there is an increasing pressure for chemical plants to operate under the best operating conditions. For a control system to provide high performance it is necessary that the model represents the process accurately. This is particularly apparent when using advanced control schemes which are explicitly model based. There are two basic model forms : mechanistic or empirical.
At present there is a very strong interest in the field of neural computation which offers a promise of general. cost-effective methodology of modelling which has the potential for tackling such problems. The major attraction of neural network models is that they can be used to approximate any continuous non·linear function (Homik et al . 1989). They appear to require a reduced skill level of the user compared to mechanistic modelling techniques. and it is possible to adapt the neural network to process changes on-line. On the other hand. neural networks have also been criticised because of some limitations associated with their use. The functional relationships developed between inputs and outputs are strongly dependent on the training data presented and therefore generalisation ability beyond this data set is poor. Physical interpretation of neural network models is difficult
Mechanistic models generally consist of a set of equations developed from physical and chemical laws. including system variables. parameters. constants and various mathematical relations. A number of unknown parameters need to be estimated before the model equations can be solved. This is often achieved by collecting data from the process and finding parameters which provide the best fit or they might be estimated from a pilot plant. Full mechanistic models for real industrial processes are often large and complex. and due to process changes. model parameters may need to be adjusted. These problems are more severe when considering food processing.
127
One possible solution is the incorporation of prior knowledge into the neural networks by creating a hybrid model through the integration of a neural network and a first principle model.
2. PROCESS AND MODEL STRUCTURE
Several workers have studied this methodology over the last few years. Johansen and Foss (1992) developed a hybrid model, taking advantage of the fact that part of the process is well understood, they used neural networks only in operation regimes where the mechanistic model is not accurate enough. They created a model validity function indicating the accuracy as a function of operation point and modified the existing model or added local models in case of plant-model mismatch. Recently, Thompson and Kramer (1994) gave a general discussion of both serial and parallel hybrid structures used for integrating different prior knowledge with radial basis networks. They proposed a parallel structure of a default model and a radial basis function network which is trained to learn the residual between the output of the default model and the target. A different approach is taken by Psichogios and Ungar (1992) that combines a feed forward neural network to estimate parameters in a dynamic first principles model of a fed-batch bioreactor. Su et al. (1992) integrated a recurrent neural network with a first principles model to learn process unmodelled dynamics. Schubert et al (1994) developed a hybrid model for state estimation and fecdrate optimisation of a fedbatch fennentation process. A fuzzy expert system, a feedforward neural network and dynamic differential equations were integrated to provide better generalisation ability. Becraft and Lee (1993) built neural networks and expert systems for fault detection and diagnosis. Mavrovouniotis and Chang (1992) present a hierarchical neural network consisting of loosely coupled small neural networks each representing a particular subset of the input variables relating to a particular process unit resulting in easier interpretation. Schaffranietz and Rbck (1994) applied a hybrid fuzzy controller incorporating a mathematical model and extended Kalman-Filter to control nitrification in waste water treatment. Reichl et a1. (1994) describe a hybrid system for speech recognition consisting of a neural network and a hidden Markov model with a new training algorithm for such classifier networks. It is the intention of this paper to take the various hybrid methods, essentially developed using simulations, and compare their behaviour when applied to solve a real industrial modelling problem.
Figure 1. Scheme of production line of the food process studied The process under consideration is a food process plant operating in continuous mode to produce breakfast cereal. A basic schematic of the process is shown in Figure 1. Cereal is processed using several unit operations to arrive at a packaged product. Seven quality measurements are made on the finished product to ensure customer satisfaction. Unfortunately, it is not possible to make these measurements on-line, so periodic off-line sampling is carried out. The objective of this research is to develop inferential models to provide on-line estimates of the quality variables using on-line process infonnation. Twenty eight of the process measurements are measured directly including feedrates, residence times and temperatures. In this paper we restrict the study to only four quality variables referred to as Ql to Q4. In case of Q I and Q2, the available process knowledge is limited as the process involves a complex series of biochemical reactions. However, it is possible to make simplifying assumptions (i.e. first order kinetics) to describe the progression of these variables with time. The simplification provides a reasonable approximation but lacks the fine detail required for estimation purposes. In case of Q3 and Q4, it was possible to develop a detailed mechanistic model from a series of heat and mass balance equations of several process units. The model developed by factory experts includes a number of unknown parameters which needed to be estimated before the model could be solved. The final unit operation could not be described accurately as little sensory information was available.
128
The available mechanistic models were thus either not complete or failed lO describe the process with the required degree of accuracy. In a situation like this, a logical approach might be lO combine the available mechanistic undersLanding with empirical modelling techniques. There are several ways of including prior knowledge inlO empirical models.
.~ r..---.-1 ...• FP~~~:ii :.:
2.1. Hybrid(/) rype model
Figure 3a. Combined model building
~ ~
Figure 2 shows an approach referred lo as Hybrid(l) structure, where the neural network (NN) is in Landem with the first principles model (FP) which has fixed structure derived from a mechanistic understanding but with parametric uncenainty. The neural network component estimates the unknown parameters.
, - - - -..·••••••••·i·.·.l••
l~N·•
Figure 3b. Combined model usage ·:I·.:.···.··
··:\···:···: +~r~meters
inputs 1---_
M~ril~tit. MOd~·:!'it ...
2.3. Hybrid(lI/) type model
•
This approach is only different from the pure neural network model in that the output of the mechanistic model is fed as an additional process input to tlle neural network.
outputs
Figure 2. Hybrid(I) type: sequential hybrid model In the hybrid(I) approach the neural network outputs are unknown process parameters which are fed into the mechanistic equations Lhat subsequently provide an estimate for the qualiLY variables. The difference between this estimaLe and the actual quality measurements is minimised during training.
In puts
;' ..:.:.:".
f - - - - .:.·..F. ."P .> .:.
:... ; : m echanlstlc
: , •.•:output
.. ...
... ·· H yb [Id
.. . '. :':\:::>:"'0 u t put
1------------ ?h«:······
•
Figure 4. Hybrid(III) model
2.2. Hybrid( 11) type model In this second type of hybrid model the plant inputs are fed both lo the neural network and the mechanistic model. The neural network is trained off-line to learn the difference between the prediction of the mechanistic model and the actual plant daLa. The neural network in fact learns un modelled information left in the residuals and thus compensates for inaccuracy of the mechanistic model. Figures 3a and 3b illustraLe the proposed structure.
2.4. Hybrid(/\/) rype model In case there are pans of the process which are well understood but other sections are difficult to describe Lhe following serial structure shown in Figure 5 could be applied.
~f:\:~
--~:: > +--mod~ " ~ Figure 5. Hybrid(IV) model: different modelling techniques in series
129
as pure neural network models or as part of the hybrid models described in the previous section. The Root Mean Square Errors of the best mechanistic models are presented in Table 1 for comparison with the other models.
3. MODEL INVESTIGATION The application of hybrid models is illustrated on two data sets logged at the plant. A period of two years elapsed between collecting the two sets and as a result raw material variations and process changes (e.g. sensor changes) could be eltpected. The process variables have been logged with a sample time of one minute, while quality infonnation was available in every fifteen minutes. A cubic spline interpolation routine was used to equalise the sample rates. The models are based upon linking cause and effect. it was therefore necessary to time-shift data using estimated residence times.
In case of Q I it is clear that hybrid(II) approach results in significantly smaller RMS error than the mechanistic model and including mechanistic infonnation has improved the predictions over a pure neural network model. The quality of predictions for hybrid models II and III is comparable, while the serial hybrid(J) approach degraded predictive ability. There are several possible reasons why this could be the case. One likely scenario could be that the mechanistic model is oversimplified. This result in constraining the approximation ability of the neural network component. An advantage of applying hybrid(I) type model is that apart from prediction process parameters are obtained knowing the nature of the input-output relationship. To demonstrate the quality of fit, results in Figure 6 and 7 show the prediction for both QI and Q2 using the best hybrid models on data set II.
For all models the data was partitioned into three sets: training, testing and validating. The training data was used to adapt the model parameters. The testing data was used to tenninate the training procedure and the validating data used to assess the quality of the resulting model. The validation data represented a different period of operation in case of both data sets, thus it is suitable to chcck the extrapolation ability of the models.
Table 1. Root Mean Square Error on validation data In case of QI and Q2 the approximate mechanistic model based upon first order Arrhenius kinctics, results in a non-linear function with several unknown parameters. These parameters have been optimised to give the best fit to the data in least squares sense. As in any optimisation problem, the optimal solution is not guaranteed and the procedure is time consuming. Static feedforward Neural Networks with backpropagation training method have been applied
Model Mechanistic NN Hybrid(I) Hybrid(II) Hybrid(III)
QI Datal 1.2647 0.9031 0.9425 0.6751 0 .7662
1-
Q2 Datal 1.3398 0.6791 0.1856 0.5001 0.3498
Act u a I H y b rod (11)
Figure 6. Prediclion of Q 1 by Hybrid II model on validation data
130
I
QI Data2 0.5980 0.4377 1.3397 0.4104 0.4330
Q2 Data2 0.1681 0.1898 0.4225 0.1723 0.1859
I-Actuel -
Hybrtd(lIl
I
Figure 7. Prediction of Q2 by Hybrid 11 model on validation data set
Q3 and Q4 are described by a mechanistic model which fits on average data collected during certain time period but is strongly biased to data belonging to any other operating regions . The above results suggest that the detailed mechanistic model is not appropriate. There could be several reasons for model mismatch: noise, incorrect model structure and inaccurate parameter values. Process noise is always present when dealing with industrial data and does not cause such significant bias. However, it is difficult to decide without a sufficient level of process knowledge whether the cause of the problem is incorrect model structure or parameter uncertainty. Studies on the mechanistic model revealed that it was one particular unit operating at the end of the production line had a poorly defined model structure and as a result the overall model was inappropriate. Current work is being carried out to improve this mechanistic model. If the fact that the mechanistic model is inaccurate is considered, it might be expected that the prediction of the hybrid models would worse than a pure neural network fit to the data. This indeed was the case as shown in table 2 for hybrid models 11 and Ill. An excessive number of unknown parameters precluded the use of hybrid model type I.
o
Table
:!. Root Mean Square Error on valjdation data
Model Mechanistic Neural Network Hybrid(II) Hybrid(llI) Hybrid(lV)
03 (Datal) 5.5041 1.012 1.231 1.114 1.011
Q4(Datai) 13.7615 9.1936 32.3916 13.3334 7.4800
The fact that the process mechanistic model has been split into unit operations allows the substitution of empirical models for problematic parts. The hybrid(IV) model follows this philosophy and simply replaces the critical unit with a neural network structure in the overall model. The superior prediction of this model over both pure neural network and the first principle model indicates that one can place confidence into the other parts mechanistic model and it is worth modelling only the poorly understood parts with the empirical approach . Such model shows improved extrapolation ability as well as being easier to interpret. The prediction for Q3 and Q4 of several models is shown on Figure 8.
50
100
150
200
250
50
100
150
200
250
Figure 8. Hybrid(lV) model predictions( ---)on product quality
131
Schaffranietz, U. and H. ROCk (1994). A Hybrid Controller Combining Model- and Knowledge-Based Methods Applied lO a Bioprocess. Proc . of the 3rd IEEE Coni on Control Applications. 1-3. 16951700.
4 . CONCLUSIONS In this paper a number of hybrid modclling approaches have been investigated and applied for modelling quality variables on an industrial continuous food process. It is shown thal the hybrid modelling approaches can offer more accurate predictions. In particular, when limited process knowledge is available a parallel-slructured hybrid model allows the neural network component to capture model mismatch between process data and the predictions of a first principles model. The hybrid modelling technique was also shown to be useful when a model of only a subset of unit operations is available. The serial approach to hybrid modelling proved to be ineffective in this particular case. The success of the modelling has led the industrial collaborator to carry out a on-line trial lO assess the estimation capabilities.
Schubert, J., R. Simutis. M. Dors, I Havlik and A. Liibbert (1994). Bioprocess Optimisation and Control: Application of Hybrid Modelling. Journal a/Biotechnology. 3S 51-68. Su, H-T., N. Bhat, P.A. Minderman and T.1.McAvoy (1992). Integrating Neural Networks with First Principles Models for Dynamic Modelling. IFAC Symp. on Dynamics and Control 0/ Chemical Reactors. Distilation ColoUJn/lS and Batch Processes. (DYCORD+'92) 77-82. Thompson, M.L. and M.A. Kramer (1994). Modelling Chemical Processes Using Prior Knowledge and Neural Networks. AlChE Journal, 40 (8), 1328-1340.
5. ACKNOWLEDGEMENTS The authors would like to acknowledge the support of the industrial collaborator for their support and providing the data, the UK Deparunent of Trade and Industry and for the Department of Chemical and Process Engineering.
REFERENCES Becraft, W.R. and P.L. Lee (1993). An Integrated Neural Network!Expert System Approach for Fault Diagnosis. CompUlers Chem. Eng .. 17 (10) 10011014. Johansen , T.A., and Foss, B. (1992). Represcnting and Learning Unmodelled Dynamics with Neural Network Memories. Proc. Amer. Control Coni, ~. 3037-3043 . Homik, K., M. Stinchombe and H. White (1989). Multilayer Feedforward Networks are Universal Approximators. Neural Networks. 2, 359-366. Mavrovouniotis. M.L. and S. Chang (1992). Hierarchical Neural Networks. Computers Chem. Eng .. 16 (4) 347-369. Psichogios, D.e. and Ungar, L.H. (1992). A Hybrid Neural Network-First Princ iples Approach to Process Modelling. AlChE Journal . 38 (10) , 1449-1511. Reichl , W., P. Caspary. G. Ruske (1994). A New Model-Disriminant Training Algorithm For Hybrid NN-HMM Systems. Int. Con! on Acoustics. Speech. and Signal Processing . Proceedings. 2677-680. 132