Copyright © IFAC Computer Applications in Biotechnology. Gamusch-Panenkirchen. Germany. 1995
BIOREACfOR MODELING AND CONTROL BY PRINCIPAL COMPONENT BASED NEURAL NETWORKS :L,. Kurtanjek Faculty ojFood Technology and Biotechnology University ojZagreb, PierottiJeva 6, Zagreb, CROATL4
In the \Wrk are developed structures for training and prediction by neural nenwrks based on principal components analysis of input/output patterns. The structures are composed of functional modules dedicated for specific tasks in view of process engineering applications. The structures are serial connections of the following modules: ARMA for approximation by autoregression moving averages of process dynamics and process or transpon delays; PS for statistical preprocessing, detection of gross errors, and rejection of redundant patterns; peA for decomposition of patterns into principal components aimed for data compression and random noise elimination; and SlSO-NN subsystem with a parallel connection of separated single input single output neural networks. Each NN submodule is structured with a single hidden layer with static neurons and feedf"orward progression. Parameters of NN modules are determined by Pollak-Ribiere-Powell conjugate gradient optimization. Number of principal components and elements in a hidden layer are optimized for mmimal predicted sum of squares (PRESS) with untrained set of paUerns. The models are applied on data from industrial production of baker's yeast in a deep Jet bioreactor. Keywords: neural nenwrk models. decomposition. statistical inference, process control. biotechnology
1. INTRODUCTION Artificial neural netv.uks can be vi~ as general models for mapptng between high dimension spaces of input and output patterns (Bhal and McAvoy. 1990; ydstie, 1990). They have a very high potential for use in control engineering problems, especially in biotechnology ( Cooney et al., 1991) due to multitude of variables and complexity of interaction of macroscopic variables at reactor level with intraeellular reaction mechanisms which is difficult to model by analytical approach- Results with computer models ( Thibault et al., 1990) and industrial data (Massimo et al., 1992; Zhang et al., 1994) con1irm important advantages of NN models over EK filten for on-line estimation and long hm;izon prediction of unmeasured biological variables. Besides having better predictivity, they are inherently stable, and once trained, they require very little of CPU time. NN models dedicated for 277
control of bioreactors need to meet several specific requirements: I) have tested extr3pOlauon ~r, 2) have simple ( modular ) and transparent structure, 3) can compensate for process dynamics, and 4) reject process and measurement noise . Most of these requirements need to be assured at design stage of NN, i.e. by appiopriate definition of input and output sets. data piepiocessing, selection of traming and testing patterns, choice of structure, and use of model verification algorithm.
).) NN stnIctvre and algOrithms
On Fig.,l. are presented the modular structures for training and prediction of NN systems. The systems are colDJXlled of several subsystems dedicated to specific tasks. The first subsystem ARMA provides approximation of derivatives of state variables by finite differences and accounts for process time delay T. The lets of input and output patterns are:
x = {y(k),y(k r =Lv( k + 6)}
6), ... i(k - T),i(k - 6 - T) ... } (1)
6
=a . t,
E is r-th residual covariance matrix. The prmcipal directions ( vectors Pr ) are deterrruned by IIllIUlIUzanon of the norm of the residual matnxes
p=nJ!il(exJ1 =IIYTx-(rTXp) .pTI!
(4)
p
Input and output patterns are scaled.. E(x)=O, and cr2(x)=1. The expression ( 4 ) is:
Fig. 1. Schematic diagrams of the structureS: ARMA Auto RegressIon MOVIng Averages, SP - Statistical Preprocessing, PLS-Principal Least Squares, NN SISO - Neural Network. PeA PrinCIpal Component Analysis, t and u are pnnClpal components, and ~ are parameters. In training procedure outputs X. Y from ARMA are inputs to SP for first step statistlcal preprocessing. It has to eliminate 1) contradictory input output ass0ciations, and 2) repetitive patterns from traiDing set leadlng to bias. For the spaces of input and output panerns are defined the confidence intervals : EX
=IIX-xll
[(r Tx)-(rTxp). PA ]
(5)
The necessary condition for minimum is:
a(
~ eXl
)2
-
(6)
=0 op After differentiation. one obtains: T
i.a
I:[(r T X)i· PI +(r T xp)·ri]
(7)
I-I
(2)
Two sets of panerns are considered contradictory when inputs X 1 and X2 belong to a same interval e x while corresponding outputs Y 1 and Y 2 are not inside same er .Contradictions may result from gross measurement errors or could be due to inappropriate defirutlons of input and output panerns under speMC dynamic conditions. They can not be resolved by NN algonthm and therefore need to be ehmmated from a training set, and their causes need to be analyzed separately. At this potnt also are elimmated redundant patterns belongmg to same input and output mtervals. Such patterns are usually result of almost steady state situations, and their presence in training set would produce bias adaptauon of NN at expense of dynamic patterns. The PLS is the second step statistical pleplocessjng by which are eliminated collinearity betMen patterns, random errors, and patterns are COlbpl essed to a low dimension space of principal components. In the predictive structure the SP and PLS subsystems are replaced with PeA where the principal directions determined in training set by PLS are used for projections of untrained input patterns X. Model with r principal components of the input set of patterns is obtained from reduced number of directions which "explain" the covariance between output Y and input X :
r
where 1 are the basis vectors of the mput space X The equations ( 7) yield the property of the pnnClpal vectors:
i.e., they are eigenvectors of the covariance of input onto output patterns, Y T X . Applied is QR algorithm which in first deflation step produces principal component with maximum covanance. Output patterns are also decomposed into corresponding principal vectors
q
by minimization
i_
r= ~)yql) · qr +E;
(9)
.-1
q=~r-(rq).qTI 1/
During training procedure are determined pairs of
(p, ,q,) ; = l,2 .. r «n;, which are the basis of reduced input and output spaces.
principal directions
Projections of input I, and output ii, patterns are -' + EX Y TX = ~(yTX-) ~\ Pt ·1·· r
(3)
278
r
calculated by the products rp, and q, . The NN are muctured with a single hidden layer but with variable number of neurons. Each neuron is a static
nonlinear activator ( Rumelhan, 1989) with activation function (11). Minimization is perfonned batch Wise with application ofvery efficient Polak-Ribiere algorithm (Storey, 1990, PoweU 1988). The critical step IS one dimensIonal mirumiz.ation ( 12 ). Applied is second order approxunation, at each iteratJon step I are foun~ cons~tive g~ er (g 21 < er (g jJ IS sansfiecl.
r
k-.
a1(p)=+ . L(Y; -Pk
M
-
talion profiles ( Fig.. 18) and also with unprediCIed vanaIiODS. Modeled are toM) MISO-NN modules aimed for adaptive regulation of ~EtOH and pH. The modules are trained for prediCtlon of the output variables ( EtOH and pH ). and inverse models for predlcuon of the corresponding marupulaIlYe vanabies (feed of molasses and NH3). shown on Fig. 3. The process time delay T is determined from a step response and is assumed to be a constant dunng a course of fed-batch operation. For mput panerns are detenruned directions of principal components.
)
.(p; -pt)
(11)
.... 1
TINE
I~)
(13)
(14) f-l~
The optimal value of the gain tlctor I (a l
g: =-~
g,
IS:
-a~ )(g~ -gn-(al -a)) . (g~ -g;)
B)
(15)
(ai-a)) (gl-g2)-\a l -a2)·(gl-g))
Due to onhogonallty be~n pnncipal components MIMO-NN IS decomposed to a senes of SISO-NN modules. I.e .
..L "",,'. ,
jj
= ,V,": (il .t2 .,. i",) =
.·1
(i. )
(16)
.., MODEL OF BAKER'S YEAST PRODUCTION The pnnCIples have been applied for modeling of baker's veast mdustrial production in 40 m 3 deep Jet blOreaclor ( Kunanjek, 1992,1994). On-line are measured all feed rates, pressure of EtCH. DO, pH. level, rpm of pumps, and biomass is detennined offline. The on-line variables are stored with frequency of toM) readings per minute during a 15 h per a batch. The key process control variable is ~EtOH measured in exhaust gas and used are classical PID loops. 10 order to provide training patterns with "rich" dynamics selected was an oscillatory case resulting from unmatChed PID parameters in the loop for molasses feeding ( Fig.. 2A). Sets of test patterns ~re chosen among number of near steady fennen-
279
Fig..2. Companson of process P and NN model responses M of ethanol panial pressure 2A. IS result obtamed with tralrung set of patterns, and 2B is response with untramed panerns. Data are shifted for a constant to proVIde better distinction be~n process and model.
Space of principal directions of dimension three was suffiCIent to account for 95 % of vanance in mput data. First order ARMA approximation was found to be sufficient to account for process dynarrucs. Prior learrung procedure all contradictory and redundant patterns ~re eliuunated NN parameters P ~re adapted to training response until a minimum of the predicted sum of squares (PRESS) on untrained sets of patterns ~re obtained It was found that for all hidden layer was sufficient to use number of neurons for one higher than number of nodes on input layers. Due to significant data compression, noise elimination, and modular structure training procedure was very efficient. After comparison with a number of untrained cases was found that average relative errors between model predictions and measured data Mre in the range from 3-5 ~•• which corresponds to error levels of applied instrumentation. The same conclusion is valid for prediction of future course of fermentation up to the horizon of 30 minutes. 10 Fig.. 4. is given the proposition of process control
with inclusion of the NN inverse modules for regulation of ~EtOH and pH by feeding rates of molasses and q-NH3 ' Other variables, such as temperarure, are regula1ed with classical PlO controllers without adaptation.
process dynamics by ARMA model. statistical preprocess for rejection of gross errors and elirrunanon of redundancy, and data compression and random noise ehmUlation by decomposibon to prinClpal components. Due to orthogonality of prinClpal components, NN have a simple MISO structure gIvmg sunple end reliable traIrung. Results are tested with numerous untrained data and average relauve error of prediction is in the range from 3-5 % .
Acknowledgment: This work was financIally supported by the Ministry o/ScIence and Technology 0/ the Republic o/Croana, project 04-07-017. REFERENCES
TINE
(~)
.-t
Fig. 3. Responses of the manipulative variables 0btained by inverse trained neural networks: 3A is profile of the molasses feed rate, and 3B is feed rate of ammorua The data are obtamed with untramed panems, and the curves are shifted for a constant to provide better distinction be~n process P and model M . x - Fm:> J.AII;3
y
=>~====cr: ; aJCItJiACTCa
c- i
!,
Fig.. 4. Control structure with inverse neural network modules in a feedback loop for ethanol partial pressure (EtOH) and pH . Other variables are controlled WIth PlO regulators. Other input infonnation which are relevant for NN performance are data about "type" or "source" of molasses, data about other components present in feed, and all other process data which are imponant for process control and were present in data training set.
3. CONCLUSIONS
new
efficient modular structure of NN networks is A proposed and tested on data from industrial production. Applied are procedures for. approximation of 280
Bhat N., McAvoy .J ,(1990) "Use of neural nets for dynamic modeling and control of chenucal process systems", Computers chem. Engng, 14.573 Cooney C.L., Raju G., OConnor G., (1991), "Expen systems and neural nets for bioprocess operation", EFB WP on bioprocess modehng and control", U. of Newcastle. 21. Jan., Newcastle. UK. Kurtanjek, Z., (1992), "Modelling and control by artificial N.N.", Automahka, 33 (3),147 . Kurtanjek. Z., (1994) "Modelling and control by artificial neural networks in biotechnology", Computers chem. Engng. 18, 5627-631 KurtanJek. Z., (1994), "Structure ofpnnClpal component based neural network models of dynamical systems", Proceedings of 16 th LT .L ", p. 277, Pula.. Croatia Massimo c., Montague A, Willis J,ThamT .,Morris J, (1992), "Towards Improved perucillin fermentation by artificial neural networks" Computers chem. Engng. 16 (4) (1992) 283-291. ~ll, M .JO, (1988), " Convergence propenies of algorithms for nonlinear optimizauon", SlAM RevIew, 28. 487-500. Rumelhart. O.E., McClelland, JL. (1989): Parallel DIstributed Processzng, The MIT, Press, Cam bridge MA. p. 318-363 . Storey, c., (1990), "A new look at conjugate direc tion methods of optimization", 2nd North American/German Workshop on Chemical Engineering, MathematIcs and Computation", 15-20 .July, GOttingen. Germany. Tbibault. J, V. Van Breusegem, Cheruy, A, (1990), liOn-line prediction of fermentation of fermentation variables using neural networks", Biotechnol. Bioeng., 36, 1041-1048. Zhang Q., Reid, J.B. Litchfield, Reo J., Chang S. W ( 1994)," A prototype neural network supervised control system for Bacillus thunnglensis fermentations", Biotechnol. Bioeng., .0, 483. ydstie, B.E.(l990). "Forecasting and control using adaptive connectionist networks", Computers chem. Engng., 14, 583-599.