European Symposiumon Computer-AidedProcess Engineering- 14 A. Barbosa-P6voa and H. Matos (Editors)
733
9 2004 Elsevier B.V. All rights reserved.
Hybrid Modelling of a PHA Production Process Using Modular Neural Networks J. Peresl*, R. Oliveira z, L. S. Serafim 2, P. Lemos 2, MA. Reis z, S. Feyo de Azevedo 1 1- Department of Chemical Engineering - Institute for Systems and Robotics, Faculty of Engineering, University of Porto, Rua Dr. Roberto Frias, 4200-465 Porto, Portugal 2 - REQUIMTE/CQFB - Department of Chemistry - Centre for Fine Chemistry and Biotechnology, Faculty of Sciences and Technology, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal
Abstract A novel method for bioreactor hybrid modeling is presented that combines first principles models and modular artificial neural networks trained with the Expectation
Maximization (EM) algorithm. The use of modular networks was motivated by the nature of the 'cells system' that may be viewed as a highly complex network of metabolic reactions organised in modular pathways. The proposed hybrid modelling technique is validated experimentally with a laboratory scale Polyhydroxyalkanoates (PHAs) production process. The main results show that the embedded modular network, if trained with the EM algorithm, is able to organise itself in modules that have correspondence to the underlying biological pathways. In the particular case of the PHA process discussed, the network learned to discriminate between acetate and internal reserves respiration, with the smaller network modules developing expertise in describing the reaction kinetics of the one or other metabolic state.
Keywords: hybrid modelling, modular neural networks, fermentation processes 1. Introduction Artificial Neural Networks (ANNs) such as the conventional Multi Layer Perceptron (MLP) and the Radial Basis Function (RBF) networks have found wide application for bioprocess modelling involving biocatalysis with cellular systems (Schubert et al. (1994), Montague and Morris (1994), Feyo de Azevedo et al. (1997)). One important feature of cells is that they may process different substrates through different metabolic pathways. For example, in the Activated Sludge process a consortium of 3 (mainly) populations of bacteria is involved each of them able to switch between metabolic mechanisms (Henze et al., 1999) yielding complex kinetic behaviour: nitrification/denitrification, aerobic/anaerobic, phosphorous accumulation/secretion states. This type of biological systems has inherent non-linear discontinuous growth kinetics due to the ability of switching between metabolic mechanisms. This fact raises several important issues concerning ANN kinetic modelling, namely that popular MLPs and RBFs networks have some limitations for approximang discontinuous input-output To whom correspondence should be addressed:
[email protected]
734 systems: MLPs tend to exhibit erratic behaviour around discontinuities (Haykin, 1994); RBFs are voted for local mappings and are not well suited for the resolution of fine details. Modular network architectures could however overcome such problems and may have some potential for modelling biokinetics. A modular network architecture consists of two or more (small) network modules mediated by an integration unit. This type of architecture performs task decomposition in the sense that it learns to partition a task into two or more functionally independent tasks and allocates distinct networks to learn each task (Jacobs et al. (1991)). Hence a modular network structure is hypothetically highly compatible with the internal structure of the system 'cell reaction kinetics'. In this paper a hybrid model that combines material balance equations and a particular type of modular networks called Mixture of Experts (ME) is applied to a a laboratory scale Polyhydroxyalkanoates (PHAs) production process. It was concluded that the ME network is very versatile in discriminating between different metabolic states with the smaller network modules developing expertise in describing the reaction kinetics of the individual metabolic states.
2. Hybrid Modelling with ME Networks In this paper the classical serial hybrid model approach that combines material balance equations with artificial neural networks is followed. The dynamics of a stirred tank bioreactor may be described by a set of mass balance equations that take the following general form assuming perfect mixing: dc
r(c) Dc+u
dt
(1)
being c a vector of n concentrations, r(c) a vector of n volumetric reaction rates, D is the dilution rate, and u a vector of input volumetric rates. In the classical serial approach the reaction rates are modelled with ANNs. In this work the reaction rates are modelled with the ME network (Jacobs and Jordan, 1991, Haykin, 1994) resulting in a hybrid structure that is simultaneously serial and parallel (Fig. 1) in the description of the reaction rates. The ME architecture consists of a set of K expert networks and an integration unit (also called gating network) (Fig. 1). Basically, the task of each expert j is to approximate a function fj : c--->rj over a region of the input space. The task of the integration unit is to assign an expert network to each input vector e. The final system output r is a linear combination of the expert network outputs: K
r = Z gj(c)rj(c)
(2)
j=l
This architecture has strong statistical foundations. For non-linear regression problems Multi layer perceptron networks may be employed with the tangent hyperbolic function in the hidden layers and linear functions in the output layer (Bishop( 1995)): rj
= w2, j
tanh(wx,jC+bl,j)+b2, j
(3)
735
c- I
Expert 1
gl 9
c- I
dc - ~ = r ( c ) - Dc + u
Expert k
rk
c
Gaussian gating system
Fig. 1: General hybrid model combining Mixture of Experts networks and material balance equations where Wl,j, and wzj are weights matrices in the connections between nodes of layers 1 and 2 and 2 and 3 respectively whereas bid and b2,j are bias associated parameter vectors. Also different forms for the integration unit have been reported. The localised gating system (Xu et al., 1995, Ramamurti and Ghosh, 1999) provides flexible input space partitions and was adopted in this work. It can be formulated as follows:
ajP(c, mj,lF.,j)
(4a)
gj (c,a) = K
E a i P ( c , mi,Ei) i=l
i
p(c, mj,~,j)=(2rc) -n/2 Ej
exp
( l/2(c-mj)r~l(c-mj)
)
(4b)
Equation (3b) is a Gaussian distribution with centre mj and covariance matrix ~j (usually only the diagonal is considered). Eq. (3a) establishes a normalised gating output scaled by the scalar parameters c~j. In Eq. (3a) the variable a is a vectored representation of all gating system parameters a= {c~j, mj, ~j }.
2.1 The training algorithm The nature of the network structure is tightly connected with the training algorithm. For the case of the ME network several algorithms have been described. The most simple is the Gradient Descent (GD) algorithm that minimises the mean square error employing error backpropagation for the calculation of gradients (Haykin, 1994, Jacobs et al., 1991). The training algorithm has also been treated as a maximum likelihood parameter estimation problem by Jacobs et al., 1991. The Expectation Maximisation (EM) algorithm (Dempster et al., 1977) was derived and applied for this structure by Jordan and Jacobs (1994). It was shown in Rao et al (1997) that when the objective function is defined as a mean square error the solutions tend to be more "cooperative", whereas the maximum likelihood formulation with the EM algorithm tends to produce more "competitive" solutions. Since we are particularly interested in competitive learning the EM algorithm was adopted in this work. The Reader is referred to the works of Jordan and Jacobs (1994) and Xu et a1.(1995) for details of this technique.
736
3. Results and Discussion Polyhydroxyalkanoates (PHAs) are biodegradable plastics synthesized by bacteria having similar properties of polypropylene. The viability of employing activated sludge instead of pure cultures is being currently investigated and it was so far concluded that the activated sludge submitted to transient conditions in the feeding of the carbon source is able to store large amounts of PHA as energy and carbon storage materials. These transient conditions need however to be optimised. The main goal in this work is to develop an accurate dynamical model with predictive capacity having in mind subsequent optimisation studies. In this respect the modelling of the kinetics for growth and PHA production are more problematic since a consortium of 3 bacteria is involved in the PHA production. Hence the kinetics for product production is further complicated by the inherent cultures interaction. Experiments were performed with a Sequencing Batch Reactor (SBR) system with mixed cultures and measurements were made concerning the most important quantities that define the state of the process, namelly the concentrations of active biomass X, acetate HAc, ammonia NH4 and PHB (the experimetal data concerns one of the most common type of PHAs, namely, the polybeta-hydroxybutyrate (PHB)). The SBR reactor is operated in cycles where at the beginning pulses of fresh materials are added to the process being the rest of the cycle operated essentially as a batch. The dynamics of the bacth phase of a cycle are defined by the following material balance equations 80
100
75 O
E
0
E 70 E o_, 65 X
/"
50 v
k.
(..)
<
"T"
6O
6 Number of points
100
0
1.5
50 Number of points
100
8O -.-..
=o 60 E ~ 40
~'0.5
rn
~ 2o
Z
0
!
0
50 Number of points
100
0
i
50 Number of points
100
Fig. 2.: Results for four batches with real experimental data: measured values (o, dots) and hybrid mass balance - ME model values ME network (-, solid line).
737
dX
--=
,uX
dt dHAc = dt dNH 4 dt dPHB
(5a)
(5b)
--qHac x
(5C)
= --qNH4 x
= qPt-,,BX (5d) dt with/L, qHAc, qNH4, qPHBbeing the specific kinetic rates of biomass, acetate ammonia and PHB respetivelly. These kinetic rates were modelled with a ME network with 3 inputs and 4 outputs. The inputs were HAc, NH4 and PHB and the outputs were the specific uptake/production rates quAc, qNH4, qPHB and/~.. The ME network was configured with two experts because it was known a priori that the process has two metabolic phases. Given the complexity and the non-linearity of the kinetics the expert networks were MLPs with 3 inputs and 4 outputs. The total number of parameters was 72 for the two experts and 10 for the gating system. Four batches of experimental data were used for training. The 'experimental' kinetic rates were first calculated from measurements of concentrations in time using the material balance equations (5a-d). The derivatives of concentrations in time were calculated by fitting cubic splines to the experimental data followed by the analytical differentiation of the splines functions. The ME network was trained over the so obtained training patterns using the EM algorithm described in Section 2.1. The results obtained with the hybrid mass balance- ME model are shown in Fig. 2. The ME network was able to model accuratelly the kinetics of this process. The mean square error (MSE) obtained was MSE-3.29x10 -2 with concentrations scalled by their maximum values. 100 - '
i I
80
1
;
\i
.4o
~
%~/
0
~'_I
0
10
o.'il~ o
.... _ 7 _ . . . . .
20
0.5
,oil
',
',
____ ~ ....... 7 . . . . . ; .
30
".
%.
',
40 Number
50
o)
60
(.
.
.
70
4
.
.
.
.
.
.
.
.
.
.
," "~-~-:-:~-:-:~
80
0
90
of p o i n t s
Fig. 3: Results for four batches with real experimental data: Gaussian gating network outputs: gl ( .... dot line), ge (-, solid line) versus concentrations of HAc (o, white dots) and PHB ( . , black dots).
738 More importantly the trends observed in the experiments were fully captured by the ME model. An important and most interesting feature of the ME was observed in this experiment and many other simulations studies made so far. The ME network was able to detect the switch between the presence and absence of acetate as illustrated in Fig 3, and the two experts developed expertise in modelling the kinetics of the one or the other metabolic state. This switch corresponds to major changes in the metabolism of the cells, namely concerning the transition between normal acetate respiration (accompanied with PHB storage) and internal storage consumption, i.e. PHB consumption. This switch was incorporated in the ME network and the experts developed expertise in describing the one or the other sate.
4. Conclusions A novel method for bioreactor hybrid modeling was presented that combines first principles models and Mixture of Experts networks trained with the Expectation Maximization method. The method was applied to a lab-scale Polyhydroxyalkanoates production process. The model could describe accurately the dynamical behaviour of the process but more interestingly it was concluded that the ME network is capable to organise itself in modules that have correspondence to the underlying biological pathways. Currently data of in vivo NMR providing information of intracellular concentrations is being used to validate further this technique.
References Bishop, C.M., 1995, Neural Networks for Pattern Recognition. Oxford University Press. Dempster, A.P., N.M. Laird and D.B. Rubin, 1977, J. Roy. Stat. Soc. B, 39, 1-38. Feyo de Azevedo, S., B. Dahm and F.R. Oliveira, 1997, Computers chem. Engn, 21, Suppl., 751-756. Haykin, S., 1994, Neural Networks: A comprehensive foundation, Prentice Hall. Henze, M., W. Gujer, T. Mino, T. Matsuo, M.C. Wentzel, G.V.R. Marais, M.C.M. Van Loosdrecht, 1999, Water Science And Technology, 39 (1), 165-182. Jacobs, R.A. and M.I. Jordan, 1991, In: Advances in Neural Information Processing Systems 3,767-773, CA Morgan Kaufmann, San Mateo. Jacobs, R.A., M.I. Jordan, S.J. Nowlan and G.E. Hinton, 1991, Neural computation, 3, 79-87. Jordan, M.I. and R.A. Jacobs, 1994, Neural computation, 6, 181-214. Montague, G. and J. Morris, 1994, Trends Biotechnol., 12, pp. 312-324. Ramamurti, V. and J. Ghosh, 1999, IEEE Transactions on Neural Networks, 10(1), 152161. Rao, A.V., D. Miller, K. Rose and A. Gersho, 1997, IEEE Trans. On Signal Processing, 45(11), 2811-2820. Schubert, J., R. Simutis, M. Doors, I. Havlik and A. Ltibbert, 1994, Chem. Eng. Technol., 17, 10-20. Xu, L., M.I. Jordan and G.E. Hinton, 1995, In: Advances in Neural Information Processing Systems 7,633-640. MIT Press.