Computers and Chemical Engineering 29 (2005) 1047–1057
On-line adaptation of neural networks for bioprocess control Kapil G. Gadkara , Sarika Mehrab , James Gomesc,∗ a
Department of Chemical Engineering, University of California at Santa Barbara, USA Department of Chemical Engineering & Material Science, University of Minnesota, USA Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology, Delhi, Hauz Khas, New Delhi 110016, India b
c
Received 7 August 2003; received in revised form 14 October 2004; accepted 30 November 2004 Available online 12 January 2005
Abstract A recurrent neural network with intra-connections within the output layer is developed to track the dynamics of fed-batch yeast fermentation. The neural network is adapted on-line using only the dissolved oxygen measurement to account for varying operating conditions. The other states of the system, namely the substrate, ethanol and biomass concentrations are not measured but predicted by the adapted network. A neural network having a 10-8-4 architecture with output layer feed back and intra-connections between the nodes of the output layer has been studied in detail. A comparative study of its performance with and without online adaptation of weights is presented. Predictions based on online adaptation of weights were found to be superior compared to that without adaptation. The network was implemented as an online state-estimator facilitating the control of a yeast fermentation process. The results demonstrate that with on-line adaptation of weights, it is possible to implement neural networks to control processes in a wide region outside its training domain. © 2004 Elsevier Ltd. All rights reserved. PACS: 84.35 Keywords: Recurrent network; On-line adaptation; Ethanol fermentation; Feed forward control
1. Introduction The successful operation and control of bioprocesses require a method by which the process objectives can be achieved reliably and quickly. This in turn means that process variables constituting the control law must be measured on-line. Direct online measurements of the primary process variables, such as the biomass, substrate and product concentrations, are usually not possible due to the dearth of suitable measuring devices or probes. The dynamic nature of bioprocesses results in varying growth rates, oxygen uptake rate and product formation rates under different operating conditions. This coupled with the inherent non-linearity of bioprocesses makes system identification difficult. Consequently, the trend in bioprocess control has been to develop identification strategies based on non-linear systems theory ∗
Corresponding author. Tel.: +91 11 2659 1013; fax: +91 11 2658 2282. E-mail address:
[email protected] (J. Gomes).
0098-1354/$ – see front matter © 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.compchemeng.2004.11.004
(Holmberg & Ranta, 1982; Gomes & Menawat, 1992, 1998, 2000; Gomes, Roychoudhury, & Menawat, 1997; Kravaris & Soroush, 1990; Proll & Karim, 1994) fuzzy control (Shi & Shimizu, 1992; Inamdar & Chui, 1997), extended Kalman filtering (Stephanopoulos & San, 1984) and neural networks (Simutis & L¨ubbert, 1997; Ge & Lee, 1997; Szepesv´ari, Szabolcs, & Lorincz, 1997; Linko, Rajalathi, & Zhu, 1995; Montague & Morris, 1994). Ideally, methods that can provide adequate precision in estimating variables from incomplete information are desirable for control of bioprocesses. Among the various techniques that have been applied successfully, neural networks have drawn much attention because it does not require any prior knowledge about the relationships that exist between the states of the system. These relationships are learned by the neural network during the training phase and stored as weights between inter-connections. There is now extensive literature showing how neural networks can be used for classification, estimation and prediction of bioprocesses (Narendra &
1048
K.G. Gadkar et al. / Computers and Chemical Engineering 29 (2005) 1047–1057
Parthasarathy, 1990; Chtourou, Najim, Roux, & Dahhou, 1993; Donat, Bhat, & McAvoy, 1991; Lee & Park, 1999). Neural network applications in bioprocesses are many (Breusegem, Thibault, & Cheruy, 1991; Su & McAvoy, 1989; Baughman & Liu, 1995; You & Nikolaou, 1993; Lennox, Rutherford, Montague, & Haughin, 1998; Ignova, Glassey, Montague, Paul, Ward, Thomas, Karim, & 1996; Potocnik & Grabec, 1999) but on-line process control implementations are few (Tian, Zhang, & Morris, 2002; Teissier, Perret, Latrille, Barillere, & Corrieu, 1997; Schubert, Simutis, Dors, Havlik, & Lubbert, 1994). This is because there are several difficulties with on-line implementation. To begin with, the neural networks must be trained off-line because of the long computation time needed for training. Often, trained networks do not predict the process satisfactorily in a sufficiently broad region of the state space to make it suitable for control applications. The requirement from a process control perspective would be that the neural network after training should be capable of predicting all possible process conditions. To achieve this goal, it would be necessary to up-date the weights of the neural network continuously so that it learns about new process events in real time. It would then be possible for the neural network to predict state variables over a wider range of conditions, including those that it was not trained for, and make it effective for on-line process control. We have developed a neural network based method to address this problem. The network architecture contains output layer feedback and intra-connections among the nodes of the output layer. The weights of the trained network are adapted continuously on-line by an error correction scheme that uses dissolved oxygen concentration values that are easily available from online measurements. The cultivation of Saccharomyces cerevisiae, commonly used for analysing process identification and control implementation problems, was selected as the system of study. In this paper, we present the theoretical development of the network architecture and the algorithm for update of the weights. The architecture is capable of providing online state estimates within a reasonable domain outside its training space. The neural network based estimator is further used in a control framework for tracking of biomass concentration along a desired profile. Details of the control experiments carried out in a 15 l reactor are also presented in this paper. 2. Model for simulation of fed-batch yeast cultivation Yeast fermentation is one of the most widely studied systems for bioprocess control applications. In 1954, the first quantitative study of aerobic yeast growth by Lemoigne and co-workers (Lemoigne, Aubert, & Millet, 1954) showed diauxic growth on glucose. They observed that the growth was divided into two distinct phases. In the first phase of growth, glucose is utilised, and biomass and ethanol are produced. In
the second phase, which begins after glucose is completely exhausted; the ethanol that was produced is used as substrate for further growth. To describe this phenomenon, many models for yeast fermentation have appeared in the literature (Aiba, Shoda, & Nagatani, 1968; Cooney, Wang, & Wang, 1977; Sonnleitner & Kappeli, 1986; Jones & Kompala, 1999). The Jones–Kompala cybernetic model (1999), gives the most accurate theoretical representation of ethanol fermentation. However, since only a reasonably accurate data generator is required for training the neural network whose weights would later be corrected on-line, the Sonnleitner–Kappeli model (1986) was selected for simulation of fed-batch yeast cultivation. The Sonnleitner–Kappeli model was selected because it was easier to implement and fitted our preliminary experimental data satisfactorily (Mehra, 1999). Sonnleitner explains the diauxic growth based on the respiratory capacity of the cells, which represents a bottleneck for oxidative substrate utilization. For substrate fluxes low enough to fit the bottleneck pure oxidative metabolism is observed with priority for glucose (Eq. (1)) over ethanol (Eq. (3)). C6 H12 O6 + aˆ O2 + bNX(NH3 ) → bCHHX OOX NNX + cCO2 + H2 O
(1)
C6 H12 O6 + gNX(NH3 ) → gCHHX OOX NNX + hCO2 + iH2 O + jC2 H6 O
(2)
ˆ 2 + lNX(NH3 ) C2 H6 O + kO → lCHHX OOX NNX + mCO2 + nH2 O
(3)
If the glucose flux exceeds the respiratory bottleneck the part of the glucose saturating the respiratory bottleneck is metabolised oxidatively (Eq. (1)) while the rest is metabolised reductively (Eq. (2)). Ethanol utilization is strictly oxidative. Hence in this condition glucose is consumed preferably over ethanol. Thus the respiratory capacity controls the metabolism of yeast. Using this platform, Sonnleitner’s model describes accurately intermediate regions of yeast fermentation where cell mass, glucose and ethanol concentrations are comparable. The model equations have been written in terms of the specific uptake rates and modified here to include the dilution terms for fed batch yeast cultivation. dx cL oxid qO2 glu max x = Ybio/glu dt a cL + k o red + Ybio/glu (qs − qsoxid )x + Ybio/eth qeth x + D(t)(x0 − x)
(4) ds = −qs x + D(t)(sF − s) dt
(5)
dp qO eth max cL red = Yeth/glu (qs − qsoxid )x − 2 x dt k cL + K o + D(t)(pin − p)
(6)
K.G. Gadkar et al. / Computers and Chemical Engineering 29 (2005) 1047–1057
dcL cL = kL a(cL∗ − cL ) − qO2 glu max x dt cL + K o cL − qO2 eth max x cL + K o dV = F (t) dt
1049
(7)
(8)
The specific uptake rates for glucose, qs and ethanol, qeth are given by qs,max s qs = (9) s + Ks qeth = Yeth/bio µmax,eth
p Ki p + K e Ki + s
(10)
The specific uptake rate for glucose qs is the sum of the uptake rates in the oxidative, qsoxid and reductive, qsred pathways. qsoxid =
qO2 ,glu,max cL a cL + K 0
qsred = qs − qsoxid
(11)
Fig. 1. Schematic diagram of the proposed neural network architecture.
(12)
where x is the biomass concentration (g l−1 ), s the glucose concentration (g l−1 ), p the ethanol concentration (g l−1 ), cL the oxygen concentration (mmol l−1 ), V the volume (l), F(t) the substrate feed rate (l h−1 ), D the dilution rate (h−1 ), kL a the mass transfer coefficient (h−1 ), and sF the concentration of glucose in the feed (g l−1 ). The description and values of various other parameters appearing in the model are given in Appendix A (Sonnleitner & Kappeli, 1986). An empirical relation is used to describe the variation of the kL a (Van’t Riet, 1983) 0.7 P kL a = 2.0 × 10−3 (13) v0.2 s V where P/V is expressed in W m−3 , vs in m s−1 and kL a in s−1 . This model was simulated using the parameter values given in Appendix A to train the network in a regime where both growth and ethanol production takes place. In the next section, we describe the network architecture and the real time weight-updating scheme.
3. Network architecture and on-line adaptation The recurrent neural network developed in this work has a three-layer structure (Fig. 1). The input layer consists of 10 nodes. Six of these correspond to the external inputs comprising of the four state variables and the two input variables. The state variables are glucose (s), biomass (x), dissolved oxygen (cL ) and ethanol (p) and the inputs are the air flow rate (Qa) and substrate feed rate (F). These values are provided at the current instant of time. The feed back from the output state variables, x, s, cL , and p, comprise the other four inputs to the first layer. The intra-connections in the output
layer are from the node representing the dissolved oxygen concentration, cL , to the other three output nodes. The output from the cL node is the prediction of the dissolved oxygen concentration by the network. Hence, the predicted dissolved oxygen value is conveyed to all the other nodes of the output layer. The number of nodes in the hidden layer is not fixed and requires to be optimised. This network architecture was chosen based on an extensive study of different network architectures differentiated by their feedback loops and intra connections. These include, feedback loops from the hidden layer to input layer, or from output layer to input layer. In addition, architectures with and without intra connections were tested (Gadkar, 2000). With reference to Fig. 1, Vij is the array of weights for the connections between the input layer and the hidden layer, and Wjk is the array of weights for the connections from the hidden layer to the output layer, Uk are the weights of the intra-connections from the dissolved oxygen node to each node of the output layer, and ξ j and ξ k are the bias nodes (not shown in figure) for the network. The set of inputs I to the network are given by Ii = xq (n),
q ∈ A,
= yr (n − 1),
r∈B
(14)
where, A denotes the set of indices q for which xq (n) is an external input at sample time n and B denotes the set of indices r for the feedback loops where yr (n − 1) is the output from an output neuron at sample time n − 1. The total internal activity, vj (n) of a neuron j in hidden layer at sample time n is given by vj (n) =
i
Vij Ii (n) + ξj
(15)
1050
K.G. Gadkar et al. / Computers and Chemical Engineering 29 (2005) 1047–1057
and the output Ohj of the hidden layer neuron j is given by Ohj (n) = ϕ(vj (n))
(16)
where ϕ is the hyperbolic tangent (tan h) transfer function. The total internal activity of a neuron k of output node can then be written as uk (n) = Wjk Ohj + Uk yO2 (n − 1) + ξk (17) j
with, yk the output of neuron k in output layer given by yk (n) = ψ(uK (n))
(18)
where ψ is the sigmoid transfer function. When the neural network is trained off-line, the standard real time temporal supervised learning algorithm is used (Haykins, 1999). However, when the neural network is used on-line, a new criterion needs to be defined for updating the weights retained by the neural network from its training. The criterion used to adapt the weights was to minimise the error between the predicted and actual measurement of cL . The deviation between the actual process and the prediction is transmitted back through the neural network for adapting the weights. Intra-connections added from the cL node to the other nodes of the output layer brings in the dependence of other state variables on dissolved oxygen. The instantaneous error is minimised by the real-time supervised algorithm. The weights are adapted so that the error between the measured and predicted value of dissolved oxygen is acceptable. The instantaneous error EO2 at any time is defined as EO2 (n) =
mented on-line, the new scheme for determining the incremental changes in the weights is given by
1 2 e (n), 2 O2
(19)
where eO2 (n) = dO2 (n) − yO2 (n)
(20)
dO2 is the actual DO value measured at sample time n and yO2 is the predicted value. The objective here is to adapt the weights in such a way that it minimises the overall error function EO2 . Therefore, when the neural network is imple-
+Wjl (n) = ηeO2
∂yO2 (n) ∂Wjl
(21)
where +Wjl (n), is the incremental change in the weight Wjl (n), made at any sample time n and η is defined as the learning rate. The algorithm for the weights update can be summarised in the following steps. • Step 1. At time t(0), input the initial values of bioprocess variables s, x, p, cL , air flow rate and substrate feed rate to the neural network. • Step 2. Predict the future values at the next sample time (t + +t) of s, x, p, cL . • Step 3. At time t + +t, measure the on-line dissolved oxygen value. • Step 4. Compare this value with the predicted value and calculate the error. Based on this error, compute the incremental changes in the weights and adapt the weights. • Step 5. Re-predict the bioprocess variables. • Step 6. Execute steps 4 and 5 until the error is reduced to the desired value. The weights obtained are set as the new weights of the neural network. • Step 7. The predicted values of the state variables are set as the current values. • Step 8. Go to Step 2.
4. Training and optimization of network architecture The model Eqs. (4)–(12) were integrated numerically to generate the training set using initial conditions shown in Table 1. In view of the fact that the neural network would be implemented on-line, the process simulations were carried out assuming actual experimental conditions. The data was generated at a constant sampling interval of 6 min for 15 h. The neural network was trained with this data set. Initial weights assigned randomly between ±0.1 worked best for this system. The performance of the neural network was examined with respect to the learning coefficient η, number
Table 1 Initial process conditions used for the training and validation of the networks Variables
Training specimen
Initial substrate concentration (g l−1 ) Initial biomass concentration (g l−1 ) Initial ethanol concentration (g l−1 ) Initial dissolved oxygen % Air flow rate (l s−1 ) Maximum substrate feed rate (l h−1 )* Substrate concentration in feed (g l−1 )
12.0 0.4 0.1 96.0 2.0 15.0 25.0
Cross validation I
II
III
IV
9.0 0.5 0.2 80.0 1.5 20.0 25.0
15.0 0.3 0.25 90.0 2.5 12.0 25.0
11.0 0.2 0.1 75.0 2 .25 13.0 25.0
13.0 0.35 0.05 95.0 1.75 17.0 25.0
∗ Feed rate is linearly increased from zero to the maximum value till the third hour after which it is kept constant till the 11th hour. It is linearly decreased to zero beyond this point till the 15th hour.
K.G. Gadkar et al. / Computers and Chemical Engineering 29 (2005) 1047–1057
Fig. 2. Variation in the mean square error with the number of iterations for different number of hidden layer nodes.
of hidden nodes, number of iterations required to reduce the total error to an acceptable level, recall error and the computation time required per iteration. The training was carried out using varying η values. Starting at a value of 0.001, the learning rate η was increased by a factor of 1.02 if the error decreased and reduced by a factor of 0.7 if the error increased. The learning rate η was varied until the normalised mean square error (MSE) of 0.4 was obtained. The value of η obtained at this point was held constant for the remainder of the training. This was done to avoid oscillations in the weights once they approached towards their optimum values. The training of the neural network was carried out for network structures with different number of hidden layer nodes. The error profile with respect to the number of iterations for these cases is presented in Fig. 2. It is observed that the error reduces smoothly without oscillations indicating the convergence of the network to the global minimum. Although the initial value of η has a minor influence on the recall error, the error profile is free from any significant oscillation and is indicative of an acceptable recall profile and generalization. For the various cases studied, the final values of η obtained at the point where the MSE is 0.4 lies between 0.07 and 0.10. The time required per iteration was independent of the initial value of the learning coefficient but increased with the number of nodes in the hidden layer. The summary of the results is given in Table 2. An identical training was also performed on a network without intra-connections for varying number of hidden Table 2 Comparison of training for varying number of hidden nodes No. of hidden nodes
No. of iterations
Total recall error
Time per iteration (s)
6 8 10 12 14
35235 13038 13982 11368
0.10419 0.01532 0.14558 0.02503
*
*
0.750 1.304 2.142 3.333 6.000
∗
No convergence.
1051
Fig. 3. Recall profiles with 8 hidden nodes for the training data set.
nodes. This was done to verify if the intra-connections actually improve the recall and generalization of the network. It was observed that the recall errors for the same configurations without intra-connections gave significantly higher recall errors in most cases of training. For example, with 10 hidden nodes the recall error was reduced by half when intraconnections are used. While with 8 hidden nodes, the recall error is reduced 20 times if intra-connections are used. The objective of our training was to determine the set of weights that give the best generalization for the network. In Fig. 3, the training and recall profiles are presented together for comparison. From our observations and overall assessment of the performance of the neural network in terms of iterations, time required per iteration and recall profiles, a 10-8-4 configuration gave the best performance. This configuration was used for on-line adaptation.
5. Performance of neural network outside the training domain For effective performance of the neural network outside its training regime, the weights obtained from training of the neural network are continuously updated using the online adaptation algorithm described earlier. We have considered four different initial conditions I–IV given in Table 1 for validation of the neural network as an effective estimator. These initial conditions cover all the possible deviations from the training set. In conditions I and II, the total amount of glucose added over the duration of the fermentation is nearly the same as that for the training set. However, in condition I, the initial glucose concentration is low and results in lower ethanol production than the training set. For condition II, the situation is reversed. Here the initial glucose concentration is high and this results in a higher amount of ethanol being produced than the training set. In condition III the rate of addition of substrate is less than the training set because both the initial concentration as well as the feed rate is lower compared to the training set. Since the airflow rate in this case is
1052
K.G. Gadkar et al. / Computers and Chemical Engineering 29 (2005) 1047–1057
Table 3 Recall errors for conditions outside the domain of training With noise
Without noise
With adaptation
Without adaptation
With adaptation
Without adaptation
I
s x p
0.062 0.141 0.352
0.319 0.683 1.564
0.082 0.124 0.442
0.325 0.659 1.533
II
s x p
0.067 0.230 1.115
0.478 0.364 1.158
0.114 0.189 0.867
0.481 0.356 1.154
III
s x p
0.512 0.301 0.241
2.572 0.599 1.150
1.357 0.410 0.410
2.575 0.602 1.152
IV
s x p
0.429 0.132 0.566
1.213 0.324 0.987
0.608 0.120 0.663
1.221 0.312 0.991
high, the substrate is consumed more rapidly. Consequently, the ethanol concentration obtained is significantly lower. The situation is reversed for condition IV. The conditions III and IV are such that both the substrate concentrations and the air flow rates cause the profiles to deviate more from the training set; so these are the cases where the performance of the networks would be more rigorously tested. Here we also consider the cases with noise in the dissolved oxygen measurement (random up to 2% dissolved oxygen) and without noise. The recall errors for the process states are shown in Table 3. It is evident that there is always an improvement in the recall error when the weights are adapted on-line. This is true for situations both with and without noise. To rule out any improvement due to the presence of noise, it may be noted that the noise is biased in the direction opposite to the desired profile. Therefore, the online adaptation scheme alone is responsible for the improvement observed in the predicted state versus time profiles. Fig. 4 shows a comparison
Fig. 4. Performance with on-line adaptation of weights when a 2% random biased noise is incorporated in DO measurements. A—actual state profile; B—profile with adaptation of weights on-line; C—profile without adaptation of weights.
of the predicted state profiles, both with and without update, to the actual profiles for condition I in the presence of noise.
6. On-line implementation for controlling yeast fermentation 6.1. Significance of dissolved oxygen measurement Commercially available laboratory scale bioreactors come with standard control features. Usually pH, temperature, agitation speed and dissolved oxygen are measured on-line and controlled with standard modules. Temperature and pH are normally kept constant for the process using PID and on–off controllers. Oxygen levels are controlled with cascade controllers using agitation speed and airflow rate. Substrate and cell mass concentration measurements may or may not be measurable on-line depending on the kind of substrate and on the overall composition of the culture medium. Product concentrations are usually measured off-line and may require even several hours for analysis. Consequently difficulties arise when it is required to control the process along a desired trajectory or when the controlled variable cannot be measured. For the online adaptation of weights proposed in this work, new information about the process must come from the online measurements. It is necessary to select a process measurement that would have sufficient information content. The dissolved oxygen concentration meets this requirement because it reflects the metabolic state of the process. Under conditions of sufficient oxygen availability and respiratory capacity, growth on both glucose and ethanol is possible, with preference to growth on glucose. On the other hand, under conditions of limited respiratory capacity excess glucose is converted to ethanol via the reductive pathway. Also, the dissolved oxygen concentration can be measured continuously and reliably.
K.G. Gadkar et al. / Computers and Chemical Engineering 29 (2005) 1047–1057
Fig. 5. Substrate feeding profile for controlling the cell concentration along a linearly increasing profile x = 0.15 × time.
6.2. Control strategy The neural network estimator developed was implemented for process control. It was proposed to control the cell concentration along a linearly increasing profile given by x = ct + c1 , where c and c1 are constants, t is the fermentation time and c1 is obtained from c1 = x|t = 3 h − 3c. To implement this control strategy, a feed forward law was derived using Eq. (4) as cL oxid qO2 glu max red D = Ybio/glu + Ybio/glu (qs − qsoxid ) a cL + k o c (22) + Ybio/eth qeth − x The neural network was used to obtain the estimates of the state variables required to evaluate the current feeding rate. The airflow rate for all control experiments was maintained at 6 l min−1 . The neural network was updated based on the error between the current dissolved oxygen measured on-line and the predicted value. The updated network was then used to make state predictions one sample time ahead. These predicted state variables were used in the control strategy. A fermentation run was carried out using the D profile given by Eq. (22) with c = 0.15 and held constant between sampling intervals (Fig. 5). Since the state variables required for determining D from Eq. (22) could not be measured online, these were estimated from the Sonnleitner’s model. The fermentation experiment was carried out using this substrate feeding profile and the conditions given in Table 4. The data Table 4 Conditions of fermentation experiment carried out for generating data for implementation of the neural network with on-line adaptation for control Initial substrate concentration (g l−1 ) Initial ethanol concentration (g l−1 ) Initial biomass concentration (g l−1 ) Initial dissolved oxygen concentration (%) Initial volume (l) Air flow rate (l min−1 ) Substrate feed concentration (g l−1 ) Starting time for feed (h) Total fermentation time (h)
16.0 2.40 1.64 86 3.0 6.0 20.0 4.0 11.0
1053
Fig. 6. Experimental data (used for training) obtained from yeast fermentation by implementing the feeding profile given in Fig. 5.
obtained from this experiment was used to train the neural network for implementing control with on-line adaptation. The state variables, x, s, p and cL , were measured at 1 h intervals and the profiles are shown in Fig. 6. The cell concentration profile thus obtained had a slope of 0.13 and its difference from the desired slope of 0.15 may be attributed to processmodel mismatch. The hourly data was then interpolated to obtain data at 15 min intervals for training of the neural network. A mean square error of 0.005 per data point for training and of 0.0112 per data point for recall was obtained. The set of weights obtained after this training were adapted on-line based on dissolved oxygen measurements when the neural network was implemented for controlling the actual fermentation process. 6.3. Process conditions and control implementation The fermentation run was performed in a 15 l reactor (BBraun Biostat C, Germany). A simple medium formulation consisting of 2% glucose, 2% peptone and 1% yeast extract was use. The temperature of the fermentation was controlled at 30 ◦ C, pH at 5 and agitation speed at 300 rpm with the in-built B-Braun controllers. The fermentation begins with an initial volume of 3 l and is run in the batch mode for the first 4 h. Then from 4 to 11 h, it was run in the fed-batch mode. Final volume at the end of the experiment was about 11 l. The cell concentration x was measured based on optical density at 600 nm, glucose was measured by the DNS method and ethanol by gas chromatography using a Chromosorb 50 column. The feeding rate was controlled using the neural control algorithm based on the on-line adapting of weights. The only process variable measured on-line was the dissolved oxygen concentration. The current error (cL − cˆ L ) was computed and the new set of weights was determined by minimizing this error. The neural network was then used to predict the state variables xˆ , sˆ , pˆ and cˆ L one time step ahead and these predicted states were used to compute the control action for the next
1054
K.G. Gadkar et al. / Computers and Chemical Engineering 29 (2005) 1047–1057
Fig. 7. Flow diagram for on-line implementation of neural network for controlling yeast fermentation using a feed forward control law.
time instant using Eq. (22). The control action was translated to a signal between 0–10 V for controlling the feed pump through PCL-726 DAC card (Dynalog Microsystems, India). Although for training, the value of c was taken as 0.15, the desired control trajectory was computed using a different value
of c equal to 0.2 for on-line implementation. This not only tested the generalization capabilities of the neural network by considering a case outside the training set, but also its adapting capability. The algorithm for the control implementation is shown in Fig. 7.
K.G. Gadkar et al. / Computers and Chemical Engineering 29 (2005) 1047–1057
Fig. 8. On-line control implementation of neural network for controlling the cell concentration along the profile x = 0.2 × time when the network was trained for a profile of x = 0.15 × time. • Experimental value; prediction with online adaptation of weights; prediction without adaptation of weights.
The results of the control implementation of this neural architecture are presented in Fig. 8. The profile predicted by the neural network with and without the adaptation of weights along with the actual experimental data is shown. We observe that predictions of the network with online update of weights follow the experimental data more closely. The initial weights of the neural network are those obtained by training it on the data of the previous experiment (Fig. 6). In this experiment (Fig. 8), the dynamics are different. For example, the substrate consumption rate between 0 and 4 h is faster compared to the previous experiment. As seen in Fig. 8, the prediction of the network based on the weights obtained from the previous experiment follows a trend similar to that of the training data. This is most easily discernable from the ethanol profile where the time to attain the peak is also the same. However, the actual process dynamics are different because of experimental variations (0 < t < 4 h) and new control objective (t ≥ 4 h). The changes occurring in the process dynamics are conveyed to the neural network by the online dissolved oxygen concentration measurements in real time. The error (cl − cˆ l ) is used to determine a new set of weights that describes the new process dynamics more accurately. Based on the new weights, D for the next time step is computed from Eq. (22). Consequently, the prediction of states and control action in this case is different from the training data shown in Fig. 6. The adapted neural network behaves closer to the actual trends observed in real time. Moreover, the biomass concentration is observed to follow the desired profile within acceptable limits of error. If the slope of the cell concentration profile is determined by linear regression using the data after control action begins (from 4 h onwards) it is found to be 0.173. If however the last data point is considered to be inaccurate and ignored, the recomputed slope of the cell concentration profile is 0.203. This is fairly accurate when compared to the control trajectory, which has a slope of 0.2. The observed inaccuracy in the last data point
1055
may be explained as follows. There is an increased mismatch between the model prediction and the actual process for low glucose concentrations. The model equations predicted a glucose concentration of about 0.03 g l−1 beyond the 9th hour. However, in an actual experiment, the residual glucose concentration lies is between 1.0 g l−1 and 0.8 g l−1 from the 8th hour onwards. Consequently, the control action computed using Eq. (22) is higher than what is actually required beyond the 10th hour resulting in lower cell concentration. If this feed rate were lower (as computed during the simulation), the cell concentration would be higher because there would have been sufficient ethanol for cell growth and no dilution effect. Thus the adapted neural network effectively tracks the desired trajectory of biomass concentration. Deteriorated performance at the end of the fed-batch is mainly due to plant-model mismatch and due to the fact that only one state variable (cl ) is measured online. The prediction of biomass with the network without the weights update is lower than the desired biomass values for the trajectory with c = 0.2. This shows that the online update of the weights provides more accurate state estimates and makes it suitable for on-line control implementation.
7. Conclusions We have studied the performance of the neural network to evaluate its capability for on-line implementation in bioprocess control. The performance of the neural network in yeast fermentation control shows conclusively that it can be implemented for online control. However, it is important to note that the on-line adaptation scheme for control will work only if available measurements adequately reflect the changing dynamics of the process. More importantly, a priori data from similar operational runs are needed for the initial training of the neural network. For the yeast fermentation process studied, it was necessary to obtain the dynamics by performing an experiment using a feeding strategy that gave the desired profile of the cell concentration. In industries, such information is usually available. Therefore, a network as described in this paper can be trained for desired input-output behaviour. Thus, with the on-line adaptation of weights proposed here, the neural network would be able to track the changing dynamics of the process due to external disturbances, thereby predicting accurately the process behaviour around the training domain. Since the computation time for adaptation of weights is very small (1–2 s), it can be used comfortably for process control. We have carried out simulations using more than one variable (not reported) for adapting the weights and observed that the predictions improve significantly. This happens because there is more information available to the neural network from online measurements. Hence, if more than one variable can be measured on-line further improvement in control performance may be expected.
1056
K.G. Gadkar et al. / Computers and Chemical Engineering 29 (2005) 1047–1057
Appendix A Description and values of model parameters Parameter
Value
Dimension
qs max maximal specific glucose uptake rate qO2 max maximal specific oxygen uptake rate oxid Ybio/glu yield for pathway (1) red Ybio/glu yield for pathway (2) OX subscript for oxygen content in molecular formula Ybio/eth yield for pathway (3) µmax,eth maximal specific growth rate Ks saturation parameter for glucose uptake KO saturation parameter for oxygen uptake Ke saturation parameter for growth on ethanol Ki parameter for glucose inhibition of ethanol uptake CX subscript for carbon content in molecular formula HX subscript for hydrogen content in molecular formula NX subscript for nitrogen content in molecular formula a stoichiometric coefficient k stoichiometric coefficient red Yeth/glu product yield for pathway (2) ∗ cL saturation value of oxygen qO2 ,glu,max maximal specific oxygen uptake rate for glucose qO2 ,glu,max maximal specific oxygen uptake rate for ethanol
3.5 8.0 0.49 0.05 0.57 0.72 0.17 0.1–0.5 0.1 0.1 0.1 1.00 1.79 0.15 12.83 35.09 1.87 0.2187 Minimum of qs a and qO2 ,max Minimum of qeth k and (qO2 max − qO2 ,glu,max )
g g−1 h−1 mmol g−1 h−1 g g−1 g g−1 mol mol−1 g g−1 h−1 g l−1 mg l−1 g l−1 g l−1 mol mol−1 mol mol−1 mol mol−1 mmol O2 /g glu mmol O2 /g eth mol eth/mol glu mmol l−1 mmol g−1 h−1 mmol g−1 h−1
References Aiba, S., Shoda, M., & Nagatani, M. (1968). Kinetics of product inhibition in alcohol fermentation. Biotechnology and Bioengineering, 10, 845–865. Barford, J. P. (1981). A mathematical model for the aerobic growth of Saccharomyces cerevisiae with a saturated respiratory capacity. Biotechnology and Bioengineering, 23, 1735–1762. Baughman, D. R., & Liu, Y. A. (1995). Neural Networks in Bioprocessing and Chemical Engineering. San Diego: Academic Press. Breusegem, V., Thibault, J., & Cheruy, A. (1991). Adaptive neural models for on-line prediction in fermentation. Canadian Journal of Chemical Engineering, 69(2), 481–487. Chtourou, M., Najim, K., Roux, G., & Dahhou, B. (1993). Control of a bioreactor using neural network. Bioprocess Engineering, 8, 251–254. Cooney, C. L., Wang, H. Y., & Wang, D. I. C. (1977). Computer-aided material balancing for prediction of fermentation parameters. Biotechnology and Bioengineering, 19, 55–67. Donat, J. S., Bhat, N., & McAvoy, T. J. (1991). Neural net based model predictive control. International Journal of Control, 54(6), 1453–1468. Gadkar, K. (2000). Control of yeast fermentation using neural networks with on-line weight updating. M. Tech. thesis, Indian Institute of Technology, New Delhi. Ge, S. S., & Lee, T. H. (1997). Robust adaptive neural network control for a class of non-linear systems. Proceedings of the Institution of Mechanical Engineers, 211(1), 171–181. Gomes, J., & Menawat, A. S. (1992). Estimation of fermentation parameters using partial data. Biotechnology Progress, 8, 118–125. Gomes, J., Roychoudhury, P. K., & Menawat, A. S. (1997). Effect of inputs on parameter estimation and prediction of unmeasured states. In Proceedings of the IIChE Golden Jubilee Congress (pp. 239–248).
Gomes, J., & Menawat, A. S. (1998). Fed-batch bioproduction of spectinomycin. Advances in Biochemical Engineering and Biotechnology, 59, 1–46. Gomes, J., & Menawat, A. S. (2000). Precise control of dissolved oxygen in bioreactors: a model-based geometric algorithm. Chemical Engineering Science, 55(1), 67–78. Haykins, S. (1999). Neural networks: a comprehensive foundation. New Jersey: Prentice-Hall Inc. Holmberg, A., & Ranta, J. (1982). Procedures for parameter and state estimation of microbial growth process models. Automatica, 13, 181– 193. Hussain, M. A. (1999). Review of the applications of neural networks in chemical process control–simulation and online implementation. Artificial Intelligence in Engineering, 13(1), 55–68. Ignova, M., Glassey, J., Montague, G. A., Paul, G. C., Ward, A. C., Thomas, C. R., & Karim, N. M. (1996). Towards intelligent process supervision: industrial penicillin fermentation case study. Computers and Chemical Engineering, 20(Suppl. 1), S545–S550. Inamdar, S. R., & Chui, M. S. (1997). Fuzzy logic control of an unstable biological reactor. Chemical Engineering Technology, 20, 414–418. Jones, K. D., & Kompala, D. S. (1999). Cybernetic model of the growth dynamics of Saccharomyces cerevisiae in batch and continuous cultures. Journal of Biotechnology, 71, 105–131. Kravaris, C. (1988). Input/output linearization: a nonlinear analog of placing poles at process zeros. AIChE Journal, 34(11), 1803– 1812. Kravaris, C., & Soroush, M. (1990). Synthesis of multivariable nonlinear controllers by input/output linearization. AIChE Journal, 36(2), 249–264.
K.G. Gadkar et al. / Computers and Chemical Engineering 29 (2005) 1047–1057 Lee, D. S., & Park, J. M. (1999). Neural network modelling for online estimation of nutrient dynamics in a sequentially-operated batch reactor. Journal of Biotechnology, 75, 229–239. Lemoigne, M., Aubert, J. P., & Millet, J. (1954). Annales de l’Institut Pasteur (Paris), 87(4), 427–439. Lennox, B., Rutherford, P., Montague, G. A., & Haughin, C. (1998). Case study investigating the application of neural networks for process modelling and condition monitoring. Computers and Chemical Engineering, 22, 1573–1579. Linko, S., Rajalathi, T., & Zhu, Y.-H. (1995). Neural state estimation and prediction in amino acid fermentation. Biotechnology Techniques, 9(8), 607–612. Mehra, S. (1999). Algorithm for the on-line updating of real-time recurrent network with partial state feedback for bioprocess control implementation. M. Tech. Thesis, Indian Institute of Technology, New Delhi. Montague, G. A., & Morris, J. (1994). Neural-network contributions in biotechnology. Trends in Biotechnology, 12, 312–324. Narendra, K. S. (1996). Neural networks for control: theory and practice. Proceedings of the IEEE, 84(10), 1385–1406. Narendra, K. S., & Parthasarathy, K. (1990). Identification and control of dynamical systems using neural networks. IEEE Transactions on Neural Networks, 1(1), 4–28. Potocnik, P., & Grabec, I. (1999). Empirical modeling of antibiotic fermentation process using neural networks and genetic algorithms. Mathematics and Computers in Simulation, 49, 363–379. Proll, T., & Karim, N. M. (1994). Nonlinear control of a bioreactor model using exact and I/O linearization. International Journal of Control, 60(4), 499–519. Schubert, J., Simutis, R., Dors, M., Havlik, I., & Lubbert, A. (1994). Bioprocess optimisation and control-application of hybrid modelling. Journal of Biotechnology, 35(1), 51–68.
1057
Shi, Z., & Shimizu, K. (1992). Neuro-fuzzy control of bioreactor systems with pattern recognition. Journal of Fermentation and Bioengineering, 74(1), 39–45. Simutis, R., & L¨ubbert, A. (1997). Exploratory analysis of bioprocesses using artificial neural network-based methods. Biotechnology Progress, 13, 479–487. Sonnleitner, B., & Kappeli, O. (1986). Growth of Saccharomyces cerevisiae is controlled by its limited respiratory capacity: formulation and verification of a hypothesis. Biotechnology and Bioengineering, 28, 927–937. Stephanopoulos, G., & San, K. Y. (1984). Studies on on-line bioreactor identification I. Theory of Biotechnology and Bioengineering, 26, 1176–1188. Su, H. T., & McAvoy, T. J. (1989). Identification of chemical processes using recurrent networks. In Proceedings of Amer. Cntr. Conf. (pp. 2314–2319). Szepesv´ari, C., Szabolcs, C., & Lorincz, A. (1997). Neurocontroller using dynamic state feedback for compensatory control. Neural Networks, 10(9), 1691–1708. Teissier, P., Perret, B., Latrille, E., Barillere, J. M., & Corrieu, G. (1997). A hybrid recurrent neural network model for yeast production monitoring and control in a wine based medium. Journal of Biotechnology, 55(9), 157–169. Tian, Y., Zhang, J., & Morris, J. (2002). Optimal control of a fed-batch bioreactor based upon an augmented recurrent neural network model. Neurocomputing, 48, 919–936. Van’t Riet, K. (1983). Mass transfer in fermentation. Trends in Biotechnology, 1(4), 113–116. You, Y., & Nikolaou, M. (1993). Dynamic process modelling with recurrent neural networks. AIChE Journal, 39(10), 1654– 1667.