Computers and Chemical Engineering 31 (2007) 1131–1140
Optimization of invertase production in a fed-batch bioreactor using simulation based dynamic programming coupled with a neural classifier Catalina Valencia, Gabriela Espinosa, Jaume Giralt, Francesc Giralt ∗ Grup de Fen`omens de Transport, Departament d’Enginyeria Qu´ımica, Universitat Rovira i Virgili, Campus Sescelades, Av. dels Pa¨ısos Catalans 26, 43007 Tarragona, Catalunya, Spain Received 12 May 2005; received in revised form 4 October 2006; accepted 5 October 2006 Available online 13 November 2006
Abstract A controller based on neuro-dynamic programming coupled with a fuzzy ARTMAP neural network for a fed-batch bioreactor was developed to produce cloned invertase in Saccharomyces cerevisiae yeast in a fed-batch bioreactor. The objective was to find the optimal glucose feed rate profile needed to achieve the highest fermentation profit in this reactive system where the enzyme expression is repressed at high glucose concentrations. The controller updated in time an optimal control action that incremented the fed-batch bioreactor profitability. The proposed neuro-dynamic programming (NDP) approach, coupled with fuzzy ARTMAP classifier, utilized suboptimal control policies to start the optimization. The fuzzy ARTMAP algorithm was used to build a cost surface in the state space visited by the process, thus minimizing the curse of dimensionality with the associated high computational costs. Bellman’s iteration was used to improve the fuzzy ARTMAP approximation of the cost surface before its implementation into the control system. The controller was tested at different fermentation conditions for initial reactor volumes within the range 0.4–0.8 l and a final constant fermentation volume of 1.2 l. Profits were higher than those previously reported in the literature, with continuous and smooth glucose feed rate profiles easy to implement under production conditions. The control system was also tested when the substract concentration changed unexpectedly. The controller global performance was also in this case better than those obtained with the best suboptimal policy and previous methods. © 2006 Elsevier Ltd. All rights reserved. Keywords: Fed-batch; Optimization; Invertase production; Neural networks; Fermentation; Process control
1. Introduction Many industrial fermentation processes involving production of antibiotics, enzymes and organic acids are carried out in a fed-batch mode of operation, where substrates are added continuously. Fed-batch bioreactors are particular useful when the growth and/or metabolite production is inhibited at certain substrate or end-product concentrations or due to a catabolite repression. In those cases, the controlled addition of substrate is essential to achieve maximum production of the desired product, i.e., it is necessary to determine the optimal substrate feed rate profile (Balsa-Canto, Banga, Alonso, & Vassiliadis, 2000; Georgieva, Hristozov, Pencheva, Tzonkov, & Hitzmann, 2003; Riascos & Pinto, 2004; Smets, Claes, November, Bastin, &
∗
Corresponding author. Tel.: +34 977 559 638; fax: +34 977 559 621. E-mail address:
[email protected] (F. Giralt). URL: http://www.etseq.urv.es/personal/fgiralt/fgiralt.html.
0098-1354/$ – see front matter © 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.compchemeng.2006.10.002
van Impe, 2004; Stigter & Keesman, 2004; Zhang & Lennox, 2004). The problem of determining the optimal substrate feed rate profile is a singular control problem. The control variable, substrate feed rate, usually appears linearly coupled with the state equations that describe the process. Many optimization methods commonly used to solve the singular control problem do not work well in systems described by more than four differential equations. This is the case of the fermentation process for the cloned invertase expression in Saccharomyces cerevisiae yeast studied by Patkar and Seo (1992). These authors found that the enzyme expression was repressed when the substrate concentration was high. They also investigated the fed-batch operation of the bioreactor with the aim of increasing productivity. Patkar and Seo (1993) proposed later a bioreactor model that takes into account the respiratory and the fermentative fluxes for the substrate consumption. With the help of this model they used the conjugate gradient method to find the optimal feed rate profile for certain fermentation process conditions. Chaudhuri
1132
C. Valencia et al. / Computers and Chemical Engineering 31 (2007) 1131–1140
and Modak (1998) incorporated a neural network model into the generalized reduced gradient method for the same productivity optimization. Another alternative is to use genetic algorithms for the optimization problem (Sarkar & Modak, 2004, 2005). The optimization methods applied previously by Patkar and Seo (1993) and Chaudhuri and Modak (1998) require the solution of a new and different optimization problem for each initial condition because the fermentation ending time has to be fixed before the optimization procedure is carried out. Thus, several fermentation ending values have to be tried to find the optimal fermentation time, and for each one of them the productivity has to be optimized; this involves many operations and is computationally demanding. In addition, if the fermentation process changes its state due to unknown disturbances, these fixed feed rate policies used previously cannot drive the system back towards an optimal final productivity value. Neuro-dynamic programming (NDP) is an optimization method that can be used to determine the optimal final fermentation time in fed-batch bioreactors as part of the optimization process when an objective function accounting for the ending time is adopted (Valencia, Kaisare, & Lee, 2005). Dynamic programming (Bellman, 1957) is an approach to model dynamic decision problems, to analyze the structural properties of these problems, and to solve them. The fermentation process under study is envisaged and modeled as a chain of consecutive transitions from one state to another. The modeled process always occupies a state at each point in time, i.e., it can be viewed as infinite or finite time horizon problem depending on the amount of time steps considered. The way each transition is completed depends on the control or decision variable and, in stochastic problems, on a transition probability function. Each effected action or decision made has an associated cost or reward. The objective of dynamic programming is to minimize the total incurred cost, obtained from the sum (or product) of the cost of the transitions needed to reach the final desired process state from an initial process state. The set of all the decisions made is called a policy. An optimal cost is obtained through a series of optimal actions. Thus, an optimal cost has an associated optimal policy. The applicability of dynamic programming (Bellman & Dreyfus, 1962) to many important practical problems is limited by the enormous size of the underlying state space. This limitation was first pointed out by Bellman and it is know as the Bellman’s curse of dimensionality. Neural networks have been used to infer the state space from examples (Bertsekas & Tsitsiklis, 1996; Desai, Badhe, Tambe, & Kulkarni, 2006; Valencia et al., 2005) and overcome this limitation. The coupled utilization of neural networks with dynamic programming is called neuro-dynamic programming or reinforcement learning, which is the term used in the artificial intelligence literature. In the current study the dynamic programming approach is used in conjunction with a fuzzy ARTMAP neural system to solve the singular control problem of a fed-batch fermentation process for cloned invertase production in S. cerevisiae yeast. The aim is to find the optimal control action at any time during the fermentation process, i.e., finding the maximum productivity with the minimum total fermentation time for different initial bioreactor volumes. Section 2 introduces the NDP methodology,
the fuzzy ARTMAP architecture and the optimal control model. The methodology followed to optimize the production of invertase with the current model is explained in detail in Section 3. Finally, the optimal controller performance and conclusions are presented in Sections 4 and 5, respectively. 2. Algorithms and control model The main objective of control systems is to influence the dynamics of a system, such as a bioreactor or some other process operation, in a way that its performance is maintained at or close to the desired state. This is accomplished by adjusting input variables to calculated values so that one or various output variables are maintained close to target conditions, subject to physical limitations or constraints. Control systems can also be used to evaluate the optimum state of the overall process by formulating and solving the best set of operating conditions for the overall process and its particular operation conditions (Groep, Gregory, Kershenbaum, & Bogle, 2000). Many highlevel control strategies applied to chemical and biological processes are model-based, i.e., a mathematical model of the process is required to build the controller and to find an adequate control action at every time step. Inverse model control and internal model control are two examples of these control strategies that are most commonly used in process engineering. A traditional approach to develop a model-based control strategy is to find a set of mathematical equations from physical and chemical principles, and to determine the values of the model parameters from process data. However, this procedure is difficult to put into operation since the number of parameters may be high, data scarce, and the process too complex and not completely understood to be adequately described by first principle models. An alternative is to build an experimental model by using neural networks (Desai et al., 2006; Hussain & Kershenbaum, 2000; Rallo, Arenas, Ferre-Gine, & Giralt, 2002). Neural Networks can be used both to estimate and optimize chemical and biochemical processes (Ramaswamy, Cutright, & Qammar, 2005). For example, Becker, Enders, and Delgado (2002) applied a feed forward neural network for the control and optimization of beer fermentation. Chiou and Wang (2001) used a hybrid differential evolution (HDE) algorithm as an approach to state estimation, while Ronen, Shabtai, and Guterman (2002) optimized the feeding profile for a fed-batch bioreactor with an evolutionary algorithm. 2.1. Neuro-dynamic programming The objective of NDP is to find an optimal feed rate profile π that could adapt itself when disturbances arise. This objective can be written as π = arg max[productivity − λ · final time] u
(1)
where u belongs to the set of all possible values of the manipulated variable, in this case the substrate feed rate and λ is a positive constant that penalizes the fermentation time. In this way, the final time (tf ) of the fermentation process is included
C. Valencia et al. / Computers and Chemical Engineering 31 (2007) 1131–1140
into the objective function and into the optimization problem. The main constraint of this optimization is the total bioreactor volume that has a fixed maximum final value. Eq. (1) is suitable for optimization by dynamic programming. Objective functions of the type [productivity/final time] were also considered to penalize larger fermentation times asymptotically and to cover realistically a broader range of possible tf . The linear time-penalty in Eq. (1) is adequate for the current problem since optimal final times will be relatively large and the range covered narrow, i.e., tf ∈ [11, 15] h. Different objective functions can also be defined to include other state variables in addition to product concentration and reactor volume. For example, Chaudhuri and Modak (1998) accounted for the substrate consumed and penalized the profit with a selling price of the product relative to the cost of the substrate. The objective function proposed by Dhir, Morrow, Rhinehart, and Wiesner (2000) maximizes the rate of production of live cells over a batch run. Eq. (1) incorporates a time-penalty, which is linear in time, with the purpose of simultaneously attaining maximum productivity with minimum batch time in the production rate optimization process. This is an improvement compared to the optimization of production for fixed reaction times carried out previously (Chaudhuri & Modak, 1998; Patkar & Seo, 1993) if the value of λ can be easily estimated a priori. The hyperbolic asymptotic objective function [productivity/tf ] and Eq. (1) yield approximately equal time-penalties over the intervals of productivities ∈ [3, 4] and tf ∈ [11, 15] h expected for the current problem when λ ≈ 0.3. Valencia et al. (2005) proposed the same time-penalty weight from heuristic considerations. In dynamic programming the optimization of the objective function corresponds to the minimization of incurred costs. To solve this minimization problem, the different costs incurred in the transitions from a given state associated to all possible control actions should be explored. Also, to find the optimal policy for a given initial state, one has to calculate an associated cost-to-go, or “desirability” of the next state, for all system states within the state space of the process. This task is extremely computationally demanding in problems where both the number of states to be explored and the dimension of the state vector are large. A plausible solution for this curse of dimensionality is to use near-optimal methods that approximate the cost-to-go J* of each state xk to a parametric function such as, ˜ k+1 , r)] J ∗ (xk ) = min[g(xk , xk+1 , u) + J(x u
˜ (xk , u), r)] = min[g(xk , xk+1 , u) + J(f u
(2)
In this equation xk+1 is the next state, g(xk , xk+1 , u) the cost associated to the transition from the actual process state xk to the next process state xk+1 , u the decision or control action, J˜ a map of the state space to the near-optimal cost-to-go of each state, f(xk ,u) is a multidimensional function describing the dynamics of the process under consideration, expressed in terms of the state of the system xk and of the manipulated variable u, i.e., the substrate feed rate in the current problem and r is a vector with parameters of the system under consideration. Artificial neural networks (NN), which have been shown to be an excellent tool when deal-
1133
ing with the complexity of chemical processes, can be used to used to map the state space J˜ (Altissimi, Brambilla, Deidda, & Semino, 1998; Bhat & McAvoy, 1990; Nascimento, Giudici, & Guardani, 2000; Rallo et al., 2002), mainly due to their intrinsic universal function approximation property (Cybenko, 1989). Fuzzy ARTMAP is a powerful cognitive classifier suitable to model complex relationships over a broad range of applications, from the phenomena of turbulence (Giralt, Arenas, Ferre-Gine, Rallo, & Kopp, 2000) to the intricate relations between molecular structure and chemical properties or activity (Espinosa, Yaffe, Arenas, Cohen, & Giralt, 2001). Fuzzy ARTMAP is a neural network architecture specialized in multidimensional category maps. It performs incremental supervised learning of categories with fuzzy operations that classify inputs according to a fuzzy set of features (Carpenter, Grossberg, Markuzon, Reynolds, & Rosen, 1992). It can also classify analog patterns that are not necessarily interpreted as a fuzzy set. The main difference with other neural networks architectures is that it learns each input as it is received on-line, rather than performing an off-line optimization of a performance criterion function. Another relevant feature of fuzzy ARTMAP is that it does not require the definition of the number of neurons or connections between them. In the current study, the state space (cost-to-go mapping) is built based on self-determined states and cost-to-go regions, each one representing a category in the fuzzy ARTMAP architecture. A set of examples is needed to find the near-optimal J˜ with NN, i.e., pairs of process state vectors (for this case, four state variables OD, I, G and V) and the associated optimal cost-togo values. To obtain these data pairs different u policies must be calculated and an approximate value for the cost-to-go evaluated. The calculated cost-to-go has to be improved by value iteration with the Bellman equation (Bellman & Dreyfus, 1962). The on-line implementation of a controller, built according to Eqs. (1) and (2), requires the convergence of the cost-to-go function. With the converged cost-to-go function the value of the control variable can be found by solving Eq. (2) at each sampled time. 2.2. Optimal control model The fermentation kinetics of the system under study in fedbatch cultures was reported by Patkar and Seo (1992). They also reported experimental data for the following four process state variables: cell density, expressed as optical density (OD), glucose concentration (G), invertase activity (I) and volume (V), obtained with six different glucose-feeding strategies in a bioreactor of V = 1.2 l. Thus, the state space x of the system has a dimension of fourth. The productivity of the fermentation process at a given time t is given by productivity(t) = I(t) · OD(t) · V (t)
(3)
The optimization problem consists in the maximization of profitability or the minimization of operation costs for a certain feed rate u, as stated by max{I(tf ) · OD(tf ) · V (tf ) − λ · tf }
(4)
min{λ · tf − I(tf ) · OD(tf ) · V (tf )}
(5)
u
u
1134
C. Valencia et al. / Computers and Chemical Engineering 31 (2007) 1131–1140
The main constraint for the above optimization problem is the maximum (final) bioreactor volume, V(tf ) ≤ 1.2 l. The objective is to find the optimal feeding profile π, π = arg min{λ · tf − I(tf ) · OD(tf )tf · V (tf )} u
(6)
To solve this problem, the NDP approach described before was used, ˜ (x, u), r)] J ∗ (x) = min[λ t + J(f u
(7)
where J* is the optimal cost-to-go, x = (OD, G, I, V) the vector of the state of the fermentation process, u the substrate feed rate, λ penalizes the fermentation time and t is the time step between two consecutive states. Note that the transition cost given by the term λ t is only a function of time. In Eq. (7), J˜ is the state space-cost-to-go mapped by the fuzzy ARTMAP system, with r being the weight vectors associated to the fuzzy ART modules, and f(x, u) a multidimensional function describing the dynamics of the process for cloned invertase fermentation given in Appendix A. The optimal feeding profile determined with the NDP approach can be written as, ˜ (x, u), r)] π = arg min[λ t + J(f u
(8)
Once the state space or cost-to-go map is obtained, the above equation can be implemented on-line into a controller in such a way that the optimal policy (π) found could adapt itself to disturbances. Thus, the invertase production optimization problem modeled is a deterministic, finite horizon NDP problem, where the overall sum of transition costs is minimized. 3. Invertase production optimization The first step in a NDP optimization is to obtain a sub-optimal cost-to-go value for each possible state of the fermentation process. In the current study the feeding policies for the invertase fermentation process were modeled based upon the calculations carried out by Patkar and Seo (1993). A total of 36 different suboptimal feeding policies were calculated for three different initial fermentation volumes V0 = 0.4, 0.6 and 0.8 l. These modeled suboptimal policies or time-sequences of the feed flow rate profile, which were chosen following the shape of the optimal feeding strategies found by both Patkar and Seo (1993) and Chaudhuri and Modak (1998), can be expressed as, 0, if t < ti u(t, ti , b) = . (9) ∗ 2.2 ∗ 0.02 (1 + b (t − ti ) ), otherwise The policies defined by Eq. (9) consider that feeding begins at time ti after starting the fermentation. At this instant the feeding flow rate increases until the total bioreactor volume is reached. The fermentation process continues until the system attains its maximum profitability value. The time when this occurs is considered the optimum final time tf∗ for a given process trajectory. Eq. (9) permits the extrapolation of the feed rate profile obtained by Patkar and Seo (1993) to a wide range of initial conditions.
Fig. 1. Optimal policies for a bioreactor initial volume V0 = 0.6 l. (– – –) Examples of suboptimal policies calculated by Eq. (9) for initial times of 6 and 7 h; (—) Chaudhuri and Modak (1998); (– · –) Patkar and Seo (1993).
The rate of change of the feed flow rate with time is governed by the parameter b. The values of b and ti considered to generate different suboptimal policies were b = [0.05, 0.07, 0.10, 0.13] and ti = [1, 2, 3, 4, 5, 6, 7, 8, 9]. Fig. 1 shows examples of different policies for the initial volume of 0.6 l and feed flow starting at ti = 5 and 7 h. This figure also includes the optimal policies obtained by Patkar and Seo (1993) and Chaudhuri and Modak (1998). The response of the states of the fermentation process and its associated profit curve when the suboptimal policy u(t, 3, 0.05) was used are shown in Fig. 2, where the reaction pathway expressed in terms of substrate and product concentrations and the reactor volume are plotted versus time. This figure illustrates the evolution of the profit profit = I(tf ) · OD(tf ) · V (tf ) − λ · tf
(10)
for λ = 0.3 and V0 = 0.6 l. A total of 9328 state points were generated through 108 simulations of the invertase fermentation process using the 36 suboptimal policies produced by Eq. (9). A suboptimal cost-to-go was calculated for each the 9328 state points for the 36 suboptimal policies considered. A final optimum time tf∗ was determined for each policy and cost-to-go values calculated with J(x) = λ · (tf∗ − tx ) − I(tf∗ ) · OD(tf∗ ) · V (tf∗ )
(11)
where tx is the time of the process associated to state x. Fuzzy ARTMAP was then applied to fit surfaces to the cost-to-go data obtained. A hypercube of the state space, limited by the hyper planes defined by the maximum and minimum values of each of the state variables, was chosen. Those minimum and maximum values were obtained from the set of all states calculated for all suboptimal policies of the fermentation process. The neural system considered four input variables (the four process state variables OD, G, I and V) and one output variable (the cost-to-go value). Backpropagation feedforward neural networks were also applied in this study for comparison purposes, despite the fact they did not yield smooth
C. Valencia et al. / Computers and Chemical Engineering 31 (2007) 1131–1140
1135
Fig. 2. Fermentation dynamics calculated by Eq. (12) in terms of cell, glucose and invertase concentrations, and V0 = 0.6 l for a policy u(t, 3, 0.05). The associated profits were calculated by Eq. (10) with λ = 0.3.
and continuous optimal feed rate profiles in a previous study (Valencia et al., 2005). The NN approximations of the minimum ˜ cost-to-go values for each visited state x are identified by J(x). All data were first preprocessed (normalized and complement coded) before fitting the cost-to-go surfaces with fuzzy ARTMAP for different state conditions. Training proceeded by presenting a vector with the four process state variables to the fuzzy ARTA module and the corresponding cost-to-go value to fuzzy ARTB . Fast learning was used and the baseline of the vigilance parameter for ARTA was initially set at 0.1. The vig-
ilance parameter for ARTB and the map field were both set at 0.95. The set of corresponding input and output data was presented randomly to both fuzzy ART modules. The training process evolved in each fuzzy ART module according to the set of fuzzy rules of classification of the input and output patterns presented until stability of classes was reached. At this point fuzzy ARTA found 7991 categories among the 9328 state points and fuzzy ARTB classified the correspondent cost-to-go values in 3597 categories. The trained fuzzy ARTMAP fitted the cost-to go data set with an acceptable relative mean error of 2.0%.
1136
C. Valencia et al. / Computers and Chemical Engineering 31 (2007) 1131–1140
Table 1 Fuzzy ARTMAP characteristics and categories for the fitted cost-to-go surfaces used in Bellman’s iteration
J˜ 1 J˜ 2 J˜ 3
Fuzzy ARTA baseline vigilance parameter
Fuzzy ARTB vigilance parameter
Map field vigilance parameter
Categories found ARTA
ARTB
0.1 0.1 0.1
0.95 0.95 0.95
0.95 0.95 0.95
7991 3161 2982
3597 1162 891
The cost-to-go surface captured by fuzzy ARTMAP can be improved through Bellman iteration, which in terms of Eq. (3) can be written as J i+1 (xk ) =
min
u ∈ [0,umax ]
[g(xk , xk+1 , u) + J˜ i (f (xk , u), r)]
(12)
In this Eq. (12), umax is the maximum value between (1.2 − Vk )/t and 0.2722 l/h, with Vk being the actual fermentation volume and t the time step between states k. This time step was kept constant an equal to 0.1 h. Note again that the transition cost from one state k to the next k + 1 is only a function of time and that it reflects the difference in the cost-to-go values associated to two consecutive states of one process trajectory. After each Bellman’s iteration i was completed a new fuzzy ARTMAP network was fitted to the cost-to-go data by using the new improved cost-to-go values J˜ i for each state. In the current study three iterations (i = 3) were needed to find a good cost-to-go approximation. The termination condition for the Bellman iteration procedure, i.e., the difference between successive cost-to-go approximations, was fixed to less than ε = 0.2. The values of the respective convergence criteria 9328+n i+1 |J (x) − J˜ i (x)|/N for the three Bellman’s iteran=1 tions were 4.15, 3.69 and 4.38, respectively, with n being the number of new visited states. The characteristics of the best fuzzy ARTMAP architectures found at each iteration are listed in Table 1. Upon completion of the iteration procedure, the final cost-to-go approximation given by fuzzy ARTMAP J˜ was implemented online into a controller system composed by an on-line implementation of the Bellman Eq. (12) resulting in a new feeding policy for the fermentation process. The results of this procedure are presented in the next section. 4. Results and discussion The controller system was operated in such a way that a substrate feed rate profile could be determined from a cost-to-go policy evaluated by the trained fuzzy ARTMAP network and improved by the on-line implementation of Bellman’s Eq. (12). The training information included the cell (OD), glucose (G), invertase (I) concentrations and the fermentation volume (V). The performance of the optimal controller system was checked for known and unknown fermentation processes dynamics, and for an unexpected disturbance, as indicated below. (i) Known process dynamics: The fermentation process was started with an initial volume V0 = 0.6 l, and the initial state was one of the process trajectories used in the training of the cost-to-go fuzzy ARTMAP approximation.
Mean relative training error (%) 2.0 4.8 1.4
(ii) Unknown process dynamics: The feed policy was evaluated for several initial fermentation volumes and profit was optimized at other different interpolated initial conditions. (iii) Dynamics under an unexpected disturbance: The substrate feed flow rate was recalculated to maintain the profit at the highest value when a decrease of substrate concentration occurred in the middle of a fermentation batch. 4.1. Performance for known process conditions The profit and the production rate of a fermentation process trajectory calculated for an initial bioreactor volume of V0 = 0.6 l using the fuzzy ARTMAP-NDP method are shown in Table 2 together with the values reported in the literature or obtained with other optimization methods. The fermentation time of 12.7 h found in this study with the fuzzy ARTMAP-NPD method yielded the highest productivity of 7.52, with a relatively high profit of 3.71. This time is slightly higher than the experimental time of 12 h reported by Patkar and Seo (1993) (productivity = 7.33; profit = 3.74), which was used in the productivity optimization carried out later by Chaudhuri and Modak (1998) (productivity = 7.10; profit = 3.50). The backpropagation-NDP method yielded the lesser fermentation time of 11.5 h and, thus, the highest profit of 3.80 with a productivity of 7.25, which is on the lower side. The backpropagation results, which are close to the best suboptimal policy, confirm those reported by Valencia et al. (2005). The final fermentation time was part of both NDP optimizations. The optimal policies obtained in the current study for V0 = 0.6 l using both NN-NDP controllers are plotted in Fig. 3 together with those reported previously. This figure shows that while the policies of Patkar and Seo (1993) and the current fuzzy ARTMAP-NDP are smooth and continuous, those corresponding to Chaudhuri and Modak (1998) and to backpropagationNDP are step-wise or discontinuous. The highly discontinuous trajectory of the manipulated variable (feed rate) in the backpropagation-NDP policy makes this method unsuitable for practical purposes. When a filter was added to smooth the controller output a profit of 3.12 instead 3.80 was obtained. Table 2 Invertase production optimization results for an initial fermentation volume V0 = 0.6 l Policy
Profit
Productivity
Final time (h)
Patkar and Seo (1993) Chaudhuri and Modak (1998) Backpropagation-NDP; u(t, 2, 0.05) Fuzzy ARTMAP-NDP; u(t, 3, 0.05)
3.74 3.50 3.80 3.71
7.33 7.10 7.25 7.52
12 12 11.5 12.7
C. Valencia et al. / Computers and Chemical Engineering 31 (2007) 1131–1140
1137
Table 3 Productivies and profits obtained for known (V0 = 0.4, 0.6 and 0.8 l) and interpolated (V0 = 0.5 and 0.7 l) initial fermentation volumes (V0 ) when the fuzzy ARTMAPNDP controller or Patkar and Seo (1993) policy are applied V0 (l)
0.4 0.5 0.6 0.7 0.8 a
Fuzzy ARTMAP-NDP u(t, 2, 0.05)
Fuzzy ARTMAP-NDP u(t, 3, 0.05)
Fuzzy ARTMAP-NDP u(t, 4, 0.05)
Patkar and Seo (1993)
tf (h)
Productivity (profit)
tf (h)
Productivity (profit)
tf (h)
Productivity (profit)
tf (h)
Productivity (profit)
12.9 12.4 11.7 11.1 10.2
8.00 (4.13) 7.60 (3.88) 7.23 (3.72) 6.91 (3.58) 6.39 (3.33)
13.9 13.3 12.7 12.1 11.2
8.46 (4.29) 7.95 (3.96) 7.52 (3.71) 7.17 (3.54) 6.70 (3.34)
14.9 14.3 13.7 13 12.1
8.69 (4.22) 8.13 (3.84) 7.68 (3.60) 7.33 (3.40) 6.84 (3.48)
12.5 12.4 12 11.5 10.8
6.64a (2.89)a 7.46a (3.74)a 7.33 (3.74) 7.03a (3.59)a 6.67a (3.43)a
Values extrapolated (scaled) from the experimental policy determined by Patkar and Seo (1993) for V0 = 0.6 l by using Eq. (9).
The results given in Table 2 and Fig. 3 for V0 = 0.6 l illustrate that the fuzzy ARTMAP-NDP controller yields the maximum productivity of 7.52, with feeding starting 2 h after the beginning of the fermentation process. At this time t = 2 h the microorganism population has increased significantly at the expense of the initial glucose substrate concentration and the input of substrate is needed. The feed rate beyond this time increases at a nearly constant rate until the final productivity of 7.52 is achieved at t = 12.7 h, when the process stops. It should be noted that the backpropagation-NDP policy considers an earlier deltalike input of glucose followed by a sharp supply 5.8 h after the beginning of the fermentation process. At this time the glucose concentration is too low and the microorganisms demands too high. As a result, the controller choice is to saturate its output by opening the valve totally and suddenly at t = 5.8 h, followed by a decrease of the flow rate afterwards to approximately 0.13 l/min and to stop feeding at t = 8.2 h. After a short period of no substrate addition, this highly discontinuous feed strategy is repeated until the final process time t = 11.5 h is reached. The productivities given in Table 2 and those reported thereafter in the remaining of the manuscript can be considered as production rates, expressed in terms of daily units of enzyme produced, since reaction times span between 11 and 15 h and a schedule of one
Fig. 3. Optimal policies for an initial volume V0 = 0.6 l. (—) Fuzzy ARTMAPNDP; (- - -) backpropagation-NDP; (– ·· –) Chaudhuri and Modak (1998); (– – –) Patkar and Seo (1993).
batch/day is likely to be adopted. Thus, the productivities (production rates) given in Table 2 for V0 = 0.6 l confirm that the fuzzy ARTMAP-NDP controller yields the highest performance followed by the experimental model developed by Patkar and Seo (1993). This result together with the more realistic feed policy determined by fuzzy ARTAMP compared to that provided by the backpropagation algorithm in terms of fermentation control and operation (see Fig. 3), justifies the solely consideration of the fuzzy ARTMAP-NDP controller in the following subsections. 4.2. Performance for unknown process conditions NDP optimization offers the possibility to apply the controller built for certain bioreactor conditions to other different interpolated conditions of the same fermentation process since cost is a state variable and operational changes imply altering only initial process conditions. Other optimization procedures would require the solution of the respective additional optimal control problems. This would be computationally demanding given the nonlinear dynamics of the optimizations involved. To demonstrate the versatility of the current approach, the fuzzy ARTMAP-NDP controller was trained with several suboptimal policies u(t, ti , b) defined by Eq. (9) for the three initial volumes V0 = 0.4, 0.6 and 0.8 l, and tested for other different interpolated initial fermentation volumes V0 ∈ [0.4, 0.8]. Table 3 shows the optimization results obtained for the three training initial volumes V0 = 0.4, 0.6 and 0.8 l, for two of the interpolated initial volumes V0 = 0.5 and 07 l, as well as for those obtained by extrapolating (scaling) the policy for V0 = 0.6 l reported by Patkar and Seo (1993) to other initial volumes within the interval V0 ∈ [0.4, 0.8] l by means of Eq. (9). The feed rate policies determined for all interpolated initial volumes, including initial conditions not reported and discussed here, were smooth and consistent with the fermentation state space and process dynamics. The fuzzy ARTMAP-NDP interpolated policies yielded in most cases profit and productivity values higher than those obtained by extrapolation of the experimental policy for V0 = 0.6 l reported by Patkar and Seo (1993), as illustrated in Table 3. Table 3 indicates that maximum profits are obtained for V0 = 0.4 l with feeding policies starting at 2, 3 and 4 h, with the largest profit of 4.29 attained for u(t, 3, 0.05). The maximum productivity (production rate of enzyme units/day for one
1138
C. Valencia et al. / Computers and Chemical Engineering 31 (2007) 1131–1140
Fig. 4. State space representation of the optimal trajectories followed by the fermentation process for V0 = 0.6 l.
batch/day) of 8.69 is also attained for V0 = 0.40 l, but with u(t, 4, 0.05). Current best policies yield profits and production rates which about 15% higher than those obtained with the experimental best policies determined by Patkar and Seo (1993), which, respectively, peak at V0 = 0.60 and 0.50 l. It should be noted that the objective function given by Eq. (11) was chosen because it was the simplest to optimize profit by considering simultaneously productivity and reaction time, compared to previous studies where productivities were optimized for fixed reaction times. The state of trajectories followed by the controlled fermentation process for V0 = 0.6 l and u(t, 3, 0.05) are plotted in Fig. 4, while optimal trajectories for V0 = 0.4, 0.6 and 0.8 l, and u(t, 2, 0.05) and u(t, 3, 0.05) are shown in Fig. 5. All feeding trajectories
are smooth and adequate for implementation in a fermentation plant. The best suboptimal policy that was applied to make initial guesses at all reaction times for the three V0 = 0.4, 0.6 and 0.8 l considered for training the NDP-system was the single one parameterized by combinations of b = [0.05, 0.07, 0.1, 0.13] and ti = [1, 2, . . ., 9] h. Fig. 5 shows that feed policies u(t, 2, 0.05) found for V0 = 0.4, 0.6 and 0.8 l start feeding earlier than the feed policies obtained with u(t, 3, 0.05). Also, the feed rate for u(t, 2, 0.05) is greater than that for u(t, 3, 0.05) at any time, while the final time of the former is shorter for the above starting conditions. The same behavior was observed for both the interpolated initial volumes included Table 3 and other initial conditions not discussed here. 4.3. Performance under an unexpected process disturbance
Fig. 5. Optimal policies u(t, ti , 0.05) obtained with the fuzzy ARTMAP-NDP based controller for V0 = 0.4, 0.6 and 0.8 l and ti = 2 and 3 h.
To further explore the effectiveness of the proposed methodology, the fuzzy ARTMAP-NDP controller performance was tested for an unknown disturbance for V0 = 0.4 l. A decrease of 30% of substrate concentration was imposed at the intermediate time t = 6 h, as shown in Fig. 6a and b for Patkar and Seo (1993) and current feed policies, respectively. These figures summarize the time-evolutions of the four fermentation process variables and two glucose feed rate policies. It is clear that both controllers sense this abrupt change of state and quickly adapt to the new state. The controller performance is hindered when using Patkar and Seo (1993) policy obtained with the conjugate gradient method since its response towards a new optimal process trajectory in Fig. 6a is not as progressive and smooth as that observed for the fuzzy ARTMAP-NDP controller in Fig. 6b. In the former case the profit and productivity achieved are 2.77 and 6.52, respectively, compared with the significantly higher 4.06 and 8.23 values attained with the current controller.
C. Valencia et al. / Computers and Chemical Engineering 31 (2007) 1131–1140
1139
Fig. 6. Fermentation process behavior when an abrupt drop of glucose concentration occurs at time t = 6 h for V0 = 0.4 l. Evolution of the controlled process when using (a) Patkar and Seo model (1993) and (b) the fuzzy ARTMAP-NDP model.
Clearly, the fuzzy ARTMAP-NDP controller deals successfully with unexpected disturbances, primarily due to the feedback action implicitly implemented in Eq. (12), i.e., the actual process state is considered when deciding the optimal control action to follow.
isterio de Educaci´on y Ciencia of Spain. This work was also supported with funds from the Distinguished Researcher Award granted to Francesc Giralt by the Catalan Government.
5. Conclusions
Mass balance equations for cloned invertase production in a fed-batch bioreactor d(G · V ) = uGF − Rt · OD · V ; dt d(OD · V ) = (Rr YOD,r + Rf YOD,f ) · OD · V ; dt d(I · OD · V ) dV = (Φ − kd I) · OD · V ; =u (a.1) dt dt In these equations u is the feed flow rate (l/min), GF the glucose feed concentration (g/l), OD the cell concentration (optical density), Rt the glucose uptake rate, Rr the respiratory flux of glucose, Rf the fermentative flux of glucose, and YOD,r and YOD,f are the cell mass yields for the respiratory and fermentation fluxes, respectively. In the current and previous studies these yields were assumed constant and equal to YOD,r = 0.6 and YOD,f = 0.15 (OD/g glucose) and kd = 1.85.
The study of the controlled manipulation of the glucose feed rate into a fed-batch bioreactor to produce cloned invertase in S. cerevisiae yeast has shown that neuro-dynamic programming coupled with fuzzy ARTMAP classifier is a good alternative to standard optimization methods previously applied to fermentation processes. The fuzzy ARTMAP-NDP controller, which starts all optimizations from suboptimal feed rate control policies and results in incremental smooth changes in the feed rate profiles, yields profits that outperform those reported previously in the literature. The interpolation of the cost surface in the state space performed by the fuzzy ARTMAP algorithm together with the Bellman’s iteration approach minimizes the curse of dimensionality of the calculations and facilitates the development of a control system that is robust to disturbances like the change of substrate concentration in the bioreactor at a given time. The feeding profiles obtained are similar in trend to the ones reported previously in the literature and also implemented experimentally. The proposed fuzzy ARTMAP-NDP methodology could be readily applied to a wide range of processes by selecting objective functions that could best capture the non-linear nature of each control problem. Acknowledgements The authors are grateful for the financial support received from the “Direcci´on General de Investigaci´on Cient´ıfica y T´ecnica”, projects PPQ2000-1339 and PPQ2001-1519, and from the CIRIT “Programa de Grups de Recerca Consolidats de la Generalitat de Catalunya”, projects 2000SGR-00103, 2001SGR-00324 and 2005SGR-00735. Catalina Valencia was the recipient of the Fellowship AP98-02611927 from the Min-
Appendix A
A.1. Glucose rates The respiratory flux Rr in Eq. (a.1) is described by the following Monod-type equation: Rr =
0.55G 0.05 + G
(a.2)
while the glucose uptake rate Rt and the rate Φ given by 6.25G 1.25G Φ= , Rr ; (a.3) Rt = max 0.95 + G 0.1 + G + 2G2 The fermentative flux of glucose Rf is inferred from the two fluxes above by Rf = Rt − Rr
(a.4)
1140
C. Valencia et al. / Computers and Chemical Engineering 31 (2007) 1131–1140
A.2. Initial conditions1 Glucose concentration: G(0) = 5.0 g/l; feed glucose concentration: GF = 10.0 g/l; cells concentration: OD(0) = 0.15; invertase concentration: I(0) = 0.1 units/OD ml. A.3. Optimization The profitability is maximized max
u ∈ [0,umax ]
{I · OD · V |tf − λ · tf };
umax = max
1.2 − V , 0.2722 t
(a.5)
where tf (h) is the fermentation ending time, λ = 0.3 and t = 0.1 h. The rate of change of the four state variables OD, I, G and V depend on the state of the system that they define and on the feed rate policy u. References Altissimi, R., Brambilla, A., Deidda, A., & Semino, D. (1998). Optimal operation of a separation plant using artificial neural networks. Computers and Chemical Engineering, 22, 939–942. Balsa-Canto, E., Banga, J. R., Alonso, A. A., & Vassiliadis, V. S. (2000). Efficient optimal control of bioprocesses using second-order information. Industrial & Engineering Chemistry Research, 39, 4287–4295. Bhat, N., & McAvoy, T. (1990). Use of neural nets for dynamic modeling and control of chemical process systems. Computers and Chemical Engineering, 14, 573–583. Becker, T., Enders, T., & Delgado, A. (2002). Dynamic neural networks as a tool for the online optimization of industrial fermentation. Bioprocess Byosystems Engineering, 24, 347–354. Bellman, R. E. (1957). Dynamic programming. Princeton University Press. Bellman, R. E., & Dreyfus, S. E. (1962). Applied dynamic programming. Princeton University Press. Bertsekas, D. P., & Tsitsiklis, J. (1996). Neuro-dynamic programming. Athenas Scientific. Carpenter, G. A., Grossberg, S., Markuzon, N., Reynolds, J. H., & Rosen, D. B. (1992). Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps. IEEE: Transactions on Neural Networks, 3, 698–713. Chaudhuri, B., & Modak, J. M. (1998). Optimization of fed-batch bioreactor using neural network model. Bioprocess Engineering, 19, 71–79. Chiou, J., & Wang, F. (2001). Estimation of Monod model parameters by hybrid differential evolution. Bioprocess Biosystems Engineering, 24, 109–113. Cybenko, G. (1989). Approximation by super positions of sigmoidal function. Mathematics of Control Signals & Systems, 2, 303–314. Desai, K., Badhe, Y., Tambe, S., & Kulkarni, B. D. (2006). Soft-sensor development for fed-batch bioreactors using support vector regresi´on. Biochemical Engineering Journal, 27, 225–239.
1 Only case B of Patkar and Seo (1993) has been considered in the current study.
Dhir, S., Morrow, K. J., Rhinehart, R. R., & Wiesner, T. (2000). Dynamic optimization of hybridoma growth in a fed-batch bioreactor. Biotechnology and Bioengineering, 67(2), 197–205. Espinosa, G., Yaffe, D., Arenas, A., Cohen, Y., & Giralt, F. (2001). A fuzzy ARTMAP based quantitative structure–property relationships (QSPRs) for predicting physical properties of organic compounds. Industrial & Engineering Chemistry Research, 40, 2757–2766. Georgieva, O., Hristozov, I., Pencheva, T., Tzonkov, St., & Hitzmann, B. (2003). Mathematical modeling and variable structure control systems for fed-batch fermentation of Escherichia coli. Chemical and Biochemical Engineering, 17(4), 293–299. Giralt, F., Arenas, A., Ferre-Gine, J., Rallo, R., & Kopp, G. A. (2000). The simulation and interpretation of turbulence with a cognitive neural system. Physics of Fluids, 12, 1826–1836. Groep, M. E., Gregory, M. E., Kershenbaum, L. S., & Bogle, I. D. L. (2000). Performance modeling and simulation of biochemical processes with interacting unit operations. Biotechnology and Bioengineering, 67, 300–311. Hussain, M. A., & Kershenbaum, L. S. (2000). Implementation of an inversemodel-based control strategy using neural networks on a partially simulated exothermic reactor. Chemical Engineering Research and Design, 78, 299–311. Nascimento, C. A., Giudici, R., & Guardani, R. (2000). Neural network based approach for optimisation of industrial chemical processes. Computers and Chemical Engineering, 24, 2303–2314. Patkar, A., & Seo, J. (1992). Fermentation kinetics of recombinant yeast in batch and fed-batch cultures. Biotechnology and Bioengineering, 40, 103–109. Patkar, A., & Seo, J. (1993). Modeling and optimization of cloned invertase expression in Saccharomyces cerevisiae. Biotechnology and Bioengineering, 41, 1066–1074. Rallo, R., Arenas, A., Ferre-Gine, J., & Giralt, F. (2002). A neural virtual sensor for the inferential prediction of product quality from process variables. Computers and Chemical Engineering, 26(12), 1735–1754. Ramaswamy, S., Cutright, T. J., & Qammar, H. K. (2005). Control of continuous bioreactor using model predictive control. Process Biochemistry, 40(2763), 2770. Riascos, C. A. M., & Pinto, J. M. (2004). Optimal control of bioreactors: A simultaneous approach for complex systems. Chemical Engineering Journal, 99(1), 23–34. Ronen, M., Shabtai, Y., & Guterman, H. (2002). Optimization of feeding profile for a fed-batch bioreactor by an evolutionary algorithm. Journal of Biotechnology, 97, 253–263. Sarkar, D., & Modak, J. M. (2004). Optimization of fed-batch bioreactors using genetic algorithm: Multiple control variables. Computers and Chemical Engineering, 28, 789–798. Sarkar, D., & Modak, J. M. (2005). Pareto-optimal solutions for multi-objective optimisation of fed-batch bioreactors using nondominated sorting genetic algorithm. Chemical Engineering Science, 60, 481–492. Smets, I. Y., Claes, J. E., November, E. J., Bastin, G. P., & van Impe, J. F. (2004). Optimal adaptive control of (bio)chemical reactors: Past, present and future. Journal of Process Control, 14, 795–805. Stigter, J. D., & Keesman, K. J. (2004). Optimal parametric sensitivity control of a fed-batch reactor. Automatica, 40, 1459–1464. Valencia, C., Kaisare, N., & Lee, J. H. (2005). Optimal control of a fedbatch bioreactor using simulation-based approximate dynamic programming. IEEE Transactions on Control Systems Technology, 13(5), 786–790. Zhang, H., & Lennox, B. (2004). Integrated condition monitoring and control of fed-batch fermentation processes. Journal of Process Control, 14, 41–50.