Empirical modeling of antibiotic fermentation process using neural networks and genetic algorithms

Empirical modeling of antibiotic fermentation process using neural networks and genetic algorithms

Mathematics and Computers in Simulation 49 (1999) 363±379 Empirical modeling of antibiotic fermentation process using neural networks and genetic alg...

320KB Sizes 0 Downloads 44 Views

Mathematics and Computers in Simulation 49 (1999) 363±379

Empirical modeling of antibiotic fermentation process using neural networks and genetic algorithms PrimozÏ PotocÏnik*, Igor Grabec University of Ljubljana, Faculty of Mechanical Engineering, AsÏkercÏeva 6, POB 394, SI-1000, Ljubljana, Slovenia Received 1 November 1998; accepted 1 April 1999

Abstract Empirical modeling of the industrial antibiotic fed-batch fermentation process is discussed in this paper. Several methods including neural networks, genetic algorithms and feature selection are combined with prior knowledge in the research methodology. A linear model, a radial basis function neural network and a hybrid linear±neural network model are applied for the model formation. Two approaches to modeling of antibiotic fermentation process are presented: a dynamic modeling of the process and a modeling in the fermentation sample space. The first approach is focused on the current state of the fermentation process and forecasts the future product concentration. The second approach treats the fermentation batch as one sample which is characterized by the set of extracted features. Based on these features, the fermentation efficiency is predicted. Modeling in the fermentation sample space integrates the prior knowledge of experts with empirical information and can represent a basis for the control of the fermentation process. # 1999 IMACS/Elsevier Science B.V. All rights reserved. Keywords: Empirical modeling; Fermentation process; Neural networks; Genetic algorithms; Hybrid modeling; Feature extraction; Feature selection

1. Introduction Fermentation processes are used to produce various substances in the pharmaceutical, chemical and food industries. The object of the research is an industrial antibiotic production by a fed-batch fermentation process in which clavulanic acid is produced as a secondary metabolite by the microorganisms. Secondary metabolic fermentation is a complex non-linear process. Very high costs, inconvenient environmental load and high energy consumption are characteristic for production. Consequently, an optimization of production is required and the first step toward an optimal production is modeling the process which is still a challenging problem. In this paper an empirical approach to modeling of the fermentation process for the antibiotic production is presented. ÐÐÐÐ * Corresponding author. 0378-4754/99/$20.00 # 1999 IMACS/Elsevier Science B.V. All rights reserved. PII: S 0 3 7 8 - 4 7 5 4 ( 9 9 ) 0 0 0 4 5 - 2

364

P. PotocÏnik, I. Grabec / Mathematics and Computers in Simulation 49 (1999) 363±379

Secondary metabolic production is not yet well understood. Several analytical approaches to modeling of secondary metabolic production systems were reported [1,2] and particularly for the production of antibiotics [3±5]. For the process under consideration we have not found in the literature any applicable analytical models available under the present extent of knowledge. Therefore an empirical non-parametric approach [6] is investigated in this paper. Our research is focused mainly on the utilization of neural networks (NN), which are well suited for modeling of nonlinear systems [7±9]. Commonly used structures are a radial basis function network and a multilayer perceptron. Neural networks have already been successfully applied to the modeling of fermentation processes. The advantage of neural networks with respect to other modeling techniques was demonstrated in a simulation study [10], describing the use of neural networks for the dynamic modeling of bioprocesses. Willis et al. [11] and Massimo et al. [12] estimated biomass concentration in industrial fermentation system by neural network models. Neural estimation techniques were also used for process predictors in biotechnology [13]. The capability of neural networks in modeling cell growth by providing initial conditions only was demonstrated in [14]. Neural networks were used to enhance the fermentation process development [15] with experimental design. Tsaptsinos and Leigh [16] reported the application of neural networks to the modeling of industrial secondary metabolic fermentation process. In relation to network configuration, a question of proper variables which are used as inputs to the process model was also addressed. In the subsequent work, neural network modeling was combined with classification methods. The available fermentations were classified into two classes and each class was modeled separately [17]. Such an approach represents an example of hybrid modeling. Neural networks can be combined with prior knowledge and experience embedded in available physical models, model-based estimators, and linguistic descriptions about system behavior which are provided by human experts. A general framework of identifying the possible combinations of hybrid modeling is presented in [18]. Hybrid models are often more accurate and extrapolate better than neural networks or available first-principle models. Some of the hybrid applications in the bio-chemical industry include a gray box modeling strategy based on NN and macroscopic balances [19], a prediction of cell biomass and secondary metabolite in a fed-batch penicillin fermentation [20], a hybrid NN model of bioreaction kinetics [21], a hybrid NN modeling of cell metabolism [22] and the use of a hybrid NN for predictive control of a batch polymerization process [23]. The aim of this research is to investigate the possibilities of non-parametric modeling of the antibiotic fed-batch fermentation process. Whereas only modeling of the process is discussed in this paper, our goal is to proceed in the direction of process optimization with the objective to improve the yield of the fermentation product. The models under investigation include: a radial basis function neural network, a hybrid linear±neural model and a linear model. Two complementary approaches to the process formation are discussed: the dynamic modeling of the process and the modeling in the fermentation sample space. We expect that research of both possible approaches can help us to find the most suitable modeling method for the fermentation process which could be used for an improvement of the process control. The dynamic modeling of the fermentation process is focused on the characterization of the state of the fermentation process and forecasts the future product concentration. Available process measurements are utilized to describe the current state of the process and to forecast the concentration of the fermentation product. The method of proper variable selection is applied to the basic set of process measurements in order to select those variables, which render possible formation of the model

P. PotocÏnik, I. Grabec / Mathematics and Computers in Simulation 49 (1999) 363±379

365

with an optimal forecasting ability. The forecasting performance is estimated on a test set of fermentations. The second approach treats a fermentation batch as one sample. With the help of the prior knowledge of human experts the features for fermentation batch description are extracted from the process measurements. Based on the extracted features, a model is built for the fermentation efficiency prediction. An optimal subset of features is determined by numeric optimization with genetic algorithms where the prediction ability of the generated model is regarded in the search for the optimum. The modeling in the fermentation sample space presents a novel hybrid description of complex processes and combines the prior knowledge of experts with methods of non-parametric modeling.

2. Model structures In this article the following structures are investigated for the purpose of process modeling: a linear model, a radial basis function network and a hybrid linear±neural model. A linear model is included for comparison with the other two non-linear models. Due to combining the modeling with a feature selection procedure, the application of a fast learning algorithm is preferable for model generation. A radial basis function network was used for the modeling because of its fast learning procedure. 2.1. Linear model The linear model is described by the following expression: ^yLIN …n† ˆ TLIN x…n†:

(1)

Here x is the model input vector, ^yLIN denotes the output of the linear model and TLIN is a transposed parameter vector which describes the system response. Given the set of N samples fx…n†; y…n†; n ˆ 1; . . . ; Ng, the parameter vector LIN can be calculated analytically using a linear regression. 2.2. Radial basis function network A radial basis function network (RBFN) is a three-layered neural network [24] which has a universal approximation property for continuous real-valued functions [25,26]. RBFN is defined by the equation: ^yRBF …n† ˆ

K X

wk k …x…n††

(2)

kˆ1

which composes the network output ^yRBF by the sum of basis functions k multiplied by the weights wk. A common choice of basis function is a Gaussian which is given by the equation: ! kx…n† ÿ qk k2 : (3) k …x…n†† ˆ exp ÿ 22

366

P. PotocÏnik, I. Grabec / Mathematics and Computers in Simulation 49 (1999) 363±379

Here qk represents the center of the basis function and  denotes its width. The norm kk in Eq. (3) can be expressed by: kx ÿ qk k2 ˆ …x ÿ qk †T C T C…x ÿ qk †

(4)

where C denotes either identity-, diagonal- or non-diagonal-matrix. In our case, a diagonal matrix C is defined by the elements which assign a specific weight to each input coordinate. Construction of a RBFN model involves determination of the parameters wk, qk, C and . A two-stage learning algorithm was chosen for the RBFN construction [24] because it yields faster learning than the supervised gradient descent method. In the first stage, the network centers qk, the matrix C and the width  are selected and the second stage involves a calculation of the output weights wk, which is a fast linear operation. 2.3. Hybrid linear±neural model As a third model structure, a hybrid parallel linear±neural model is used. The model is schematically shown in Fig. 1 and is also referred to as a linear model with a non-linear error correction. The motivation to combine linear and neural models comes from different properties of the two models. While neural networks have good approximation and interpolation properties, they show limited extrapolation capacity. On the other hand, linear models are limited to deal with non-linearities but often demonstrate more robust extrapolation behavior. Therefore, a combination of a linear model with a non-linear neural-net-based error correction is used in our study. A hybrid model is described by the following equation: ^yHIB …n† ˆ TLIN x…n† ‡

K X

wk k …x…n††:

(5)

kˆ1

The model output ^yHIB is composed of the linear part and the non-linear part which is modeled by the neural network. The two-step learning procedure involves formation of the linear model at the first step and training of the RBFN at the second step. The target values for the RBFN learning are the residual non-linearities of the modeled process. 3. Antibiotic fermentation process The considered fermentation process runs in a fed-batch mode which refers to continuous and variable rate of feeding while the product is removed at the end of the process. The antibiotic is

Fig. 1. A hybrid linear±neural model.

P. PotocÏnik, I. Grabec / Mathematics and Computers in Simulation 49 (1999) 363±379

367

Fig. 2. The typical growth of the fermentation product.

produced as a secondary metabolite by the microorganisms. The average batch lasts about 140 h and, after the product recovery, the process is restarted. Data for 65 fermentation batches are available with process variables measured in 3 h time intervals. Measurements provide the values of the temperature T, pH, air flow O2 partial air pressure pO2 , stirring frequency , viscosity , percentage of mass volume pmV, and concentrations of nutrition Cn, product Cp and several other variables. Because the data are of a confidential nature, we describe the measurements in relative units. Fig. 2 shows a typical growth of the fermentation product concentration Cp, where a linear scale with an arbitrary unit is used vertically. Some of the measurements are sampled on-line with a data acquisition system, while the remaining data are acquired off-line by analysis of the samples in a laboratory. The off-line procedures are time consuming and as a result a sampling interval of 3 h is possible. Because of experimental errors, a signal preprocessing is generally necessary to prepare data for the modeling purpose. Some missing values are estimated by a second-order interpolation. Some of the measurements include experimental noise which is removed by using Savitzky±Golay (SG) digital smoothing filters [27]. These filters approximate the underlying function within the moving window by a higher order polynomial. The example in Fig. 3 presents the results of filtering the pH data with second-order SG filter. Internal states of the process are not accessible. Therefore, the current state of the fermentation process has to be estimated from the available process measurements. The goal of this investigation is to generate a proper process model on the basis of the available set of measurements. In the following sections two approaches to the fermentation process modeling are presented: dynamic modeling of the fermentation process and modeling in the fermentation sample space. 4. Dynamic modeling of the fermentation process The aim of dynamic modeling of the fermentation process is to generate a model for forecasting the future product concentration from the current values of available process measurements. The model generation is based on the observation of dynamic changes in the fermentation process. The composition of all substances and the rate of their changes which determine the state of the fermentation process in detail are not observable but are indirectly represented by the measured variables. The future fermentation product concentration is selected as a modeling target because it is a basic indicator of the fermentation process quality.

368

P. PotocÏnik, I. Grabec / Mathematics and Computers in Simulation 49 (1999) 363±379

Fig. 3. The original (a) and the filtered (b) pH data.

A general scheme of dynamic modeling of the fermentation process is presented in Fig. 4. The data acquisition system is used to collect the process variables which are then preprocessed by interpolation and filtering. The method of variable selection is applied to the basic set of process variables to select the most appropriate signals for the model formation. Based on the selected variables, the dynamic model of the fermentation process is built for forecasting the future product concentration. The model is applicable for testing the influence of variations of process variables on process quality. We also expect the application of a model in a model-based predictive control scheme.

Fig. 4. The first approach to modeling of the antibiotic fermentation process: A general scheme of dynamic modeling of the fermentation process.

P. PotocÏnik, I. Grabec / Mathematics and Computers in Simulation 49 (1999) 363±379

369

4.1. Model formation The current set of available process variables is described by: …t† ˆ ‰pH…t†; O2 …t†; pO2 …t†; …t†; …t†; pmV…t†; Cn …t†; . . . ; Cp …t†Š:

(6)

The subset of variables which is selected for the model formation is denoted by: X…t†  …t†:

(7)

Components of the model input vector: x…t† ˆ ‰x1 …t†; . . . ; xD …t†Š

(8)

are composed of the current and past subsets of selected process variables: xd …t† ˆ fX…t†; X…t ÿ 1†; . . . ; X…t ÿ  ‡ 1†g;

(9)

where  denotes the maximum embedding dimension. The unit of t refers to 3 h sampling time interval. The output signal is defined by the future product concentration: y…t ‡ 1† ˆ Cp …t ‡ 1†:

(10)

Mapping from the input space x to the output space y is described by the model: ^y…t ‡ 1† ˆ M…x…t††

(11)

where M denotes one of the previously mentioned model structures. The available data set Z ˆ f …n†; n ˆ 1; . . . ; Ng contains 65 fermentations with N ˆ 2300 samples of measured variables. This data set is divided into a learning set Zl which is used for the model generation, and a test set Zt which is utilized for the model validation. Nl and Nt denote the number of samples in the learning and test set. The mean square error of the prediction on the test set Zt is used for the model validation: Vt ˆ

Nt 1 X ‰^y…x…n†† ÿ y…x…n††Š2 : Nt nˆ1

(12)

The inverse of the error measure Vt is applicable as a description of the model quality. 4.2. Variable selection The complexity of the model depends on the number of applied process variables. Quite generally one can expect that some of them will be more appropriate for modeling the product concentration than others. The aim of this section is to introduce a method by which the suitability of the variables for modeling can be estimated. We expect that by including only some properly selected variables the complexity can be reduced and the performance of the model increased. An optimal subset of process variables for the model formation is determined by the method of variable selection. We apply the backward elimination procedure which starts modeling with a

370

P. PotocÏnik, I. Grabec / Mathematics and Computers in Simulation 49 (1999) 363±379

complete set of variables. At every step of the procedure, the model is formed and tested. In the next step, one of the variables is omitted from the input vector x and the model is built and tested again. By repetition of this procedure the variable with the least contribution to the model quality is found and rejected in the next step of the search. The procedure is repeated recursively until only one variable is left in the input vector x. All three model structures with several embedding dimensions (2[1,5]) are considered in the modeling procedure. If forecasting is performed by a model on the learning set of fermentations, the best results are obtained by using a complete input vector X, but forecasting on the set of test fermentations reveals that a model with a complete input vector does not yield a minimal test error Vt and consequently does not exhibit a good generalization ability. An optimal forecasting with minimal prediction error Vt is obtained by using a subset of representative variables, as described in the next section. 4.3. Results Forecasting errors Vt obtained by a linear model, a radial basis function neural network and a hybrid linear±neural model are: Vt…LIN† ˆ 0:00613; Vt…RBFN† ˆ 0:00555 and Vt…HIB† ˆ 0:00550. The best results of dynamic modeling of the fermentation process are obtained using a hybrid linear±neural model and the following subset of five input variables: the batch age, t, the current product concentration, Cp(t), the viscosity, (t), the nutrition concentration, Cn(t), and the partial oxygen pressure, pO2 …t†. The results of forecasting obtained by a hybrid model are shown in Fig. 5. The forecast values of the product concentration ^y and the observed values y are plotted for three fermentations from the test set Zt. A scatter plot of all predictions on the test set is also presented. 4.4. Discussion of dynamic modeling The forecasting accuracy and generalization properties of non-linear models are better than those of a linear one although the differences are small. The results of forecasting by a hybrid model are slightly better than the results obtained by a RBFN model. A scatter plot of all predictions on the test set Zt in Fig. 5 demonstrates good modeling accuracy and generalization ability of the model with optimally selected input components. Suitable process variables for the forecasting of the product concentration Cp(t‡1) are: the current product concentration, Cp(t), the viscosity, (t), the current age of the batch, t, the nutrition concentration, Cn(t) and the partial oxygen pressure, pO2 …t†. Other process variables are not suitable for our modeling purpose and can deteriorate the generalization ability of the model. A proper embedding dimension  for composition of the input vector x is  ˆ 1. Including additional delayed values of process variables does not improve the modeling accuracy. Among the tested models, the hybrid linear±neural model is the most suitable for dynamic modeling of the fermentation process. There exists a problem of model reliability outside the training domain. The extrapolation ability of the linear±neural model is slightly better than that of a NN model and could be further improved by introducing additional analytical knowledge into our approach. We expect the application of the model to simulation purposes and to model-based predictive control.

P. PotocÏnik, I. Grabec / Mathematics and Computers in Simulation 49 (1999) 363±379

371

Fig. 5. Dynamic modeling of the fermentation process: forecasting of the fermentation product concentration by a hybrid model. The results of forecasting for three fermentations from the test set Zt and a scatter plot of all forecasting trials are shown.

5. Modeling in the fermentation sample space The goal of modeling in the fermentation sample space is to build a model for forecasting of the fermentation efficiency Q which is estimated by the maximal achieved product concentration: Q ˆ max…Cp …t††: t

(13)

The fermentation sample space is defined by the set of samples where every sample represents one fermentation batch. The sample is composed from the features which describe the fermentation batch and are extracted from the process variables with the help of prior knowledge. Based on the set of features by which a fermentation batch is characterized, the model is built to forecast the fermentation efficiency Q. There are two reasons to use the extracted features instead of raw process measurements: first, the dimensionality of input vectors is considerably reduced, and second, the prior knowledge of experts is utilized. The general scheme of modeling in the fermentation sample space is presented in Fig. 6. The features are extracted from available variables and the optimal subset of fermentation features for the model formation is determined by numerical optimization based on genetic algorithms. The selected features are applied to the model input vector x and the model is used to predict the fermentation efficiency Q.

372

P. PotocÏnik, I. Grabec / Mathematics and Computers in Simulation 49 (1999) 363±379

Fig. 6. The second approach to modeling of the antibiotic fermentation process: A general scheme of modeling in the fermentation sample space.

Modeling in the fermentation sample space can be used to estimate the final quality of the fermentation batch already during the run of the process. The model is applicable for simulation purposes and can serve for optimization of the control policy. 5.1. Feature extraction Samples in the fermentation sample space are determined by the set of features which are extracted from the available data for 65 fermentation batches. Cooperation with experts from the industry is of fundamental importance at this step as they can suggest how to extract informative features from the process variables. Features for the fermentation process characterization are extracted by using auxiliary functions, some of them are shown in Fig. 7. The interpretation of the auxiliary functions is given in Table 1. The goal of this modeling approach is to extract the features already during the early stages of the fermentation process and to use these features for the estimation of the fermentation efficiency Q. As suggested by the experts from the industry, the extracted features include various particular characteristic values of the process variables, the trends describing their changes in time and statistics like mean value and variance. Using the auxiliary functions given in Table 1, a set of 41 features is

Fig. 7. Auxiliary functions for feature extraction from the process measurements.

P. PotocÏnik, I. Grabec / Mathematics and Computers in Simulation 49 (1999) 363±379

373

Table 1 Interpretation of the auxiliary functions for feature extraction Ave(X) Var(X) Val(X,t1) Max(X) tmax(X) tthrash(X) Line(X,t1,t2)

Statistical average Variance Value of variable X at time t1 Maximum value Time, when the maximum value occurs Time, when the predetermined threshold is achieved Coefficients of the linear interpolation in interval [t1,t2]

extracted for every sample in the fermentation sample space: …n† ˆ f1 …n†; . . . ; 41 …n†g;

n ˆ 1; . . . ; 65:

(14)

Here {1,. . .,41} denote the features which are included in the feature set . The features  are more relevant for characterization of the fermentation batch than raw data because they are chosen based on the qualitative knowledge of experts who control the industrial fermentation. The method proposed thus combines qualitative expert knowledge with the methods of non-parametric modeling in a hybrid manner. 5.2. Model formation The components fxd ; d ˆ 1; . . . ; Dg of the model input vector x are chosen among the first 40 features {1(n),. . .,40(n)}. This part of the feature set  is referred to as: x …n† ˆ f1 …n†; . . . ; 40 …n†g:

(15)

The fermentation efficiency Q is estimated by the 41st feature which represents the maximal achieved fermentation product concentration: Q…n† ˆ 41 …n† ˆ max…Cp …t; n††: t

(16)

This feature is selected as a modeling target output. The goal of modeling in the fermentation sample space is to forecast the fermentation efficiency Q on the basis of the feature set x. A model is generated to approximate the mapping from the feature space to the output space: ^ Q…n† ˆ M…x…n††:

(17)

A linear model, a RBFN and a hybrid model are examined as a structure M. As in the dynamic modeling of the fermentation process, a mean square error of prediction on the test set of fermentations is applied for the model validation: Nt 1X ^ ‰Q…n† ÿ Q…n†Š2 : Vt ˆ Nt nˆ1

The error measure Vt is a criterion for the model quality estimation.

(18)

P. PotocÏnik, I. Grabec / Mathematics and Computers in Simulation 49 (1999) 363±379

374

5.3. Feature selection The original set of features x is an initial approximation of possible descriptors for the fermentation batch, as suggested by the experts from the industry. It was found that when all 40 features are applied to the model input x, the model does not yield good predictions on the test set. Therefore an optimization of the utilized features is performed where the prediction error given by Eq. (18) is used as a criterion for an optimization. The aim is to find an optimal subset of fermentation features: x  x ;

(19)

which renders generating a model with the minimal prediction error Vt possible. A selection of an arbitrary subset xÿ from the set x ˆ f1 ; . . . ; 40 g is performed by using an operator B called a binary selector: xÿ ˆ B…x †:

(20)

A binary selector B selects the features from the feature set x and is composed of binary digits (0,1) determining exclusion or inclusion of the feature. The operation of a binary selector is presented in Fig. 8 where five features are selected from the original feature set x. A search for the optimal binary selector B*(x) which determines the optimal feature subset x* is performed by numerical optimization. The high dimensionality of the search space (40 possible features) can cause forward selection or backward elimination schemes to be trapped in a local minimum. Therefore a genetic algorithm (GA) is used to perform the optimization because it is well suited to robust optimization in a multidimensional space [28,29]. No additional encoding is necessary because the binary selector B is already represented by binary digits. A general form of a genetic algorithm is described in pseudo-code in Fig. 9. The optimization with a genetic algorithm starts with initialization of the population P(t) of solutions. The initial population can be set randomly and its members are represented by binary strings. In each cycle of genetic operation, a subsequent generation P(t‡1) is created from the current population P(t). Operators of selection, recombination and mutation are used to form the next generation. The suitability of the new solutions is evaluated and the procedure is repeated unless the stopping criterion

Fig. 8. An operation of a binary selector: The original feature set x is mapped into a subset xÿ which contains only selected features.

P. PotocÏnik, I. Grabec / Mathematics and Computers in Simulation 49 (1999) 363±379

375

Fig. 9. A general form of a genetic algorithm.

is met. The stopping criterion usually prescribes the maximum number of iterations or determines if the best solution obtained is satisfactory. Optimization with a genetic algorithm is utilized in the search for the optimal binary selector B*(x). The optimum is defined with respect to minimum of efficiency forecasting error. The GA evaluation function fe for each binary selector B(x) is defined by the forecasting error Vt of the model which is built upon the evaluated feature subset xÿ ˆ B…x †: fe …B…x †† ˆ Vt …B…x †† ˆ

Nt 1X 2 ^ ‰Q…B… x…n† †† ÿ Q…n†Š : Nt nˆ1

(21)

5.4. Solution approach Modeling is combined with optimizing feature selection procedure and consists of the following steps: 1. A model structure fMi ; i ˆ 1; 2; 3g is selected with index i representing a linear model, a RBFN or a hybrid model. Thirty experiments are repeated with each model structure. An experiment refers to the optimization procedure by genetic algorithm. 2. Every experiment is designed to optimize the feature subspace xÿ for the selected model structure Mi. The goal is to find the optimal subset of features x*. The following operations are performed during the experiment:  An initial population of random binary selectors fBm …x †; m ˆ 1; . . . ; Mg is generated.  A model is built with every binary selector Bm(x) and its suitability is determined using the criterion function fe given by Eq. (21).  Iterations of a genetic algorithm are executed. A model with minimal prediction error Vt is considered as the optimal one and a corresponding binary selector B*(x) determines the optimal fermentation feature subspace x*. 3. When the experiments are accomplished with all three model structures, the optimal model structure M* is determined together with the corresponding fermentation feature subset x ˆ B …x †.

376

P. PotocÏnik, I. Grabec / Mathematics and Computers in Simulation 49 (1999) 363±379

5.5. Results Forecasting errors Vt obtained by a linear model, a radial basis function neural network and a hybrid linear±neural model are: Vt…LIN† ˆ 0:116; Vt…RBFN† ˆ 0:104 and Vt…HIB† ˆ 0:099. The best results are obtained by using a hybrid linear±neural model. The result of forecasting by a hybrid model with mean square error of prediction 0.099 corresponds to 6.3% error of efficiency predictions on the test fermentations. Corresponding optimal binary selector B* for feature selection is: (22) Fig. 10 shows a scatter plot of results obtained by a hybrid model. Predicted values of fermentation ^ efficiency Q…n† are plotted versus actually observed fermentation efficiency Qn for the subset of learning fermentations fn ˆ 1; . . . ; Nt g and for the subset of test fermentations fn ˆ 1; . . . ; Nt g. The results refer to the use of a hybrid model which yields the best results. 5.6. Discussion of modeling in the fermentation sample space The hybrid linear±neural model is the most suitable for modeling in the fermentation sample space. Results are similar for the RBFN model and slightly worse for the linear model. Available data from 65 fermentations are used in forming the fermentation sample space. A limited number of samples yields fast learning but the model performance is sensitive to the splitting of the fermentation set into the learning and test subsets. Additional measurements should be collected to improve the modeling reliability. The available process variables do not contain high quality information for the characterization of the fermentation batch. Therefore, the prior knowledge of experts is applied for the extraction of features which are appropriate for description of the fermentation. The initial set of 40 features is further reduced to the set of the most suitable features. The selected features render generating a proper model

Fig. 10. Modeling in the fermentation sample space: forecasting of the fermentation efficiency Q by a hybrid model. A scatter ^ for the learning subset and for the test subset of fermentations are shown. plot of predictions Q

P. PotocÏnik, I. Grabec / Mathematics and Computers in Simulation 49 (1999) 363±379

377

possible and indicate the most important process properties. This information can be utilized by the fermentation experts who can interpret the features and focus on the pointed process characteristics in order to improve the control policy. The hybrid model in the fermentation sample space is applicable for a simulation in which the consequences of process parameter variations are explored. This can lead to optimal parameters of control policy which corresponds to process optimization. 6. Conclusions Empirical approach to modeling of industrial batch processes is proposed where several methods are combined with prior knowledge in the research methodology. In this paper the empirical modeling is applied to the industrial antibiotic production. A radial basis function network and a hybrid linear± neural model are used in the modeling procedure and are compared with a linear model. A particular emphasis is alloted to the selection of proper variables for the model formation. Two complementary approaches to empirical modeling of the antibiotic fermentation process are discussed. Dynamic modeling of the fermentation process is focused on the prediction of the current state of the fermentation batch where the state is estimated by the available process measurements. The optimal subset of process variables is determined for the model formation and several variables with low information content are rejected from the modeling procedure. Very good forecasting accuracy is obtained primarily due to a correlation of the future product concentration with the current concentration. This correlation is automatically learned by the model and the best forecasting results are observed by using the hybrid linear±neural model. A deficiency that was observed by using the model is that low sensitivity to some process variations occurs for several process parameters. Complementary to dynamic modeling of the fermentation process, modeling in the fermentation sample space treats a complete fermentation batch as one sample. This approach resembles the treatment of the fermentation batches by the experts from the industry. With their help the features for the process characterization are extracted from available process variables. The features capture the qualitative information which was empirically observed to be important for the process behavior. The initial set of 40 extracted features is further reduced and optimized by a genetic algorithm to select only those features which render generating the best model possible. The hybrid linear±neural model is superior to other models in forecasting fermentation efficiency. The method proposed integrates the prior qualitative knowledge of experts with empirical information. The solution approach combines available knowledge with non-parametric modeling and numerical optimization. Based on our investigation we conclude that a promising approach to the modeling of very complex processes incorporates several types of knowledge and combines them in a hybrid manner. We expect that a hybrid model in the fermentation sample space could form a proper basis for an optimal control of the antibiotic fermentation process. Acknowledgements The authors would like to acknowledge the company LEK, d.d. for their support of this research and the National Ministry of Science and Technology for their financial support.

378

P. PotocÏnik, I. Grabec / Mathematics and Computers in Simulation 49 (1999) 363±379

References [1] S. MilicÏicÂ, J. VelusÏcÏek, M. Kremser, H. SocÏicÏ, Mathematical modeling of growth and alkaloid production in Claviceps purpurea batch fermentation, Biotechnol. Bioeng. 41(5) (1993) 503±511. [2] J. Guardiola, J.L. Iborra, M. CaÂnovas, A model that links growth and secondary metabolite production in plant cell suspension cultures, Biotechnol. Bioeng. 46(3) (1995) 291±297. [3] J.P. Cardoso, A simple model for the optimization of the extracted yield of antibiotics isolated from fermented broths by direct crystallization, Biotechnol. Bioeng. 42(9) (1993) 1068±1076. [4] G.C. Paul, C.R. Thomas, A structured model for hyphal differentiation and penicillin production using Penicillium chrysogenum, Biotechnol. Bioeng. 51(5) (1996) 558±572. [5] T.C. Zangirolami, C.L. Johansen, J. Nielsen, S.B. Jùrgensen, Simulation of penicillin production in fed-batch cultivations using a morphologically structured model, Biotechnol. Bioeng. 56(6) (1997) 593±604. [6] I. Grabec, W. Sachse, Synergetics of Measurements, Prediction and Control, Series in Synergetics, Springer, Berlin, 1997. [7] K.S. Narendra, K. Parthasarathy, Identification and control of dynamical systems using neural networks, IEEE Trans. Neural Networks 1(1) (1990) 4±27. [8] N.V. Bhat, P.A. Minderman, T. McAvoy, N.S. Wang, Modeling chemical process systems via neural computation, IEEE Control Systems Mag. 10 (1990) 24±29. [9] S. Chen, S.A. Billings, Neural networks for nonlinear dynamic system modelling and identification, Int. J. Control 56(2) (1992) 319±346. [10] J. Thibault, V.V. Breusegem, A. CheÂruy, On-line prediction of fermentation variables using neural networks, Biotechnol. Bioeng. 36 (1990) 1041±1048. [11] M.J. Willis, C.D. Massimo, G.A. Montague, M.T. Tham, A.J. Morris, Artificial neural networks in process engineering, IEE Proceedings-D 138(3) (1990) 256±266. [12] C.D. Massimo, G.A. Montague, M.J. Willis, M.H. Tham, A.J. Morris, Towards improved penicillin fermentation via artificial neural networks, Comput. Chem. Engrg. 16(4) (1992) 283±291. [13] M. Keulers, Structure and parameter identification of a batch fermentation process using non-linear modelling, in: Proceedings of the American Control Conference, San Francisco, CA, June 1993, pp. 2261±2265. [14] M.-J. Syu, G.T. Tsao, Neural network modeling of batch cell growth pattern, Biotechnol. Bioeng. 42(3) (1993) 376±380. [15] J. Glassey, G.A. Montague, A.C. Ward, B.V. Kara, Artificial neural network based experimental design procedures for enhancing fermentation development, Biotechnol. Bioeng. 44(4) (1994) 397±405. [16] D. Tsaptsinos, J.R. Leigh, Modelling of a fermentation process using multi-layer perceptrons: Epochs vs pattern learning, sigmoid vs linear transfer function, J. Microcomput. Appl. 16 (1993) 125±136. [17] D. Tsaptsinos, R. Tang, J.R. Leigh, Neuroidentification of a biotechnological process: Issues and applications, Neurocomputing 9 (1995) 63±79. [18] M. Agarwal, Combining neural and conventional paradigms for modelling, prediction and control, Int. J. Systems Sci. 28(1) (1997) 65±81. [19] H.J.L. van Can, H.A.B. te Braake, C. Hellinga, K.C.A.M. Luyben, An efficient model development strategy for bioprocesses based on neural networks in macroscopic balances, Biotechnol. Bioeng. 54(6) (1997) 549±566. [20] M.L. Thompson, M.A. Kramer, Modeling chemical processes using prior knowledge and neural networks, AIChE J. 40(8) (1994) 1328±1340. [21] B. SaxeÂn, H. SaxeÂn, A neural-network based model of bioreaction kinetics, Canadian J. Chem. Engrg. 74 (1996) 124± 131. [22] P.C. Pu, J.P. Barford, A hybrid neural network ± first principles approach for modelling of cell metabolism, Comput. Chem. Engrg. 20(6/7) (1996) 951±958. [23] A.Y. Tsen, S.S. Jang, D.S.H. Wong, B. Joseph, Predictive control of quality in batch polymerization using hybrid ANN models, AIChE J. 42(2) (1996) 455±465. [24] J. Moody, C.J. Darken, Fast learning in networks of locally-tuned processing units, Neural Comput. 1 (1989) 281±294. [25] F. Girosi, T. Poggio, Networks and the best approximation property, Biol. Cybernet. 63 (1990) 169±176. [26] E.J. Hartman, J.D. Keeler, Layered neural networks with gaussian hidden units as universal approximations, Neural Comput. 2 (1990) 210±215.

P. PotocÏnik, I. Grabec / Mathematics and Computers in Simulation 49 (1999) 363±379

379

[27] W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery, Numerical Recipes in C: The Art of Scientific Computing, 2nd ed., Cambridge University Press, Cambridge, 1992. [28] D.E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, Reading, MA, 1989. [29] K.F. Man, K.S. Tang, S. Kwong, Genetic algorithms: Concepts and applications, IEEE Trans. Industrial Electronics 43(5) (1996) 519±534.