JOURNAL
OF BIOSCIENCE
AND BIOENGINEERING
Vol. 88, No. 2, 215-220. 1999
Construction of COD Simulation Model for Activated Sludge Process by Fuzzy Neural Network SHUTA TOMIDA, TAIZO HANAI, NAOYASU UEDA, HIROYUKI HONDA, AND TAKESHI KOBAYASHI* Department of Biotechnology, Graduate School of Engineering, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8603, Japan Received17February 1999/Accepted10 May 1999 Fuzzy neural network (FNN) was applied to construct a simulation model for estimating the effluent chemical oxygen demand (COD) value of an activated sludge process in a “U” plant, in which most of process variables were measured once an hour. The constructed FNN model could simulate periodic changes in COD with high accuracy. Comparing the simulation result obtained using the FNN model with that obtained using the multiple regression analysis (MRA) model, it was found that the FNN model had 3.7 times higher accuracy than the MRA model. The FNN models corresponding to each of the four seasonswere also constructed. Analyzing the fuzzy rules acquired from the FNN models after learning, the operational characteristic of this plant could be elucidated. Construction of the simulation model for another plant “A”, in which process variables were measured once a day, was also carried out. This FNN model also had a relatively high accuracy. [Key words: activated sludge process,fuzzy neural network, chemical oxygen demand, simulation] (ANN) model. The model simulated the process dynamically using the data from the actual wastewater treatment. However, the ANN model used in their study was the black-box model and it is difficult to interpret the meaning of rules acquired in the model. Iwahori et al. constructed the control system of the process by applying fuzzy reasoning (6), and this system actually controlled a pilot-scale plant well. However, it is difficult to construct a fuzzy reasoning which has a generality. The membership functions and the fuzzy rules must be tuned by the trial and error method for application to each plant. Shiraishi and Nakahara also applied fuzzy reasoning for the process, in which the model parameters were decided by a genetic algorithm (7). However, they did not discuss the quality of eflluent from the treatment. In the previous papers (8-ll), we applied a fuzzy neural network (FNN) to control fermentation processes, such as the Japanese sake mashing and beer brewing, and a highly accurate FNN model was constructed. In the present study, we apply FNN for constructing a model to estimate chemical oxygen demand (COD) of the effluent from municipal wastewater treatment.
The activated sludge process has been widely used for water treatment of both municipal and industrial wastewaters. Activated sludge can convert various organic compounds in wastewater to carbon dioxide by the oxidative activities of aerobic microorganisms. Since many microorganisms are associated with the removal of organic compounds, the kinetics of this process is normally more complex than that of industrial microbial processes, in which only one microorganism is used. To date, many models have been proposed to describe the dynamic characteristics of the process, such as the activated sludge models (ASMs) no. 1 (1) and no. 2 (2) presented by the International Association on Water Quality (IAWQ). These models are constructed based on Monod’s equation for growth of microorganisms. However, the ASMs no. 1 and no. 2 have 19 and over 50 parameters, respectively, and it is difficult to decide these parameters exactly. In the actual activated sludge process, the flow rate and the composition of the wastewater influent vary at all times. Therefore, it is very difficult to construct a mathematical model for control of the process, and well-established control theories (3) cannot be applied to the process without a model. Considering these situations, the wastewater treatments have been controlled by skilled operators. In recent years, the load for treatment of municipal wastewater has increased due to increasing municipal population, and the criterion for the effluent from the wastewater treatment has become more stringent from the viewpoint of environmental preservation. Therefore, a sophisticated control system of the process needs to be established. Knowledge information processing has been applied to some systems, in which a mathematical model is relatively difficult to construct (4). In the activated sludge process, some studies have reported the construction of models that simulate the performance of the process. Martin et al. (5) constructed a hybrid model of the mathematical model and the artificial neural network
MATERIALS
AND METHODS
Measurement items The data used in this study were collected at the “U” wastewater treatment plant in “N” city applying the activated sludge method. One week data collected in May, August and October, 1994 and in February, 1995, representing spring, summer, autumn and winter seasons, were used for the analysis. Figure 1 shows the flowsheet of the “U” plant. The following items were measured at the points from (b) to (h) every hour, i.e., (b) primary effluent; SS (suspended sludge), COD, and flow rate of the effluent, (c) the middle portion of the aeration tank; mixed liquor suspended solid (MLSS), dissolved oxygen concentration (DO), and added amount of sodium hypochlorite, (d) the exit of the aeration tank; MLSS, and DO, (e) effluent; COD, pH, and flow rate of the effluent, (f) line of excess
* Correspondingauthor. 215
216
TOMIDA
J. BIOSCI. BIOENF.,
ET AL. Primary settling tank
Aeration tank
Final settling tank
(4 Effluent
lnfluent
(h)+ Return sludge
Blower
e(g)
. . . . . . . . . . . . . . . . .. . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..*.~
. . . . . . ..a...*) 1 (f) Excess sludge
FIG. 1. Flowsheet of wastewater treatment plant “U”.
sludge; flow rate of the excess sludge, (g) line from blower; air flow rate, (h) line of return sludge; flow rate of the return sludge, and MLSS. These items were used as the input variables for modeling. SS, COD and MLSS were measured using the optical density method (12). The average hydraulic retention times for the primary settling tank, the aeration tank and the final settling tank were calculated once a day. In this plant, there are two aeration tanks which are arranged in parallel. Since each aeration tank has the individual measurement devices at points (c), (d), (g) and (h), the average values of the two devices at each point were used for the calculation. Input and output variables for modeling in “U” plant The average hydraulic retention times for the aeration tank and the final settling tank were approximately 6 h and 4 h, respectively. Assuming a plug flow, the wastewater passing through point (b) will reach 3 h later at point (c), 6 h later at point (d), 10 h later at point (e). Therefore, the measured values at point (b) at time T h, at point (c) at T+ 3 h, at point (d) at T+6 h were used for the estimation of the COD value at point (e) at TflO h. Since the air flow rate in aeration tanks varied, the average air flow rate during 6 h was also added to the input variable. Temperatures of the atmosphere and point (e) were measured twice a day and their average values were used for analysis, but the temperature in the aeration tank was not measured. Therefore, the temperature at point (e) at time T+6 h was used as the temperature in the aeration tank. The load of COD against the unit amount of the sludge (COD/MLSS) is the important index in the aeration tank, and it was added to the input variable. The mass flow rate of the return sludge calculated as “volumetric flow rate of the return sludge x MLSS” at the line of return sludge was also added to the input variable. As a result, the total number of input variables was 34. If there is a strong correlation among the variables and they are used to construct the model, then multicollinearity (13) will occur and the constructed model will be incorrect. Therefore, if a variable has a correlation coefficient above 0.8 with another variable, only one of the two variables should be used for the modeling. As a result, the 16 variables were used to construct the model and these are listed in Table 1. Using these 16 variables, the FNN model estimating the COD value in the effluent, in which the COD value is the output variable, was constructed. Since sigmoid functions were used as the internal function of the FNN, each of the input and output data were normalized from 0.05 to 0.95. Among all sets of the input and output data in each season, which totaled to 168 data sets, half the data were used for learning of the
FNN and the other half were used for evaluation of the FNN after learning. Analysis for “A” plant In wastewater treatment plants, all measurements are normally carried out only once a day. In order to confirm that the method presented in this paper can be applied to conventional plants, the data were collected once a day (at 3 o’clock in the afternoon) from June to August, 1993 at the “A” plant in “H” city, and were used for the analysis. The flowsheet of “A” plant is almost the same as that of “I-l” plant. However, some measurements and their intervals in “A” plant were different from those in “U” plant. No measurements were taken at points (d) and (g), but the following variables were measured from points (a) to (h) except (d) and (g) once a day, i.e., (a) influent; SS, COD, and flow rate of the influent, (b) primary effluent; SS, and COD, (c) the middle portion of the aeration tank; sludge volume (SV), sludge volume index (SVI), DO, and oxygen utilization rate (OUR), (e) effluent; COD, (f) line of excess sludge; flow rate of the excess sludge, and MLSS, (h) line of return sludge; flow rate of the return sludge, and MLSS. As a result, the number of variables measured were 14. Utilizing these data, the FNN model which estimates the COD values at time Df 1 d from the variables measured at time D d was constructed. Each of the input and output data were normalized from 0.05 to 0.95. The total amount of data was limited (actual data set: 53) and half the data would be insufficient for learning of the FNN. Therefore, in order to construct the FNN model with high accuracy, two thirds of the data were used for learning of the FNN and one third was used for evaluation of the FNN after learning. In this paper, the “Type I” Fuzzy neural network FNN proposed by Horikawa et al. (14) was used. The learning conditions and evaluation method for the FNN models as used for the previous studies (8) were adopted. TABLE
1. Input variables for the model estimating COD of the wastewater treatment plant “U”
Variables Operation time Hydraulic retention time COD MLSS COD/MLSS DO PH Temperature Flow rate of water Flow rate of sludge (F,, Mass flow rate of biomass Air flow rate
Measured point in Fig. 1 Aeration tank (b) (~1 and (d) Aeration tank (c) and (d)
(Fs x
MLSS)
;:; (b) and (e) (f) and (h) 00 (9)
VOL.
88, 1999
COD SIMULATION
If an unnecessary variable is used as the input for the FNN model, the fuzzy rules of the model can hardly be understood and the accuracy of this model would be lower than that using only necessary variables. Therefore, the parameter increasing method (PIM) was used for the optimization of the input variables (14). The PIM identifies a fuzzy model by increasing such parameters as the number of membership functions and/or the number of input variables step by step. The fuzzy modeling by PIM was carried out in every combination of input variables at every step. The most suitable combination of input variables which has the smallest error is selected at every step. Considering the relationship between the number of learning data and the scale of the model, the maximum steps of PIM were determined to be six for the model of “U” plant and five for that of “A” plant. Multiple regression analysis (MRA) was also used for comparison with the FNN model. The input variables were optimized by the PIM method and the maximum number of input variables was set to the same maximum step number of the FNN modeling. RESULTS AND DISCUSSION Selected input variables and the estimated result
Modeling was carried out for the data sets collected in each of the four seasons at “U” plant. The results using the data in the summer season are discussed as an example. Selected input variables of FNN and MRA models are listed in Table 2. The reason why the selected variables were different between FNN and MRA models is that MRA can handle only linear relationships, while FNN can deal with more complex relationships. Figure 2 shows the time courses of input variables selected by PIM in FNN model. Since “U” plant is a municipal wastewater treatment plant and a major part of the influent is domestic wastewater, the COD value at point (b) in Fig. 1 changed periodically with the 24 h cycle. The temperature at point (e) in Fig. 1 was measured twice a day but the average value was used for the analysis as mentioned in Materials and Methods. Therefore, it is plotted stepwise. The stepwise plotting for the hydraulic retention time is also due to the same reason. The simulated results for the evaluation data by the constructed FNN and MRA models using these input variables are shown in Fig. 3. The FNN model estimated the time courses with high accuracy. The average error and the average relative error of this model were 0.094 mg/Z and 0.69X, respectively. On the other hand, the MRA model could roughly estimate the COD trend. Some points were highly inaccurate. The average error and the average relative error of this model were 0.355 mg/l and 2.62x, respectively. The FNN model showed 3.7 times higher accuracy than the MRA model.
8
60
o
60
0
1
2
MODEL
3 4 Time(d)
BY FNN
5
217
6
FIG. 2. Time courses for input variables used for simulation of summer data. (A) COD at point (b); (B) hydraulic retention time; (C) temperature at point (e).
Comparing the input variables selected for the FNN model by PIM using the summer data with that for the MRA model, “the COD value of the primary effluent” and “the hydraulic retention time in the aeration tank” were selected as the input variables for both models. It is reasonable for “the COD value of the primary effluent” to be selected as the input variable, since this variable shows the COD value of the entrance of the aeration tank and is strongly related to the COD of the effluent. Since the organic compounds in the wastewater are removed mainly in the aeration tank, “the hydraulic retention time in the aeration tank” was selected as the input variable. The other input variables selected in the FNN model were “temperature of the effluent” and “operation time”. Selection of the former is quite reasonable since temperature affects the activities of microorganisms in the aeration tank and that of the latter is also reasonable since “U” plant is a treatment process for municipal wastewater, and the amount and quality of wastewater in the influent vary periodically with the 24 h cycle. Both “operation time” and “temperature of the effluent” were selected twice by PIM, suggesting the importance of these input variables. On the other hand, the selected input variables for the MRA model were, “the average DO in the middle of the aeration tank “: “the average DO at the exit of the aeration tank”, volumetric flow rate of effluent” and “COD/MLSS”. The reason why “the average DO in the middle of the aeration tank” and “the average DO at the 16 I
6 10
TABLE
2.
Selected input variables for the FNN and MRA models using summer data by PIM
FNN model COD at point (b) Operation timea Hydraulic retention time Temperature at point (ep
MRA model COD at point (b) DO at point (c) DO at point (d) Flow rate of effluent at point (e) COD/MLSS Hydraulic retention time a The variable was selected twice by PIM.
I
0
1
2
3
4
5
6
Time (d)
FIG. 3. Simulation results for summer data using the FNN (A) and MRA (B) models. Symbols: 0, measured value; 0, estimated value.
218
TOMIDA
J. BIOSCI. BIOENG.,
ET AL. TABLE
FNN model
Data Data Data Data
MRA model
for for for for
3. Relative errors for FNN and MRA models in each season Spring 0.61 0.99 2.59 2.68
learning (%) evaluation (%) learning (%) evaluation (%) TABLE
4.
Summer 0.49 0.69 2.34 2.62
Relative errors Autumn 1.10 1.21 1.98 2.11
Winter 0.79 0.83 2.73 3.27
Average 0.75 0.91 2.41 2.67
--
Selected input variables of other seasons for the FNN model
Spring COD at point (b) Operation time” pH at point (e) Average air flow rate during 6 h Hydraulic retention time
Autumn COD at point (b) Operation timea Temperature at point (e) MLSS at point (d)
Winter/Summer COD at point (b) Operation timea Hydraulic retention time Temperature at point (e)
a The variable was selected twice by PIM.
exit of the aeration tank” were selected as the input variables seems to be that DO in the aeration tank strongly affects the activities of aerobic microorganisms. “Volumetric flow rate of effluent” is the amount of wastewater which has been treated, and was selected due to the reason similar to the selection of “operation time” in the FNN model. “COD/MLSS” reflects the substrate concentration per unit amount of microorganisms, which is one of the important variables in the activated sludge treatment . Modelings in spring, autumn and winter were also carried out independently and the three models were constructed. Table 3 shows the relative errors for the estimation results of FNN and MRA models for each season. FNN models for each season were more precise than the MRA ones, both for learning and for evaluation. Average relative error for evaluation in the MRA model was about 2.9 times higher than that of the FNN model. Figure 4 shows the simulation results for evaluation data in spring, autumn and winter using the FNN models. In winter, COD values in the effluent were slightly higher than in the other seasons. This may be due to low microbial activities of activated sludge in winter. In all seasons, however, the FNN models could simulate periodic changes for COD with high accuracy. We also attempted to simulate this process using the IAWQ ASM c
16 I L
I I
IOL A
s
16-r
g ‘4 E 12 10 16 9 g, ‘4 ;
12 10
I 0
1
I 2
L
3
4
5
,I 6
Time (d)
FIG. 4. Simulation results for spring (A), autumn (B) and winter (C). Symbols: 0, measured value; 0, estimated value.
no. 1 (1). All parameters in the IAWQ ASM no. 1 were tuned to fit the data for learning, and the simulation was reasonable for the data evaluation (results not shown). However, values of these parameters were far from the normal biological significance. Table 4 shows the selected input variables of the FNN model for typical seasons; May, August, October and February. In all the seasons, “the COD value of the primary effluent” and “operation time” were selected as the common input variables in the FNN models, and “operation time” was selected twice by PIM. It is reasonable that these input variables appear to be very important for the process. It is also interesting that the same input variables were selected for the FNN models in summer and winter as shown in Tables 2 and 4. “Temperature of the effluent” was selected twice for the models of summer and winter, but was selected once for the model of autumn and not for model of spring. The temperature is moderate during spring and autumn, and it seems to have a strong influence on microbial activities in summer and winter seasons. Conditions of activated sludge wastewater treatment vary slightly from day to day, but moderately from month to month and dramatically from season to season. Therefore, a flexible simulation method should be developed. In Table 4, the data for one week in May, August, October and February were used to construct each FNN model. These FNN models were constructed independently of each other. However, the data could be replaced with new ones hour by hour, since the measurement is performed hourly. Therefore, the FNN models and the input variables selected will be changed gradually and automatically. In this manner, FNN is very flexible, and this is the most important characteristics of the FNN model. Analysis of the FNN rules In FNN, the acquired knowledge can be described in the form of fuzzy rules by analysis of the connection weight in the FNN after learning. Table 5 shows the acquired fuzzy rules of FNN models using the summer data. In this table, “S”, “M” and “B” indicate small, medium and big, respectively. If the values in this table become big, the COD value of the effluent becomes large. From this table, the following rules were elucidated after the analysis; (i) If “operation time” becomes medium (the time is about noon), then the COD value becomes small (effluent COD becomes low). If “operation time” becomes small or big
VOL.
88,
COD SIMULATION
1999 TABLE
5.
BY FNN
219
Identified fuzzy rule using summer data Operation time M COD at point (b) B s
S
Hydraulic retention time
MODEL
B
s
B
S
B
S
0.93
0.12
0.26
0.20
0.74
0.72
B
0.01
0.36
0.95
0.08
0.58
0.76
S
1.11
0.57
0.25
0.18
0.62
0.61
B
0.41
0.81
0.29
0.44
0.74
0.80
S
0.44
1.08
0.10
0.08
0.19
0.38
B
0.79
0.66
0.26
0.56
0.74
0.74
(the time is about early in the morning or late at night), then the COD value becomes big (effluent COD becomes high). This rule reflects the effect of hydraulic retention time of this plant (about 12 h) and of day-cycle of domestic wastewater. (ii) In the case of big (long) “hydraulic retention time”, if the “temperature in the effluent” becomes small, medium and big (water temperature becomes low, medium and high), then the effluent COD becomes small, medium and big (effluent COD becomes low, slightly high and high), respectively. This rule can be interpreted as follows. In this plant, a nitrifying reaction often occurs in summer at high water temperature. The skilled operater in this plant controls the air flow rate at a low level so as to limit the nitrifying reaction, which leads to a decrease in the activities of aerobic microorganisms, and the effluent COD becomes high. (iii) In the case of small (short) “hydraulic retention time”, if the “temperature in effluent” becomes small, medium and big (water temperature becomes low, medium and high), then the effluent COD becomes big, medium and small (effluent COD becomes high, slightly low and low), respectively. This means that effluent COD decreases with increase in temperature since the hydraulic retention time is short and the nitrifying reaction does not easily occur. In these ways, characteristic rules of each season can be elucidated from the FNN model of each season. The simulation method presented in this paper can be easily applied to other wastewater treatment plants, if the data are measured every hour. In those cases, the membership functions will be tuned automatically by learning and the fuzzy rules can be also automatically elucidated. Comparing these rules with the ones derived from the present plant, the characteristics of these plants are understood clearly. These will lead to the construction of
high level wastewater treatment systems which can simulate and control precisely the activated sludge process under various qualities of influent water. Estimation results for “A” plant In the “A” plant, the measurements were normally carried out once a day. Using the data obtained in the “A” plant, COD in the effluent was estimated by the same method mentioned above, and the results are shown in Fig. 5. The average error of 0.11 mg/l and the average relative error of 5.5% were obtained using the data measured once a day. Data measured 4-6 times a day are usually necessary for a precise estimation of activated sludge process since the mean hydraulic retention time in an aeration tank is normally 4-6 h. However, the estimated values for COD in the effluent coincided well with the actual data, and the FNN model with relatively high accuracy was constructed for conventional activated sludge processes, in which the process variables are measured once a day. In the case of the MRA model, the average error and relative error were 0.25 mg/l and 10.5X, respectively. This revealed that the FNN model also had higher accuracy than the MRA model. In conclusion, the FNN model for estimating COD values in the effluent with high accuracy could be constructed for the activated sludge process with the use of process data collected from the actual plant, either measured every hour or once a day, without the use of a mathematical model. These FNN models will be applied for more precise control of the activated sludge process. REFERENCES
1. Henze,
M., and Matsuo,
Grady, C. P. L. Jr., Gujer, W., Mac&, G. v. R., T.: Activated sludge model no. 1. IAWPRC Scientific and Technical Reports, 1 (1987). 2. Heoze, M., Gujer, W., Mino, T., Matsuo, T., Wentzel, M. C., and Marais, G. v. R.: Activated sludge model no. 2. IAWPRC
Scientific and Technical Reports, 3 (1995). G. E. F.: Modeling and simulation in chemical engineering, p. 1-17. Wiley Interscience, New York, USA (1972). 4. Dalle Molle, D. T., and Edgar, T. F.: Qualitative modeling of chemical reaction systems, p. l-36. 1n Michael, L. M. (ed.), Artificial intelligence in process engineering. Academic Press, Boston, USA (1990). 3. Roger,
5. Martin, 0
20
40
60
60
100
Time(d)
FIG. 5. Simulation results using the FNN model for wastewater treatment plant “A”. Symbols: 0, Measured value; 0, estimated value.
C.,
Bernard,
P. A. G.,
Paul,
L.,
and
Jules,
T.:
Dynamic modeling of the activated sludge process:improving prediction using neural networks. Wat. Res., 29, 995-1004 (1995). 6. Iwahori,
K., Yamakawa, K., and Fujita, M.: Effect of fuzzy control on influent variations in a pilot-scale activated sludge process. Preprints. of the 7th International Conference on
220
TOMIDA
ET AL.
Computer Applications in Biotechnology, 541-546 (1998). 7. Shiraishi, H. and Nakahara, S.: Application of fuzzy reasoning using genetic algorithm for control of an activated sludge process. Kagaku Kougaku Ronbunshu, 22, l-6 (1996). (in Japanese) 8. Hanai, T., Katayama, A., Honda, H., and Kobayashi, T.: Automatic fuzzy modeling for ginjo sake brewing process using fuzzy neural networks. J. Chem. Eng. Japan, 30, 94-100 (1997). 9. Honda, H., Hanai, T., Katayama, A., Tohyama, H., and Kobayashi, T.: Temperature control of ginjo sake mashing process by automatic fuzzy modeling using fuzzy neural networks. J. Ferment. Bioeng., 85, 107-112 (1998). 10. Nishida, Y., Hanai, T., Katayama, A., Honda, H., Fukaya, I., and Kobayashi, T.: Experimental ginjo-sake brewing by fuzzy
J. BIOSCI. BIOENG.,
11.
12. 13. 14.
neural network. J. Brew. Sot. Japan, 92, 447-451 (1997). (in Japanese) Noguchi, H., Hanai, T., Takahashi, W., Ichii, T., Tanikawa, M., Masuoka S., Honda, H., and Kobayashi, T.: Model construction for quality of beer and brewing process using FNN. Kagaku Kougaku Ronbunshu, 25 (1999), accepted. (in Japanese) Japan Sewage Works Association: Standard methods for the examination of sewage, p. 42-72. Japan Sewage Works Assoc., Tokyo (1993). (in Japanese) Suga, T.: Practice of multi-variate analysis, p. 67-351. Gendai Sugaku Sha, Tokyo (1997). (in Japanese) Horikawa, S., Furuhashi, T., and Uchikawa, Y.: A study on fuzzy modeling using fuzzy neural networks. Proc. of International Fuzzy Engineering Symposium ‘91. 562-573 (1991).