Journal Pre-proofs Modeling and optimization of microbial lipid fermentation from cellulosic ethanol wastewater by Rhodotorula glutinis based on the support vector machine Lihe Zhang, Bin Chao, Xu Zhang PII: DOI: Reference:
S0960-8524(20)30050-X https://doi.org/10.1016/j.biortech.2020.122781 BITE 122781
To appear in:
Bioresource Technology
Received Date: Revised Date: Accepted Date:
29 November 2019 6 January 2020 7 January 2020
Please cite this article as: Zhang, L., Chao, B., Zhang, X., Modeling and optimization of microbial lipid fermentation from cellulosic ethanol wastewater by Rhodotorula glutinis based on the support vector machine, Bioresource Technology (2020), doi: https://doi.org/10.1016/j.biortech.2020.122781
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
© 2020 Elsevier Ltd. All rights reserved.
1
Modeling and optimization of microbial lipid
2
fermentation from cellulosic ethanol wastewater
3
by Rhodotorula glutinis based on the support
4
vector machine
5
Lihe Zhang, Bin Chao, Xu Zhang*
6
Beijing Key Lab of Bioprocess, National Energy R&D Center for Biorefinery, College
7
of life Science and Technology, Beijing University of Chemical Technology, Beijing
8
China
9
Highlights
10
The change law of organic matter in fermentation of lipid was analyzed.
11
BP-ANN and SVM model of the fermentation of the ethanol wastewater were
12
established.
13
SVM is better than BP-ANN in prediction and optimization based on small sample.
14
The parameters were optimized by genetic algorithm based on SVM.
15
Key words
16
Cellulosic ethanol wastewater; Microbial lipid; Support vector machine; Artificial
17
neural network
18
Abstract
19
To establish the models of microbial lipid production from cellulosic ethanol
20
wastewater by R. glutinis, the biomass, lipid yield, and COD removal rate were
21
investigated under different conditions. Subsequently, the genetic algorithm based on
22
SVM was adopted to optimize parameters for obtaining the maximum biomass. The
23
results demonstrated that the initial COD and glucose content had a significant effect on
24
lipids synthesis. Most of the organic matter in the wastewater was consumed with the
25
production of lipid. Compared with BP-ANN, SVM had better fitting and generalization
26
ability for small amount of experimental data. By genetic algorithm optimization based
27
on SVM, the maximum biomass and lipid yield could reach 11.87 g/L and 2.18 g/L,
28
respectively. The results suggest that the SVM model could be used as an effective tool
29
to optimize fermentation conditions.
30 31
1. Introduction Biofuels as a crucial renewable resource have spurred worldwide attention. At the
32
same time opportunities and challenges coexist in the development of biofuels (Baeyens
33
et al.2015; Wang et al 2019; Moreno et al.2017). In the past few decades, numerous
34
studies have been conducted on the production of biofuels from lignocellulosic biomass
35
to substitute the fossil fuels (e.g., bioethanol, especially cellulosic ethanol) (Humbird et
36
al.2011; Chandel et al.2018 Hanna, B 2018; Ibarra-Gonzalez et al 2019). In the
37
meantime, the increasing demand of cellulosic ethanol has boosted the technical
38
progress to produce biofuels. However, the production of cellulosic ethanol often results
39
in a concurrent production of large volumes of high-strength wastewater. (Humbird et
40
al.2011). According to the different production process, 6t to 80t of wastewater is
41
generated from ethanol produced per ton (Wang et al.2017). Besides, the wastewater
42
has a high chemical oxygen demand (COD), primarily containing sugars, organic acids,
43
glycerin, inhibitors, inorganic salt, etc. Cellulosic ethanol wastewater belongs to a
44
typical type of biorefinery wastewater that exhibits high concentration, high chroma,
45
complicated component and low pH (Zhang et al.2018). Thus, it will seriously pollute
46
the quality of the water environment and adversely affect people’s normal life without
47
purification. To achieve large-scale commercial production of fuel ethanol from
48
lignocellulose, a practical solution of wastewater treatment should be developed (Sarkar
49
et al.2012).
50
However, the resource utilization of the cellulosic ethanol wastewater has not been
51
reported extensively. Previous studies have focused on how to remove the COD or to
52
reduce the costs and energy consumption mainly (Steinwinder 2011; Zhao and Yu
53
2013). There are many research about the treatment methods of the industrial waste
54
water, which are very difficult and energy consuming. (e.g., evaporation, membrane
55
separation, single cell protein production, and electrochemical processes) (Shan et
56
al.2015; Shan et al.2015; Hu et al. 2017; Lynd et al 2017). For the cellulosic ethanol
57
wastewater, which is rich in organic matter, the most economical and feasible method is
58
to convert the residual sugars and organic compounds into lipid by oleaginous yeasts
59
(Gude 2016). As a crucial feedstock, microbial lipid can be used to produce biodiesel,
60
biolubricant and jet fuel using different methods (Chuck et al. 2014). However, because
61
of the high cost, it was difficult to achieve commercial application of biodiesel
62
production from microbial lipid. The main cost of microbial cultivation is the raw
63
material that takes up more than 80% (Xue et al.2010). According to previous research,
64
the production of microbial lipid from wastewater can effectively lower the cost of its
65
production. There are numerous studies about the production of microbial lipid from
66
various types of organic wastewater by oleaginous yeasts, especially the R. glutinis
67
(Xue et al.2006; Hall et al. 2011; Ling et al.2013; Zhou et al. 2013; Peng et al.2013;
68
Chen et al.2009). In our previous study, the feasibility of using the cellulosic ethanol
69
wastewater by R. glutinis has been explored. Subsequently, a novel strategy was
70
formulated for lipid production through coupling oleaginous yeasts and activated sludge
71
biological methods. As the results suggested, the utilization of cellulosic ethanol
72
wastewater by R. glutinis for producing microbial lipid can not only save the cost of
73
producing microbial lipid, but also remove the COD of wastewater.
74
In the process of fermentation, many technological parameters will have an
75
important impact on the fermentation results, such as fermentation time, temperature,
76
pH, solid-liquid ratio and so on. The yield can be increased by optimizing the process
77
parameters. The existing methods for obtaining the optimal pretreatment parameters
78
mostly cover orthogonal experimental design (OED) (Zhu et al., 2013), response
79
surface methodology (RSM) (Mohammadi et al., 2016), and uniform design (UD) (Fang
80
et al., 2000), artificial neural network (ANN) (Singh et al., 2017; Boukelia et al, 2016),
81
support vector machine (SVM) (Pablo et al., 2013), etc. But the accuracy of OED, RSM
82
and UD methods is not very high if the experimental data are not enough, which limits
83
their applications range. More and more researchers start to use ANN and SVM to build
84
models to optimize fermentation conditions. The modeling of fermentation can provide
85
reliable data reference for the control and optimization of fermentation process
86
parameters. ANN simulates the biological nervous system with bionics. Generally, it is
87
composed of input layer, hidden layer and output layer. The layers are connected by
88
weights, and each layer contains one or more nodes. It should only know the input and
89
output data of the fermentation, whereas it is not required to study the reaction
90
mechanism of the fermentation process. By analyzing the biological nervous system
91
from different angles, a variety of artificial neural network models are obtained. And
92
among these models, BP-ANN and RBP-ANN are commonly used. Accordingly,
93
modeling based on the ANN method is simple and easy, whereas its training algorithm
94
converges slowly and falls into local optimum easily. It is not suitable for modeling
95
with small sample data (Sebayang et al.2017; Grahovac et al.2016). SVM is considered
96
a novel pattern classification and nonlinear regression method for statistical learning
97
theory. It follows the structural risk minimization criterion to minimize the risk of the
98
sample points while minimizing the risk structure and enhancing the generalization
99
ability of the model. It has developed rapidly and has been successfully applied in many
100
fields (bioinformatics, medicine, text and handwriting recognition, etc.) (Irawan et
101
al.2015; Guerbai et al.2018). Compare with conventional ANN, SVM is capable of
102
obtaining global optimal solutions based on small samples. As fueled by the rapid
103
advancement of computer technology, SVM has now been widely used in various
104
disciplines of scientific research.
105
In this study, the cellulosic ethanol wastewater was applied as raw material to
106
culture R. glutinis aiming to evaluate the effect of initial concentration of glucose and
107
COD on the lipid fermentation and to investigate the change of major organics in the
108
wastewater. Therefore, biomass, lipid synthesis and the concentration of major organics
109
were monitored at different times when supplied with different concentrations. Based
110
on the data obtained from fermentation, the SVM and BP-ANN models of microbial
111
lipid fermentation were established. Subsequently, the best model was selected to find
112
the best process parameters for obtaining the maximum biomass concentration using the
113
genetic algorithm.
114
2. Material and methods
115
2.1 Microorganism, culture conditions, and wastewater
116
The yeast strain R. glutinis CGMCC No. 2258 was obtained from the China
117
National Research Institute of Food and Fermentation Industries. Besides, it was stored
118
in agar slant medium with yeast extract (4 g/L), urea (2 g/L), and glucose (200 g/L) at
119
4 °C.
120 121 122 123 124
The seed&basic medium contained (g/L) glucose 40, (NH4)2SO4 2, KH2PO4 7, Na2SO4 2, MgSO4 1.5, Yeast extract 1.5. Cellulosic ethanol wastewater which was used in our study was purchased from the COFCO Corporation, China. The wastewater was diluted into different proportion before making up the medium.
125
Subsequently, only glucose was added. All mediums were adjusted to the same initial
126
pH at 5.5 and sterilized at 121℃ for 20 min. The inoculums were cultured at 30 °C in a
127
180-rpm shaker for 24 h and then transferred into 500 mL flasks that contained 100 mL
128
medium with 10% inoculation size (v/v).
129
2.2. Analytical methods
130 131
The dry cell weight method was used to measure the biomass (Zhang et al., 2014). Moreover, the concentration of glucose was measured by a glucose biosensor (SBA
132
40C, Biological Institute of Shandong Academy of Sciences). Lipid was extracted using
133
the method reported by (Xue et al. 2008). The lipid components were analyzed as
134
described in the existing study (Zhang et al. 2014).
135
HPLC (Thermo Scientific, Waltham, MA, USA) was used to measure the
136
concentration of sugars, organic acids and other organics and the specific method was
137
followed the procedure reported in Patel et al. (2015).
138
2.3 Experimental design and prediction model
139
In this paper, the MATLAB software was used to establish the SVM model and
140
BP-ANN model. In the experiments, the biomass, time and the concentration of glucose
141
were obtained to prepare data for modeling. The model was trained using the
142
fermentation data collected from the experiments to obtain a prediction model of lipid
143
fermentation from cellulosic ethanol wastewater by R.glutinis.
144
2.3.1 Establishment and functions of SVM model
145
In the present study, SVM regression model was built using the fermentation data
146
of R. glutinis. It is equivalent to a function map, as shown in Eq. (1), which has an input
147
and an output.
148 149
𝑦 = (𝑥) …………………………………… (1)
Where x denotes the independent variable; y is the dependent variable. In this
150
study, the fermentation time and the concentration of various substances at different
151
times are the dependent variable.
152
(1) Data preprocessing
153
The selected data were normalized by the Eq. (2). In MATLAB, the above
154
normalization can be achieved by the ‘mapminmax’ function (3). The mapping adopted
155
by the ‘mapminmax’ function is expressed as Eq. (4).
156
𝑓:𝑥→𝑦 = 𝑥𝑚𝑖𝑛𝑚𝑎𝑥…………………………………………… (2)
157
[y, ps] = mapminmax(x, ymin, ymax) ………………………… (3)
158
𝑦=
159
𝑥 ― 𝑥𝑚𝑖𝑛
(ymax - ymin) × (x - xmin) xmax - xmin
+ ymin………………………………… (4)
Where x denotes the data before normalization, in this paper, it mainly refers to the
160
data obtained from fermentation; ymin and ymax refer to the range parameters of the
161
mapping, the default values are -1 and 1, respectively; y is the normalized data; ps
162
indicates the structure that holds the normalized mapping; ymin and ymax and ps were
163
parameters related to software settings.
164 165
(2) Optimization selection of model parameters In this paper, the best penalty factor c and g were obtained using cross validation
166
method (CV method). Function (5) in the toolbox of LIBSVM-FarutoUltimate was
167
employed to achieve the CV method.
168
[mes, bestc, bestg] = ...SVMcgForRegress(train_
169
y, train_x, cmin, cmax, gmin, gmax,v, cstep, gstep, msestep) ... .. (5)
170
Input: Where tarin_y refers to the dependent variable to be regressed and its size is
171
n by 1, and n is the number of samples; train_x is an independent variable and its size is
172
n by m, where n represents the number of samples and m represents the number of
173
independent variables; cmin and cmax are the minimum and maximum values of the
174
penalty coefficient c after taking the power exponent with the base of 2 and the default
175
values are -5 and 5, respectively; gmin and gmax are the minimum and maximum
176
values of the model parameter g after taking the power exponent with the base of 2 and
177
the default values are -5 and 5, respectively ; v represents the CV parameter and its
178
default is 5; cstep and gstep are the step size of the parameter c and g and their default
179
are 1, respectively; msestep refers to the step size of the MSE graph and its default is 5.
180
Output: Where mse denotes the lowest mean square error in the CV process; bestc
181
and bestg are the optimal parameters c and g, respectively.
182
(3) Training and regression prediction
183
The SVM model was trained by the best parameters c and g obtained by the CV
184
method, and subsequently, the experimental data were predicted by the regressive
185
analysis. The SVM model in this paper is implemented using the LIBSVM toolbox. The
186
major functions of the LIBSVM toolbox cover the training function ‘svmtrain’ and the
187
prediction function ‘svmpredict’.
188
Training function ‘svmtrain’:
189
model = svmtrain(train_y, train_x, options) …………………… (6)
190
Input: Where train_y denotes the dependent variable of the training set and its size
191
is n by 1, and n is the number of samples ; train_x refers to the independent variable
192
of the training set and its size is n by m, where n represents the number of samples
193
and m represents the number of independent variables ; Options is a parameter
194
option.
195
Output: Model represents a model obtained by training.
196
The prediction function ‘svmpredict’:
197
[predict_y, mse, dec_value] = svmpredict(test_y, test_x, model) ……… (7)
198
Input: Where test_y denotes the dependent variable of the test set and its size is
199
n by 1, and n is the number of samples; test_x indicates the independent variable of
200
the test set and its size is n by m, where n represents the number of samples and m
201
represents the number of independent variables; model is the SVM model trained
202
by the svmtrain function.
203
Output: Where predict_y denotes the result of the predicted test set ; mse refers to a
204
column vector with the size of 3×1; dec_value is the decision value.
205
2.3.2 Establishment and functions of BP-ANN model
206
The establishment of BP artificial neural network model can be divided into three
207
steps: construction, training and prediction. MATLAB software has a neural network
208
toolbox, which includes BP-ANN. BP-ANN involves three functions, ‘newff’, ‘train’
209
and ‘sim’. Before the BP-ANN modeling, the data were also preprocessed by Eq. (3).
210
(1) Parameter setting function ‘newff’
211
net = newff(P, T, S)……………………………….…….. (8)
212
Input: Where P is the input variable matrix; T is the output variable matrix; S is the
213
number of nodes in the hidden layer. The size of variable matrix were determined by
214
experimental data.
215 216 217 218 219
Output: Where Net is the BP artificial neural network after initialization. (2) The training function ‘train’ net = train (NET, X, T, INi, OUTi) ………………………………. (9) Input: Where NET for training network; X is the input variable matrix; T is the output variable matrix; INi is the input layer condition; OUTi is the output layer
220 221
condition. In general, the first three parameters need to be set, and the last two parameters use
222
the default values. The last two parameters are set only when the neural network needs
223
to be optimized.
224 225 226
Output: Where Net is the artificial neural network obtained after training. (3) The prediction function ‘sim’ y = sim (net, x) ……………………………………… (10)
227
Input: Where net is a trained network; X is the input data;
228
Output: Where Y is the data of network prediction.
229
3. Results and discussion
230
3.1 Effects of initial COD on the fermentation of lipid
231
R. glutinis, as a kind of important lipid yeast, can accumulate lipids by exploiting
232
various wastewater as the raw materials. However, the cellulosic ethanol wastewater
233
applied in our study exhibits high concentration of inhibitor (e.g., furfural, 5-
234
hydroxymethyl, and furfuryl alcohol). Besides, it will suppress the growth and lipid-
235
producing of R. glutinis. In order to reduce the inhibition, the waste water was diluted
236
before fermentation. The effects of initial COD on the fermentation of lipid were
237
explored. Before fermentation, the glucose at a concentration of 40g/L was added to the
238
wastewater. The results were shown in Fig. 1. Several diversifications existed in the
239
growth and lipid-producing of R. glutinis with various wastewater contents in medium.
240
The decrease of biomass and lipid production was obvious when the proportion of
241
wastewater exceeded 40%. In particular, when the wastewater content reached 50%, the
242
lipid yield was nearly zero, and the glucose concentration of medium remained basically
243
unchanged. In this concentration, the concentration of the inhibitors exceeded the
244
tolerance limit of R. glutinis, and cell growth was nearly stagnant. In contrast, the
245
curves of cell growth and lipid yield are almost identical at the proportion of wastewater
246
of 25% and 33%. Cells were growing fastly before144 h, and biomass was peaked in
247
192 h with10.6g/L and 11.12g/L, respectively. Subsequently, the biomass was gradually
248
down-regulated, it was mainly because the nutrients were exhausted, and the cells began
249
to dissolve. After the fermentation, the COD removal rate reached over 80%.
250
The results suggested that the concentration of inhibitor was a significant factor
251
affecting biomass and lipid synthesis of R. glutinis. More importantly, compared with
252
the synthetic medium, there was no significant difference in the lipid yield by using the
253
wastewater as the raw materials with only glucose added (Gong 2019). Besides,
254
microbial lipid fermentation by R. glutinis could indeed act as a practical and functional
255
approach to treat the waste water, which is capable of not only producing lipid but also
256
removing the COD of the waste water.
257
3.2 Effects of initial glucose concentration on the fermentation process of lipid
258
To obtain the maximum lipid yield and COD removal rate, different concentrations
259
of initial glucose (20, 30, 40, 50 g/L) were employed to culture R. glutinis with the
260
wastewater content of 33%; moreover, the biomass, lipid accumulation and COD
261
removal rate were ascertained. As shown in Fig. 2, the biomass of cells displayed a
262
significant difference when the initial sugar concentration was up-regulated from 20g/L
263
to 50g/L. With the increase in the initial glucose concentration, the maximum biomass
264
progressively increased. The maximum biomass reached 7.12g/L,9.13 g/L, 11.12 g/L,
265
11.52 g/L, respectively. When the concentration of glucose was less than 30g/L, the
266
glucose was consumed rapidly in 160 h, and the biomass was not sufficiently
267
accumulated. When it reached more than 40 g/L, the lipid and biomass could be
268
sufficient to synthesis and accumulate; at the end of fermentation, the yield of lipid was
269
more than 1.9g/L. Nevertheless, when the glucose concentration was 50g/L, the rate of
270
glucose consumption decreased noticeably.
271
The results revealed that the addition of glucose could positively impact cell growth
272
and lipid synthesis. The results also proved that the addition of glucose can promote the
273
COD reduction of wastewater. When the glucose at a concentration of 40 g/L was
274
introduced to the wastewater, the COD removal rate reached 84%. It will greatly relieve
275
the pressure of wastewater treatment. Compared with other culture methods without
276
glucose added (Wang 2017, Zhou 2013), the removal rate of COD and lipid yield
277
obtained in this study were more competitive. Though the yield of lipid on cell was not
278
high enough, the production of lipids might be further facilitated by fed-batch
279
cultivations in a bioreactor.
280
3.3 The variations of organic matter during fermentation
281
The previous studies suggested that the characteristic of the cellulose ethanol
282
wastewater has been ascertained (Zhang et al.2018). The organic components of
283
wastewater primarily included sugars, organic acids, aldehydes and so on. To delve into
284
the fermentation process of lipid, the culture medium content 33% of wastewater and
285
40g/L of the glucose were taken as the initial medium for lipid production, with the
286
samples taken per 24 h during the fermentation. The concentrations of different organic
287
matters in the samples were ascertained and analyzed; the result are shown in Fig. 3.
288
According to the results, the biomass and lipid concentration rose with the
289
extension of time, and the concentrations of glucose, lactic acid, acetic acid, glycerin,
290
xylose, furfural, furfuryl alcohol were decreased. It was therefore suggested that R.
291
glutinis can consume various substrates during the fermentation, as reported by (Wiebe
292
et al., 2012; Patel et al., 2015). Fig. 3-D suggests that from 0 to 192 h, lactic acid
293
decreased in a relatively slow manner due to the rich glucose in the medium. The
294
glucose was fully consumed at 192 h, and the lactic acid began to be absorbed and
295
exploited rapidly by R. glutinis. The varying trend of acetic acid was more noticeable
296
than that of lactic acid. From 0 to 24h, acetic acid decreased obviously. However,
297
during the fermentation, the decline of acetic acid gradually moderated. The results
298
suggested that both lactic acid and acetic acid in wastewater could be exploited by
299
mucous red yeast. As compared with lactic acid, R. glutinis exhibits better utilization
300
ability to acetic acid. The identical phenomenon occurred with xylose and glycerin, and
301
the presence of glucose hindered the utilization of other substrates. It was not until the
302
concentration of glucose reached to 0 g/L that glycerol and xylose began to be drawn
303
upon rapidly.
304
The results also revealed that the concentration of the citric acid and succinic acid
305
fluctuated irregularly during the fermentation. It was primarily because the citric acid
306
and succinic acid were intermediate products in the process of cells growth and
307
metabolism. During the fermentation, the components of furfural and furfuryl alcohol in
308
wastewater decreased rapidly. After 120h, furfural and furfuryl alcohol were completely
309
consumed, whereas the results reported that R. glutinis exhibited robust tolerance and
310
assimilation ability to furfural and furfuryl alcohol. For the wastewater, rich in complex
311
organic matter, it is very cost-effective to produce lipid and reduce the COD of
312
wastewater by R. glutinis. According to the change of organic matter content in the
313
fermentation process, it can be seen that with the increase of bacterial mass, organic
314
matter in the wastewater decreased gradually.
315
3.4 Training and prediction based on BP-ANN and SVM model
316
During the microbial lipid fermentation from cellulosic ethanol wastewater by R.
317
glutinis different reaction conditions significantly affected the biomass and the yield of
318
lipid. The results of lipid synthesis revealed that the lipid synthesis and cell growth of R.
319
glutinis pertain to the coupling type. Accordingly, to find the optimal reaction condition
320
of the highest biomass, the reaction conditions should be optimized. In the present
321
study, genetic algorithm was adopted to optimize the conditions of the fermentation.
322
Besides, it covered two steps. The first step is the training and prediction of model,
323
while and the second step was extremum optimum design based on genetic algorithm.
324
Accordingly, a fermentation model should be build based on the fermentation data
325
under a range of reaction conditions. In this study, SVM and BP-ANN were employed
326
to build the fermentation model, respectively, and the optimal fermentation model was
327
taken to optimize the genetic algorithm.
328 329
The quality of the models were assessed by statistical means, e.g., the coefficient of determination (R2) mean squared error (MSE), and the MSE can be expressed as:
330 331 332 333
1
𝑀𝑆𝐸 = 𝑛[∑
𝑛
(𝑦𝑒𝑥𝑝 ― 𝑦𝑝𝑟𝑒)2]…………………… (11)
𝑖=1
Where yexp denotes the experimental value; ypre denotes the predicted value; n indicates the sample number. In this paper, the biomass was taken as the dependent variable of the models and
334
the volume fraction of wastewater, while the concentration of glucose supplementation
335
and fermentation time were adopted as independent variable. According to the existing
336
studies here, 77 sets of data about the effects of initial glucose concentration and initial
337
COD on the fermentation process of lipid were harvested, and the data is listed in
338
Supplementary data Table 1. 7 sets of data were randomly taken as test data, and the rest
339
70 sets of data acted as training data to build the models. Subsequently, the trained
340
network was adopted to assess the output of test data and analyze the prediction results.
341
To build the SVM model, the training and test data were normalized by the
342
function of Eq. (3). Besides, the optimal parameters bestc and bsetg were harvested
343
using CV method based on ‘SVMcgForRegress’ function of Eq. (5). First, a rough
344
search of bestc and bsetg was conducted with the range of the parameters c and g to be
345
optimized both as [2-10, 210]. The results of rough search were presented in Fig. 4 A and
346
B. The optimal parameters c and g under the rough search reached 2.2974 and 4,
347
respectively, and the minimum MSE under CV was 0.0016. Moreover, according to the
348
rough results, the range of optimization parameters c and g were narrowed to [2-4, 24]
349
and [2-5, 25], separately. The results were shown in Fig. 4 C and D. The results revealed
350
that the optimal parameters c and g were 1 and 8, respectively, and the minimum MSE
351
under CV was 0.0016. Lastly, the SVM model was trained using the training data
352
according to the optimal parameters c and g calculated, and then the test data underwent
353
the regression prediction. The fitting results of the training data and test data are shown
354
in Fig. 5 A and B. According to the curve in Fig. 5, it can be observed that the fitting
355
degree between prediction data and experimental data of both test data and training data
356
were noticeably high. The results suggested that the SVM fermentation model exhibited
357
a prominent generalization ability.
358
To build the BP-ANN model, the training data and test data were normalized too
359
by ‘mapminmax’ function of Eq (3). ‘Newff’ function of Eq (8) was adopted to build
360
BP-ANN, and the number of iterations was set to 1000, the learning rate was 0.1, and
361
the learning goal was 0.0000004. Based on the ‘trian’ function of Eq (9), to train BP -
362
ANN with training data, the neural network was capable of predicting the biomass
363
during the fermentation. Subsequently, the ‘sim’ function of Eq (10) was called to test
364
the network with the test data, and the fitting effect of the network was analyzed by
365
assessing the error between the output and the test output. The trained network was
366
employed to assess the biomass of test data, and the predicted results are shown in Fig.
367
5 C and Fig. 5 D. The results revealed that BP-ANN exerts a good fitting effect on the
368
fermentation process of mucous red yeast, whereas some errors remained between the
369
predicted results and the actual results, and some samples displayed relatively
370
noticeable prediction errors.
371
The results of Tab 1suggest that the MSE of the training data and test data based
372
on SVM were 0.0004 and 0.0018 respectively, and the R2 was 0.9959 and 0.9862
373
respectively. The MSE of the training data and test data based on BP-ANN were 0.0043
374
and 0.0105, respectively, and the R2 was 0.9899 and 0.9785 respectively. It is therefore
375
suggested that with only a few samples, SVM model exhibited a better performance
376
than ANN model. SVM has a strong potential in the soft sensor of internal variables in
377
fermentation processes and the prediction of fermentation results. The results suggest
378
that the SVM model could be used to study the complex fermentation process of lipid
379
fermentation process. Accordingly, in the present study, the optimization of genetic
380
algorithm will also comply with SVM model.
381
3.5 Optimization by genetic algorithm based on SVM
382
Lastly, genetic algorithm was adopted to find the optimal parameters for obtaining
383
the maximum biomass based on the SVM model. The number of iterations, the
384
population size, the crossover probability, the mutation probability, and the individual
385
length were 500, 50, 0.4, 0.2 and 3, respectively. The fitness variation curve of the
386
optimal individual in the optimization process was plotted in Fig. 6. The fitness value of
387
the optimal individual calculated by genetic algorithm was 11.8723, and the optimal
388
individual reached [32.6048 46.2636 221.0520]. The results revealed that the biomass
389
was peaked at 11.87 g/L increased by 5%, and the lipid content was 2.18 g/L with
390
wastewater volume fraction of 32.6%, sugar content of 46.2 g/L, as well as fermentation
391
time of 221 h.
392
The fermentation of lipid from cellulosic ethanol wastewater by R. glutinis is a
393
kind of complicated batch process which is severely nonlinear and time-varying.
394
Traditional optimization methods were time-consuming and laborious. In this paper,
395
computer model were established to optimize fermentation conditions. According to the
396
results, it demonstrated that the model established in our study had good generalization
397
and prediction ability for the fermentation of microbial lipid from cellulosic ethanol
398
wastewater. And according to the model we got the best fermentation parameters, and
399
the model can be used to optimize more process parameters based on different data.
400
4. Conclusions
401
This study investigated the change of organic matter in the process of lipid
402
fermentation and established the corresponding fermentation model to optimize the
403
fermentation parameters. The results demonstrated that the organic matter in cellulosic
404
ethanol wastewater were indeed employed by R. glutinis. The establishment of
405
fermentation model has important guiding significance for optimizing parameters. With
406
the development of big data technology and artificial intelligence technology, these
407
models can be enriched with experimental data continuously by adding novel detection
408
methods and targets. Furthermore, it can be used for other fermentation processes.
409
Acknowledgements
410
This work was supported by the National Key Research and Development Program of
411
China (2017YFB0306800) and the Overseas Expertise Introduction Project for
412
Discipline Innovation (B13005). And the authors would like to express thanks for the
413
supports.
414
References
415
Baeyens, J., Qian, K., Appels, L., Dewil, R., Tan, T., 2015. Challenges and
416
opportunities in improving the production of bio-ethanol. Prog. Energy Combust.
417
Sci. 47, 60-88. https://doi.org/10.1016/j.pecs.2014.10.003
418
Boukelia, T.E., Arslan, O., Mecibah, M.S., Baeyens, J., Qian, K., Appels, L., Dewil, R.,
419
Tan, T., 2016. ANN-based optimization of a parabolic trough solar thermal power
420
plant. Appl. Therm. Eng. 107, 1210-1218.
421
https://doi.org/10.1016/j.applthermaleng.2016.07.084
422
Brännström, H., Kumar, H., Alén, R., 2018. Current and Potential Biofuel Production
423
from Plant Oils. Bioenergy Res. 11, 592–613. https://doi.org/10.1007/s12155-018-
424
9923-2
425
Chandel, A.K., Silveira, M.H.L., Vanelli, B.A., 2018. Second Generation Ethanol
426
Production: Potential Biomass Feedstock, Biomass Deconstruction, and Chemical
427
Platforms for Process Valorization 135-152. https://doi.org/10.1016/B978-0-12-
428
804534-3.00006-9
429
Chen, X., Li, Z., Zhang, X. et al.2009. Screening of Oleaginous Yeast Strains Tolerant
430
to Lignocellulose. Degradation Compounds Appl Biochem Biotechnol 159: 591.
431
https://doi.org/10.1007/s12010-008-8491-x
432
Chuck, C.J., Lou-Hing, D., Dean, R., Sargeant, L.A., Scott, R.J., Jenkins, R.W., 2014.
433
Simultaneous microwave extraction and synthesis of fatty acid methyl ester from
434
the oleaginous yeast Rhodotorula glutinis. Energy 69, 446-454.
435
https://doi.org/10.1016/j.energy.2014.03.036
436
Fang, K.T., Lin, D.K.J., Winker, P., Zhang, Y., 2000. Uniform Design: Theory and
437
Application. Technometrics 42, 237-248.
438
https://doi.org/10.1080/00401706.2000.10486045
439
Gong, G., Liu, L., Zhang X., Tan, T., 2019. Comparative evaluation of different carbon
440
sources supply on simultaneous production of lipid and carotene of Rhodotorula
441
glutinis with irradiation and the assessment of key gene transcription. Bioresour.
442
Technol, 288:21559. https://doi.org/10.1016/j.biortech.2019.121559
443
Grahovac, J., Jokić, A., Dodić, J., Vučurović, D., Dodić, S., 2016. Modelling and
444
prediction of bioethanol production from intermediates and byproduct of sugar
445
beet processing using neural networks. Renew. Energy 85, 953-958.
446
https://doi.org/10.1016/j.renene.2015.07.054
447 448
449
Gude, V.G., 2016. Wastewater Treatment in Microbial Fuel Cells - An Overview 122, 287-307. https://doi.org/10.1016/j.jclepro.2016.02.022
Guerbai, Y., Chibani, Y., Hadjadji, B., 2018. Handwriting gender recognition system
450
based on the one-class support vector machines. Seventh International Conference
451
on Image Processing Theory. IEEE.
452
Hall, J., Hetrick, M., French, T., Hernandez, R., Donaldson, J., Mondala, A., Holmes,
453
W., 2011. Oil production by a consortium of oleaginous microorganisms grown
454
on primary effluent wastewater. Journal of Chemical Technology &
455
Biotechnology, 86(1), 54-60. https://doi.org/10.1002/jctb.2506
456
Hu, Q., Fan, L., Gao, D., 2017. Pilot-scale investigation on the treatment of cellulosic
457
ethanol biorefinery wastewater. Chem. Eng. J. 309, 409–416.
458
https://doi.org/10.1016/j.cej.2016.10.066
459 460
Ibarra-Gonzalez, P., Rong, B.G., 2019. A review of the current state of biofuels production from lignocellulosic biomass using thermochemical conversion routes.
461 462
Chinese J. Chem. Eng. 27, 1523–1535. https://doi.org/10.1016/j.cjche.2018.09.018 Irawan, M. I., (2015). Study comparison backpropagation, support vector machine, and
463
extreme learning machine for bioinformatics data.
464
https://doi.org/10.17746/1563-0102.2015.43.2.116-125
465
Jovana, G., Aleksandar, J., Jelena, D., Damjan, V., Siniša, D., (2016). Modelling and
466
prediction of bioethanol production from intermediates and byproduct of sugar
467
beet processing using neural networks. Renewable Energy, 85, 953-958.
468
https://doi.org/10.1016/j.renene.2015.07.054
469
Ling, J. , Nip, S. , & Shim, H. . (2013). Enhancement of lipid productivity of
470
rhodosporidium toruloides in distillery wastewater by increasing cell density.
471
Bioresource Technology, 146, 301-309.
472
https://doi.org/10.1016/j.biortech.2013.07.023
473
Lynd, L.R., Liang, X., Biddy, M.J., Allee, A., Cai, H., Foust, T., Himmel, M.E., Laser,
474
M.S., Wang, M., Wyman, C.E., 2017. Cellulosic ethanol: status and innovation.
475
Curr. Opin. Biotechnol. 45, 202–211. https://doi.org/10.1016/j.copbio.2017.03.008
476
Mohammadi, R., Mohammadifar, M.A., Mortazavian, A.M., Rouhi, M., Ghasemi,
477
J.B.,Delshadian, Z., (2016). Extraction optimization of pepsin-soluble collagen
478
from eggshell membrane by response surface methodology (RSM). Food Chem.
479
190, 186-193. https://doi.org/10.1016/j.foodchem.2015.05.073
480
Moreno, A.D., Alvira, P., Ibarra, D., Tomás-Pejó, E., 2017. Production of Ethanol from
481
Lignocellulosic Biomass. Biofuels and Biorefineries, Vol. 7. Springer Singapore.
482
https://doi.org/10.1007/978-981-10-4172-3_12
483
Pablo, R.P., Juan, C.R., Chaparro, D.G., Venzor, J.A.P., Carreon, A.Q., Rosiles, J.G.,
484
(2013).Support Vector Machines for Regression: A Succinct Review of Large-
485
Scale and Linear Programming Formulations. Int. J. Intell. Sci. 3, 5-14.
486
Patel, A., Pruthi, V., Singh, R.P., Pruthi, P.A., 2015. Synergistic effect of fermentable
487
and non-fermentable carbon sources enhances TAG accumulation in oleaginous
488
yeast Rhodosporidium kratochvilovae HIMPA1. Bioresour Technol 188, 136-144.
489
https://doi.org/10.1016/j.biortech.2015.02.062
490
Pattananuwat, N., Aoki, M., Hatamoto, M., Nakamura, A., Yamazaki, S., Syutsubo, K.,
491
Araki, N., Takahashi, M., Harada, H., Yamaguchi, T., 2013. Performance and
492
microbial community analysis of a full-scale hybrid anaerobic-aerobic membrane
493
system for treating molasses-based bioethanol wastewater. Int. J. Environ. Res. 7,
494
979–988. https://doi.org/10.22059/ijer.2013.681
495
Peng, W., Huang, C., Chen, Xue-fang, Xiong, L., Chen, Xin-de, Chen, Y., Ma, L.,
496
2013. Microbial conversion of wastewater from butanol fermentation to microbial
497
oil by oleaginous yeast Trichosporon dermatis. Renew. Energy 55, 31-34.
498
https://doi.org/10.1016/j.renene.2012.12.017
499
Sarkar, N., Ghosh, S.K., Bannerjee, S., Aikat, K., 2012. Bioethanol production from
500
agricultural wastes: An overview. Renew. Energy 37, 19-27.
501
https://doi.org/10.1016/j.renene.2011.06.045\
502
Sebayang, A.H., Masjuki, H.H., Ong, H.C., Dharma, S., Silitonga, A.S., Kusumo, F.,
503
Milano, J., 2017. Optimization of bioethanol production from sorghum grains
504
using artificial neural networks integrated with ant colony. Ind. Crop. Prod. 97,
505 506
146-155. https://doi.org/10.1016/j.indcrop.2016.11.064 Shan, L., Yu, Y., Zhu, Z., Zhao, W., Wang, H., Ambuchi, J.J., Feng, Y., 2015.
507
Microbial community analysis in a combined anaerobic and aerobic digestion
508
system for treatment of cellulosic ethanol production wastewater. Environ. Sci.
509
Pollut. Res. 22, 17789–17798. https://doi.org/10.1007/s11356-015-4938-0
510
Shan, L., Liu, J., Ambuchi, J.J., Yu, Y., Huang, L., Feng, Y., 2017. Investigation on
511
decolorization of biologically pretreated cellulosic ethanol wastewater by
512
electrochemical method. Chem. Eng. J. 323, 455–464.
513
https://doi.org/10.1016/j.cej.2017.04.121
514
Singh, D.K., Verma, D.K., Singh, Y., Hasan, S.H., 2017. Preparation of CuO
515
nanoparticles using Tamarindus indica pulp extract for removal of As(III):
516
Optimization of adsorption process by ANN-GA. J. Environ. Chem. Eng. 5, 1302-
517
1318. https://doi.org/10.1016/j.jece.2017.01.046
518 519 520
Steinwinder, T., Gill, E., Gerhardt, M., n.d. Process Design of Wastewater Treatment for the NREL Cellulosic Ethanol Model. Wang, J., Hu, M., Zhang, H., Bao, J., 2017. Converting Chemical Oxygen Demand
521
(COD) of Cellulosic Ethanol Fermentation Wastewater into Microbial Lipid by
522
Oleaginous Yeast Trichosporon cutaneum. Appl. Biochem. Biotechnol. 182,
523
1121–1130. https://doi.org/10.1007/s12010-016-2386-z
524
Wang, M., Dewil, R., Maniatis, K., Wheeldon, J., Tan, T., Baeyens, J., Fang, Y., 2019.
525
Biomass-derived aviation fuels: Challenges and perspective. Prog. Energy
526
Combust. Sci. 74, 31–49.https://doi.org/10.1016/j.pecs.2019.04.004
527
Xue, F., Zhang, X., Luo, H., Tan, T., 2006. A new method for preparing raw material
528
for biodiesel production. Process Biochem. 41, 1699-1702.
529
https://doi.org/10.1016/j.procbio.2006.03.002
530
Xue, F., Miao, J., Zhang, X., Luo, H., Tan, T., 2008. Studies on lipid production by
531
Rhodotorula glutinis fermentation using monosodium glutamate wastewater as
532
culture medium. Bioresour. Technol. 99, 5923-5927.
533
https://doi.org/10.1016/j.biortech.2007.04.046
534
Xue, F., Gao, B., Zhu, Y., Zhang, X., Feng, W., Tan, T., 2010. Pilot-scale production of
535
microbial lipid using starch wastewater as raw material. Bioresour. Technol. 101,
536
6092-6095. https://doi.org/10.1016/j.biortech.2010.01.124
537
Zhang, X., Meng, L., Xu, Z., Tianwei, T., n.d. Microbial lipid production and organic
538
matters removal from cellulosic ethanol wastewater through coupling oleaginous
539
yeasts and activated sludge biological method. Bioresour. Technol. 267, 395-400.
540
https://doi.org/10.1016/j.biortech.2018.07.075
541
Zhang, Z., Zhang, X., Tan, T., 2014. Lipid and carotenoid production by Rhodotorula
542
glutinis under irradiation/high-temperature and dark/low-temperature cultivation.
543
Bioresour. Technol. 157, 149-153. https://doi.org/10.1016/j.biortech.2014.01.039.
544
Zhao, & Yu, B., 2013. Study on treatment of cellulose fuel ethanol wastewater and
545
application. Advanced Materials Research, 777, 365-369.
546
https://doi.org/10.4028/www.scientific.net/AMR.777.365
547 548
Zhou, W., Wang, W., Li, Y., Zhang, Y., 2013. Lipid production by Rhodosporidium toruloides Y2 in bioethanol wastewater and evaluation of biomass energetic yield.
549 550
Bioresour. Technol. 127, 435-440. https://doi.org/10.1016/j.biortech.2012.09.067
551
Fig. 1 Effects of initial COD on biomass (A), glucose consumption (B), lipid content
552
and lipid yield (C) , and COD removal rate (D)
553
Fig. 2 Effects of initial glucose concentration on biomass (A), glucose consumption (B),
554
lipid content and lipid yield (C), and COD removal rate (D)
555
Fig. 3 Changes of the organic matter in cellulose ethanol wastewater during the
556
fermentation:A (Glucose, Xylose, Glycerin); B (Citric acid, Succinic acid); C
557
(Furfural, Furfuryl alcohol, HMF); D (Lactic acid, Acetic acid
558
Fig. 4 Contour map (A: Rough search, C: Fine search) and 3D view (B A: Rough
559
search, D: Fine search) of parameter optimization by CV
560
Fig .5 The fitting results of the training data (A: SVM model, C: BP-ANN model) and
561
the test data (B: SVM model, D: BP-ANN model)
562
Fig. 6 Curve of fitness
563
Table 1 Comparison between SVM and BP -ANN
564 565 566 567 568 569 570 571
572
573 574 575 576 577 578 579
Fig. 1 Effects of initial COD on biomass (A), glucose consumption (B), lipid content and lipid yield (C) , and COD removal rate (D)
580
581 582 583 584 585 586 587
Fig. 2 Effects of initial glucose concentration on biomass (A), glucose consumption (B), lipid content and lipid yield (C), and COD removal rate (D)
588
589 590
Fig. 3 Changes of the organic matter in cellulose ethanol wastewater during the fermentation:A
591
(Glucose, Xylose, Glycerin); B (Citric acid, Succinic acid); C (Furfural, Furfuryl alcohol, HMF); D
592
(Lactic acid, Acetic acid)
593 594 595 596 597
598
599 600 601 602 603 604 605 606 607 608 609 610 611 612
Fig. 4 Contour map (A: Rough search, C: Fine search) and 3D view (B A: Rough search, D: Fine search) of parameter optimization by CV
613
A
B
C
D
614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631
Fig .5 The fitting results of the training data (A: SVM model, C: BP-ANN model) and the test data (B: SVM model, D: BP-ANN model)
632 11.88 11.87 11.86
Fitness
11.85 11.84 11.83 11.82 11.81 11.8 11.79 0
633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648
100
200
300
400
Iterations
Fig. 6 Curve of fitness
500
600
649 650 651 652
Table 1 Comparison between SVM and BP -ANN
Training data Test data
SVM model MSE R2
BP-ANN model MSE R2
0.0004 0.0018
0.0043 0.0105
0.9959 0.9862
0.9899 0.9785
653 654 655 656 657 658
Table S1. The detailed data of the models in this study Number
Volume fraction of wastewater(%)
Glucose(g/L)
Time(h)
Biomass(g/L)
1
25
40
0
0.03
2
25
40
24
1.94
3
25
40
48
3.26
4
25
40
72
6.22
5
25
40
96
7.46
6
25
40
120
8.35
7
25
40
144
10.1
8
25
40
168
10.54
9
25
40
192
10.6
10
25
40
216
10.29
11
25
40
240
10.16
12
40
40
0
0.03
13
40
40
24
0.05
14
40
40
48
0.1
15
40
40
72
2.34
16
40
40
96
3.43
17
40
40
120
4.26
18
40
40
144
5.08
19
40
40
168
7.17
20
40
40
192
8.45
21
40
40
216
8.94
22
40
40
240
8.72
23
50
40
0
0.03
24
50
40
24
0.03
25
50
40
48
0.03
26
50
40
72
0.2
27
50
40
96
0.53
28
50
40
120
0.56
29
50
40
144
1.96
30
50
40
168
2.37
31
50
40
192
2.73
32
50
40
216
2.7
33
50
40
240
2.72
34
33.33
20
0
0.03
35
33.33
20
24
0.98
36
33.33
20
48
4.02
37
33.33
20
72
5.31
38
33.33
20
96
6.47
39
33.33
20
120
7.12
40
33.33
20
144
6.93
41
33.33
20
168
6.77
42
33.33
20
192
6.51
43
33.33
20
216
6.4
44
33.33
20
240
6.21
45
33.33
30
0
0.03
46
33.33
30
24
1.26
47
33.33
30
48
3.94
48
33.33
30
72
5.98
49
33.33
30
96
7.66
50
33.33
30
120
8.58
51
33.33
30
144
9.03
52
33.33
30
168
8.32
53
33.33
30
192
8.08
54
33.33
30
216
8.07
55
33.33
30
240
7.93
56
33.33
40
0
0.03
57
33.33
40
24
1.37
58
33.33
40
48
3.96
59
33.33
40
72
5.02
60
33.33
40
96
6.66
61
33.33
40
120
8.85
62
33.33
40
144
9.26
63
33.33
40
168
10.38
64
33.33
40
192
11.12
65
33.33
40
216
10.87
66
33.33
40
240
10.01
67
33.33
50
0
0.03
68
33.33
50
24
1.67
69
33.33
50
48
4.6
70
33.33
50
72
6.01
71
33.33
50
96
7.2
72
33.33
50
120
7.89
73
33.33
50
144
7.84
74
33.33
50
168
9.21
75
33.33
50
192
10.95
76
33.33
50
216
11.37
77
33.33
50
240
11.52
659 660
Highlights
661
The change law of organic matter in fermentation of lipid was analyzed.
662
BP-ANN and SVM model of the fermentation of the ethanol wastewater were
663
established.
664
SVM is better than BP-ANN in prediction and optimization based on small sample.
665
The parameters were optimized by genetic algorithm based on SVM.
666 667 668 669
Credit Author Statement
670
Lihe Zhang: Data curation; Methodology; Formal analysis; Investigation;
671
Resources.
672
Bin Chao: Software.
673
Xu Zhang: Conceptualization; Funding acquisition; Supervision; Validation.
674 675 676
Declaration of interests
677
☒ The authors declare that they have no known competing financial interests or personal
678
relationships that could have appeared to influence the work reported in this paper.
679 680
☐The authors declare the following financial interests/personal relationships which may
681
be considered as potential competing interests:
682 683 684 685 686 687