Modeling and optimization of microbial lipid fermentation from cellulosic ethanol wastewater by Rhodotorula glutinis based on the support vector machine

Modeling and optimization of microbial lipid fermentation from cellulosic ethanol wastewater by Rhodotorula glutinis based on the support vector machine

Journal Pre-proofs Modeling and optimization of microbial lipid fermentation from cellulosic ethanol wastewater by Rhodotorula glutinis based on the s...

1MB Sizes 0 Downloads 42 Views

Journal Pre-proofs Modeling and optimization of microbial lipid fermentation from cellulosic ethanol wastewater by Rhodotorula glutinis based on the support vector machine Lihe Zhang, Bin Chao, Xu Zhang PII: DOI: Reference:

S0960-8524(20)30050-X https://doi.org/10.1016/j.biortech.2020.122781 BITE 122781

To appear in:

Bioresource Technology

Received Date: Revised Date: Accepted Date:

29 November 2019 6 January 2020 7 January 2020

Please cite this article as: Zhang, L., Chao, B., Zhang, X., Modeling and optimization of microbial lipid fermentation from cellulosic ethanol wastewater by Rhodotorula glutinis based on the support vector machine, Bioresource Technology (2020), doi: https://doi.org/10.1016/j.biortech.2020.122781

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2020 Elsevier Ltd. All rights reserved.

1

Modeling and optimization of microbial lipid

2

fermentation from cellulosic ethanol wastewater

3

by Rhodotorula glutinis based on the support

4

vector machine

5

Lihe Zhang, Bin Chao, Xu Zhang*

6

Beijing Key Lab of Bioprocess, National Energy R&D Center for Biorefinery, College

7

of life Science and Technology, Beijing University of Chemical Technology, Beijing

8

China

9

Highlights

10



The change law of organic matter in fermentation of lipid was analyzed.

11



BP-ANN and SVM model of the fermentation of the ethanol wastewater were

12

established.

13



SVM is better than BP-ANN in prediction and optimization based on small sample.

14



The parameters were optimized by genetic algorithm based on SVM.

15

Key words

16

Cellulosic ethanol wastewater; Microbial lipid; Support vector machine; Artificial

17

neural network

18

Abstract

19

To establish the models of microbial lipid production from cellulosic ethanol

20

wastewater by R. glutinis, the biomass, lipid yield, and COD removal rate were

21

investigated under different conditions. Subsequently, the genetic algorithm based on

22

SVM was adopted to optimize parameters for obtaining the maximum biomass. The

23

results demonstrated that the initial COD and glucose content had a significant effect on

24

lipids synthesis. Most of the organic matter in the wastewater was consumed with the

25

production of lipid. Compared with BP-ANN, SVM had better fitting and generalization

26

ability for small amount of experimental data. By genetic algorithm optimization based

27

on SVM, the maximum biomass and lipid yield could reach 11.87 g/L and 2.18 g/L,

28

respectively. The results suggest that the SVM model could be used as an effective tool

29

to optimize fermentation conditions.

30 31

1. Introduction Biofuels as a crucial renewable resource have spurred worldwide attention. At the

32

same time opportunities and challenges coexist in the development of biofuels (Baeyens

33

et al.2015; Wang et al 2019; Moreno et al.2017). In the past few decades, numerous

34

studies have been conducted on the production of biofuels from lignocellulosic biomass

35

to substitute the fossil fuels (e.g., bioethanol, especially cellulosic ethanol) (Humbird et

36

al.2011; Chandel et al.2018 Hanna, B 2018; Ibarra-Gonzalez et al 2019). In the

37

meantime, the increasing demand of cellulosic ethanol has boosted the technical

38

progress to produce biofuels. However, the production of cellulosic ethanol often results

39

in a concurrent production of large volumes of high-strength wastewater. (Humbird et

40

al.2011). According to the different production process, 6t to 80t of wastewater is

41

generated from ethanol produced per ton (Wang et al.2017). Besides, the wastewater

42

has a high chemical oxygen demand (COD), primarily containing sugars, organic acids,

43

glycerin, inhibitors, inorganic salt, etc. Cellulosic ethanol wastewater belongs to a

44

typical type of biorefinery wastewater that exhibits high concentration, high chroma,

45

complicated component and low pH (Zhang et al.2018). Thus, it will seriously pollute

46

the quality of the water environment and adversely affect people’s normal life without

47

purification. To achieve large-scale commercial production of fuel ethanol from

48

lignocellulose, a practical solution of wastewater treatment should be developed (Sarkar

49

et al.2012).

50

However, the resource utilization of the cellulosic ethanol wastewater has not been

51

reported extensively. Previous studies have focused on how to remove the COD or to

52

reduce the costs and energy consumption mainly (Steinwinder 2011; Zhao and Yu

53

2013). There are many research about the treatment methods of the industrial waste

54

water, which are very difficult and energy consuming. (e.g., evaporation, membrane

55

separation, single cell protein production, and electrochemical processes) (Shan et

56

al.2015; Shan et al.2015; Hu et al. 2017; Lynd et al 2017). For the cellulosic ethanol

57

wastewater, which is rich in organic matter, the most economical and feasible method is

58

to convert the residual sugars and organic compounds into lipid by oleaginous yeasts

59

(Gude 2016). As a crucial feedstock, microbial lipid can be used to produce biodiesel,

60

biolubricant and jet fuel using different methods (Chuck et al. 2014). However, because

61

of the high cost, it was difficult to achieve commercial application of biodiesel

62

production from microbial lipid. The main cost of microbial cultivation is the raw

63

material that takes up more than 80% (Xue et al.2010). According to previous research,

64

the production of microbial lipid from wastewater can effectively lower the cost of its

65

production. There are numerous studies about the production of microbial lipid from

66

various types of organic wastewater by oleaginous yeasts, especially the R. glutinis

67

(Xue et al.2006; Hall et al. 2011; Ling et al.2013; Zhou et al. 2013; Peng et al.2013;

68

Chen et al.2009). In our previous study, the feasibility of using the cellulosic ethanol

69

wastewater by R. glutinis has been explored. Subsequently, a novel strategy was

70

formulated for lipid production through coupling oleaginous yeasts and activated sludge

71

biological methods. As the results suggested, the utilization of cellulosic ethanol

72

wastewater by R. glutinis for producing microbial lipid can not only save the cost of

73

producing microbial lipid, but also remove the COD of wastewater.

74

In the process of fermentation, many technological parameters will have an

75

important impact on the fermentation results, such as fermentation time, temperature,

76

pH, solid-liquid ratio and so on. The yield can be increased by optimizing the process

77

parameters. The existing methods for obtaining the optimal pretreatment parameters

78

mostly cover orthogonal experimental design (OED) (Zhu et al., 2013), response

79

surface methodology (RSM) (Mohammadi et al., 2016), and uniform design (UD) (Fang

80

et al., 2000), artificial neural network (ANN) (Singh et al., 2017; Boukelia et al, 2016),

81

support vector machine (SVM) (Pablo et al., 2013), etc. But the accuracy of OED, RSM

82

and UD methods is not very high if the experimental data are not enough, which limits

83

their applications range. More and more researchers start to use ANN and SVM to build

84

models to optimize fermentation conditions. The modeling of fermentation can provide

85

reliable data reference for the control and optimization of fermentation process

86

parameters. ANN simulates the biological nervous system with bionics. Generally, it is

87

composed of input layer, hidden layer and output layer. The layers are connected by

88

weights, and each layer contains one or more nodes. It should only know the input and

89

output data of the fermentation, whereas it is not required to study the reaction

90

mechanism of the fermentation process. By analyzing the biological nervous system

91

from different angles, a variety of artificial neural network models are obtained. And

92

among these models, BP-ANN and RBP-ANN are commonly used. Accordingly,

93

modeling based on the ANN method is simple and easy, whereas its training algorithm

94

converges slowly and falls into local optimum easily. It is not suitable for modeling

95

with small sample data (Sebayang et al.2017; Grahovac et al.2016). SVM is considered

96

a novel pattern classification and nonlinear regression method for statistical learning

97

theory. It follows the structural risk minimization criterion to minimize the risk of the

98

sample points while minimizing the risk structure and enhancing the generalization

99

ability of the model. It has developed rapidly and has been successfully applied in many

100

fields (bioinformatics, medicine, text and handwriting recognition, etc.) (Irawan et

101

al.2015; Guerbai et al.2018). Compare with conventional ANN, SVM is capable of

102

obtaining global optimal solutions based on small samples. As fueled by the rapid

103

advancement of computer technology, SVM has now been widely used in various

104

disciplines of scientific research.

105

In this study, the cellulosic ethanol wastewater was applied as raw material to

106

culture R. glutinis aiming to evaluate the effect of initial concentration of glucose and

107

COD on the lipid fermentation and to investigate the change of major organics in the

108

wastewater. Therefore, biomass, lipid synthesis and the concentration of major organics

109

were monitored at different times when supplied with different concentrations. Based

110

on the data obtained from fermentation, the SVM and BP-ANN models of microbial

111

lipid fermentation were established. Subsequently, the best model was selected to find

112

the best process parameters for obtaining the maximum biomass concentration using the

113

genetic algorithm.

114

2. Material and methods

115

2.1 Microorganism, culture conditions, and wastewater

116

The yeast strain R. glutinis CGMCC No. 2258 was obtained from the China

117

National Research Institute of Food and Fermentation Industries. Besides, it was stored

118

in agar slant medium with yeast extract (4 g/L), urea (2 g/L), and glucose (200 g/L) at

119

4 °C.

120 121 122 123 124

The seed&basic medium contained (g/L) glucose 40, (NH4)2SO4 2, KH2PO4 7, Na2SO4 2, MgSO4 1.5, Yeast extract 1.5. Cellulosic ethanol wastewater which was used in our study was purchased from the COFCO Corporation, China. The wastewater was diluted into different proportion before making up the medium.

125

Subsequently, only glucose was added. All mediums were adjusted to the same initial

126

pH at 5.5 and sterilized at 121℃ for 20 min. The inoculums were cultured at 30 °C in a

127

180-rpm shaker for 24 h and then transferred into 500 mL flasks that contained 100 mL

128

medium with 10% inoculation size (v/v).

129

2.2. Analytical methods

130 131

The dry cell weight method was used to measure the biomass (Zhang et al., 2014). Moreover, the concentration of glucose was measured by a glucose biosensor (SBA

132

40C, Biological Institute of Shandong Academy of Sciences). Lipid was extracted using

133

the method reported by (Xue et al. 2008). The lipid components were analyzed as

134

described in the existing study (Zhang et al. 2014).

135

HPLC (Thermo Scientific, Waltham, MA, USA) was used to measure the

136

concentration of sugars, organic acids and other organics and the specific method was

137

followed the procedure reported in Patel et al. (2015).

138

2.3 Experimental design and prediction model

139

In this paper, the MATLAB software was used to establish the SVM model and

140

BP-ANN model. In the experiments, the biomass, time and the concentration of glucose

141

were obtained to prepare data for modeling. The model was trained using the

142

fermentation data collected from the experiments to obtain a prediction model of lipid

143

fermentation from cellulosic ethanol wastewater by R.glutinis.

144

2.3.1 Establishment and functions of SVM model

145

In the present study, SVM regression model was built using the fermentation data

146

of R. glutinis. It is equivalent to a function map, as shown in Eq. (1), which has an input

147

and an output.

148 149

𝑦 = (𝑥) …………………………………… (1)

Where x denotes the independent variable; y is the dependent variable. In this

150

study, the fermentation time and the concentration of various substances at different

151

times are the dependent variable.

152

(1) Data preprocessing

153

The selected data were normalized by the Eq. (2). In MATLAB, the above

154

normalization can be achieved by the ‘mapminmax’ function (3). The mapping adopted

155

by the ‘mapminmax’ function is expressed as Eq. (4).

156

𝑓:𝑥→𝑦 = 𝑥𝑚𝑖𝑛𝑚𝑎𝑥…………………………………………… (2)

157

[y, ps] = mapminmax(x, ymin, ymax) ………………………… (3)

158

𝑦=

159

𝑥 ― 𝑥𝑚𝑖𝑛

(ymax - ymin) × (x - xmin) xmax - xmin

+ ymin………………………………… (4)

Where x denotes the data before normalization, in this paper, it mainly refers to the

160

data obtained from fermentation; ymin and ymax refer to the range parameters of the

161

mapping, the default values are -1 and 1, respectively; y is the normalized data; ps

162

indicates the structure that holds the normalized mapping; ymin and ymax and ps were

163

parameters related to software settings.

164 165

(2) Optimization selection of model parameters In this paper, the best penalty factor c and g were obtained using cross validation

166

method (CV method). Function (5) in the toolbox of LIBSVM-FarutoUltimate was

167

employed to achieve the CV method.

168

[mes, bestc, bestg] = ...SVMcgForRegress(train_

169

y, train_x, cmin, cmax, gmin, gmax,v, cstep, gstep, msestep) ... .. (5)

170

Input: Where tarin_y refers to the dependent variable to be regressed and its size is

171

n by 1, and n is the number of samples; train_x is an independent variable and its size is

172

n by m, where n represents the number of samples and m represents the number of

173

independent variables; cmin and cmax are the minimum and maximum values of the

174

penalty coefficient c after taking the power exponent with the base of 2 and the default

175

values are -5 and 5, respectively; gmin and gmax are the minimum and maximum

176

values of the model parameter g after taking the power exponent with the base of 2 and

177

the default values are -5 and 5, respectively ; v represents the CV parameter and its

178

default is 5; cstep and gstep are the step size of the parameter c and g and their default

179

are 1, respectively; msestep refers to the step size of the MSE graph and its default is 5.

180

Output: Where mse denotes the lowest mean square error in the CV process; bestc

181

and bestg are the optimal parameters c and g, respectively.

182

(3) Training and regression prediction

183

The SVM model was trained by the best parameters c and g obtained by the CV

184

method, and subsequently, the experimental data were predicted by the regressive

185

analysis. The SVM model in this paper is implemented using the LIBSVM toolbox. The

186

major functions of the LIBSVM toolbox cover the training function ‘svmtrain’ and the

187

prediction function ‘svmpredict’.

188

Training function ‘svmtrain’:

189

model = svmtrain(train_y, train_x, options) …………………… (6)

190

Input: Where train_y denotes the dependent variable of the training set and its size

191

is n by 1, and n is the number of samples ; train_x refers to the independent variable

192

of the training set and its size is n by m, where n represents the number of samples

193

and m represents the number of independent variables ; Options is a parameter

194

option.

195

Output: Model represents a model obtained by training.

196

The prediction function ‘svmpredict’:

197

[predict_y, mse, dec_value] = svmpredict(test_y, test_x, model) ……… (7)

198

Input: Where test_y denotes the dependent variable of the test set and its size is

199

n by 1, and n is the number of samples; test_x indicates the independent variable of

200

the test set and its size is n by m, where n represents the number of samples and m

201

represents the number of independent variables; model is the SVM model trained

202

by the svmtrain function.

203

Output: Where predict_y denotes the result of the predicted test set ; mse refers to a

204

column vector with the size of 3×1; dec_value is the decision value.

205

2.3.2 Establishment and functions of BP-ANN model

206

The establishment of BP artificial neural network model can be divided into three

207

steps: construction, training and prediction. MATLAB software has a neural network

208

toolbox, which includes BP-ANN. BP-ANN involves three functions, ‘newff’, ‘train’

209

and ‘sim’. Before the BP-ANN modeling, the data were also preprocessed by Eq. (3).

210

(1) Parameter setting function ‘newff’

211

net = newff(P, T, S)……………………………….…….. (8)

212

Input: Where P is the input variable matrix; T is the output variable matrix; S is the

213

number of nodes in the hidden layer. The size of variable matrix were determined by

214

experimental data.

215 216 217 218 219

Output: Where Net is the BP artificial neural network after initialization. (2) The training function ‘train’ net = train (NET, X, T, INi, OUTi) ………………………………. (9) Input: Where NET for training network; X is the input variable matrix; T is the output variable matrix; INi is the input layer condition; OUTi is the output layer

220 221

condition. In general, the first three parameters need to be set, and the last two parameters use

222

the default values. The last two parameters are set only when the neural network needs

223

to be optimized.

224 225 226

Output: Where Net is the artificial neural network obtained after training. (3) The prediction function ‘sim’ y = sim (net, x) ……………………………………… (10)

227

Input: Where net is a trained network; X is the input data;

228

Output: Where Y is the data of network prediction.

229

3. Results and discussion

230

3.1 Effects of initial COD on the fermentation of lipid

231

R. glutinis, as a kind of important lipid yeast, can accumulate lipids by exploiting

232

various wastewater as the raw materials. However, the cellulosic ethanol wastewater

233

applied in our study exhibits high concentration of inhibitor (e.g., furfural, 5-

234

hydroxymethyl, and furfuryl alcohol). Besides, it will suppress the growth and lipid-

235

producing of R. glutinis. In order to reduce the inhibition, the waste water was diluted

236

before fermentation. The effects of initial COD on the fermentation of lipid were

237

explored. Before fermentation, the glucose at a concentration of 40g/L was added to the

238

wastewater. The results were shown in Fig. 1. Several diversifications existed in the

239

growth and lipid-producing of R. glutinis with various wastewater contents in medium.

240

The decrease of biomass and lipid production was obvious when the proportion of

241

wastewater exceeded 40%. In particular, when the wastewater content reached 50%, the

242

lipid yield was nearly zero, and the glucose concentration of medium remained basically

243

unchanged. In this concentration, the concentration of the inhibitors exceeded the

244

tolerance limit of R. glutinis, and cell growth was nearly stagnant. In contrast, the

245

curves of cell growth and lipid yield are almost identical at the proportion of wastewater

246

of 25% and 33%. Cells were growing fastly before144 h, and biomass was peaked in

247

192 h with10.6g/L and 11.12g/L, respectively. Subsequently, the biomass was gradually

248

down-regulated, it was mainly because the nutrients were exhausted, and the cells began

249

to dissolve. After the fermentation, the COD removal rate reached over 80%.

250

The results suggested that the concentration of inhibitor was a significant factor

251

affecting biomass and lipid synthesis of R. glutinis. More importantly, compared with

252

the synthetic medium, there was no significant difference in the lipid yield by using the

253

wastewater as the raw materials with only glucose added (Gong 2019). Besides,

254

microbial lipid fermentation by R. glutinis could indeed act as a practical and functional

255

approach to treat the waste water, which is capable of not only producing lipid but also

256

removing the COD of the waste water.

257

3.2 Effects of initial glucose concentration on the fermentation process of lipid

258

To obtain the maximum lipid yield and COD removal rate, different concentrations

259

of initial glucose (20, 30, 40, 50 g/L) were employed to culture R. glutinis with the

260

wastewater content of 33%; moreover, the biomass, lipid accumulation and COD

261

removal rate were ascertained. As shown in Fig. 2, the biomass of cells displayed a

262

significant difference when the initial sugar concentration was up-regulated from 20g/L

263

to 50g/L. With the increase in the initial glucose concentration, the maximum biomass

264

progressively increased. The maximum biomass reached 7.12g/L,9.13 g/L, 11.12 g/L,

265

11.52 g/L, respectively. When the concentration of glucose was less than 30g/L, the

266

glucose was consumed rapidly in 160 h, and the biomass was not sufficiently

267

accumulated. When it reached more than 40 g/L, the lipid and biomass could be

268

sufficient to synthesis and accumulate; at the end of fermentation, the yield of lipid was

269

more than 1.9g/L. Nevertheless, when the glucose concentration was 50g/L, the rate of

270

glucose consumption decreased noticeably.

271

The results revealed that the addition of glucose could positively impact cell growth

272

and lipid synthesis. The results also proved that the addition of glucose can promote the

273

COD reduction of wastewater. When the glucose at a concentration of 40 g/L was

274

introduced to the wastewater, the COD removal rate reached 84%. It will greatly relieve

275

the pressure of wastewater treatment. Compared with other culture methods without

276

glucose added (Wang 2017, Zhou 2013), the removal rate of COD and lipid yield

277

obtained in this study were more competitive. Though the yield of lipid on cell was not

278

high enough, the production of lipids might be further facilitated by fed-batch

279

cultivations in a bioreactor.

280

3.3 The variations of organic matter during fermentation

281

The previous studies suggested that the characteristic of the cellulose ethanol

282

wastewater has been ascertained (Zhang et al.2018). The organic components of

283

wastewater primarily included sugars, organic acids, aldehydes and so on. To delve into

284

the fermentation process of lipid, the culture medium content 33% of wastewater and

285

40g/L of the glucose were taken as the initial medium for lipid production, with the

286

samples taken per 24 h during the fermentation. The concentrations of different organic

287

matters in the samples were ascertained and analyzed; the result are shown in Fig. 3.

288

According to the results, the biomass and lipid concentration rose with the

289

extension of time, and the concentrations of glucose, lactic acid, acetic acid, glycerin,

290

xylose, furfural, furfuryl alcohol were decreased. It was therefore suggested that R.

291

glutinis can consume various substrates during the fermentation, as reported by (Wiebe

292

et al., 2012; Patel et al., 2015). Fig. 3-D suggests that from 0 to 192 h, lactic acid

293

decreased in a relatively slow manner due to the rich glucose in the medium. The

294

glucose was fully consumed at 192 h, and the lactic acid began to be absorbed and

295

exploited rapidly by R. glutinis. The varying trend of acetic acid was more noticeable

296

than that of lactic acid. From 0 to 24h, acetic acid decreased obviously. However,

297

during the fermentation, the decline of acetic acid gradually moderated. The results

298

suggested that both lactic acid and acetic acid in wastewater could be exploited by

299

mucous red yeast. As compared with lactic acid, R. glutinis exhibits better utilization

300

ability to acetic acid. The identical phenomenon occurred with xylose and glycerin, and

301

the presence of glucose hindered the utilization of other substrates. It was not until the

302

concentration of glucose reached to 0 g/L that glycerol and xylose began to be drawn

303

upon rapidly.

304

The results also revealed that the concentration of the citric acid and succinic acid

305

fluctuated irregularly during the fermentation. It was primarily because the citric acid

306

and succinic acid were intermediate products in the process of cells growth and

307

metabolism. During the fermentation, the components of furfural and furfuryl alcohol in

308

wastewater decreased rapidly. After 120h, furfural and furfuryl alcohol were completely

309

consumed, whereas the results reported that R. glutinis exhibited robust tolerance and

310

assimilation ability to furfural and furfuryl alcohol. For the wastewater, rich in complex

311

organic matter, it is very cost-effective to produce lipid and reduce the COD of

312

wastewater by R. glutinis. According to the change of organic matter content in the

313

fermentation process, it can be seen that with the increase of bacterial mass, organic

314

matter in the wastewater decreased gradually.

315

3.4 Training and prediction based on BP-ANN and SVM model

316

During the microbial lipid fermentation from cellulosic ethanol wastewater by R.

317

glutinis different reaction conditions significantly affected the biomass and the yield of

318

lipid. The results of lipid synthesis revealed that the lipid synthesis and cell growth of R.

319

glutinis pertain to the coupling type. Accordingly, to find the optimal reaction condition

320

of the highest biomass, the reaction conditions should be optimized. In the present

321

study, genetic algorithm was adopted to optimize the conditions of the fermentation.

322

Besides, it covered two steps. The first step is the training and prediction of model,

323

while and the second step was extremum optimum design based on genetic algorithm.

324

Accordingly, a fermentation model should be build based on the fermentation data

325

under a range of reaction conditions. In this study, SVM and BP-ANN were employed

326

to build the fermentation model, respectively, and the optimal fermentation model was

327

taken to optimize the genetic algorithm.

328 329

The quality of the models were assessed by statistical means, e.g., the coefficient of determination (R2) mean squared error (MSE), and the MSE can be expressed as:

330 331 332 333

1

𝑀𝑆𝐸 = 𝑛[∑

𝑛

(𝑦𝑒𝑥𝑝 ― 𝑦𝑝𝑟𝑒)2]…………………… (11)

𝑖=1

Where yexp denotes the experimental value; ypre denotes the predicted value; n indicates the sample number. In this paper, the biomass was taken as the dependent variable of the models and

334

the volume fraction of wastewater, while the concentration of glucose supplementation

335

and fermentation time were adopted as independent variable. According to the existing

336

studies here, 77 sets of data about the effects of initial glucose concentration and initial

337

COD on the fermentation process of lipid were harvested, and the data is listed in

338

Supplementary data Table 1. 7 sets of data were randomly taken as test data, and the rest

339

70 sets of data acted as training data to build the models. Subsequently, the trained

340

network was adopted to assess the output of test data and analyze the prediction results.

341

To build the SVM model, the training and test data were normalized by the

342

function of Eq. (3). Besides, the optimal parameters bestc and bsetg were harvested

343

using CV method based on ‘SVMcgForRegress’ function of Eq. (5). First, a rough

344

search of bestc and bsetg was conducted with the range of the parameters c and g to be

345

optimized both as [2-10, 210]. The results of rough search were presented in Fig. 4 A and

346

B. The optimal parameters c and g under the rough search reached 2.2974 and 4,

347

respectively, and the minimum MSE under CV was 0.0016. Moreover, according to the

348

rough results, the range of optimization parameters c and g were narrowed to [2-4, 24]

349

and [2-5, 25], separately. The results were shown in Fig. 4 C and D. The results revealed

350

that the optimal parameters c and g were 1 and 8, respectively, and the minimum MSE

351

under CV was 0.0016. Lastly, the SVM model was trained using the training data

352

according to the optimal parameters c and g calculated, and then the test data underwent

353

the regression prediction. The fitting results of the training data and test data are shown

354

in Fig. 5 A and B. According to the curve in Fig. 5, it can be observed that the fitting

355

degree between prediction data and experimental data of both test data and training data

356

were noticeably high. The results suggested that the SVM fermentation model exhibited

357

a prominent generalization ability.

358

To build the BP-ANN model, the training data and test data were normalized too

359

by ‘mapminmax’ function of Eq (3). ‘Newff’ function of Eq (8) was adopted to build

360

BP-ANN, and the number of iterations was set to 1000, the learning rate was 0.1, and

361

the learning goal was 0.0000004. Based on the ‘trian’ function of Eq (9), to train BP -

362

ANN with training data, the neural network was capable of predicting the biomass

363

during the fermentation. Subsequently, the ‘sim’ function of Eq (10) was called to test

364

the network with the test data, and the fitting effect of the network was analyzed by

365

assessing the error between the output and the test output. The trained network was

366

employed to assess the biomass of test data, and the predicted results are shown in Fig.

367

5 C and Fig. 5 D. The results revealed that BP-ANN exerts a good fitting effect on the

368

fermentation process of mucous red yeast, whereas some errors remained between the

369

predicted results and the actual results, and some samples displayed relatively

370

noticeable prediction errors.

371

The results of Tab 1suggest that the MSE of the training data and test data based

372

on SVM were 0.0004 and 0.0018 respectively, and the R2 was 0.9959 and 0.9862

373

respectively. The MSE of the training data and test data based on BP-ANN were 0.0043

374

and 0.0105, respectively, and the R2 was 0.9899 and 0.9785 respectively. It is therefore

375

suggested that with only a few samples, SVM model exhibited a better performance

376

than ANN model. SVM has a strong potential in the soft sensor of internal variables in

377

fermentation processes and the prediction of fermentation results. The results suggest

378

that the SVM model could be used to study the complex fermentation process of lipid

379

fermentation process. Accordingly, in the present study, the optimization of genetic

380

algorithm will also comply with SVM model.

381

3.5 Optimization by genetic algorithm based on SVM

382

Lastly, genetic algorithm was adopted to find the optimal parameters for obtaining

383

the maximum biomass based on the SVM model. The number of iterations, the

384

population size, the crossover probability, the mutation probability, and the individual

385

length were 500, 50, 0.4, 0.2 and 3, respectively. The fitness variation curve of the

386

optimal individual in the optimization process was plotted in Fig. 6. The fitness value of

387

the optimal individual calculated by genetic algorithm was 11.8723, and the optimal

388

individual reached [32.6048 46.2636 221.0520]. The results revealed that the biomass

389

was peaked at 11.87 g/L increased by 5%, and the lipid content was 2.18 g/L with

390

wastewater volume fraction of 32.6%, sugar content of 46.2 g/L, as well as fermentation

391

time of 221 h.

392

The fermentation of lipid from cellulosic ethanol wastewater by R. glutinis is a

393

kind of complicated batch process which is severely nonlinear and time-varying.

394

Traditional optimization methods were time-consuming and laborious. In this paper,

395

computer model were established to optimize fermentation conditions. According to the

396

results, it demonstrated that the model established in our study had good generalization

397

and prediction ability for the fermentation of microbial lipid from cellulosic ethanol

398

wastewater. And according to the model we got the best fermentation parameters, and

399

the model can be used to optimize more process parameters based on different data.

400

4. Conclusions

401

This study investigated the change of organic matter in the process of lipid

402

fermentation and established the corresponding fermentation model to optimize the

403

fermentation parameters. The results demonstrated that the organic matter in cellulosic

404

ethanol wastewater were indeed employed by R. glutinis. The establishment of

405

fermentation model has important guiding significance for optimizing parameters. With

406

the development of big data technology and artificial intelligence technology, these

407

models can be enriched with experimental data continuously by adding novel detection

408

methods and targets. Furthermore, it can be used for other fermentation processes.

409

Acknowledgements

410

This work was supported by the National Key Research and Development Program of

411

China (2017YFB0306800) and the Overseas Expertise Introduction Project for

412

Discipline Innovation (B13005). And the authors would like to express thanks for the

413

supports.

414

References

415

Baeyens, J., Qian, K., Appels, L., Dewil, R., Tan, T., 2015. Challenges and

416

opportunities in improving the production of bio-ethanol. Prog. Energy Combust.

417

Sci. 47, 60-88. https://doi.org/10.1016/j.pecs.2014.10.003

418

Boukelia, T.E., Arslan, O., Mecibah, M.S., Baeyens, J., Qian, K., Appels, L., Dewil, R.,

419

Tan, T., 2016. ANN-based optimization of a parabolic trough solar thermal power

420

plant. Appl. Therm. Eng. 107, 1210-1218.

421

https://doi.org/10.1016/j.applthermaleng.2016.07.084

422

Brännström, H., Kumar, H., Alén, R., 2018. Current and Potential Biofuel Production

423

from Plant Oils. Bioenergy Res. 11, 592–613. https://doi.org/10.1007/s12155-018-

424

9923-2

425

Chandel, A.K., Silveira, M.H.L., Vanelli, B.A., 2018. Second Generation Ethanol

426

Production: Potential Biomass Feedstock, Biomass Deconstruction, and Chemical

427

Platforms for Process Valorization 135-152. https://doi.org/10.1016/B978-0-12-

428

804534-3.00006-9

429

Chen, X., Li, Z., Zhang, X. et al.2009. Screening of Oleaginous Yeast Strains Tolerant

430

to Lignocellulose. Degradation Compounds Appl Biochem Biotechnol 159: 591.

431

https://doi.org/10.1007/s12010-008-8491-x

432

Chuck, C.J., Lou-Hing, D., Dean, R., Sargeant, L.A., Scott, R.J., Jenkins, R.W., 2014.

433

Simultaneous microwave extraction and synthesis of fatty acid methyl ester from

434

the oleaginous yeast Rhodotorula glutinis. Energy 69, 446-454.

435

https://doi.org/10.1016/j.energy.2014.03.036

436

Fang, K.T., Lin, D.K.J., Winker, P., Zhang, Y., 2000. Uniform Design: Theory and

437

Application. Technometrics 42, 237-248.

438

https://doi.org/10.1080/00401706.2000.10486045

439

Gong, G., Liu, L., Zhang X., Tan, T., 2019. Comparative evaluation of different carbon

440

sources supply on simultaneous production of lipid and carotene of Rhodotorula

441

glutinis with irradiation and the assessment of key gene transcription. Bioresour.

442

Technol, 288:21559. https://doi.org/10.1016/j.biortech.2019.121559

443

Grahovac, J., Jokić, A., Dodić, J., Vučurović, D., Dodić, S., 2016. Modelling and

444

prediction of bioethanol production from intermediates and byproduct of sugar

445

beet processing using neural networks. Renew. Energy 85, 953-958.

446

https://doi.org/10.1016/j.renene.2015.07.054

447 448

449

Gude, V.G., 2016. Wastewater Treatment in Microbial Fuel Cells - An Overview 122, 287-307. https://doi.org/10.1016/j.jclepro.2016.02.022

Guerbai, Y., Chibani, Y., Hadjadji, B., 2018. Handwriting gender recognition system

450

based on the one-class support vector machines. Seventh International Conference

451

on Image Processing Theory. IEEE.

452

Hall, J., Hetrick, M., French, T., Hernandez, R., Donaldson, J., Mondala, A., Holmes,

453

W., 2011. Oil production by a consortium of oleaginous microorganisms grown

454

on primary effluent wastewater. Journal of Chemical Technology &

455

Biotechnology, 86(1), 54-60. https://doi.org/10.1002/jctb.2506

456

Hu, Q., Fan, L., Gao, D., 2017. Pilot-scale investigation on the treatment of cellulosic

457

ethanol biorefinery wastewater. Chem. Eng. J. 309, 409–416.

458

https://doi.org/10.1016/j.cej.2016.10.066

459 460

Ibarra-Gonzalez, P., Rong, B.G., 2019. A review of the current state of biofuels production from lignocellulosic biomass using thermochemical conversion routes.

461 462

Chinese J. Chem. Eng. 27, 1523–1535. https://doi.org/10.1016/j.cjche.2018.09.018 Irawan, M. I., (2015). Study comparison backpropagation, support vector machine, and

463

extreme learning machine for bioinformatics data.

464

https://doi.org/10.17746/1563-0102.2015.43.2.116-125

465

Jovana, G., Aleksandar, J., Jelena, D., Damjan, V., Siniša, D., (2016). Modelling and

466

prediction of bioethanol production from intermediates and byproduct of sugar

467

beet processing using neural networks. Renewable Energy, 85, 953-958.

468

https://doi.org/10.1016/j.renene.2015.07.054

469

Ling, J. , Nip, S. , & Shim, H. . (2013). Enhancement of lipid productivity of

470

rhodosporidium toruloides in distillery wastewater by increasing cell density.

471

Bioresource Technology, 146, 301-309.

472

https://doi.org/10.1016/j.biortech.2013.07.023

473

Lynd, L.R., Liang, X., Biddy, M.J., Allee, A., Cai, H., Foust, T., Himmel, M.E., Laser,

474

M.S., Wang, M., Wyman, C.E., 2017. Cellulosic ethanol: status and innovation.

475

Curr. Opin. Biotechnol. 45, 202–211. https://doi.org/10.1016/j.copbio.2017.03.008

476

Mohammadi, R., Mohammadifar, M.A., Mortazavian, A.M., Rouhi, M., Ghasemi,

477

J.B.,Delshadian, Z., (2016). Extraction optimization of pepsin-soluble collagen

478

from eggshell membrane by response surface methodology (RSM). Food Chem.

479

190, 186-193. https://doi.org/10.1016/j.foodchem.2015.05.073

480

Moreno, A.D., Alvira, P., Ibarra, D., Tomás-Pejó, E., 2017. Production of Ethanol from

481

Lignocellulosic Biomass. Biofuels and Biorefineries, Vol. 7. Springer Singapore.

482

https://doi.org/10.1007/978-981-10-4172-3_12

483

Pablo, R.P., Juan, C.R., Chaparro, D.G., Venzor, J.A.P., Carreon, A.Q., Rosiles, J.G.,

484

(2013).Support Vector Machines for Regression: A Succinct Review of Large-

485

Scale and Linear Programming Formulations. Int. J. Intell. Sci. 3, 5-14.

486

Patel, A., Pruthi, V., Singh, R.P., Pruthi, P.A., 2015. Synergistic effect of fermentable

487

and non-fermentable carbon sources enhances TAG accumulation in oleaginous

488

yeast Rhodosporidium kratochvilovae HIMPA1. Bioresour Technol 188, 136-144.

489

https://doi.org/10.1016/j.biortech.2015.02.062

490

Pattananuwat, N., Aoki, M., Hatamoto, M., Nakamura, A., Yamazaki, S., Syutsubo, K.,

491

Araki, N., Takahashi, M., Harada, H., Yamaguchi, T., 2013. Performance and

492

microbial community analysis of a full-scale hybrid anaerobic-aerobic membrane

493

system for treating molasses-based bioethanol wastewater. Int. J. Environ. Res. 7,

494

979–988. https://doi.org/10.22059/ijer.2013.681

495

Peng, W., Huang, C., Chen, Xue-fang, Xiong, L., Chen, Xin-de, Chen, Y., Ma, L.,

496

2013. Microbial conversion of wastewater from butanol fermentation to microbial

497

oil by oleaginous yeast Trichosporon dermatis. Renew. Energy 55, 31-34.

498

https://doi.org/10.1016/j.renene.2012.12.017

499

Sarkar, N., Ghosh, S.K., Bannerjee, S., Aikat, K., 2012. Bioethanol production from

500

agricultural wastes: An overview. Renew. Energy 37, 19-27.

501

https://doi.org/10.1016/j.renene.2011.06.045\

502

Sebayang, A.H., Masjuki, H.H., Ong, H.C., Dharma, S., Silitonga, A.S., Kusumo, F.,

503

Milano, J., 2017. Optimization of bioethanol production from sorghum grains

504

using artificial neural networks integrated with ant colony. Ind. Crop. Prod. 97,

505 506

146-155. https://doi.org/10.1016/j.indcrop.2016.11.064 Shan, L., Yu, Y., Zhu, Z., Zhao, W., Wang, H., Ambuchi, J.J., Feng, Y., 2015.

507

Microbial community analysis in a combined anaerobic and aerobic digestion

508

system for treatment of cellulosic ethanol production wastewater. Environ. Sci.

509

Pollut. Res. 22, 17789–17798. https://doi.org/10.1007/s11356-015-4938-0

510

Shan, L., Liu, J., Ambuchi, J.J., Yu, Y., Huang, L., Feng, Y., 2017. Investigation on

511

decolorization of biologically pretreated cellulosic ethanol wastewater by

512

electrochemical method. Chem. Eng. J. 323, 455–464.

513

https://doi.org/10.1016/j.cej.2017.04.121

514

Singh, D.K., Verma, D.K., Singh, Y., Hasan, S.H., 2017. Preparation of CuO

515

nanoparticles using Tamarindus indica pulp extract for removal of As(III):

516

Optimization of adsorption process by ANN-GA. J. Environ. Chem. Eng. 5, 1302-

517

1318. https://doi.org/10.1016/j.jece.2017.01.046

518 519 520

Steinwinder, T., Gill, E., Gerhardt, M., n.d. Process Design of Wastewater Treatment for the NREL Cellulosic Ethanol Model. Wang, J., Hu, M., Zhang, H., Bao, J., 2017. Converting Chemical Oxygen Demand

521

(COD) of Cellulosic Ethanol Fermentation Wastewater into Microbial Lipid by

522

Oleaginous Yeast Trichosporon cutaneum. Appl. Biochem. Biotechnol. 182,

523

1121–1130. https://doi.org/10.1007/s12010-016-2386-z

524

Wang, M., Dewil, R., Maniatis, K., Wheeldon, J., Tan, T., Baeyens, J., Fang, Y., 2019.

525

Biomass-derived aviation fuels: Challenges and perspective. Prog. Energy

526

Combust. Sci. 74, 31–49.https://doi.org/10.1016/j.pecs.2019.04.004

527

Xue, F., Zhang, X., Luo, H., Tan, T., 2006. A new method for preparing raw material

528

for biodiesel production. Process Biochem. 41, 1699-1702.

529

https://doi.org/10.1016/j.procbio.2006.03.002

530

Xue, F., Miao, J., Zhang, X., Luo, H., Tan, T., 2008. Studies on lipid production by

531

Rhodotorula glutinis fermentation using monosodium glutamate wastewater as

532

culture medium. Bioresour. Technol. 99, 5923-5927.

533

https://doi.org/10.1016/j.biortech.2007.04.046

534

Xue, F., Gao, B., Zhu, Y., Zhang, X., Feng, W., Tan, T., 2010. Pilot-scale production of

535

microbial lipid using starch wastewater as raw material. Bioresour. Technol. 101,

536

6092-6095. https://doi.org/10.1016/j.biortech.2010.01.124

537

Zhang, X., Meng, L., Xu, Z., Tianwei, T., n.d. Microbial lipid production and organic

538

matters removal from cellulosic ethanol wastewater through coupling oleaginous

539

yeasts and activated sludge biological method. Bioresour. Technol. 267, 395-400.

540

https://doi.org/10.1016/j.biortech.2018.07.075

541

Zhang, Z., Zhang, X., Tan, T., 2014. Lipid and carotenoid production by Rhodotorula

542

glutinis under irradiation/high-temperature and dark/low-temperature cultivation.

543

Bioresour. Technol. 157, 149-153. https://doi.org/10.1016/j.biortech.2014.01.039.

544

Zhao, & Yu, B., 2013. Study on treatment of cellulose fuel ethanol wastewater and

545

application. Advanced Materials Research, 777, 365-369.

546

https://doi.org/10.4028/www.scientific.net/AMR.777.365

547 548

Zhou, W., Wang, W., Li, Y., Zhang, Y., 2013. Lipid production by Rhodosporidium toruloides Y2 in bioethanol wastewater and evaluation of biomass energetic yield.

549 550

Bioresour. Technol. 127, 435-440. https://doi.org/10.1016/j.biortech.2012.09.067

551

Fig. 1 Effects of initial COD on biomass (A), glucose consumption (B), lipid content

552

and lipid yield (C) , and COD removal rate (D)

553

Fig. 2 Effects of initial glucose concentration on biomass (A), glucose consumption (B),

554

lipid content and lipid yield (C), and COD removal rate (D)

555

Fig. 3 Changes of the organic matter in cellulose ethanol wastewater during the

556

fermentation:A (Glucose, Xylose, Glycerin); B (Citric acid, Succinic acid); C

557

(Furfural, Furfuryl alcohol, HMF); D (Lactic acid, Acetic acid

558

Fig. 4 Contour map (A: Rough search, C: Fine search) and 3D view (B A: Rough

559

search, D: Fine search) of parameter optimization by CV

560

Fig .5 The fitting results of the training data (A: SVM model, C: BP-ANN model) and

561

the test data (B: SVM model, D: BP-ANN model)

562

Fig. 6 Curve of fitness

563

Table 1 Comparison between SVM and BP -ANN

564 565 566 567 568 569 570 571

572

573 574 575 576 577 578 579

Fig. 1 Effects of initial COD on biomass (A), glucose consumption (B), lipid content and lipid yield (C) , and COD removal rate (D)

580

581 582 583 584 585 586 587

Fig. 2 Effects of initial glucose concentration on biomass (A), glucose consumption (B), lipid content and lipid yield (C), and COD removal rate (D)

588

589 590

Fig. 3 Changes of the organic matter in cellulose ethanol wastewater during the fermentation:A

591

(Glucose, Xylose, Glycerin); B (Citric acid, Succinic acid); C (Furfural, Furfuryl alcohol, HMF); D

592

(Lactic acid, Acetic acid)

593 594 595 596 597

598

599 600 601 602 603 604 605 606 607 608 609 610 611 612

Fig. 4 Contour map (A: Rough search, C: Fine search) and 3D view (B A: Rough search, D: Fine search) of parameter optimization by CV

613

A

B

C

D

614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631

Fig .5 The fitting results of the training data (A: SVM model, C: BP-ANN model) and the test data (B: SVM model, D: BP-ANN model)

632 11.88 11.87 11.86

Fitness

11.85 11.84 11.83 11.82 11.81 11.8 11.79 0

633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648

100

200

300

400

Iterations

Fig. 6 Curve of fitness

500

600

649 650 651 652

Table 1 Comparison between SVM and BP -ANN

Training data Test data

SVM model MSE R2

BP-ANN model MSE R2

0.0004 0.0018

0.0043 0.0105

0.9959 0.9862

0.9899 0.9785

653 654 655 656 657 658

Table S1. The detailed data of the models in this study Number

Volume fraction of wastewater(%)

Glucose(g/L)

Time(h)

Biomass(g/L)

1

25

40

0

0.03

2

25

40

24

1.94

3

25

40

48

3.26

4

25

40

72

6.22

5

25

40

96

7.46

6

25

40

120

8.35

7

25

40

144

10.1

8

25

40

168

10.54

9

25

40

192

10.6

10

25

40

216

10.29

11

25

40

240

10.16

12

40

40

0

0.03

13

40

40

24

0.05

14

40

40

48

0.1

15

40

40

72

2.34

16

40

40

96

3.43

17

40

40

120

4.26

18

40

40

144

5.08

19

40

40

168

7.17

20

40

40

192

8.45

21

40

40

216

8.94

22

40

40

240

8.72

23

50

40

0

0.03

24

50

40

24

0.03

25

50

40

48

0.03

26

50

40

72

0.2

27

50

40

96

0.53

28

50

40

120

0.56

29

50

40

144

1.96

30

50

40

168

2.37

31

50

40

192

2.73

32

50

40

216

2.7

33

50

40

240

2.72

34

33.33

20

0

0.03

35

33.33

20

24

0.98

36

33.33

20

48

4.02

37

33.33

20

72

5.31

38

33.33

20

96

6.47

39

33.33

20

120

7.12

40

33.33

20

144

6.93

41

33.33

20

168

6.77

42

33.33

20

192

6.51

43

33.33

20

216

6.4

44

33.33

20

240

6.21

45

33.33

30

0

0.03

46

33.33

30

24

1.26

47

33.33

30

48

3.94

48

33.33

30

72

5.98

49

33.33

30

96

7.66

50

33.33

30

120

8.58

51

33.33

30

144

9.03

52

33.33

30

168

8.32

53

33.33

30

192

8.08

54

33.33

30

216

8.07

55

33.33

30

240

7.93

56

33.33

40

0

0.03

57

33.33

40

24

1.37

58

33.33

40

48

3.96

59

33.33

40

72

5.02

60

33.33

40

96

6.66

61

33.33

40

120

8.85

62

33.33

40

144

9.26

63

33.33

40

168

10.38

64

33.33

40

192

11.12

65

33.33

40

216

10.87

66

33.33

40

240

10.01

67

33.33

50

0

0.03

68

33.33

50

24

1.67

69

33.33

50

48

4.6

70

33.33

50

72

6.01

71

33.33

50

96

7.2

72

33.33

50

120

7.89

73

33.33

50

144

7.84

74

33.33

50

168

9.21

75

33.33

50

192

10.95

76

33.33

50

216

11.37

77

33.33

50

240

11.52

659 660

Highlights

661



The change law of organic matter in fermentation of lipid was analyzed.

662



BP-ANN and SVM model of the fermentation of the ethanol wastewater were

663

established.

664



SVM is better than BP-ANN in prediction and optimization based on small sample.

665



The parameters were optimized by genetic algorithm based on SVM.

666 667 668 669

Credit Author Statement

670

Lihe Zhang: Data curation; Methodology; Formal analysis; Investigation;

671

Resources.

672

Bin Chao: Software.

673

Xu Zhang: Conceptualization; Funding acquisition; Supervision; Validation.

674 675 676

Declaration of interests

677

☒ The authors declare that they have no known competing financial interests or personal

678

relationships that could have appeared to influence the work reported in this paper.

679 680

☐The authors declare the following financial interests/personal relationships which may

681

be considered as potential competing interests:

682 683 684 685 686 687