An adaptive mode convolutional neural network based on bar-shaped structures and its operation modeling to complex industrial processes

An adaptive mode convolutional neural network based on bar-shaped structures and its operation modeling to complex industrial processes

Journal Pre-proof An adaptive mode convolutional neural network based on bar-shaped structures and its operation modeling to complex industrial proces...

2MB Sizes 0 Downloads 10 Views

Journal Pre-proof An adaptive mode convolutional neural network based on bar-shaped structures and its operation modeling to complex industrial processes Yongjian Wang, Hongguang Li, Chu Qi PII:

S0169-7439(19)30802-0

DOI:

https://doi.org/10.1016/j.chemolab.2020.103932

Reference:

CHEMOM 103932

To appear in:

Chemometrics and Intelligent Laboratory Systems

Received Date: 10 December 2019 Revised Date:

25 December 2019

Accepted Date: 6 January 2020

Please cite this article as: Y. Wang, H. Li, C. Qi, An adaptive mode convolutional neural network based on bar-shaped structures and its operation modeling to complex industrial processes, Chemometrics and Intelligent Laboratory Systems (2020), doi: https://doi.org/10.1016/j.chemolab.2020.103932. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2020 Published by Elsevier B.V.

Yongjian Wang: Writing- Reviewing; Conceptualization; Visualization Hongguang Li: Supervision, Language improvement. Chu Qi: Validation; Language improvement.

1

An adaptive mode convolutional neural network based on bar-shaped structures

2

and its operation modeling to complex industrial processes

3

Yongjian Wang1,2, Hongguang Li1*, Chu Qi1

4

(1. College of Information Science and Technology, Beijing University of Chemical

5

Technology.

6

2. Department of Chemical and Biomolecular Engineering, University of California, Los

7

Angeles)

8

*Corresponding author, E-mail: [email protected]

9

Abstract: Optimal operation modeling plays an important role in complex industrial processes;

10

however, with the increasing complexity and high nonlinearity in industrial processes, it becomes

11

more and more difficult to establish an accurate operation modeling using first-principles methods.

12

In this paper, an adaptive mode convolutional neural network framework based on bar-shaped

13

structures (BS-AMCNN) is proposed, which is a data-driven model. First, a bar-shaped structure

14

is designed to deal with the industrial process data specifically. The bar-shaped structure can

15

transfer the advantages of CNN on processing image data to processing industrial process data.

16

Meanwhile, the convolution windows and pooling windows in the proposed BS-AMCNN

17

algorithm is replaced by translation-only sliding bar-shaped windows. Therefore, the algorithm

18

can adjust the CNN structure adaptively among three different modes depending on different

19

process statuses. the optimal operation model can be obtained with the proposed BS-AMCNN

20

method accordingly. An experiment on real complex industrial process, methanol production

21

process, is carried out, which validates the effectiveness of the proposed method. The proposed

22

method is further compared with the traditional CNN method, and the back propagation (BP)

23

method. The results demonstrate the effectiveness of the proposed method.

24

Keywords: Convolutional neural networks; Bar-shaped; Operation modeling; Adaptive mode;

25

Methanol production process

26

1. .Introduction

27

With the development of industrial processes, high efficiency of industrial processes has

28

attracted more and more attention [1-2]. High productivity can be achieved through researching on

1

29

industrial processes. Accurate operations are required to ensure the optimal production efficiency.

30

Therefore, an appropriate operation model is crucial to this process Generally, modeling methods

31

can be divided into mechanism modeling methods and data-driven modeling methods. [3-4]. The

32

mathematical relationship between variables can be established by analyzing and interpreting the

33

physical and chemical mechanisms of processes using mechanism modeling methods [5].

34

Glarborg et al. [6] proposed a mechanism model for the gaseous sulfating of alkali hydroxide and

35

alkali chloride; the method relies on a detailed chemical kinetic model for the high-temperature

36

gas-phase interaction between alkali metals, the O/H radical pool, and chlorine/sulfur species.

37

Frenklach et al. [7] proposed a detailed chemical reaction model for the growth of polycyclic

38

aromatic hydrocarbons and soot particle nucleation and growth; the established model could

39

calculate the optical properties of an arbitrary ensemble of soot particle. Jiang et al. [8] proposed a

40

new numerical model to describe the micro-interacting situations between grains and workpiece

41

material in grinding contact zone to predict the roughness of ground surfaces accurately. However,

42

with the complexity increases in industrial processes, it’s almost impossible to build a precise

43

mechanism model. Optimal operating strategies are often manually adjusted by experienced

44

operators. the optimization of industrial processes can hardly be implemented without the vivid

45

description of the model nor advice from experienced operators. Therefore, an accurate optimal

46

operation model is indispensable.

47

Meanwhile, many researchers have found that with the application of various instrumentation

48

and distributed control systems, industrial data have grown explosively and are increasingly

49

accessible. Industrial process data reflects operation knowledge of the industrial process, which

50

means the data contains abundant operating information. Therefore, many researchers have

51

adopted data-driven modeling methods [9-11]. Data-driven modeling methods dedicates to mining

52

useful information from the input and output data. Mathematical relationship between the

53

independent variables and dependent variables can be established using the data-driven models

54

[12]. Data-driven modeling methods need less process knowledge than mechanism modeling

55

methods, while mainly depending on the collected process data. As a result, data-driven modeling

56

has seen an increasing popularity compared to traditional mechanism modeling, depending on

57

whether the object is non-linear, data-driven modeling methods can be divided into the following

2

58

methods [13]: linear regression methods, fuzzy modeling methods, artificial neural network (NN)

59

methods, etc. NN is one of the most widely used modeling methods, which can represent the

60

mathematical model discrete parallel information process. It can solve complex nonlinear

61

problems effectively. Since Industrial process data are highly non-linear NN has been successfully

62

applied to the modeling for complex processes by various researchers [14]. Lee et al. [15] built a

63

hybrid neural network model of a full-scale industrial wastewater treatment process. He et al. [16]

64

proposed a hybrid robust model based on an improved functional link neural network integrating

65

with partial least square, and the proposed model was successfully applied to predicting key

66

process variables. Ling et al. [17] improved a hybrid particle swarm optimized wavelet neural

67

network for modeling the development of fluid dispensing for electronic packaging.

68

The modeling problem of industrial processes can be successfully solved by these NN

69

methods. Though without an accurate mechanism model, complex industrial process data can be

70

effectively used for modeling. NN data-driven methods have also been successfully applied to

71

optimal operation modeling for optimization by researchers [18]. Cui et al. [19] proposed

72

operational-pattern optimization in blast furnace pulverized coal injection based on prediction

73

model of neural network. Rangwala et al. [20] used computing abilities of neural networks to learn

74

and optimize the machine operations. Ochoa-Estopier et al. [21] proposed a new methodology for

75

optimizing heat-integrated crude oil distillation systems.

76

However, there is a serious problem when it comes to model industrial process objects using

77

traditional NNs. The connection between neurons is fully connected. When there are several

78

hidden layers in the neural network, the fully connected neurons significantly increase the burden

79

of computation. As the time cost will greatly increase will the rising computation amount, the NN

80

methods cannot be suitable for overly complex data modeling. Convolutional neural network

81

(CNN) is a recently developed NNs [22]. Sparse sampling and shared weighting methods are used

82

to reduce the amount of computation rapidly. The structure provides a solution to solving the

83

computational burden problem.

84

CNN is a feedforward neural network. With the advancement of hardware devices, especially

85

the use of GPUs, deep learning attracts more and more attention in academic research and

86

industrial application [23-24]. For modeling problems in different fields, more and more scholars

3

87

tend to use in-depth learning to solve practical problems [25-26]. Deep learning neural networks

88

can extract features from complex high-dimensional data. There are four basic deep neural

89

networks that available currently: deep belief networks (DBNs) [27-28], stacked auto encoders

90

(SAE) [29-30], recurrent neural networks (RNNs) [31-32] and convolutional neural networks

91

(CNNs) [33-34]. CNN does not need to select features manually, only training the weights to

92

process high-dimensional data. thus getting good training results. It has been widely used in image

93

processing and feature extraction. Chen et al. [35] proposed a regularized deep feature extraction

94

method to classify hyperspectral images using a convolutional neural network. Wiatowski et al.

95

[36] proposed a mathematical theory of deep convolutional neural networks for feature extraction.

96

Acharya et al. [37] proposed a computer-aided diagnosis system with the CNN method. Data with

97

grid topology features can be processed well with CNN. We can regard time-series data as one-

98

dimensional grid data which has fixed time intervals. CNN can also be used to analyze time series

99

data. The optimal operation modeling problem is a complex multidimensional nonlinear data

100

modeling problem with time-series characteristics. Considering the advantages mentioned above,

101

CNN is a good solution to optimal operation modeling for industrial processes with time-series in

102

this paper.

103

However, the CNN structure cannot be used to process industrial process data directly. In

104

order to solve this problem, an adaptive mode convolutional neural network modeling framework

105

based on bar-shaped structures (BS-AMCNN) is proposed in this paper. In the proposed BS-

106

AMCNN, the structure of traditional CNN is modified. The inputs of the traditional CNN are

107

usually a square image with

108

window in sequence. The pooling window is also a translation sliding window in sequence. The

109

proposed structure in the BS-AMCNN can break the input square image into a long bar data. In

110

the proposed BS-AMCNN structure, sliding windows and pooling windows can both be replaced

111

by long bar translation-only sliding windows. What’s more, the operation conditions of the

112

complex industrial process is changing all the time. If only one simple single algorithm is used

113

during the data analysis, some important features hidden in the history data cannot be extracted

114

and some unnecessary computation will cost extra calculation time. In our proposed BS-AMCNN

115

method, we adjust the single network structure into three alternative network structures: Using the

n × n size. Image features can be extracted by a translation sliding

4

116

previous time modeling results as a standard; using ordinary convolutional neural network

117

structures; and increasing the number of filters and network depth to perform deep-level extraction

118

of local features. The range of the input data selected each time, the length of convolution

119

windows and the length of pooling windows can be determined to minimize the final error using

120

the trial and error method. Therefore, using the proposed BS-AMCNN model, optimal operational

121

strategies in the industrial process data can be learned and predicted efficiently. In order to

122

validate the performance of the proposed BS-AMCNN, optimal operation modeling simulations

123

on the methanol production process are carried out. Simulation results show that the proposed BS-

124

CNN can achieve high accuracy in operation modeling. Some other methods are compared with

125

our proposed BS-AMCNN method, and the results shows that the BS-AMCNN method relatively

126

advanced in operation modeling.

127

The article consists of the following parts: Section 2 describes the basic CNN structure;

128

Section 3 introduces the steps of the proposed BS-AMCNN model in detail and clearly illustrates

129

the intelligent operation framework. Section 4 gives a case study of methanol production process

130

using the proposed BS-AMCNN model and proves the effectiveness of the proposed method.

131

Section 5 makes a summary of this paper.

132

2. Traditional CNN structure

133

CNN is a new type of deep learning methods. LeCun [38] proposed the original CNN structure

134

to deal with the handwritten number recognition problems. The learning performance of the

135

system can be improved by the three core points of CNN technology: sparse interaction, parameter

136

sharing and pooling.

137

2.1 Sparse interactions

138

Traditional neural network is a fully connected structure. The connection between inputs and

139

outputs are realized by matrix multiplication. The interaction between input units and output units

140

can be reflected by the parameters in the parameter matrix, and the parameters in the parameter

141

matrix also represent the relationship between each corresponding output unit and input unit.

142

However, the sparse connection of CNN can make the size of the kernel much smaller than the

5

143

size of the inputs, thus reducing the dimension and computing complexity of system parameters.

144

Figure 1(a) describes the full connection between the two neuron layers of the traditional neural

145

network. Each neuron unit is connected to any neuron unit in the next layer. Figure 1(b) describes

146

the sparse connection structure in CNN. In such network, each neuron unit is connected to the

147

adjacent part of the neuron unit. In Figure 2, although sparsely connected network structures are

148

adopted, most inputs can still contribute to the outputs after passing through a deep network. Deep

149

units can be indirectly connected to all or most inputs at the same time. Sparse connection holds

150

the advantage of simpler network structure and lower computational complexity. Complex

151

relationship among multivariable can be effectively extracted the through partial connection

152

between neurons.

153 154

Figure 1(a) Fully connected structure

Figure 1(b) Sparsely connected structure

155 156 157

Figure 2 The acceptable domain of CNN architecture 2.2 Parameter sharing

6

158 159

Figure 3(a) Traditional neural network without

160

parameter sharing

Figure 3(b) Neural network with parameter sharing

161

The weights of the convolution kernel are acquired in learning process, and the weights of the

162

convolution kernel will not change during the convolution process. Figure 3(a) describes the

163

traditional neural networks. In Figure 3 (a), the purple-red arrow represents the use of intermediate

164

elements of the weight matrix in the fully connected model. This model does not use parameter

165

sharing, so the parameters are used only once. In Figure 3 (b), the purple-red arrow indicates the

166

use of intermediate elements of the element kernel in the convolution model. Because of

167

parameter sharing, this single parameter is used at all input locations. Parameter sharing reduces

168

the network complexity and the number of parameters that the model needs to store in calculation.

169

2.3 Pooling

170

Typical convolution neural networks mainly consist of three parts: convolution, detection and

171

pooling. Pooling can replace the network output of the current point as the output of the whole

172

function by the overall statistical characteristics of adjacent locations, thus greatly reduce the

173

dimension of parameters. There are two most common pooling methods: average pooling and

174

maximum pooling, and these two pooling methods are shown in Figure 4. In Figure 4(a) and 4(b),

175

, , ,  stands for the input matrix  items;  stands for the maximum value among , , , .

176

The transformation matrix of the pooling processing is represented by . The input dimension

7

177

can be reduced clearly according to the pooling layers. Therefore, pooling structures can greatly

178

reduce the computational burden of the network and improve the efficiency of network operation. Input X

Transformation W

Output Y

179 180

Figure 4(a) Maximum pooling Input X

Transformation W

Output Y

181 182

Figure 4(b) Average pooling

183

3. The proposed BS-AMCNN

184

3.1 The structure of the BS-CNN

185

Traditional CNN methods are usually used to process image information; however, the

186

industrial process data are always digital format with time-series. Traditional CNN methods

187

cannot be directly applied to industrial process modeling. In order to solve this problem, the

188

structure of traditional CNN has been modified. In traditional CNN methods, the size of the input

189

image data, convolution windows and pooling windows are  ×  ,  ×  and  ×  ,

190

respectively. And  >  ,  >  . The amount of data in the industrial process is much larger

191

than the number of variables. The number of associated variables in industrial process data can be

192

regarded as the width of the input data of the proposed BS-CNN structure, and the amount of

193

industrial process data can be regarded as the total length of the input data of the proposed BS-

194

CNN structure. The pixel points of the image are independent to each other; however, the

195

industrial process data are related to each other. When using CNN to process the industrial process

196

data, convolution windows need to cover all the associated variables. As shown in Figure 5(a), in

197

this paper the number of the associated variables is the same with the width of convolution

8

198

windows in the proposed BS-CNN structure. In the pooling structure, the characteristics of the

199

required variables cannot be removed by the pooling process, so the associated variables still need

200

to be remained. As shown in Figure 5(b), the width of pooling windows is also consistent with

201

the number of associated variables. Considering the information of industrial process data, pooled

202

values are averaged for each variable in the pooling windows to make the results more accurate.

203

Both the length of convolution windows and pooling windows,  and  , can be obtained by the

204

trial and error method. In Figure 5(a) and Figure 5(b),  and  represent the size of traditional

205

convolution window and the size of traditional pooling window, respectively. convolution window tc tc

pooling window t

lc

s

t

p

lp

p

s

206 207

Figure 5(a) the proposed convolution window

Figure 5(b) the proposed pooling window

208

The structure of the BS-CNN is shown in Figure 6. Assume that the initial samples  have

209

m rows and columns, where represents the number of the input variables and  stands for the

210

number of the initial samples . With the aid of the proposed panning bar-shaped convolution

211

window, the first convolution layer  with data  can be obtained. The width of the proposed

212

bar-shaped convolution window is equal to the number of the input attributes ;  stands for the

213

length of the convolution layer, the value of  can be obtained by . The convolution kernel

214

numbers stand for different shared weight numbers, and the features are extracted from . Then,

215

put  into the first pooling layer  .  stands for the output of  . According to the proposed bar-

216

shaped pooling window,  can be obtained after dimensionality reduction; the width of the

217

proposed bar-shaped pooling window is also equal to the number of the input attributes ; 

218

stands for the length of the pooling layer; the value of  can be obtained by  . After the

219

calculation of several convolution layers ( , , … ) and pooling layers ( ,  , … ) , a flat data

220

can be obtained with the outputs of the last pooling layer  . Then, two fully connected layers 

221

and  are added into the proposed BS-CNN structure.  and ! stand for the output of  and

222

 , respectively. Finally, calculate the final outputs  of the proposed BS-CNN with the upper

223

output.

9

224 225

Figure 6 The BS-CNN architecture

226

3.2 The structure of proposed BS-AMCNN

227

The BS-CNN structure solves the problem that CNN cannot be directly applied to industrial

228

process data, and considers all the input variables at the same time. However, the operation

229

statuses in the production process are changing all the time. When the steady-state changes are not

230

obvious, modeling speed will reduce if all the data are used to build the operation model. When

231

the data changes drastically, the fixed network structure cannot extract deep features, which will

232

cause the underfitting of the segment. To overcome this weakness, we proposed a new adaptive

233

mode CNN based on bar-shaped structures (BS-AMCNN). When it comes to modeling the

234

industrial process data, the network structures are divided into three different modes for different

235

steady-state data conditions. All boundary values are obtained by combining process knowledge

236

with trial and error algorithms. The three different modes are shown as follows.

237

Inheritance mode

238

If the steady-state fluctuate within a small range, we can regard that the industrial data have the

239

same characteristics as the previous data. And we don’t have to use the data with same features to

240

establish the operation model, the previous trained model can represent a steady-state process that

241

is essentially unchanged. Equation 1 shows the boundary.

242

"#$% = "'()*+,- /

243

01 2013 013

/≤

(1)

Where "#$% stands for the network structure selected under the inheritance mode, "'()*+,-

10

244

stands for the previous network structure, 562 represents the previous moment of the input

245

variable, 56 represents the current value of the input variable, and  according for the bound value

246

of the mode, which is usually range between 0% and 5%.

247

Normal mode

248

If the steady-state fluctuate between the boundary of inheritance mode and the boundary of

249

enhance mode, we can keep the structure of normal proposed BS-AMCNN. The normal mode

250

usually is the most frequently used mode, and we can continue to train the operation model with

251

the proposed BS-AMCNN method using the default number of layers and filters for CNN, and the

252

model can be described as follows. 01 2013

"#$7 = "892:;<77  < /

253

013

/ ≤ 

(2)

254

Where "#$7 represents the network selected under the normal mode, "892:;<77 represents

255

the normal BS-AMCNN structure, in which the number of filters and the layers is set as normal

256

CNN method, and  stands for the bound value of the enhanced mode, which is usually range

257

between 5% and 10%.

258

Enhanced mode

259

When the steady-state fluctuate is large enough, the normal BS-AMCNN structure cannot extract

260

deeper features hidden in the process data. Without changing the form of the network, we can

261

only extract the deeper features hidden in the data by increasing the number of layers of the

262

network and the number of filters. The final number of layers and the filters in the network can be

263

obtained by trial and error method. And the enhanced mode can be described as follows:

264

∗ "#$> = "892:;<77  < /

265

01 2013 013

/≤

(3)

∗ represents Where "#$> represents the network selected under the enhanced mode, "892:;<77

11

266

the enhanced BS-AMCNN structure, in which the depth of the network and the number of filters

267

can be adjusted to the optimal state, and  stands for the upper bound of the enhanced mode. 

268

usually ranges from 10%-15%. When the data changed over 15%, we treat the process data as

269

non-steady-state.

270

3.3 The parameters assignment of the proposed BS-AMCNN

271

Two computing functions, the activation function and the cost function, are involved in the

272

proposed BS-AMCNN structure. Nonlinear variables are added into the activation function to deal

273

with nonlinear data. Cost function often affects the experimental results, and a small cost function

274

can obtain a desirable result.

275

3.3.1 Activation functions

276

@$AB function has the advantages of simple calculation, fast convergence and unsaturation, it

277

is a commonly used activation function. In the proposed BS-AMCNN architecture, the ReLU

278

function is used instead of the traditional sigmoid function, and its characteristics are as follows: C C>0 @$AB(C) = D 0 C≤0

279

(4)

280

where C represents the value before activation function. @$AB is a special activation function,

281

which alters the negative values to zero while the positive values keep unchanged. The unilateral

282

inhibition of @$AB creates conditions for the sparse connection of CNN. The main features of

283

complex objects can also be obtained by the sparse property of @$AB function.

284

3.3.2 Cost functions

285

In complex neural network structure, the size of cost function is usually used to evaluate the

286

adaptability between the predicted value and the target value. A simple sample ( (*) ,  (*) ) of the

287

cost function can be defined as follows.

12



FGH, ;  (*) ,  (*) J = KℎM,N G (*) J −  (*) K

288



(5)

289

where G ( ) ,  ( ) J, ⋯ , G (Q) ,  (Q) J represent the  samples. Equation 6 describes the whole cost

290

function:

V

6

291

QX 2 6X XY ∑*Z ∑WZ

F(H, ) = Q ∑Q*Z F SH, ; G (*) ,  (*) JT + ∑[Z

(H*W6 )

292

where the unit  in the layer  and the unit \ in the layer  + 1 are related to the weight H*W6 ; ^

293

represents the weight decay parameter; F(H, ) represents the mean of the cost function as a

294

whole. Combining with the principle of gradient descent, we can get update formulas of H*W6 and

295

*W6 in the direction of negative gradient, as shown in Equation 7 and Equation 8. ab

296

H*W6_ = H*W6 − ` aM 1

297

*W6_ = *W6 − ` aN1

cd

ab

cd

(6)

(7) (8)

298

where ` represents the learning rate. In case that H*W6 and *W6 are layered, we can use iterative

299

computation to complete the whole algorithm as long as the partial derivatives of H*W6 and *W6 can

300

be obtained.

301

In summary, because of the proposed bar-shaped input data, bar-shaped convolution windows

302

and bar-shaped pooling windows of the proposed BS-AMCNN, industrial process data

303

associated with optimal operating conditions can be extracted using the proposed BS-

304

AMCNN. With the proposed adaptive mode in the BS-AMCNN structure, the optimal operating

305

strategy of the industrial processes can be modeled. Each layer of the proposed BS-AMCNN

306

structure can adjust itself for the final task. So, effective communication between layers can be

307

achieved. The application procedure of the proposed BS-AMCNN consists of two phases: the off-

308

line training phase and the on-line prediction. In the off-line training phase, convolution and

309

pooling processing are carried out repeatedly. And then the outputs of the final pooling layer are

13

310

put into the full connection layers to calculate out the outputs of the BS-AMCNN. The optimal

311

BS-AMCNN model can be confirmed by changing the length of convolution and pooling

312

windows using the trial and error method. During the on-line prediction phase, real-time process

313

variables are put into the well-trained BS-AMCNN to predict the optimal operation strategies.

314

Based on the descriptions above, the block diagram of the proposed BS-AMCNN method is

315

graphically illustrated in Figure 7. Off-line training

History data Select the network model

Convolution

Pooling

Repeat and get the best state

Well-trained BS-AMCNN model

Prediction of the operation values

On-line prediction

316 317

Output the best results

Results of the proposed BS-AMCNN structure

Change the length of windows

Process variables associated with operation variables

Full connection

Figure 7. The application procedure of the proposed BS-AMCNN

318

4. Case Study

319 320 321 322 323 324 325 326 327 328 329

With the development of modern industry, the consumption of non-renewable energy sources such as petroleum is increasing, and people must find new alternative renewable energy sources. Methanol is an important industrial product. It is a renewable, green and clean energy source, which can alleviate the problem of oil shortage to a large extent. Due to the complex engineering background, the methanol production process involves many random, fuzzy, uncertain and uncontrollable factors. Improper operation settings tend to reduce product quality and increase material consumption; therefore, it is necessary to monitor the state of complex chemical processes, evaluate the performance of manual operations and adjust operations in a timely and accurate manner. We use the proposed BS-AMCNN method to establish the operation model in this paper, which will be helpful with the material consumption and production efficiency improvement. Figure 8 shows the flowchart of the methanol production process.

14

U PSTREAM SYNGAS

J 121 S

FIC3103

PLANT AIR

V-1 S

FIC3106

FROM IMPORT HYDROGEN

V-2

E 121

E122

S

PIC3701B

TIC3401

S

D321 TO FLARE

D 121 V-7 S

PIC3302

V-3 S

V-5

FIC3503

R ECTIFICATION

V-6

E123A/B

S

E124

TIC3402

D 122

330 331 332 333

334 335 336 337 338 339 340 341 342

D322

V-4

Figure 8 The flow chart of Tennessee Eastman Process There are several important variables that related to the productions, which are shown in Table 1. Table 1 The most relevant operation variables with target variables Device Number Variable Name Variable Unit FIC3103 The airflow of the factory Nm3/h FIC3106 Hydrogen flow Nm3/h FIC3503 Crude methanol flow t/h PIC3302 Outlet pressure of methanol separator MPa PIC3701B Outlet pressure of purge gas MPa TIC3401 Outlet gas temperature of the first synthetic tower ℃ TIC3402 Outlet gas temperature of the second synthetic tower ℃ There are many links in the design of the production process of methanol. The main steps are as follows: First, the raw material gas is compressed and input into the preheater. After the temperature reaches the optimal temperature of the reaction, it is sent to the reactor for synthesis under the action of the catalyst. The obtained products are firstly cooled by a cooler, and then are sent to a separator after reaching a certain temperature. Crude methanol is separated in the separator, and the raw material gas that has not fully reacted enters the cycle to continue the reaction. The main formulas designed are as follows: g6gWh-6

 + 2f ijjjjk f f g6gWh-6

343

 + 3f ijjjjk f f + f 

344

 + f ijjjjk  + f 

345 346 347 348 349

g6gWh-6

(9) (10) (11)

All the above are exothermic reactions. FIC3503 is regarded as the output of the methanol production process in this paper. there are 20000 samples of each variable, 12000 samples are selected to train the BS-ARFTCN model, and the rest 8000 samples are used to test the welltrained model. Figure 9 shows the historical data of the selected samples. The program runs on Intel(R) Core (TM) i7-7700HQ NVIDIA GeForce GTX 1060 computer.

15

350 351 352

Figure 9 The historical data of selected samples First, the range of the selected input data each time can be obtained by the trial and error

353

method. The error with different range values is shown in Figure 10. When the range of the

354

selected input data is 40, the final valve opening degree can reach the minimum error

355

7.482 × 10 −6 . Different lengths of the proposed bar-shaped convolution windows are determined

356

using the trial and error method. The error with different length values is shown in Figure 11.

357

When the proposed bar-shaped convolution window size is 4, the final operation model can reach

358

the minimum error 7.397 × 102p. Different lengths of the proposed bar-shaped pooling windows

359

can also be determined using the trial and error method in Figure 11. When the optimal size of the

360

proposed bar-shaped pooling window is 3, the final operation model can reach the minimum error

361

5.459 × 102p. In the proposed BS-AMCNN method, the above selected parameters are used. The

362

optimal result for the proposed BS-AMCNN can be obtained.

16

363 364

Figure 10 Output errors with different range of selected data

365 366

Figure 11 Output errors with different range of pooling windows and convolution windows

367 368 369 370 371

After the parameters are settled, the training sets are put into the proposed BS-AMCNN algorithm to verify the effectiveness of the method. Figure 12 shows the modeling results and Figure 13 shows the error of the built model. It is clear that the proposed BS-AMCNN could help to build a good operation model. The predicted model can be used in the methanol production process to predict the production of crude methanol.

17

372 373

Figure 12 the training results of the crude methanol

374 375 376 377 378 379

Figure 13 the training errors of the methanol production Next, we use the test sets to verify the established model, and the testing results could meet our requirements. Figure 14 shows the test results of the crude methanol production, and Figure 15 shows the errors of the production. Table 2 shows the training and testing results of the proposed BS-AMCNN method.

18

380 381

Figure 14 The test results of the crude methanol production

382 383 384

385 386

Figure 15 The test errors of the crude methanol production Table 2 The training and testing results of the proposed BS-AMCNN method The proposed BS-AMCNN method Training errors Testing errors 2 Crude methanol Production −4.26 × 10 −5.95 × 102 In order to validate the performance of the proposed BS-AMCNN method, we further use

387

other two different methods, the traditional CNN method and the Back Propagation (BP) neural

388

network method, to compare with the proposed method. The experimental results of four different

389

optimization operation models are shown in Figure 16. The errors of three different models are

390

shown in Figure 17.

19

391 392

Figure 16 The comparison of three different operation modeling methods

393 394 395

Figure 17 The errors of three different operation modeling methods Table 3 The errors for the crude methanol productions with different methods Methods

BP

The traditional CNN

The proposed BS-AMCNN

Errors

−3.42 × 102

−2.65 × 102

−5.95 × 102

396

In order to compare the result more precisely, we illustrate the errors in Table 3. Table 3

397

shows that the prediction errors for crude methanol production with the BP method, the traditional

398

CNN method, and the proposed BS-AMCNN method are −3.42 × 102 , −2.65 × 102 , −5.95 ×

399

102, respectively. It is demonstrated that the proposed BS-AMCNN method can receive the most

400

accurate operation modeling results with the help of the bar-shaped structures and the adaptive

20

401

network mode. This kind of long bar translation-only sliding windows can suite well with the time

402

series industrial process data.

403

5. Conclusions

404

A novel intelligent modeling framework combing adaptive mode convolutional neural

405

network with bar-shaped structure (BS-AMCNN) is proposed to deal with the optimal operation

406

modeling problems in industrial processes. With the aid of the proposed bar-shaped structures, the

407

CNN algorithm can be successfully applied to processing complex industrial process data of time

408

sequence. Meanwhile the adaptive mode can coordinate the proposed BS-AMCNN method

409

dealing with the time-varying working environment. In order to prove the effectiveness of the

410

proposed BS-AMCNN, a real methanol production process is taken as an experimental object. The

411

proposed BS-AMCNN structure is used to validate an optimal operation modeling problem. The

412

BP method and the traditional CNN method are further used to verify the performance of the

413

proposed BS-AMCNN method. the results show that the proposed BS-AMCNN method could

414

achieve the highest prediction? accuracy. Therefore, the proposed BS-AMCNN method can be

415

served as an effective operation modeling tool for industrial processes.

416 417

References

418

[1] Carlini M, Mennuni A, Allegrini E, et al. Energy Efficiency in the Industrial Process of

419

Hair Fiber Depigmentation: Analysis and FEM Simulation[J]. Energy Procedia, 2016, 101:550-

420

557.

421 422 423 424 425 426 427 428

[2] Giacone E, Mancò S. Energy efficiency measurement in industrial processes[J]. Energy, 2012, 38(1): 331-345. [3] Hameed B H, El-Khaiary M I. Malachite green adsorption by rattan sawdust: Isotherm, kinetic and mechanism modeling[J]. Journal of Hazardous Materials, 2008, 159(2-3): 574-579. [4] Hill D J, Minsker B S. Anomaly detection in streaming environmental sensor data: A datadriven modeling approach[J]. Environmental Modelling & Software, 2010, 25(9): 1014-1022. [5] Hameed B H, El-Khaiary M I. Malachite green adsorption by rattan sawdust: Isotherm, kinetic and mechanism modeling[J]. Journal of Hazardous Materials, 2008, 159(2-3):574.

21

429 430 431 432

[6] Glarborg P, Marshall P. Mechanism and modeling of the formation of gaseous alkali sulfates[J]. Combustion & Flame, 2005, 141(1):22-39. [7] Frenklach M, Wang H. Detailed Mechanism and Modeling of Soot Particle Formation[J]. 1994, 59(59):165-192.

433

[8] Jiang, Jingliang, Peiqi, et al. Study on micro-interacting mechanism modeling in grinding

434

process and;ground surface roughness prediction[J]. International Journal of Advanced

435

Manufacturing Technology, 2013, 67(5-8):1035-1052.

436 437 438 439

[9] Wu C L, Chau K W, Li Y S. Predicting monthly streamflow using data℃driven models coupled with data℃preprocessing techniques[J]. Water Resources Research, 2009, 45(8). [10] Boets P, Lock K, Messiaen M, et al. Combining data-driven methods and lab studies to analyse the ecology of Dikerogammarus villosus[J]. Ecological Informatics, 2010, 5(2):133-139.

440

[11] Nuhic A, Terzimehic T, Soczka-Guth T, et al. Health diagnosis and remaining useful life

441

prognostics of lithium-ion batteries using data-driven methods[J]. Journal of Power Sources, 2013,

442

239:680-688.

443 444 445 446

[12] Zhang X, Liu J, Li B, et al. DONet/CoolStreaming: A data-driven overlay network for live media streaming[J]. Proc IEEE Infocom, 2005, 3:2102 - 2111. [13] Han J, Cai Y, Cercone N. Data-driven discovery of quantitative rules in relational databases[J]. IEEE Trans.knowl.data Eng, 1993, 5(1):29-40.

447

[14] Jarmo J. Huuskonen, David J. Livingstone and Igor V. Tetko. Neural Network Modeling

448

for Estimation of Partition Coefficient Based on Atom-Type Electrotopological State Indices[J].

449

Journal of Chemical Information & Computer Sciences, 2000, 40(4):947.

450

[15] Lee D S, Jeon C O, Park J M, et al. Hybrid neural network modeling of a full-scale

451

industrial wastewater treatment process[J]. Biotechnology & Bioengineering, 2010, 78(6):670-682.

452

[16] He Y L, Xu Y, Geng Z Q, et al. Hybrid robust model based on an improved functional

453

link neural network integrating with partial least square (IFLNN-PLS) and its application to

454

predicting key process variables[J]. ISA transactions, 2016, 61: 155-166.

455

[17] Ling S H, Iu H H C, Leung F H F, et al. Improved Hybrid Particle Swarm Optimized

456

Wavelet Neural Network for Modeling the Development of Fluid Dispensing for Electronic

457

Packaging[J]. IEEE Transactions on Industrial Electronics, 2008, 55(9):3447-3460.

22

458

[18] Beşikçi E B, Arslan O, Turan O, et al. An artificial neural network based decision

459

support system for energy efficient ship operations[J]. Computers & Operations Research, 2016,

460

66: 393-401.

461 462

[19] Cui G M, Deng-Fei H U, Xiang M A. Operational-Pattern Optimization in Blast Furnace PCI Based on Prediction Model of Neural Network[J]. Journal of Iron & Steel Research, 2014.

463

[20] Rangwala S S, Dornfeld D A. Learning and optimization of machining operations using

464

computing abilities of neural networks[J]. Systems Man & Cybernetics IEEE Transactions on,

465

1989, 19(2):299-314.

466

[21] Ochoa-Estopier L M, Jobson M, Smith R. Operational optimization of crude oil

467

distillation systems using artificial neural networks[J]. Computers & Chemical Engineering, 2013,

468

59(5):178-185.

469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486

[22] Lawrence S, Giles C L, Tsoi A C, et al. Face recognition: a convolutional neural-network approach[J]. IEEE Trans Neural Network, 1997, 8(1):98-113. [23] Deng L, Yu D. Deep Learning: Methods and Applications[J]. Foundations & Trends in Signal Processing, 2014, 7(3):197-387. [24] Liu Z, Luo P, Wang X, et al. Deep Learning Face Attributes in the Wild[J]. 2014:37303738. [25] Längkvist M, Karlsson L, Loutfi A. A review of unsupervised feature learning and deep learning for time-series modeling ☆[J]. Pattern Recognition Letters, 2014, 42(1):11-24. [26] Majumder N, Poria S, Gelbukh A, et al. Deep learning-based document modeling for personality detection from text[J]. IEEE Intelligent Systems, 2017, 32(2): 74-79. [27] Mohamed A, Dahl G E, Hinton G. Acoustic Modeling Using Deep Belief Networks[J]. IEEE Transactions on Audio Speech & Language Processing, 2011, 20(1):14-22. [28] Basu S, Karki M, Ganguly S, et al. Learning sparse feature representations using probabilistic quadtrees and deep belief nets[J]. Neural Processing Letters, 2017, 45(3): 855-867. [29] Du B, Xiong W, Wu J, et al. Stacked convolutional denoising auto-encoders for feature representation[J]. IEEE transactions on cybernetics, 2017, 47(4): 1017-1027. [30] Ding Y, Zhang X, Tang J. A Noisy Sparse Convolution Neural Network Based on Stacked Auto-encoders[J]. network, 2017, 2: 1.

23

487 488

[31] Williams R J, Zipser D. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks[J]. Neural Computation, 2014, 1(2):270-280.

489

[32] Zhang X Y, Yin F, Zhang Y M, et al. Drawing and recognizing chinese characters with

490

recurrent neural network[J]. IEEE transactions on pattern analysis and machine intelligence, 2018,

491

40(4): 849-862.

492 493 494 495

[33] Zhu J, Liao S, Lei Z, et al. Multi-label convolutional neural network based pedestrian attribute classification[J]. Image and Vision Computing, 2017, 58: 224-229. [34] Kim Y. Convolutional Neural Networks for Sentence Classification[J]. Eprint Arxiv, 2014.

496

[35] Chen Y, Jiang H, Li C, et al. Deep Feature Extraction and Classification of Hyperspectral

497

Images Based on Convolutional Neural Networks[J]. IEEE Transactions on Geoscience & Remote

498

Sensing, 2016, 54(10):6232-6251.

499

[36] Wiatowski T, Bölcskei H. A Mathematical Theory of Deep Convolutional Neural

500

Networks for Feature Extraction[J]. IEEE Transactions on Information Theory, 2015, PP (99):1-1.

501

[37] Acharya U R, Fujita H, Lih O S, et al. Automated detection of arrhythmias using

502

different intervals of tachycardia ECG segments with convolutional neural network[J].

503

Information sciences, 2017, 405: 81-90.

504 505

[38] Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11):2278-2324.

24



An adaptive mode CNN based on bar-shaped structure (BS-AMCNN) is proposed.



The convolution and pooling windows are transformed into bar-shaped structures.



The adaptive mode in BS-AMCNN can suite the changeable working conditions.



Optimal operations can be effectively extracted using the proposed BS-AMCNN method.



The proposed method outperforms standard models on methanol production process.

Declaration of interests ☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. ☒The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:

Yongjian Wang Hongguang Li