JID:AESCTE AID:105775 /FLA
[m5G; v1.282; Prn:13/02/2020; 8:24] P.1 (1-11)
Aerospace Science and Technology ••• (••••) ••••••
1
Contents lists available at ScienceDirect
67 68
2 3
Aerospace Science and Technology
4
69 70 71
5
72
6
www.elsevier.com/locate/aescte
7
73
8
74
9
75
10
76
11 12 13
A modeling method for aero-engine by combining stochastic gradient descent with support vector regression
16 17
81
College of Energy and Power Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
83
82 84 85
19
a r t i c l e
i n f o
a b s t r a c t
86 87
21 22 23 24 25 26 27 28 29 30 31 32
79
Li-Hua Ren, Zhi-Feng Ye, Yong-Ping Zhao ∗
18 20
78 80
14 15
77
Article history: Received 9 September 2019 Received in revised form 10 December 2019 Accepted 8 February 2020 Available online xxxx Communicated by Xinqian Zheng Keywords: Modeling Aero-engine Machine learning Stochastic gradient descent Support vector regression
Aero-engine aerodynamic model is widely applied to identify the aerodynamic parameters of components like compressor pressure, turbine temperature and so on. A data-driven modeling method for the aeroengine aerodynamic model by combining stochastic gradient descent with support vector regression (SGDSVR) is proposed. A novel support vector regression (SVR) training mechanism that combines batch learning with online learning is presented according to the demand and characteristic of the aero-engine aerodynamic model. In the training mechanism, batch learning is to build the initial model and online learning is to modify the online model based on the initial model. An improved sequential minimal optimization (SMO) algorithm is introduced during building the initial model phase and the SGDSVR algorithm is proposed during modifying online model phase. The simulation data of an aero-engine component-level model and the flight data of a certain aircraft are used to test the modeling method and the proposed method shows better performance compared with traditional methods. © 2020 Elsevier Masson SAS. All rights reserved.
88 89 90 91 92 93 94 95 96 97 98
33
99
34
100
35
101
36 37
1. Introduction
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
Aero-engine is the power source of airplanes, so its safety and reliability are very important, and aero-engine health management (EHM) emerged [1]. Health monitoring system is a critical part of EHM system [2]. The health monitoring system can provide the real-time information weather the aero-engine is healthy or not. With the rapid development of modern aircraft, the health monitoring system is embedded in aero-engine widely to guarantee safety. Health monitoring techniques are also applied to engines during the development process and test process [3]. One of the monitoring approaches is building the identification model by using machine learning algorithms such as support vector regression, neural networks and so on. These identification models are built by training the health state data and can predict the value of health state variables. If the sensor state variables have differences with the predicted variables, the engine can be considered as abnormal or unhealthy [4][5]. Generally, the machine learning algorithms for identification models are implemented by batch learning. However, conventional batch learning methods are inefficient for building an aero-engine model which will degrade with time [6]. Different training data might increase with the increment of flight times and the iden-
60 61 62 63 64 65 66
*
Corresponding author. E-mail address:
[email protected] (Y.-P. Zhao).
https://doi.org/10.1016/j.ast.2020.105775 1270-9638/© 2020 Elsevier Masson SAS. All rights reserved.
tification model needs to be rebuilt after each flight accordingly. Consequently, online learning will be an ideal approach to deal with these problems. This training approach can constantly improve the model and adapt to the degradation of the aero-engine with the increment of flight times. Online learning is a kind of training method, being not a new algorithm. Generally, all of the machine learning models can be trained by online learning approaches. For neural networks, online learning could be a simple stochastic gradient descent progress when a new sample comes. However, for SVR online learning is not such simple because the solution idea of SVR is to find the support vectors among all the samples. Consequently, the SVR training method cannot be changed from batch learning to online learning easily. Whereas both online neural network and online SVR are viable for building aero-engine online model, the neural network might trap in local optimum and the SVR will not. In this article, the SVR is applied to build the aero-engine model. Discussions regarding online SVR algorithms research in recent years. Accurate online support vector regression (AOSVR) was proposed by Ma [7]. The algorithm is a regression version of incremental and decremental support vector machine (SVM) proposed by Cauwenberghs [8]. In this algorithm, sample sets are divided into error support vectors set, margin support vectors set and remaining samples set. When a new sample comes, the basic idea is to change the Lagrange coefficient corresponding to the new sample in a finite number of discrete steps until it meets the KarushKuhn-Tucker (KKT) conditions and the new sample will be added
102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132
JID:AESCTE
2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
AID:105775 /FLA
[m5G; v1.282; Prn:13/02/2020; 8:24] P.2 (1-11)
L.-H. Ren et al. / Aerospace Science and Technology ••• (••••) ••••••
into one of the three sets finally, while ensuring that the existing samples in training set continue to satisfy the KKT conditions at each step. Base on the work of Cauwenberghs and Ma, stabilization is implemented by Parrella to avoid the problems of instability errors due to an enormous number of floating-point operations [9]. The basic idea is to use the same principles applied in normal algorithm and extend them with new possible moves permitted only when a sample exit from its set. He also generously provides a Matlab prototype, a C++ implementation and a Windows interface code on his website for interested researchers. Laskov developed a new store design and organization of computation for incremental SVM learning to improve computational efficiency. The idea is to judiciously use row-major and column-major storage of matrices instead of one-dimensional arrays, to possibly eliminate selection operations [10]. However, there are two major problems if these online learning algorithms are applied to the aero-engine health monitoring model directly. One is that these algorithms need to reserve the learned samples, consequently the computational complexity will increase with the increasing of samples which will affect the real-time performance of aero-engine monitoring. The other one is that the learning process of these online SVR algorithms start from scratch which will take a long online learning time for the model to meet the health monitoring demand. In fact, there is historical data more or less for training the initial model. Modifying the model based on the initial model can speed up the modeling process. According to the demand and characteristics of aero-engine health monitoring, a novel modeling method based on SGDSVR is proposed. The proposed algorithm combines batch learning with online learning. In the algorithm, batch learning is to build the initial model and online learning is to modify the online model based on the initial model. In summary, this algorithm consists of three key components as follows.
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
1) The initial identification model is trained with SMO which is an optimal batch learning algorithm for SVR, and the SMO used in this paper is an improved version proposed by Fan [11]. In the improved SMO algorithm, the working sets are selected according to the second-order information which will achieve faster convergence than the traditional SMO. 2) The online model based on the initial SVR model is online modified with the SGD algorithm. In this algorithm, when a new sample comes, the values of support vectors and bias will update according to the error between the actual value and predicted value of the current sample. The modifying process is only related to the current sample and the learned samples do not need to be reserved, so the computational efficiency can be improved apparently. Online learning the SVR model with the SGD algorithm is our original idea for the regression function of SVR, which resembles training ra dial basis kernel (RBF) neural network to some extent. In a sense, the modified model is not an SVR model any more, but more like RBF neural network because the support vectors do not meet the KKT conditions any more. 3) A modifying switch is introduced, the online learning process can be stopped and started by the operator. There are three main reasons for the modifying switch. One is to avoid modifying the model when the aero-engine is abnormal which will contaminate the model. The second one is to avoid modifying the model when the aircraft is waiting to take-off or cruising for a long time which might cause the model overfitting. The third one is to reduce the computation burden when there is no need to modify the model.
64 65 66
This paper is structured as follows. In section 2, the nonlinear autoregressive with exogenous inputs (NARX) model is introduced
to build the aero-engine identification model. The mapping function of NARX is SVR. In section 3, the mathematical model of SVR is dwelled on, and an improved SMO algorithm is applied to train the SVR model as the initial model. In section 4, the SGDSVR algorithm is proposed to online modify the model based on the initial SVR model built in section 3. In section 5, the process of building the initial model with the improved SMO algorithm and online modifying the model with the SGDSVR algorithm is elaborated on, then the robustness of the proposed algorithms is tested with noisy data and the performance is compared with the AOSVR. Finally, conclusions follow.
67 68 69 70 71 72 73 74 75 76 77 78
Notes: scalars are represented in lowercase letters; vectors are in lowercase bold and matrices are in uppercase bold in the following sections.
79 80 81 82 83
2. NARX model
84
NARX model is a widely-used nonlinear system identification model for its excellent dynamic performance. There have been many successful cases to prove that NARX is an effective way for dynamic identification modeling [12][13]. NARX model is usually used with neural network by replacing the mapping function with neural network. The mapping function can also be replaced with SVR. As can be seen in Fig. 1, there are two different architectures of NARX SVR model, open-loop architecture and close-loop architecture given by the Eq. (1) and (2), respectively [14]:
[ yˆ 1 (t ), yˆ 2 (t ), · · · , yˆ n (t )] = f [u 1 (t ), u 1 (t − 1), · · · , u 1 (t − T order_u ), u 2 (t ), u 2 (t − 1), · · · , u 2 (t − T order_u ), ··· , um (t ), um (t − 1), · · · , um (t − T order_u ), y 1 (t − 1), y 1 (t − 2), · · · , y 1 (t − T order_y ), y 2 (t − 1), y 2 (t − 2), · · · , y 2 (t − T order_y ), ··· , yn (t − 1), yn (t − 2), · · · , yn (t − T order_y )] [ yˆ 1 (t ), yˆ 2 (t ), · · · , yˆ n (t )] = f [u 1 (t ), u 1 (t − 1), · · · , u 1 (t − T order_u ), u 2 (t ), u 2 (t − 1), · · · , u 2 (t − T order_u ), ··· , um (t ), um (t − 1), · · · , um (t − T order_u ), yˆ 1 (t − 1), yˆ 1 (t − 2), · · · , yˆ 1 (t − T order_y ), yˆ 2 (t − 1), yˆ 2 (t − 2), · · · , yˆ 2 (t − T order_y ), ··· , yˆ n (t − 1), yˆ n (t − 2), · · · , yˆ n (t − T order_y )]
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
(1)
100 101 102 103 104 105 106 107 108 109
(2)
where m and n represent the input and output variables number respectively. u, y and yˆ represent the input, actual output and predicted output variables respectively; T order_u and T order_ y represent the input and output orders respectively; f (•) is a nonlinear mapping function between inputs and outputs. In this article the map f (•) is an SVR function. In the open-loop architecture, the present value yˆ (t ) is predicted from the present value of u, past values of u and past values of y. In the closed-loop architecture, the present value yˆ (t ) is predicted from the present value of u, past values of u and past values of yˆ . Generally, during the training phase, the open-loop architecture is used because using the true past output values to train the model can make the model more accurate. During testing and online predicting phase, the closed-loop architecture is used when the output past true values is not available or the multi-step-ahead prediction is required. Multi-step-ahead prediction means that the future value of output needs to be predicted. If the value after n
110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132
JID:AESCTE AID:105775 /FLA
[m5G; v1.282; Prn:13/02/2020; 8:24] P.3 (1-11)
L.-H. Ren et al. / Aerospace Science and Technology ••• (••••) ••••••
3
1
67
2
68
3
69
4
70
5
71
6
72
7
73
8
74
9
75
10
76
11
77
12
78
13
79
14
80
15
81
16
82
17
83
18
84 85
19 20
86
Fig. 1. Architectures of NARX SVR model.
87
21 22 23 24 25 26 27 28 29 30
time steps yˆ (t + n) need to be predicted, firstly, the present predicted value yˆ (t ) should be calculated, then take yˆ (t ) and u (t + 1) as input to predict yˆ (t + 1) and so forth to obtain yˆ (t + n). In this recursive process, the prediction error will accumulate because the next time step output is predicted from the predicted output. In the health monitoring system, the present and past true values can be obtained and there is no need to do multi-step-ahead prediction. Therefore, the open-loop architecture is also used in testing and online predicting phase for its accuracy in this article.
88 89 90 91 92 93 94 95 96 97
31 32
98
3. SVR
99
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
63 64 65 66
Fig. 2. Schematic diagram of
The NARX model is a kind of nonlinear model and SVR can approach any nonlinear function [15]. For samples D = {(x1 , y 1 ), (x2 , y 2 ), · · · , (xl , yl )}, xi ∈ Rn , y i ∈ R, i = 1, 2, · · · , l, where xi is an n-dimensional vector and represents the input of a sample; y i is a real number and represents the output of a sample; l is the number of samples. The expression of SVR is shown as equation.
yˆ = f (x) = w φ(x) + b
(3)
where w is the weight vector of the regression function, b is the bias of the regression function, φ(x) is the mapped vector of support vector x. For traditional regression model, the loss function is usually calculated based on the error between the predicted output f (x) and real output y, and the loss equals to zeros if and only if f (x) and y are identical. In contrast, SVR hypothesize that the deviation for acceptance between f (x) and y is ε and the loss will be calculated if only the error between f (x) and y bigger than ε . As the Fig. 2 shows, we can obtain a tube with the f (x) as center and the insensitive loss factor ε as breadth. The predicted value yˆ can be regarded as correct the value is located in the “ε -tube”, therefore the regression problem is known as ε − SVR. The loss function of ε − SVR is defined as:
61 62
100
3.1. Mathematical model
L=C
l i =1
ε ( f ( x i − y i ) +
1 2
2
w
where C is the regularization constant and ε is the loss function.
(4)
ε ( z ) =
ε − SVR.
101 102
0, | z| − ε ,
if | z| ≤ ε otherwise
(5)
In the loss function, the first term is the error term and the second term 1/2 w 2 is the L2 regularization term. SVR is usually applied to the small-sample problem which often occurs overfitting and L2 regularization can avoid overfitting to a certain extent. In the previous discussion, we hypothesize that the training samples can all located in the “ε -tube”, however, it is often difficult to meet the hypothesis in the actual task. To alleviate the problem, some samples are allowed to exceed the constraint. The soft margin is introduced for this reason. In soft margin SVR, slack variables ξi and ξˆi are introduced to represent the degree beyond constraint for each sample. Then the loss function will have four optimal parameters: w, b, ξi and ξˆi . The SVR optimization problem can be rewritten as Eq. (6) and the constraint can be quantified according to the definition of slack variables.
103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120
min L ( w , b, ξi , ξˆi ) =
1 2
l w 2 + C (ξi + ξˆi )
st . f (xi ) − y i ≤ ε + ξi
121 122
i =1
123
(6)
y i − f (xi ) ≤ ε + ξˆi
ξi ≥ 0, ξˆi ≥ 0, i = 1, 2, · · · , l
124 125 126 127 128
3.2. Lagrange dual problem
129 130
ε -insensitive
In Eq. (6), the support vector regression is cleverly formulated under the umbrella of convex optimization. Meanwhile, we should
131 132
JID:AESCTE
AID:105775 /FLA
[m5G; v1.282; Prn:13/02/2020; 8:24] P.4 (1-11)
L.-H. Ren et al. / Aerospace Science and Technology ••• (••••) ••••••
4
1 2 3
note that the training sample is embodied in four constraints. This constrained optimization problem is called the primal problem. It is basically characterized as follows:
4 5 6
• The loss function is a convex function of optimal parameters. • The constraints are linear in optimal parameters.
7 8 9 10 11 12 13 14 15 16
Accordingly, we may solve the constrained-optimization problem by using the method of Lagrange multipliers [16]. First, construct the Lagrange function by introducing Lagrange multipliers μi > 0, μˆ i > 0, αi ≥ 0 and αˆ i ≥ 0.
ˆ i , ξi , ξˆi , μi , μ ˆ i) L ( w , b , αi , α =
17 18 19 20 21 22 23 24 25 26 27 28 29
+
1 2
w 2 + C
l
m
l
i =1
i =1
(ξi + ξˆi ) −
αi ( f ( xi ) − y i − ε − ξ i ) +
i =1
μi ξi −
l
l
μˆ i ξˆi
i =1
(7)
αˆ i ( y i − f (xi ) − ε − ξˆi )
The solution to the constrained-optimization problem is determined by the saddle point of the Lagrange function. A saddle point of a Lagrange is a point where the roots are real, but of opposite signs; such a singularity is always unstable. The saddle point has to be minimized with respect to w and b; it also has to be maximized with respect to ξi and ξˆi . Thus, let the partial derivatives ˆ i , ξi , ξˆi , μi , μ ˆ i ) with respect to variable w, of function L ( w , b, αi , α b, ξi and ξˆ equal to zero, and then the following equations can be obtained.
32
i =1
33 34 35 36 37 38 39 40 41 42
(8)
0=
(αˆ i − αi )
(9)
i =1
C = αi + μi
(10)
ˆi + μ ˆi C =α
(11)
Substituting the Eq. (8) to (11) into the equation gives the SVR dual problem as follows:
43 44 45
1 l
min f (α , α) =
46
2
(αˆ i − αi )(αˆ j − α j ) K (xi ,x j )−
48
51 52 53 54 55 56 57
ˆ i − αi ) + ε (αˆ i + αi ) y i (α
i =1
where K (xi ,x j ) = φ(xi ) T φ(x j ) represents the kernel function. The above process needs to meet the KKT conditions as follows:
αi ( f ( xi ) − y i − ε − ξi ) = 0 αˆ i ( y i − f (xi ) − ε − ξˆi ) = 0
59
αi αˆ i = 0, ξi ξˆi = 0
61 62 63
66
(13)
(C − αi )ξi = 0, (C − αˆ i )ξˆ = 0
f ( x) =
l
(αˆ i − αi ) K (xi ,x) + b
i =1
69 70 71 72 74
i =1
76 77 78 79 80 81 82 83
(αˆ i − αi ) K (xi , x j ))/ p
(16)
84 85 86 87
3.3. Improved SMO
88
The initial model can be trained with batch learning algorithm for the fact that there are history flight data. It is significant to select the batch learning algorithm for training the initial model because an efficient and accurate initial modeling method can speed up the whole modeling process. A SMO algorithm using secondorder information is adopted to train the initial model. The key to the ε − SVR problem is how to solve the dual problem (12) efficiently. A SMO decomposition method using secondorder information is applied to solve this problem. Dual problem (12) can be rewritten as:
K −K , αˆ ] −K K
α ˆ ) = [α min f (α , α ˆ 2 α α + [ε e T + y T , ε e T − y T ] αˆ α T = 0; 0 ≤ αi , αˆ i ≤ C , i = 1, 2, · · · , l st . Z αˆ T
90 91 92 93 94 95 96 97 98 99 100
T
101
(17)
102 103 104 105 106
where e is the vector of all ones, Z is a 2l by 1 vector with Z i = 1, i = 1, 2, · · · , l and Z i = −1, i = l + 1, l + 2, · · · , 2l. And problem (17) can be converted to problem (18) [17].
1
89
107 108 109 110
βT Q β − P T β
2 st . Z β = 0; 0 ≤ βi ≤ C , i = 1, 2, · · · , 2l K −K εe + y ˆ T ], Q = where β = [α T , α , P =− . −K K εe − y
(18)
The matrix Q is usually fully dense and maybe too large to be stored. Decomposition methods are designed to handle such difficulties [18]. Unlike most optimization methods that update the whole β vector in each step of an iterative process, the decomposition method modifies only a subset of per iteration. This subset, denoted as the working set B, leads to a small sub-problem to be minimized in each iteration. An extreme case is the SMO [19][20], which restricts B to have only two elements. Then in each iteration one does not require any optimization algorithm to solve a simple two-variable problem. Following are the main steps of algorithm 1 and the flowchart of algorithm 1 is shown as Table 1.
111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127
Substituting the Eq. (8) into equation gives the SVR function:
64 65
(12)
l st . (αˆ i − αi ) = 0; 0 ≤ αi , αˆ i ≤ C
58 60
j =1
T
i =1
49 50
(y j + ε −
l
min f (β) =
i =1 j =1
l
47
l
68
75
In theory, we can calculate b by any sample that meets the condition 0 < αi < C . In practice, a more robust approach is often used: select multiple or all of the samples which meet the condition 0 < αi < C to solve b and then calculate the mean value. The final equation of b is shown as Eq. (16), where p represents the number of samples that meet the condition 0 ≤ α j ≤ C .
1
l
67
73
(15)
i =1
b=
i =1
l w= (αˆ i − αi )φ(xi )
l b = yj +ε − (αˆ i − αi ) K (xi , x j ))
p
30 31
Considering the KKT conditions Eq. (13), each sample (xi , y i ) should meet (C − αi )ξi = 0 and αi ( f (xi ) − y i − ε − ξi ) = 0. Thereupon, after obtaining αi from Eq. (12), if 0 < αi < C , ξi will be equal to zero. Thus, f (xi ) − y i − ε = 0, then b can be obtained by replacing f (xi ) with Eq. (14).
(14)
Algorithm 1: SMO 1: find β 1 as the initial feasible solution, and set k = 1. 2: if β k is an optimal solution of Eq. (18), stop. Otherwise, find a two-element working set B k = {i , j } ⊂ {1, 2, · · · , 2l} by Algorithm 2.
128 129 130 131 132
JID:AESCTE AID:105775 /FLA
[m5G; v1.282; Prn:13/02/2020; 8:24] P.5 (1-11)
L.-H. Ren et al. / Aerospace Science and Technology ••• (••••) ••••••
1
5
67
Table 1 The flowchart of algorithm 1.
2
68
3
Algorithm 1: SMO
69
4
Input: Training Set D = {(x1 , y 1 ), (x2 , y 2 ), · · · , (xl , yl )}; Work set selection algorithm (Algorithm 2): B ; Process: K −K O εe + y , β= , G =− , 1: compute Q = O −K K εe − y
70
5 6 7 8
T
71
Z =−
T
72
e , −e
where
K i j = K (xi , x j ), O = 0 0 · · · 0 1×l , e = 1 1 · · · 1 1×l ; #Eq. (18) 2: while(true) do 3: B = [i , j ] = B ( G , β, Z , Q ); #Algorithm 2 4: if j = −1; break while; end if; 5: solve the constrained-optimization problem with two optimal parameters: βi and β j ; #Eq. (19) 6: end while Output: β ;
9 10 11 12 13 14
17 18 19
22 23 24 25 26 27 28 29 30 31
β kB
β kN
3: Define N ≡ {1, 2, · · · , 2l}\ B , and to be sub-vectors of iteration k corresponding to B k and N k , respectively. 4: solve the following sub-problem with the variable β kB . k
20 21
k
1
Qk
Qk
BB BN [(β kB )T , (β kN )T ] Q kN B Q kN N 2 k βB − [( P kB )T , ( P kN )T ] β kN
min
β kB β kN
1
= (β kB )T Q kB B β kB + (− P kB + Q kB N β kN )T β kB + constant 2
1 k k Q iik Q ikj βik βik k k k T = [βi , β j ] + (− P B + Q B N β N ) k k k βj β kj Q ij Q jj 2 + constant st .0 ≤ βik , β kj ≤ C ; zki βik + zkj β kj = −( zkN ) T β kN
32 33 34 35
(19) 5: set β kB+1 to be the optimal solution of step 4 and β kN+1 ≡ β kN . Set k ← k + 1 and go to step 2.
36 37 38 39 40 41 42
Better methods for working set B k selection could reduce the number of iterations. The working set selection method using second-order information proffered by Fan is employed which can lead to faster convergence. The working set B k selection algorithm is organized in algorithm 2 and the detailed flowchart of algorithm 2 is shown as Table 2.
43 44 45
Algorithm 2: Working set selection 1: define
Table 2 The flowchart of algorithm 2.
ats ≡ K tt + K ss − 2K ts
(20)
48
bts ≡ − Z t ∇ f (β k )t + Z s ∇ f (β k )s
(21)
51
where ∇ f (β) = Q β− P is the gradient of f (β). 2: define
52 53
a¯ ts ≡
54 55 56
ats
τ
if ats > 0 otherwise
(22)
where τ is a small positive number. 3: select
57 58
k
61
64 65 66
(23)
t
j ∈ arg max{− t
62 63
k
i ∈ arg max{− Z t ∇ f (β )|t ∈ I up (β )}
59 60
76 77 78 79 80 82 83
Input: G , β, Z , Q from Algorithm 1; Process: 1: i = −1, g max = −∞, g min = +∞ 2: for t = 1, 2, · · · , l 3: if (Z t = 1 and βt < C ) or ( Z t = −1 and βt > 0) #Eq. (25) 4: if − Z t · G t ≥ g max ; i = t; end if; #Eq. (23) 5: end if 6: end for 7: j = −1, hmin = +∞; 8: for t = 1, 2, · · · , l 9: if (Z t = 1 and βt > 0) or ( Z t = −1 and βt < C ) #Eq. (26) 10: b = g max + Z t · G t , g min = min( g min , − Z t · G t ); #Eq. (21) 11: if b > 0; 12: a = Q ii + Q tt − 2Z i · Z t · Q it , a = max(τ , 0); #Eq. (20) 13: if −b2 /a ≤ hmin ; j = t, hmin = −b2 /a; end if; #Eq. (24) 14: end if 15: if g max − g min < τ ; i = −1, j = −1; end if Output: [i , j]
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101
4: return B k ={i , j }. The above all are the initial SVR model training algorithms. While SVR can only handle multi-input single-output (MISO) problems. Therefore, multiple SVR machines are combined to handle multi-input multi-output (MIMO) problems. To simplify the model storing and using process, multiple SVR machines are stored in a single file in juxtaposition according to the order of output. The parameters of each output are trained and tested in turn in MIMO problems automatically.
102 103 104 105 106 107 108 109 110 111 112
47
50
75
84
Algorithm 2: Working set selection
46
49
74
81
15 16
73
b2it a¯ it
k
k
k
|t ∈ I low (β ), Z t ∇ f (β )t < Z i ∇ f (β )i }
(24)
where
I up (β) ≡ {t |β t < C , Z t = 1or βt > 0| Z t = −1}
(25)
I low (β) ≡ {t |β t < C , Z t = −1or βt > 0| Z t = 1}
(26)
4. SGDSVR
113 114
The initial model of aero-engine is usually trained by small samples which may lead to overfitting and poor generalization capabilities. This means the initial model may have high prediction accuracy on the flight conditions whose flight data are collected as the training set. However, the initial model may not show excellent performance when the aircraft is on new flight conditions whose data are not trained. The traditional approach is to add the new flight data to the training set artificially and batch learning the model again after the flight. The approach is inefficient compared with online learning. Because online learning will reserve the trained parameters and just modify or add parameters according to the new flight data. However, traditional approach will abandon all the trained parameters and train the model from scratch with the set of old and new flight data; Online SVR which can online modify the model will be an optimistic choice to deal with the problem. However, traditional online SVR are incremental and decremental algorithms. The learned samples will be recalculated when a new sample comes, accordingly, traditional online
115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132
JID:AESCTE
AID:105775 /FLA
1 2 3 4 5 6 7
SVR is time-costing which will affect the real-time performance of monitoring. The SGDSVR is advocated to deal with the problem. The initial regression model can be obtained as Eq. (27) by the above training algorithm. s f ( x) = (αˆ i − αi ) K (xi , x) + b
10
where s is the number of all support vectors which are stored in the model. RBF is employed as kernel function in this article.
11 12 13 14 15 16 17 18
K ( x i , x) = e
21
g ( x) =
h
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55
x −x2 i 2δ 2
(28)
58 59 60 61
w i j K (c i , x) + b, j = 1, 2, · · · , n
(29)
where w i j is the weight of hidden layer, c i is the center vector and n is the number of output neurons. If the RBF neural network only ˆ i − αi as weight w i1 and has one output neuron and we regard α support vector xi as center vector c i , the SVR function is the same as the RBF neural network [21][22]. Therefore, the SGD algorithm which is applied to neural network online learning wildly can also be applicable to SVR online learning [23][24][25][26][27]. Although the model is trained for abnormal detection, the model can’t execute the task well before complete the online training process and we’d better avoid online modifying when the system is abnormal. Because the model will be affected if we modify the online model when the aero-engine is abnormal. Generally, abnormal conditions are far less than normal conditions, the model will still maintain a high accuracy if we train the model online for a sufficiently long time. However, there are some simple fault diagnosis systems on modern aircraft, we’d better stop online learning if the existing diagnosis systems send an alert. Thus, a modifying switch is introduced and the online learning process can be stopped and started artificially to avoid modifying the model when the aero-engine is abnormal obviously. Furthermore, the modifying switch can avoid modifying the model when the aircraft is waiting to take-off or cruising for a long time which might cause the model overfitting and reduce the computation burden. Now, the modifying switch is just a control button controlled by the operator. The operator should turn on and turn off the switch according to the clear flight conditions. For example, the switch should be turned off when other fault diagnosis systems send an alert and the aircraft is waiting to take-off or cruising for a long time and the switch should be turned on when the aircraft is executing a new flight mission. In the future work, we will develop an algorithm making the switch turn on and turn off automatically according to the flight data. The SGDSVR algorithm is demonstrated in algorithm 3. Algorithm 3: SGDSVR Step 1: calculate the predicted value yˆ through Eq. (27) when a new sample comes. Step 2: calculate the mean square error (MSE) between predicted value yˆ and actual value y.
62 63 64 65 66
MSE =
1 2
68 69
Algorithm 3: SGDSVR Input: optimal parameters xi and b; a new sample (x j , y j ); learning rate Process: 1: calculate yˆ j ; # Eq. (27) 2: calculate M S E; # Eq. (31) 3: calculate gradient ∂ M S E /b and ∂ M S E /xi ; # Eq. (31) (32) 4: update optimal parameters xi and b; # Eq. (34) (35) Output: xi , b
η
( yˆ − y )
2
70 71 72 73 74 75 76
∂MSE = yˆ − y ∂b ∂MSE ∂ yˆ ∂ K (xi , x) ∂MSE = · · ˆ ∂ xi ∂y ∂ K ( x i , x) ∂ xi ∂ K (xi , x) = ( yˆ − y )(αˆ i − αi ) ∂ xi
(31)
(30)
Step 3: calculate the partial derivatives of M S E with respect to support vector xi and bias b as the updating gradient.
79 81
(32)
82 83 84
i where ∂ xi , i = 1, 2, · · · , s is a partial derivative of kernel function with respect to vector.
∂ K ( x i , x) ∂ K ( x i , x) ∂ K (xi , x) ∂ K (xi , x) =[ , ,··· , ] ∂ xi ∂ xi1 ∂ xi2 ∂ xim x −x2 ∂ K (xi , x) xim − xm − i = e 2δ2 • ∂ xim δ2
78 80
∂ K ( x , x)
56 57
67
Table 3 The flowchart of algorithm 3.
77
i =1
22 23
−
We hypothesize that the new flight data for online learning are also meet the form of Eq. (27). Then online learning algorithms can be applied to modify the parameters in Eq. (27). The prediction function of RBF neural network is shown as Eq. (29).
19 20
(27)
i =1
8 9
[m5G; v1.282; Prn:13/02/2020; 8:24] P.6 (1-11)
L.-H. Ren et al. / Aerospace Science and Technology ••• (••••) ••••••
6
85 86 87 88 89
(33)
90 91 92 93
Step 4: update the vectors and bias according to the gradient.
∂MSE ∂b ∂MSE x i = xi − η ∂ xi
b=b−η
94 95
(34)
96 97
(35)
where η is the learning rate which is a small number because the initial model has been accurate and we only need to modify the model slightly.
98 99 100 101 102 103 104
The gradient of vectors and bias is only related to the present sample and the learned samples do not need to be stored and recalculated in the SGDSVR which will cost less modifying time short compared with traditional online SVR (see Table 3).
105 106 107 108 109
5. Aero-engine modeling with SGDSVR 5.1. Modeling process
110 111 112 113 114
The modeling process is divided into two parts, off-line building initial model and online modifying model. The following are the steps of this process. The processes of building and online modifying the aero-engine model are shown in Fig. 3. Step 1: divide the history flight data into training set and testing set. Step 2: preprocess the data, including: 1) wipe off the abnormal data; 2) interpolate the missing values; 3) remove similar samples; 4) normalize the data. Step 3: NARX process the training set according to Eq. (1). Step 4: set the SVR hyper parameters, including the regularization coefficient C , the insensitive loss coefficient ε and the RBF spread δ . Step 5: train the initial SVR model with Algorithm 1. Step 6: compute the mean absolute error of the processed testing set by invoking the model trained in step 5. If the error is smaller than 2%, save the model and go to step 7, otherwise, go to step 4. The mean absolute error is defined as
115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132
JID:AESCTE AID:105775 /FLA
[m5G; v1.282; Prn:13/02/2020; 8:24] P.7 (1-11)
L.-H. Ren et al. / Aerospace Science and Technology ••• (••••) ••••••
7
1
67
2
68
3
69
4
70
5
71
6
72
7
73
8
74
9
75
10
76
11
77
12
78
13
79
14
80
15
81
16
82
17
83
18
84
19
85
20
86
21
87
22
88
23
89
24
90
25
91
26
92
27
93
28
94
29
95
30
96
Fig. 3. Off-line building and online modifying model process.
97
31 32 33
E Nh =
34 35 36 37
ET6 =
h i =1 h
|( N h,i − Nˆ h,i )|/h = |( T 6,i − Tˆ 6,i )|/h =
i =1
h
i =1 h
| N h | /h (36)
| T 6 | /h
i =1
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55
where h is the number of the test set samples. Step 7: obtain online data through online health monitoring software. Step 8: normalize the online data. Step 9: save the data at the present and several previous moments in an online data queue according to the maximal order of NARX model. When a new sample comes, update the data queue according to the first-in first-out (FIFO) principle. Step 10: NARX process the online data queue. Step 11: calculate the online predicted value by invoking the model in step 5. Step 12: calculate the absolute prediction error of online data. Step 13: if the modifying switch is on, go to step 14; otherwise, go to step 15. Step 14: modify the model with Algorithm 3. Step 15: if the monitoring mission need to be continued, go to step 1, otherwise end the procedure.
56 57
5.2. Data acquisition
58 59 60 61 62 63 64 65 66
Firstly, we acquire the simulation data from the componentlevel model of a certain two-rotor turbofan aero-engine. The original data of the component-level model is smooth. To test the effect of the presence of noise, the smooth data is added with strong Gaussian white noise and the proposed algorithms are tested with the noisy and original data respectively. Some of the simulation data is shown in Fig. 4. Secondly, we acquire the actual flight data of a certain two-rotor turbojet aero-engine. The flight data itself
Table 4 Mean absolute error for different training data.
E Nh ET6
Original data
Noise data
0.0019 0.0057
0.0069 0.0134
98 99 100 101 102 103 104 105
is mixed with slight noise and the data are used to compare the performance of SGDSVR with traditional online SVR. Flight height H , Mach number Ma, total air temperature T 0 and fuel quantity Wfb (power lever angle P L A) are chosen as input variables, and high-pressure rotor speed N h and gas temperature after low-pressure turbine T 6 are chosen as output variables in this model. Before modeling, these data are all normalized into [0,1].
106 107 108 109 110 111 112 113
5.3. Results and analysis
114 115
5.3.1. Simulation data Firstly, we train the initial models with noisy and original training data respectively, then two of the models are tested with the original testing data. The results are shown in Table 4 and Fig. 5–6. As we can see that the accuracy of the initial model declines slightly even though the training data exists strong noise. The results indicate that the algorithm for building the initial model is robust to noise. Secondly, we modify the initial model which is trained with original data with noisy and original online data respectively. Fig. 7 (a) and (b) show the modifying process of T 6 with original and noisy online data respectively. Fig. 8 and Table 5 show the error during the processes. At the beginning, the E T 6 of original and noisy data are all 0.0731. About 100 seconds later, the E T 6 drop dramatically with the modifying switch on. The E T 6 maintain a pretty low level while modifying, i.e., the value are 0.0065 and 0.0047 for original and noisy data respectively. Another 100
116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132
JID:AESCTE
AID:105775 /FLA
[m5G; v1.282; Prn:13/02/2020; 8:24] P.8 (1-11)
L.-H. Ren et al. / Aerospace Science and Technology ••• (••••) ••••••
8
1
67
2
68
3
69
4
70
5
71
6
72
7
73
8
74
9
75
10
76
11
77
12
78
13
79
14
80
15
81
16
82
17
83
18
84
19
85
20
86
21
87
22
88
23
89
24
90
25
91
26
92
27
93
28
94
29
95
30
96 97
31
Fig. 4. Original and noisy simulation data.
32
98
33
99
34
100
35
101
36
102
37
103
38
104
39
105
40
106
41
107
42
108
43
109
44
110
45
111
46
112
47
113 114
48 49
Fig. 5. Actual and predicted values of N h and T 6 for two initial models.
50 51 52
Table 5 Mean absolute errors of two modifying processes.
53 54 55 56 57
E T 6 before modifying E T 6 while modifying E T 6 after modifying
Original data
Noisy data
0.0731 0.0065 0.0019
0.0731 0.0047 0.0030
58 59 60 61 62 63 64 65 66
seconds later, the E T 6 still maintain a pretty low level with the modifying switch off. The results demonstrate that the algorithm for online modifying the model is also robust to noise. Finally, we should note that the data acquired from different operating conditions (as Fig. 4 shows) and a single model has high accuracy for all the operating conditions (as Fig. 5 shows). That means the proposed algorithms are also robust to aero-engine operating conditions.
5.3.2. Flight data The presented algorithm is compared with AOSVR to verify the performance of SGDSVR with the actual flight data. The hardware environment for all these algorithms is Inter(R) Core(TM) i5-7300HQ
[email protected] GHz and 16G memory. First, the initial models are trained with SGDSVR and AOSVR respectively. The initial AOSVR model training algorithm is implemented in Visual Studio 2013 and the C++ source code is an improved version with stabilization provided by Parrella on his website (http://onlinesvr. altervista.org/). The initial SGDSVR model is also implemented in Visual Studio 2013 with C++ and the training algorithm is an improved SMO. Both of the online modifying algorithms are also implemented in C++ environment. The performance of two algorithms in building the initial model is shown as Table 6 and Fig. 9–10. From Table 6, for SGDSVR,
115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132
JID:AESCTE AID:105775 /FLA
[m5G; v1.282; Prn:13/02/2020; 8:24] P.9 (1-11)
L.-H. Ren et al. / Aerospace Science and Technology ••• (••••) ••••••
9
1
67
2
68
3
69
4
70
5
71
6
72
7
73
8
74
9
75
10
76
11
77
12
78
13
79
14
80 81
15 16
82
Fig. 6. Absolute errors of N h and T 6 for two initial models.
17
83
18
84
19
85
20
86
21
87
22
88
23
89
24
90
25
91
26
92
27
93
28
94
29
95
30
96
31
97
32
98
33
99
Fig. 7. Actual and predicted values of two modifying processes.
34
100 101
35 37
Table 7 Performance of online modifying model.
38
SGDSVR
AOSVR
104
0.21 0.0210 0.0008 0.0043 0.0182 0.0009 0.0080
2.72 0.0455 0.0061 0.0194 0.0519 0.0063 0.0203
105
36
39
Modifying time/(s/sample) E Nh before modifying E Nh while modifying E Nh after modifying E T 6 before modifying E T 6 while modifying E T 6 after modifying
40 41 42 43 44
102 103
47 48 49
Fig. 8. Absolute errors of two modifying processes.
51 52 54
Table 6 Performance of structuring initial model.
55 56 57 58
Training time/s E Nh ET6
SGDSVR
AOSVR
105.06 0.0129 0.0173
2332 0.0362 0.0324
59 60 61 62 63 64 65 66
108 109 110 112
46
53
107
111
45
50
106
the initial model with two outputs can be trained completely in 105.06 seconds while it takes 2332 seconds for AOSVR algorithm. For SGDSVR, E Nh and E T 6 of test set are 0.0129 and 0.0173 respectively while these values are 0.0362 and 0.0324 respectively for AOSVR. As we can also see in Fig. 9–10, the SGDSVR initial model visually demonstrates a better regression accuracy than AOSVR in
terms of N h and T 6 . The results indicate that the SGDSVR is more efficient and accurate than AOSVR in building the initial model. The performance of two algorithms on online modifying model is shown in Table 7 and Fig. 11–12. Firstly, it can be seen that for SGDSVR the online modifying time is 0.21 second while it takes 2.72 seconds for AOSVR when a new sample comes. The sampling time is 1 second for the tested aero-engine. SGDSVR can satisfy the real-time demand while the AOSVR cannot. Subsequently, to verify the online modifying effectiveness, the initial model is applied to a working condition in which the initial model shows poor performance. In general, online modifying manifests the similar influence on N h and T 6 . Hence, we can only focus on N h . At the beginning, the E Nh are 0.0210 and 0.0455 for SGDSVR and AOSVR respectively. About 100 seconds later, the E Nh drop dramatically with the modifying switch on, whereas SGDSVR shows faster modifying efficiency. The E Nh maintain a pretty low level while modifying, i.e., the value are 0.0008 and 0.0061 for SGDSVR and AOSVR respectively. Another 100 seconds later, the E Nh of two algorithms increases slightly with the modifying switch off, but remain relatively low compared with the E Nh
113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132
JID:AESCTE
AID:105775 /FLA
10
[m5G; v1.282; Prn:13/02/2020; 8:24] P.10 (1-11)
L.-H. Ren et al. / Aerospace Science and Technology ••• (••••) ••••••
1
67
2
68
3
69
4
70
5
71
6
72
7
73
8
74
9
75
10
76
11
77
12
78
13
79
14
80
15
Fig. 9. Actual and predicted values of N h and T 6 of test set.
16
81 82
17
83
18
84
19
85
20
86
21
87
22
88
23
89
24
90
25
91
26
92
27
93
28
94
29
95
30
96
31
97 98
32
Fig. 10. Absolute errors of N h and T 6 of test set.
33
99
34
100
35
101
36
102
37
103
38
104
39
105
40
106
41
107
42
108
43
109
44
110
45
111
46
112
47
113
48
114 115
49 50
Fig. 11. Actual and predicted values of N h and T 6 of online data.
52 53 54 55
before modifying. The results demonstrate that both of the algorithms can increase the model accuracy through online modifying, while SGDSVR has faster and better modifying effectiveness than AOSVR.
56 57 58
6. Conclusions
59 60 61 62 63 64 65 66
116 117
51
A novel SGDSVR training mechanism that combines batch learning with online learning is presented according to the demand and characteristic of aero-engine model. Some conclusions are followed by: (1) The results of simulation data and actual flight data indicate that the proposed algorithms are robust to noise and operating conditions and are reliable.
(2) During the off-line phase, for the fact that there are history data for training initial model, an improved SMO algorithm which is a batch learning algorithm is adopted to train the initial model. In this way, we can obtain an accurate model in about one hundred seconds rather than spending thousands of seconds to online training a usable model from scratch. The efficiency and accuracy of building initial model are both improved in this way. (3) During the online phase, for the sake of real-time performance, SGD algorithm is adopted to modify the initial SVR model rather than using incremental and decremental learning algorithms. In this way, the modifying time reduces about 10 times which makes the modifying time meet the real-time demand. Meanwhile, SGDSVR also shows faster and better modifying effectiveness than AOSVR.
118 119 120 121 122 123 124 125 126 127 128 129 130 131 132
JID:AESCTE AID:105775 /FLA
[m5G; v1.282; Prn:13/02/2020; 8:24] P.11 (1-11)
L.-H. Ren et al. / Aerospace Science and Technology ••• (••••) ••••••
11
1
67
2
68
3
69
4
70
5
71
6
72
7
73
8
74
9
75
10
76
11
77
12
78
13
79
14
80 81
15 16
Fig. 12. Absolute errors of N h and T 6 of online data.
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
In general, we can obtain a usable initial model in a short time and online modifying the model efficiently for aero-engine with SGDSVR algorithm which combines batch learning and online learning. Declaration of competing interest There is no conflict of interest. Acknowledgements This research was supported by the Fundamental Research Funds for the Intelligent Aero-engines under grant No. 2017-JCJQZD-047-21. References [1] Y.P. Zhao, G. Huang, Q.K. Hu, J.F. Tan, J.J. Wang, Z. Yang, Soft extreme learning machine for fault detection of aircraft engine, Aerosp. Sci. Technol. 91 (2019) 70–81. [2] V. Camerini, G. Coppotelli, S. Bendisch, Fault detection in operating helicopter drivetrain components based on support vector data description, Aerosp. Sci. Technol. 73 (2018) 48–60. [3] D. Clifton, Condition Monitoring of Gas-Turbine Engines, University of Oxford, London, 2005. [4] L. Tarassenko, A. Nairac, N. Townsend, Novelty detection for the identification of abnormalities, Int. J. Syst. Sci. 31 (2000) 1427–1439. [5] L. Tarassenko, A. Nairac, N. Townsend, Novelty detection in jet engines, in: IEEE Colloquium on Condition Monitoring: Machinery, External Structures & Health, 1999. [6] F. Lu, J. Wu, J. Huang, X. Qiu, Aircraft engine degradation prognostics based on logistic regression and novel OS-ELM algorithm, Aerosp. Sci. Technol. 84 (2019) 661–671. [7] J. Ma, J. Theiler, S. Perkins, Accurate on-line support vector regression, Neural Comput. 15 (2003) 2683–2703. [8] G. Cauwenberghs, T. Poggio, Incremental and decremental support vector machine learning, in: Advances in Neural Information Processing Systems, 2001, pp. 409–415. [9] F. Parrella, Online Support Vector Machines for Regression, University of Genoa, Italy, 2007.
[10] P. Laskov, C. Gehl, S. Krüger, Incremental support vector learning: analysis, implementation and applications, J. Mach. Learn. Res. 7 (2006) 1909–1936. [11] Rong-En Fan, Pai-Hsuen Chen, Chih-Jen Lin, Working set using second order information for training support vector machines, J. Mach. Learn. Res. 6 (2005) 1889–1918. [12] M. Basso, L. Giarre, S. Groppi, NARX models of an industrial power plant gas turbine, IEEE Trans. Control Syst. Technol. 13 (2005) 0-604. [13] B. Zina, C. Octavian, R. Ahmed, A nonlinear autoregressive exogenous (NARX) neural network model for the prediction of the daily direct solar radiation, Energies 11 (2018) 620–641. [14] Hamid Asgari, XiaoQi Chen, Mirko Morini, NARX models for simulation of the start-up operation of a single shaft gas turbine, Appl. Therm. Eng. 93 (2016) 368–376. [15] H. Drucker, C.J.C. Surges, L. Kaufman, A. Smola, V. Vapnik, Support Vector Regression Machines, 1997, pp. 155–161. [16] D.P. Bertsekas, Constrained Optimization and Lagrange Multiplier Methods, Academic Press, 1982. [17] Harris Drucker, Chris J.C. Burges, Linda Kaufman, Support vector regression machines, in: Neural Information Processing Systems Conference, 1996. [18] D. Hush, C. Scovel, Polynomial-time decomposition algorithms for support vector machines, Mach. Learn. 51 (2003) 51–71. [19] Chih-Jen Lin, Linear convergence of a decomposition method for support vector machines, Mach. Learn. 12 (2001) 291–314. [20] J. Platt, Fast Training of Support Vector Machines Using Sequential Minimal Optimization, MIT Press, 2000. [21] Z.Q. Li, Y.P. Zhao, Z.Y. Cai, P.P. Xi, Y.T. Pan, G. Huang, T.H. Zhang, A proposed self-organizing radial basis function network for aero-engine thrust estimation, Aerosp. Sci. Technol. 87 (2019) 167–177. [22] J. Park, I.W. Sandberg, Universal approximation using radial-basis-function networks, Neural Comput. 3 (1991) 246–257. [23] L. Bottou, On-line learning and stochastic approximations, in: On-Line Learning in Neural Networks, 1999. [24] Z. Wang, K. Crammer, S. Vucetic, Breaking the curse of kernelization: budgeted stochastic gradient descent for large-scale SVM training, J. Mach. Learn. Res. 13 (2012) 3103. [25] N.P. Thanh, Y.S. Kung, S.C. Chen, H.H. Chou, Digital hardware implementation of a radial basis function neural network, Comput. Electr. Eng. 53 (2016) 106–121. [26] Z.A. Zhu, W. Chen, G. Wang, C. Zhu, Z. Chen, P-PackSVM: Parallel Primal Gradient Descent Kernel SVM, 2009, pp. 677–686. [27] J. Sai, B. Wang, B. Wu, BPPGD: Budgeted Parallel Primal Gradient Descent Kernel SVM on Spark, 2017, pp. 74–79.
82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119
54
120
55
121
56
122
57
123
58
124
59
125
60
126
61
127
62
128
63
129
64
130
65
131
66
132