Communicated by Dr. Nianyin Zeng
Accepted Manuscript
Robust identification approach for nonlinear state-space models Xin Liu, Xianqiang Yang PII: DOI: Reference:
S0925-2312(18)31471-1 https://doi.org/10.1016/j.neucom.2018.12.017 NEUCOM 20234
To appear in:
Neurocomputing
Received date: Revised date: Accepted date:
15 August 2018 29 November 2018 25 December 2018
Please cite this article as: Xin Liu, Xianqiang Yang, Robust identification approach for nonlinear statespace models, Neurocomputing (2018), doi: https://doi.org/10.1016/j.neucom.2018.12.017
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Robust identification approach for nonlinear state-space models Xin Liua , Xianqiang Yanga,b,∗ a
School of Astronautics, Harbin Institute of Technology, Harbin, Heilongjiang 150080, China State Key Laboratory of Robotics and Systems of Harbin Institute of Technology, Harbin, Heilongjiang 150080, China
CR IP T
b
Abstract
AN US
The identification of nonlinear state-space model (NSSM) with output observations corrupted by outliers is investigated in this paper. The outlier is commonly encountered in practical industrial processes which should not be ignored in nonlinear processes modeling. The statistical scheme based on the Student’s t-distribution is applied to resist the outlier and the expectation-maximization (EM) algorithm is employed to simultaneously identify the undetermined model and noise parameters. A particle smoother is introduced and used to approximately calculate the desired Q-function. The usefulness of the proposed approach is demonstrated via the numerical and mechanical examples.
1. Introduction
ED
M
Keywords: Nonlinear system identification; Robustness, Student’s t-distribution; Particle smoother; Expectation-maximization algorithm
AC
CE
PT
Nowadays, industrial processes in modern society are often designed to perform certain complicated tasks. Majority of them exhibit complex intrinsic mechanism and nonlinear property which makes it intractable for modeling these processes with traditional first principle modeling method [1, 2, 3, 4, 5]. Nonlinear system identification (NSI) is regarded as a favorable alternative and has attracted more and more attentions [6, 7, 8, 9, 10]. The main purpose of NSI is to obtain an explicit mathematical expression which can accurately describe the process behavior of nonlinear system based on the informative process data [11, 12]. The identification methods for nonlinear systems with various types of models, such as Hammerstein models [13], Nonlinear Autoregressive eXogenous (NARX) models [14], Wiener models [15], and state-space model (SSM) [16], are widely studied. Currently, the existing identification techniques for nonlinear systems can be roughly divided into three types: the black-box, white-box and grey-box approaches [9, 17]. In the white-box methods such as subspace method [18], set-membership method [19] and recursive method [20], the prior knowledge about the system dynamics should be comprehensively ∗
Corresponding author Email address:
[email protected] (Xianqiang Yang)
Preprint submitted to Neurocomputing
December 28, 2018
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
AN US
CR IP T
known. On the contrary, the black-box methods generally construct the system models only based on the process data, the specific process knowledge is not necessary in this kind of methods [13, 15, 17, 21, 22, 23]. The grey-box methods can be treated as a combination of the white-box and black-box methods. In the grey-box identification method, black-box estimation approach is performed and supported by a priori information about the system [9, 17]. In majority of the conventional system identification methods, the Gaussian distributed noise model is adopted and it lacks of robustness for the outlier. The outlier is a common issue existed in practical processes and it is usually caused by data recording error, system disturbance, sensor malfunction and so on [12]. Recently, the usage of robust observation models such as Gaussian mixture models [24] and Student’s t-distribution model [25, 26] for NSI has gained more and more attention. In [24], Jin et al. found a Gaussian mixture models based identification solution for switched systems with one Gaussian component was applied to model the output and the other Gaussian component was applied to model the random outliers. But in practical industrial data, the number of outliers and the outliers positions can be arbitrary which make their method lack of flexibility since only one Gaussian model was used to model the outliers. In [26], Lu et al. built a Student’s t-based identification framework for nonlinear systems. The Student’s t-distribution is a heavy-tailed disctribution and it can be factored into infinite Gaussian components with varying variances which makes it robust for various types of outliers. This paper considers the robust identification of NSSM with measurements corrupted by outliers. Among various model structures, the SSM can describe the system dynamics naturally and, especially, represent the multiple input and multiple output (MIMO) systems feasibly, which show great advantages [16, 27]. In [28, 29], the unscent Kalman filter and particle filter were applied for NSSM identfication and the unknown parameters were both identified with the EM algorithm. The subspace and set-membership identification approaches can be found in [18] and [19], respectively. However, all these identification methods were designed for Gaussian measurement noises and they were not robust for the outliers. In [30], a robust Kalman filter was developed based on Student’s t distribution, but it was only suitable for linear SSM. For NSSM, the robust state estimation methods were developed by using a combination of the Student’s t distribution and Variational Bayesian (VB) approach in [31, 32]. But the accurate model parameters were required in these methods which limited their applications for system identification. Huang et al. designed a robust Gaussian approximate (GA) fixed-interval fiter for NSSM identification with the heavy-tailed t-distribution [33]. However, the variance of noise should be known accurately and it also suffers heavy computational burden. In this paper, the robust identification method for NSSM is developed and the robust observation model based on the Student’s t-distribution is established to deal with the outliers. The proposed method is derived under the EM algorithm scheme and a particle smoother is adopted to numerically calculate the Q-function. The major contributions of this paper are: 1. The robust framework for identifying the NSSM with contaminated measurements is established in this work; 2. To resist the outliers existed in the output, the Student’s t-based statistical model is applied for modeling the measurement noise; 3. The formulas to calculate the undetermined model 2
ACCEPTED MANUSCRIPT
2. Problem preliminary Consider the following NSSM expressed as
CR IP T
and noise parameters iteratively are all derived with EM algorithm. The rest of this work is given as follows. Section 2 states the identification of the NSSM with measurements contaminated with outliers in detail. Section 3 first introduces the EM algorithm, particle filter, and particle smoother, respectively, and then the detailed derivations of the proposed robust approach are presented. The efficacy of current work is verified through a numerical example and a two-link rigid manipulator in Section 4. Section 5 gives the concluding remarks of current work.
AN US
xk = f (θf , uk−1 , xk−1 ) + wk−1 , yk = g(θg , uk , xk ) + vk .
(1) (2)
where {uk }k=1,2,3...,N and {yk }k=1,2,3...,N denote the input and output sequences, respectively; {xk }k=1,2,3...,N are the hidden states at all the sampling moments; f (·) and g(·) are the nonlinear and smooth functions and they are utilized to describe the transition and measurement processes, respectively; θf and θg represent the unknown parameters of the above NSSM. wk is the state noise which is generally Gaussian distributed. In order to resist the outliers existed in the measurements, the noise vk is assumed to be Student’s t-distributed. That is (3) (4)
ED
M
wk ∼ N (wk |0, Rw ), vk ∼ St(vk |0, Rv , r).
CE
PT
where Rw and Rv are the variance and scale of wk and vk , respectively; r > 0 is the degree of freedom (DOF) of vk . With the property of Student’s t-distribution, it is derived that [26] Z St(vk |0, Rv , r) = p(vk |0, Rv , λk )p(λk |r)dλk =
Γ((r + 1)/2)(πrRv )−1/2 . Γ(r/2){1 + vk2 /(rRv )}(r+1)/2
(5)
AC
where λk is the auxiliary random variable and Γ(·) is the Gamma function. The term p(vk |0, Rv , λk ) is a Gaussian distributed probability density function (PDF) with zero mean and variance Rv /λk and p(λk |r) follows Gamma distribution p(λk |r) ∼ G(λk |r/2, r/2).
(6)
According to Eq. (2), yk is also t-distributed [12] yk ∼ St(yk |g(xk , uk , θg ), Rv , r), 3
(7)
ACCEPTED MANUSCRIPT
and St(yk |(g(xk , uk , θg ), Rv , r)) Z = p(yk |g(xk , uk , θg ), Rv , λk )p(λk |r)dλk
Γ((r + 1)/2)(πrRv )−1/2 . Γ(r/2){1 + (yk − g(xk , uk , θg ))2 /(rRv )}(r+1)/2
(8)
CR IP T
=
In current work, the input-output data sequences {yk , uk }k=1,2,...,N are collected to construct the available set Cobs = {u1:N , y1:N }. The hidden state variables {xk }k=1,2,...,N and the auxiliary variables {λk }k=1,2,...,N are used to construct the missing data set Cmis = {x1:N , λ1:N }. Therefore, this paper aims to identify the parameters Θ = {θg , θf , Rw , Rv , r} based on Cobs and the contaminated observations.
AN US
3. The derivations of the proposed method
In the sequel, the EM algorithm is used to identify the NSSM with contaminated output [34, 35, 36, 37]. In the E-step, the cost function (Q-function) is calculated as [12, 35, 38, 39] Q(Θ|Θs ) = ECmis |Cobs ,Θs {log p(Cmis , Cobs |Θ)},
(9)
M
In the M-step, the unknown parameters are estimated as [12, 37] Θ = arg max Q(Θ|Θs ).
(10)
ED
Θ
3.1. Calculation of the Q-function First, according to the probability chain rule, log p(Cmis , Cobs |Θ) can be rewritten as
CE
PT
log p(x1:N , λ1:N , y1:N , u1:N |Θ) = log p(y1:N |u1:N , x1:N , λ1:N , Θ) + log p(λ1:N |u1:N , x1:N |Θ) + log p(x1:N |u1:N , Θ) + log p(u1:N , Θ).
(11)
AC
Based on Eqs. (1)-(8), log p(Cmis , Cobs |Θ) can be further simplified as log p(x1:N , λ1:N , y1:N , u1:N |Θ)
= +
N X
k=1 N X k=2
log p(yk |uk , xk , λk , Rv , θg ) +
N X k=1
log p(λk |uk , xk , r)
log p(xk |xk−1 , uk−1 , Rw , θf ) + log p(x1 |Θ) + C1 .
4
(12)
ACCEPTED MANUSCRIPT
where C1 = log p(u1:N , Θ) is a constant. The expectation of log p(Cmis , Cobs |Θ) with respect to λk can be constructed as Z p(λk |Cobs , Θs ) log p(Cmis , Cobs |Θ)dλk λk
+ +
k=1 λk N Z X k=1 N X k=2
λk
p(λk |Cobs , Θs ) × log p(yk |uk , xk , λk , Rv , θg )dλk
CR IP T
=
N Z X
p(λk |Cobs , Θs ) × log p(λk |xk , uk , r)dλk
log p(xk |xk−1 , uk−1 , Rw , θf ) + log p(x1 |Θ1 ) + C1 .
(13)
AN US
Morever, the conditional expectation of Eq. (13) with respect to the hidden variable xk is calculated as Z N Z X s s p(λk |Cobs , Θs ) p(xk |Cobs , Θ ) × Q(Θ|Θ ) = λk
k=1
× log p(yk |uk , xk , λk , Rv , θg )dλk dxk
+
Z
M
k=1 N Z X
s
p(xk |Cobs , Θ ) ×
λk
s
p(λk |Cobs , Θ ) × log p(λk |xk , uk , r)dλk dxk
ED
+
N Z X
p(xk , xk−1 |Cobs , Θs ) × log p(xk |xk−1 , uk−1 , Rw , θf )dxk dxk−1
PT
Zk=2 + p(x1 |Cobs , Θs ) × log p(x1 |Θ1 )dx1 + C1 .
(14)
AC
CE
To further simplify the Q-function in Eq. (14), the next two PDFs are necessary (1) p(xk |Cobs , Θs ) (2) p(xk , xk−1 |Cobs , Θs )
(15)
Calculations of p(xk |Cobs , Θs ) and p(xk , xk−1 |Cobs , Θs ) can be simplified as calculations of p(xk |y1:N , Θs ) and p(xk , xk−1 |y1:N , Θs ), which are both smoothing problems [11]. Since it is intractable to calculate these two terms directly, they are numerically approximated with particle smoother which is introduced in the following sections. 3.2. Particle filter and smoother The particle filter provides a numerical solution to approximate the PDFs of interest with finite random weighted samples [11, 37, 39, 40]. In this way, the PDF p(xk |y1:k ) can be 5
ACCEPTED MANUSCRIPT
numerically computed as [39] p(xk |y1:k ) ≈
L X l=1
Wkl δ(xk − xlk ), δ(xk − xlk ) =
(
1, xk = xlk 0, others
(16)
l Wkl ∝ Wk−1 p(yk |xlk ).
CR IP T
where δ(·) is the Dirac delta function. Since it is unrealistic to generate particles, the importance density is often used here [11, 37]. Generally, the importance density is often chosen as the probablilty of state transition, then corresponding weight of every particle is updated as [39, 40] (17)
M
AN US
Moreover, the resampling procedure is applied to avoid the degeneracy of the particles. Similarly, the particle smoother is employed to calculate the PDFs p(xk |y1:N , Θs ) and p(xk , xk−1 | y1:N , Θs ). First, the smoothed PDF p(xk |y1:N , Θs ) is factored as [39] Z s p(xk |y1:N , Θ ) = p(xk+1 , xk |y1:N , Θs )dxk+1 Z = p(xk |y1:k , xk+1 , Θs )p(xk+1 |Θs , y1:N )dxk+1 Z p(xk+1 |y1:N , Θs )p(xk+1 |xk , Θs ) s dxk+1 . (18) = p(xk |y1:k , Θ ) × R p(xk+1 |y1:N , Θs )p(xk+1 |xk , Θs )dxk
PT
ED
Clearly, the filtered PDF p(xk |y1:k , Θs ), smoothed PDF p(xk+1 |y1:N , Θs ) and state transition density p(xk+1 |xk , Θs ) are all involved in the calculation of the smoothed PDF p(xk |y1:N , Θs ) [39, 40]. Based on the conclusion in [39], the the smoothed PDF p(xk |y1:N , Θs ) at instant k can be numerically approximated as s
CE
p(xk |y1:N , Θ ) ≈
L X l=1
l Wk|N δ(xk − xlk ),
(19)
AC
l is recursively calculated as [39, 40] and the weight Wk|N
l Wk|N = Wkl
"
L X j=1
j Wk+1|N PL
p(xjk+1 |xlk , Θs )
j i i s i=1 Wk p(xk+1 |xk , Θ )
#
.
(20)
As illustrated in Eq. (18), the joint smoothed PDF p(xk , xk−1 |y1:N , Θs ) can be rewritten as [39] p(xk+1 , xk |y1:N , Θs ) = p(xk |Θs , y1:k ) × R 6
p(xk+1 |Θs , y1:N )p(xk+1 |Θs , xk ) . p(xk+1 |Θs , y1:N )p(xk+1 |Θs , xk )dxk
(21)
ACCEPTED MANUSCRIPT
Similarly, the joint smoothed PDF p(xk+1 , xk |y1:N , Θs ) can be numerically approximated as [29, 39] s
p(xk+1 , xk |y1:N , Θ ) ≈
L X l=1
l Wk,k+1|N δ(xk − xlk )δ(xk+1 − xlk+1 ),
(22)
κl l = PL k Wk,k+1|N
l=1
and
l κlk = Wkl Wk+1|N PL
κlk
,
CR IP T
l is calculated as [39] where the joint weight Wk,k+1|N
p(xlk+1 |xlk , Θs )
i=1
Wki p(xik+1 |xik , Θs )
.
(23)
(24)
+
×
Z
p(λk |Cobs , Θs ) × log p(λk |xlk , uk , r)dλk
M
l Wk|N
λk
l Wk,k−1|N × log p(xlk |xlk−1 , uk−1 , Rw , θf )
k=2 l=1 L X W1l l=1
× log p(xl1 |Θ1 ) + C1 .
(25)
PT
+
k=1 l=1 L N X X
λk
ED
+
k=1 l=1 N X L X
AN US
3.3. Mathematical derivations under EM algorithm scheme 3.3.1. E-step According to Eqs. (19) and (22), the Q-function in Eq. (14) is rewritten as Z N X L X s l Q(Θ|Θ ) = Wk|N × p(λk |Cobs , Θs ) × log p(yk |uk , xlk , λk , Rv , θg )dλk
CE
The first integral term in Eq. (25) is computed as Z p(λk |Cobs , Θs ) × log p(yk |uk , xlk , λk , Rv , θg )dλk
AC
λk
1 1 hλk il (yk − g(xlk , uk , θg ))2 , = − log(2πRv ) + hlog λk il − 2 2 2Rv
(26)
where hλk il and hlog λk il represent the expectations of λk and log λk , which are respectively calculated as Z 1 + rs hλk il = p(λk |Cobs , Θs )λk dλk = , (27) (yk − g(xlk , uk , θgs ))2 /Rvs + rs Z hlog λk il = p(λk |Cobs , Θs ) log λk dλk = − log(((yk − g(xlk , uk , θgs ))2 /Rvs + rs )/2) + ψ((1 + rs )/2). 7
(28)
ACCEPTED MANUSCRIPT
CR IP T
where rs is obtained in the sth iteration and ψ(x) = d log(Γ(x))/dx is the Digamma function [12]. The second integral in Eq. (25) can be computed as Z p(λk |Cobs , Θs ) × log p(λk |xlk , uk , r)dλk λk r r r r r − 1 hlog λk il − hλk il − log Γ . (29) = log + 2 2 2 2 2 Based on Eqs. (1), (26), and (29), the Q-function in Eq. (25) is rewritten as N L 1 XX l hλk il (yk − g(xlk , uk , θg ))2 Q(Θ|Θ ) = W hlog λk il − 2 k=1 l=1 k|N Rv s
N
L
AN US
r XX l N W (hlog λk il − hλk il ) − log(2πRv ) + 2 2 k=1 l=1 k|N N X L X Nr r r l + log − N log Γ( ) − Wk|N hlog λk il 2 2 2 k=1 l=1
k=2 l=1
(xlk − f (xlk−1 , uk−1 , θf ))2 2Rw L
X N −1 log(2πRw ) + W1l × log p(xl1 |Θ1 ) + C1 . 2 l=1
(30)
ED
−
l Wk,k−1|N
M
−
N X L X
PT
3.3.2. M-step To perform the M-step, the derivatives of the Q-function in Eq. (30) with respects to Rw and Rv are taken and set to zero, it is derived that L
(31)
Rv =
(32)
CE AC
N
1 XX l W (xl − f (xlk−1 , uk−1 , θf ))2 , Rw = N − 1 k=2 l=1 k,k−1|N k N L 1 XX l W hλk il (yk − g(xlk , uk , θg ))2 . N k=1 l=1 k|N
Substituting Eqs. (31) and (32) into Eq. (30) yields Q(Θ|Θs ) = O1 (θf ) + O2 (θg ) + O3 (r) + C,
8
(33)
ACCEPTED MANUSCRIPT
where
N
L
CR IP T
PN PL 2 l l l N −1 k=2 l=1 Wk,k−1|N (xk − f (xk−1 , uk−1 , θf )) × log , O1 (θf ) = − 2 N −1 PN PL l l 2 N k=1 l=1 Wk|N hλk il (yk − g(xk , uk , θg )) O2 (θg ) = − log , 2 N N L Nr r r r XX l log − N log Γ( ), O3 (r) = Wk|N (hlog λk il − hλk il ) + 2 k=1 l=1 2 2 2
(34) (35) (36)
2N − 1 1 XX l C=− (log(2π) + 1) − W hlog λk il 2 2 k=1 l=1 k|N l=1
W1l × log p(xl1 |Θ1 ) + C1 .
AN US
+
L X
(37)
Since O3 (r) is only dependent on parameter r, the derivative of O3 (r) is taken with respect to r and equating it to zero yields L N r N 1 XX l r W (hlog λk il − hλk il ) + log − ψ + 1 = 0. 2 k=1 l=1 k|N 2 2 2
(38)
M
Then, the parameter r is updated by solving Eq. (38). The convexity of O3 (r) is proved in [41]. The parameters θf and θg are updated as
ED
θf = arg max O1 (θf ), θf
θg = arg max O2 (θg ).
(39)
PT
θg
CE
Clearly, it is very difficult to obtain the analytical solutions of θf and θg directly. The gradient based search strategy and the Newton method are often applied instead [29, 33, 39]. The complete steps of the developed algorithm are concluded in Algorithm 1. 4. Illustrative example
AC
4.1. Numerical example Consider the next NSSM expressed as xk = axk−1 + buk−1 + wk−1 , yk = c cos(xk ) + vk .
(40)
where wk ∼ N (0, 0.1) and vk ∼ N (0, 0.1). The true values are set as a = 0.9, b = 1, c = 1. 9
(41)
ACCEPTED MANUSCRIPT
Alogrithm 1 Robust identification approach for NSSM
AN US
CR IP T
Input: The available data Cobs = {y1:N , u1:N } Output: The parameters Θ = {θg , θf , Rw , Rv , r} 1: Initialize Θs = Θ1 ; 2: repeat 3: E-step: Compute the smoothed PDFs and the necessary expectations 4: Compute the smoothed PDFs p(xk |Cobs , Θs ) and p(xk , xk−1 |Cobs , Θs ) based on Eqs. (19) and (22), respectively; 5: Construct the expectation terms of rk and log rk via Eqs. (27) and (28), respectively; 6: M-step: Estimate all the parameters in Θ 7: Update the model parameters θf and θg through maximizing O1 (θf ) and O2 (θg ) in Eqs. (34) and (35), estimate the DOF r through solving Eq. (38); 8: Estimate Rw and Rv via Eqs. (31) and (32), respectively. 9: until convergence; ∗ , Rv∗ , r∗ } 10: Finally, the desired parameters are expressed as Θ∗ = {θg∗ , θf∗ , Rw
ED
M
A simulation is firstly executed to evaluate the proposed Algorithm 1. The Gaussian noise is selected as the excitation signal. N = 500 outputs are generated and 10% outliers are utilized to deteriorate the output data quality. The outliers are uniformly generated in range [-5,5] and the training data set is presented in Fig. 1. In the proposed Algorithm 1, the parameter θf is estimated by maximizing the cost function O1 (θf ). According to Eq. (40), O1 (θf ) can be rewritten as O1 (θf ) = −
PT
and
CE
J1 (a, b) =
L N X X k=2 l=1
J1 (a, b) N −1 log , 2 N −1
l Wk,k−1|N (xlk − axlk−1 − buk−1 )2 .
(42)
(43)
AC
Based on the monotonicity of log function, the parameters a and b are estimated through taking the derivatives of J1 (a, b) with respects to a and b and setting them to zero, PN PL l l l k=2 l=1 Wk,k−1|N (xk − bold uk−1 )xk−1 anew = , (44) PN PL l l 2 k=2 l=1 Wk,k−1|N (xk−1 ) PN PL l l l k=2 l=1 Wk,k−1|N (xk − aold xk−1 )uk−1 bnew = . (45) PN PL l 2 k=2 l=1 Wk,k−1|N uk−1 Similarly, based on Eq. (40), O2 (θg ) can be rewritten as O2 (θg ) = −
N J2 (c) log , 2 N
10
(46)
ACCEPTED MANUSCRIPT
Output data with 10% outliers
y
5 0 −5 0
100
200
300
400
2 0 −2 −4 0
100
200 300 Samples
400
CR IP T
4
u
500
Input data
500
AN US
Figure 1: The training data set of the numerical example
and J2 (c) =
L N X X k=1 l=1
l Wk|N hλk il (yk − c cos(xlk ))2 .
The parameter c is updated as
PN PL
l=1
l hλk il (yk cos(xlk )) Wk|N
M
k=1
cnew = PN PL
l l 2 l=1 Wk|N hλk il (cos(xk ))
k=1
(47)
.
(48)
PT
1.3 1.2
1
0.2
20
Rv
0.7
60
80
100
120
20
40
60
80
100
120
1.2
r
0.5 0.4
140
160
180
200
The scale parameter R v
0.05
0.6
20
40
0.1
0.8
0.3
The variance R w
0.15 0.1
CE
0.9
AC
Parameter trajectories
1.1
a b c
Rw
ED
Here, hλk il is treated as a related weight that assigned to each data point yk . As seen in
140
160
180
200
The degree of freedom r
1 0.8
40
60
80
100 120 Iterations
140
160
180
20
200
(a) The identified parameters a, b and c
40
60
80
100
120
140
160
180
200
(b) The identified Rw , Rv and r
Figure 2: The identification results with outputs corrupted by 10% outliers
Eq. (27), when yk is an outlier, the weight hλk il is a very small value that close to zero. Therefore, the influence of the outliers acted on parameter estimation is suppressed [41]. 11
ACCEPTED MANUSCRIPT
After running the robust Algorithm 1, the identification results are obtained and presented in Figs. 2a and 2b. It is clearly seen that all the estimated parameters can converge to the real values after about 100 iterations. Gopaluni [39] proposes a particle filter based approach to identify the NSSM with missing outputs. Then comparisons with Gopaluni’s method are conducted to verify the superiority of the proposed method in this paper.
CR IP T
Output data with no outliers
y
1 0 −1 0
50
100
150
200
250
300
350
Input data
4
AN US
u
2 0 −2 −4 0
400
50
100
150
200 250 Samples
300
350
400
ED
1
0.8 0.7
PT
Parameter trajectories
0.9
0.6 0.5
CE
0.4
Our approach 1.1 1 Parameter trajectories
Gopaluni’s approach 1.1
M
Figure 3: The data of the first comparison test with output data corrputed by no outlier
a b c
100
200 300 Iterations
400
a b c
0.7 0.6
0.4 0
500
(a) Estimation results of the method in [39]
AC
0.8
0.5
0.3
0.2 0
0.9
100
200 300 Iterations
400
500
(b) Estimation results of the developed method
Figure 4: The comparison results when 0% outlier is added to output data
In the first comparison test, the input is selected as the Gaussian noise and N = 400 output data are collected. 0% outlier is added to the outputs and the resulting data set is shown in Fig. 3. The comparison results of the method in [39] and the method proposed in this paper are presented in Figs. 4a and 4b, respectively. As shown in these two figures, when the outputs are pure and ideal, all the identified parameters by using both methods have a good/accurate convergence property. 12
ACCEPTED MANUSCRIPT
Output data with 5% outliers
y
5 0 −5 0
50
100
150
200
250
300
350
4
CR IP T
u
2 0 −2 −4 0
400
Input data
50
100
150
200 250 Samples
300
350
400
AN US
Figure 5: The identification data of the second comparison test with output data corrputed by 5% outliers
ED
M
In the second comparison test, 5% outliers, which are uniformly generated in range [6,6], are added to the output data and the identification data set is shown in Fig. 5. After applying the method in [39] and the proposed method in this paper, the identification results are provided in Figs. 6a and 6b, respectively. In this case, the method in [39] is not robust to the outliers and the estimated parameters diverge greatly from the corresponding true values; However, the estimated parameters by using the method proposed in this paper can still converge to the true values. It can be concluded from the above results that the proposed method outperforms the method in [39] when the output observations of the system are corrupted by outliers. Our approach
Gopaluni’s approach
1 0.8
Parameter trajectories
PT
1
CE
Parameter trajectories
1.2
1.1
a b c
0.6 0.4
AC
0.2
0 0
100
200 300 Iterations
0.9 0.8
a b c
0.7 0.6 0.5
400
0.4 0
500
100
200 300 Iterations
400
500
(a) Identification results of the method in [39] (b) Identification results of the developed method Figure 6: The comparison results when the outputs corrupted by 5% outliers
To further verify the usefulness of the proposed Algorithm 1, another comparison with the state-of-the-art robust approach is conducted. In [33], Huang et al. found a robust identification method for NSSM. N = 1000 output data points are collected and the Student’s 13
0.5 100
150
200
Parameter b
5
0
50
100
150
200
0.5 100 150 Iterations
200
250
50
100
150
200
250
50
100
150
200
250
100 150 Iterations
200
250
1
0.5
250
1
50
0.5
250
Parameter c
Parameter c
Parameter b
50
Our approach
1
1
0.5 50
(a) The method in [33]
CR IP T
Huang’s approach
1
Parameter a
Parameter a
ACCEPTED MANUSCRIPT
(b) The method proposed in this paper
AN US
Figure 7: The parameter trajectories with both method
M
t-distributed noise vk ∼ St(vk |0, 0.01, 1) is set as the measurement noise. 250 iterations are used in both methods and the parameter trajectories are compared in Fig. 7. Moreover, the Monte Carlo simulations with 100 different noise sequences are performed and the mean value and standard derivation (S.D.) of every parameter are listed in Table 1. ˆ k2 is also computed to assess the identified parameters The bias norm (BN) k θtrue − E(θ) of the Monte Carlo simulations.
TRUE 0.9 1 1
Our approach The method in [33] MEAN S.D. MEAN S.D. 0.8998 0.0017 0.8982 0.0043 1.0215 0.0088 1.0661 0.0167 1.0018 0.0062 0.9959 0.0091 0.0216 0.0662
CE
PT
a b c BN
ED
Table 1: The identified parameters with both methods under contaminated outputs
AC
As compared in Fig. 7, the estimated parameters of the proposed approach can converge to the true values with higher accuracy and less iterations. In Table 1, the calculated BN of the estimations with the proposed approach is smaller than the method in [33], which indicates that the model parameters are estimated more accurately with the proposed approach. The efficacy of the the proposed approach is confirmed in this comparison. 4.2. A two-link manipulator system The two-link rigid robotic manipulator is a common mechanical system in real-world industry and the lateral view of which is briefly seen in Fig. 8 [42, 43, 44]. F (t) is a varying 14
ACCEPTED MANUSCRIPT
F (t ) D3
m, l
m, l
X
D1
X
CR IP T
D2
Y
AN US
Figure 8: The lateral view of the manipulator system
vertical downward force and α3 ≡ 90◦ [42]. The explanations of rest physical variables can be found in [42, 44]. Under small angular vibrations, the dynamic behavior of this mechanical system is expressed as [42]
Where
ED
b1 0 E11 E12 K11 K12 M= ,E = , K(t) = , 0 b3 E21 E22 K21 K22
(50)
PT
and
(49)
M
M Φ¨ + E Φ˙ + K(t)Φ = 0,
AC
CE
E11 K11 K12 K22
= d1 + d2 , E12 = E21 = −d2 , E22 = d2 , = k1 + k2 − b4 sin(α10 ) − F (t)l cos(α10 − α3 ), = K21 = −k2 , = k2 − b5 sin(α20 ) − F (t)l cos(α20 − α3 ),
Φ = [α11 α21 ]T , b1 = 4ml2 /3, b2 = ml2 /2, b3 = ml2 /3, b4 = 3ml2 g/2, b5 = ml2 g/2.
(51)
Suppose the hidden state is selected as x = [Φ˙ T ΦT ]T , then Eq. (49) can be transformed into dx = A(t)x, dt
15
(52)
ACCEPTED MANUSCRIPT
Angle α11/rad
0.6
Output1 with 10% outliers
0.4 0.2 0 100
200
300
Angle α21/rad
0.6
400
Output2 with 10% outliers
0.4 0.2 0 −0.2 0
500
100
200
Time
300
400
CR IP T
−0.2 0
500
AN US
Figure 9: The outputs of the manipulator system with 10% outliers
where
(53)
M
E11 E12 K11 K12 − − − − b1 b1 b1 b1 E E K K − 21 − 22 − 21 − 22 A(t) = . b3 b3 b3 b3 1 0 0 0 0 1 0 0
ED
Estimations of θ1 and θ1 under 10% outliers
0.9
θ1
AC
CE
PT
Trajectories of θ1 and θ2
1
θ2
0.8 0.7 0.6 0.5
0
50
100
Iterations
150
200
Figure 10: The estimated parameters θ1 and θ2 with outputs corrupted by 10% outliers
Based on finite difference discretization method, the above continuous time differential
16
ACCEPTED MANUSCRIPT
equations can be transformed into NSSM in Eqs. (1)-(2), where E11 E12 K11 K12 − − − − b1 b1 b1 b1 E E K K − 21 − 22 − 21 − 22 f (xk−1 , uk−1 , θf ) = xk−1 + ∆t x , b3 b3 b3 k−1 b3 θ1 0 0 0 0 θ2 0 0
CR IP T
(54)
0 0 1 0 g(xk , uk , θg ) = x . 0 0 0 1 k
(55)
M
AN US
where ∆t is the sampling period. Both θ1 and θ2 need to be estimated and the true values are θ1 = θ2 = 1. The values of other parameters can be found in [42]. In this test, the periodic signal F (t) = 10 sin(0.02πt) N is utilized to excite the system and N = 500 outputs are collected. 10% outliers are generated to deteriorate the output data quality and the output data are provided in Fig. 9. After running the developed Algorithm 1, the convergence curves of parameters θ1 and θ2 are obtained and presented in Fig. 10 and the estimation errors versus iterations are also provided in Figs. 11a and 11b, respectively. After about 50 iterations, both estimation errors of θ1 and θ2 are very small which shows that the identified parameters converge to the corresponding true values.
0.6
ED
0.6
PT
0.3
0.5
Estimation error of θ2
0.4
0.2 0.1
CE
Estimation error of θ1
0.5
0
−0.1 0
AC
50
0.4 0.3 0.2 0.1 0
100
Iterations
150
(a) Estimation error of θ1
200
−0.1 0
50
100
Iterations
150
200
(b) Estimation error of θ2
Figure 11: The parameter estimation errors under 10% outliers
To comprehensively prove the effectiveness of the developed Algorithm 1, three additional tests with outlier percentages 5%, 20% and 40% are performed and all the results are given in Figs. 12a and 12b. After 200 iterations, the estimated values of θ1 and θ2 are all ˆ |θ − θ| given in Table 2 and the estimation error rate r = × 100% is calculated as well. θ θ 17
ACCEPTED MANUSCRIPT
Estimations of θ2 under different outliers ratios
1
1
0.9
0.9
p=5% p=10% p=20% p=40%
0.7 0.6 0.5
0.8 0.7 0.6 0.5
0
50
100 Iterations
150
200
0
(a) The estimated θ1
p=5% p=10% p=20% p=40%
CR IP T
0.8
Trajectories of θ2
Trajectories of θ1
Estimations of θ1 under different outliers ratios
50
100 Iterations
150
200
(b) The estimated θ2
AN US
Figure 12: The estimated θ1 and θ2 with different outlier percentages
M
denotes the true parameter value and θˆ is the estimated value. It can be seen from these identification results that all estimated parameters have an accurate convergence property when the outputs are corrupted by different percentages of outliers. As shown in Figs. 12a and 12b, the estimated parameters converge faster with the decrease of the outlier percentages. It can be concluded that the proposed method is robust to the outliers and is capable of providing satisfactory parameter estimates.
ED
5. Conclusions
CE
PT
This paper investigates a robust method for NSSM identification with output data corrupted by outliers. In order to overcome the difficulties of outlier imposed on nonlinear processes modeling, the robust observation model based on the Student’s t-distribution is established. The uncertain model parameters are estimated with EM algorithm and a particle smoother is adopted to calculate the target Q-function of the EM algorithm. In the proposed method, a small weight that is inversely proportional to the squared output es-
AC
Table 2: The estimated parameters after 200 iterations under different outlier percentages
Parameter values θ1 Error rate θ2 Error rate True 1 1 5% outliers 1.0014 0.14% 1.0032 0.32% 10% outliers 0.9953 0.47% 0.9967 0.33% 20% outliers 1.0050 0.50% 0.9936 0.64% 40% outliers 1.0074 0.74% 0.9925 0.75%
18
ACCEPTED MANUSCRIPT
timation error is placed on the outlier so that the adverse effects of outlier to parameter estimation is eliminated. However, the main disadvantage of the developed algorithm is that the performance of which is directly impacted by the initial model parameters. In the future, we will continue to study the identification of NSSM with undetermined time-delay and the identification of nonlinear systems with quantized output. References
AC
CE
PT
ED
M
AN US
CR IP T
[1] X. Yang, X. Yang, Local identification of lpv dual-rate system with random measurement delays, IEEE Transactions on Industrial Electronics 65 (2) (2018) 1499–1507. [2] N. Hou, H. Dong, Z. Wang, W. Ren, F. E. Alsaadi, Non-fragile state estimation for discrete markovian jumping neural networks, Neurocomputing 179 (2016) 238–245. [3] Y. Yu, H. Dong, Z. Wang, W. Ren, F. E. Alsaadi, Design of non-fragile state estimators for discrete time-delayed neural networks with parameter uncertainties, Neurocomputing 182 (2016) 18–24. [4] X. Yang, X. Liu, B. Han, Lpv model identification with an unknown scheduling variable in presence of missing observations-a robust global approach, IET Control Theory & Applications 12 (10) (2018) 1465–1473. [5] X. Yang, H. Gao, Multiple model approach to linear parameter varying time-delay system identification with em algorithm, Journal of The Franklin Institute 351 (12) (2014) 5565–5581. [6] H. Dong, Z. Wang, S. X. Ding, H. Gao, Finite-horizon estimation of randomly occurring faults for a class of nonlinear time-varying systems, Automatica 50 (12) (2014) 31823189. [7] H. Dong, Z. Wang, S. X. Ding, H. Gao, Finite-horizon reliable control with randomly occurring uncertainties and nonlinearities subject to output quantization, Automatica 52 (2015) 355362. [8] R. D. Nowak, Nonlinear system identification, Circuits Systems and Signal Processing 21 (1) (2002) 109–122. [9] A. A. Adeniran, S. E. Ferik, Modeling and identification of nonlinear systems: a review of the multimodel approach-part i, IEEE Transactions on Systems, Man, and Cybernetics: Systems 47 (7) (2017) 1149–1159. [10] N. Zeng and Z. Wang and Y. Li and M. Du and X. Liu, A hybrid EKF and switching PSO algorithm for joint state and parameter estimation of lateral flow immunoassay models, IEEE/ACM Transactions on Computational Biology and Bioinformatics 9 (2) (2012) 321–329. [11] J. Deng, B. Huang, Identification of nonlinear parameter varying systems with missing output data, AIChE Journal 58 (11) (2012) 1081–1092. [12] X. Yang, S. Yin, Robust global identification and output estimation for lpv dual-rate systems subjected to random output time-delays, IEEE Transactions on Industrial Informatics 13 (6) (2017) 2876–2885. [13] S. Ozer, H. Zorlu, S. Mete, System identification application using hammerstein model, Indian Academy of Science 41 (7) (2016) 597–605. [14] A. Rahrooh, S. Shepard, Identification of nonlinear systems using narmax model, Nonlinear Analysis 71 (12) (2009) 1198–1202. [15] E. W. Bai, Frequency domain identification of hammerstein models, Automatica 39 (9) (2003) 1521– 1530. [16] A. Marconato, J. Sjoberg, J. A. K. Suykens, J. Schoukens, Improved initialization for nonlinear statespace modeling, IEEE Transactions on Instrumentation and Measurement 63 (4) (2014) 972–980. [17] I. M. Yassin, M. N. Taib, R. Adnan, Recent advancements & methodologies in system identification: A review, Scientific Research Journal 1 (1) (2013) 14–33. [18] B. Gunes, J. W. V. Wingerden, M. Verhaegen, Predictor-based tensor regression (pbtr) for lpv subspace identification, Automatica 79 (2017) 235–243. [19] V. Cerone, D. Piga, D. Regruto, Set-membership lpv model identification of vehicle lateral dynamics, Automatica 47 (8) (2011) 1794–1799. [20] T. Wigren, Recursive prediction error identification and scaling of non-linear state space models using a restricted black box parameterization, Automatica 42 (1) (2006) 159–168.
19
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
AN US
CR IP T
[21] R. D. Veaux, D. C. Psichogios, L. H. Ungar, A comparison of two nonparametric estimation schemes: Mars and neural networks, Computers & Chemical Engineering 17 (8) (1993) 819837. [22] S. A. Billings, H. L. Wei, A new class of wavelet networks for nonlinear system identification, IEEE Transactions on Neural Networks 16 (4) (2005) 862874. [23] A. Gretton, A. Doucet, R. Herbrich, P. J. W. Rayner, B. Scholkopf, Support vector regression for black-box system identification, in: Proceedings of 11th IEEE Workshop Statistical Signal Processing, 2001, pp. 341344. [24] X. Jin, B. Huang, Robust identification of piecewise/switching autoregressive exogenous process, AIChE Journal 56 (7) (2010) 1829–1844. [25] F. Guo, K. Hariprasad, B. Huang, Y. Ding, Robust identification for nonlinear errors-in-variables systems using the em algorithm, Journal of Process Control 54 (1) (2017) 129–137. [26] Y. Lu, B. Huang, Robust multiple-model lpv approach to nonlinear process identification using mixture t distributions, Journal of Process Control 24 (9) (2014) 1472–1488. [27] N. Zeng and Z. Wang and H. Zhang, Inferring nonlinear lateral flow immunoassay state-space models via an unscented Kalman filter, Science China Information Sciences 59 (11) (2016) 1–10. [28] M. Gasperin, D. Juricic, Application of unscented transformation in nonlinear system identification, in: Proceedings of 18th IFAC World Congress, 2011, pp. 44284433. [29] T. B. Schon, A. Wills, B. Ninness, System identification of nonlinear state-space models, Automatica 47 (1) (2010) 39–49. [30] Y. Huang, Y. Zhang, N. Li, Z. Wu, J. Chambers, A novel robust students t-based kalman filter, IEEE Transactions on Aerospace and Electronic Systems 53 (3) (2017) 1545–1554. [31] G. Agamennoni, J. I. Nieto, E. M. Nebot, Approximate inference in state-space models with heavytailed noise, IEEE Transactions on Signal Processing 60 (10) (2012) 5024–5037. [32] H. Nurminen, T. Ardeshiri, R. Piche, F. Gustafsson, Robust inference for state-space models with skewed measurement noise, IEEE Transactions on Signal Processing 22 (11) (2015) 1898–1902. [33] Y. Huang, Y. Zhang, N. Li, S. M. Naqvi, J. Chambers, A robust and efficient system identification method for a state-space model with heavy-tailed process and measurement noises, in: Proceedings of International Conference on Information Fusion, 2016, pp. 441448. [34] X. Jin, B. Huang, D. S. Shook, Multiple model lpv approach to nonlinear process identification with em algorithm, Journal of Process Control 21 (1) (2011) 182–193. [35] L. Chen, B. Huang, F. Liu, Multi-model approach to nonlinear system identification with unknown time delay, IFAC Proceedings Volumes 47 (3) (2014) 9388–9393. [36] J. Wu, On the convergence properties of the em algorithm, Annals of Statistics 11 (1) (1983) 95–103. [37] L. Chen, A. Tulsyan, B. Huang, F. Liu, Multiple model approach to nonlinear system identification with an uncertain scheduling variable using em algorithm, Journal of Process Control 23 (10) (2013) 1480–1496. [38] N. Zeng and Z. Wang and Y. Li and M. Du and J. Cao and X. Liu, Time series modeling of nanogold immunochromatographic assay via expectation maximization algorithm, IEEE Transactions on Biomedical Engineering 60 (12) (2013) 3418–3424. [39] R. B. Gopaluni, A particle filter approach to identification of nonlinear processes under missing observations, The Canadian Journal of Chemical Engineering 86 (6) (2008) 1081–1092. [40] M. S. Arulampalam, S. Maskell, N. Gordon, A tutorial on particle filters for online nonlinear/nongaussian bayesian tracking, IEEE Transactions on Signal Processing 50 (2) (2002) 174–188. [41] C. Liu and D B. Rubin, ML estimation of the t distribution using EM and its extensions, ECM and ECME, Statistica Sinica 5 (1) (1995) 19–39. [42] K. Liu, Identification of linear time-varying systems, Journal of Sound and Vibration 206 (4) (1997) 487–505. [43] M. Tong, Y. Pan, Z. Li, W. Lin, Valid data based normalized cross-correlation (vdncc) for topography identification, Neurocomputing 308 (1) (2018) 184–193. [44] M. B. A. Jabali, M. H. Kazemi, Uncertain polytopic lpv modelling of robot manipulators and trajectory tracking, International Journal of Control, Automation and Systems 15 (2) (2017) 883–891.
20
M
AN US
CR IP T
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
Xin Liu received the B.E. degree in measurement-control technology and instrumentation and M.E. degree in control engineering from Harbin Engineering University, Harbin, China, in 2013 and 2015, respectively. He is currently a Ph.D student in the Institute of Intelligent Control and Systems, Harbin Institute of Technology, Harbin, China. His research interests include nonlinear systems identification, data-driven process modeling and soft sensor development.
21
M
AN US
CR IP T
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
Xianqiang Yang (M’16) received the B.E. degree in automation and M.E. degree in control theory and control engineering from Shandong University of Science and Technology, Qingdao, China, in 2008 and 2011, respectively, and Ph.D degree in control science and engineering from Harbin Institute of Technology, Harbin, China, in 2015. Currently, he is an Associate professor in the School of Astronautics, Harbin Institute of Technology. His research interests include identification, soft sensor development, and image processing.
22