Author’s Accepted Manuscript A new sampling method in particle filter based on pearson correlation coefficient Haomiao Zhou, Zhihong Deng, Yuanqing Xia, Mengyin Fu www.elsevier.com/locate/neucom
PII: DOI: Reference:
S0925-2312(16)30792-5 http://dx.doi.org/10.1016/j.neucom.2016.07.036 NEUCOM17393
To appear in: Neurocomputing Received date: 11 December 2015 Revised date: 18 May 2016 Accepted date: 23 July 2016 Cite this article as: Haomiao Zhou, Zhihong Deng, Yuanqing Xia and Mengyin Fu, A new sampling method in particle filter based on pearson correlation coefficient, Neurocomputing, http://dx.doi.org/10.1016/j.neucom.2016.07.036 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting galley proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
1
A new Sampling Method in Particle Filter Based on Pearson Correlation Coefficient Haomiao Zhou, Zhihong Deng, Yuanqing Xia, Mengyin Fu
Abstract Particle filters have been proven to be very effective for nonlinear/non-Gaussian systems. However, the great disadvantage of a particle filter is its particle degeneracy and sample impoverishment. An improved particle filter based on pearson correlation coefficient (PPC) is proposed to reduce the disadvantage. The PPC is adopted to determine whether the particles are close to the true states. By resampling the particles in the prediction step, the new PF performs better than generic PF. Finally, some simulations are carried out to illustrate the effectiveness of the proposed filter.
Keywords: Particle filter, Pearson correlation coefficient, Importance density I. INTRODUCTION Nonlinear filtering is a very active topic in signal processing and control theory. Although the equations of the optimal nonlinear filter have been developed since the middle of the 1960s, the involved integrals are still intractable. Hence, many suboptimal nonlinear filters have been introduced in the literature. The simplest way but biased to approximate a nonlinear state-space model is extended Kalman filter (EKF), which replaces the state transition and the measurement equations by Taylor series expansions [1]. The unscented Kalman filter (UKF) uses several sigma points to calculate recursively the means and covariances in the Kalman filter [2]. Both filters use the Gaussian distribution to approximate the true posterior distribution. Recently, particle filtering (PF) also known as sequential Monte Carlo proposed by Gordon et al. for online filtering and prediction of nonlinear and non-Gaussian state space models has H Zhou, Z Deng,Y. Xia, M Fu are with the School of Automation, Beijing Institute of Technology, Beijing 100081, China. Email:
[email protected]; dzh
[email protected];xia
[email protected];
[email protected]. July 25, 2016
DRAFT
2
received growing attentions [3]. PF are numerical methods that produce an approximation of posterior probability density function (PDF) of the state variables using empirical distributions of systems by evolving weighted particles. Since it does not necessitate simplification of nonlinearity or assumption of specific distributions, it is currently one of the most successful methods used to approximate nonlinear and non-Gaussian state estimation problems. And it has been widely used in various fields such as signal processing [4] [5], target tracking [6]-[9], mobile robot localization [10]-[11], image processing [12], economics and engineering [13]. PF consist of three important operations: generation of new particles, computation of particle weights, and resampling [14]. The resampling including multinominal resampling [15], systematic resampling [16], residual resampling [17], stratified resampling [18], and so on, which needs to deal with the degeneracy problem is a major stumbling block in the development of sequential MC methods. Unfortunately, resampling causes another noted problem sample impoverishment, which results in loss of the diversity of particles. To compensate, large numbers of particles are required in realistic problems, which increases the computational complexity. Pearson correlation coefficient (PCC) is a statistical metric that measures the strength and direction of a linear relationship between two random variables [19]. It has been applied to various indices in statistics, such as data analysis and classification [20], data analysis [21], clustering, decision making [22], finance analysis [23], biological research [24], etc. In this paper, motivated by the merit of the PPC, we propose a new PPC-based particle filter (PPF) algorithm to alleviate the sample impoverishment problem and to obtain better estimation accuracy. The simulation results compared with the standard PF have been given in this paper. The paper is structured as follows. In Section II we present background informations regarding the PF and the pearson product moment correlation. In Section III the improved version of PF is derived. Numerical simulation is carried out and results are analyzed in Section IV. Finally, conclusions are drawn in Section V. II. PRELIMINARIES In this section we give an overview of the basic definitions and properties concerning the asymptotic behavior of the particle filter and the pearson correlation coefficient.
July 25, 2016
DRAFT
3
A. Background on particle filter Consider the filtering problem for the following nonlinear discrete-time system. The state space representation is given by: xk+1 = fk (xk , νk ),
(1)
yk = hk (xk , ωk ),
(2)
where xk ∈ nx denotes the state at time k, yk ∈ ny denotes the corresponding measurement, νk and ωk represent the process and measurement noise, mutually independent and independent of x0 , with known PDFs Pνk and Pωk , respectively. The PDF p(x0 ) of the initial state x0 is known. From the formulas (1) and (2) we derive the transition probability kernel K(x k |xk−1 ) and the likelihood function g(y k |xk ) defined by K(xk |xk−1 ) = Pνk (xk − fk−1 (xk−1 )
(3)
g(yk |xk ) = Pωk (yk − hk (xk ))
(4)
are the conditional PDF of the variable xk given by the previous state xk−1 and the conditional PDF of the variable yk given the current state xk . Particle filter is a special version of Bayes filter, and it is based on sequential Monte Carlo sampling. The objective of PF is to estimate and represent the posterior probability density P (xk |y1:k ). The probability density function usually can not be resolved, and is difficult for sampling. Therefore the PF represents posterior PDF P (xk |y1:k ) by N random samples (particles) i N with their associated weights {xi0:k , ωki }N i=1 , where {x0:k }i=1 is a set of particles with weights
{ωki }N i=1 . The posterior state PDF at k is approximated as P (xk |y1:k ) ≈
N
i=1
ωki δ(xk − xik )
(5)
where δ denotes the Dirac delta at zero. Let φ be a bounded test function, πt|t−1 the measure corresponding to the probability density p(xt |y1:t−1 ), and πt|t the measure corresponding to the density p(xt |y1:t ). Then use the notation (ν, φ) Kφ(x) July 25, 2016
φ(x)ν(dx)
(6)
K(dz|x)φ(z)
(7) DRAFT
4
TABLE I: Generic Particle Filter 1) Initialize the particles, xi0 ∼ π0 (dx0 ),
i = 1, 2, ..., N
2) Predict the particles by drawing independent samples according to xik ∼ P (xk |xik−1 ), i = 1, ..., N 3) Evaluate the importance weights ωki = P (yk |xik ), i = 1, ..., N j and Normalize: ωki = ωki / N j=1 ωk N i 2 −1 ˆf =( 4) Calculate Nef i=1 (ωk ) ) ˆ f < Nthr If Nef N
N
[xik , ωki , −i=1 ] = RESAM P LE[xik , ωki i=1 ] i i x ˆk = N i=1 ωk xk
where ν is a measure, φ is a function and K is a Markov transition kernel. The Bayesian filtering equations can be written as (πt|t−1 , φ) = (πt−1|t−1 , Kφ) (πt|t , φ) =
(πt|t−1 ,gφ) (πt|t−1 ,g)
(8) (9)
The generic particle filter as it was introduced by [13] is described in TABLE I. It has been proven that the variance of the weights can only increase over time. This means, in general, after a few iterations, all but one weight will be (close to) zero, which is called the degeneracy problem. When that occurs most of the computational power is wasted on updating negligible weights and the accuracy of the algorithm strongly deteriorates since the posterior PDF is approximated only by a small set of significant particles. To prevent this, the particles are regularly resampled, i.e., particles with small weights are eliminated, and new particles are created at or around those with large weights. To decide when to resample, effective sample ˆef f is compared to some predefined threshold Nthr . Many efficient resampling schemes size N have been proposed, such as systematic resampling (its operation is described in TABLE IV), multinomial resampling, and residual resampling. B. Pearson correlation coefficient The pearson correlation coefficient is a measure of the linear dependence between two random variables (real-valued vectors). Historically, it is the first formal measure of correlation and it is still one of the most widely used measure of relationship. July 25, 2016
DRAFT
5
TABLE II: Systematic Resampling 1) Construct the cumulative density function CDF : ci = ci−1 + ωki and initialize : c1 = 0
i = 1, 2, ..., N
2) Start at the bottom of the CDF : i = 1 Draw a starting point: u1 ∼ U [0, N −1 ] 3) For each particle move along the CDF : uj = u1 + N −1 (j − 1) ,
j=1,2,...,N
While uj > ci i=i+1 ∗
Assign partcle: xjk = xjk Assign weight: ωkj = N −1
Assign parent: ij = i
The pearson correlation coefficient of two variables X and Y is formally defined as the covariance of the two variables divided by the product of their standard deviations (which acts as a normalization factor) and it can be equivalently defined by [19]:
y) i −¯ rxy = √ (xi −¯x2) √(y
where x¯ =
1 n
N
i=1
(xi −¯ x)
xi denotes the mean of x. y¯ =
(10)
(yi −¯ y )2 1 n
N
i=1
yi denotes the mean of y. The
coefficient rxy ranges from −1 to 1 and it is invariant to linear transformations of either variables. The PCC gives an indication on the strength of the linear relationship between the two random variables x and y. The sign of the correlation coefficient is positive if the variables are directly related and negative if they are inversely related. If rxy = 0, then x and y are said to be uncorrelated. The closer the value of |rxy | is to 1, the stronger the measures closeness to a linear relationship. This is because the association measure reflects the tendency of changes for each pair of corresponding expression levels in the two profiles. The pearson correlation coefficient measures the similarity of the changes in the expression levels of two profiles. Specifically it measures the strength of the linear relationship between two profiles. Suppose that the observation path of a particle is close to the observation path of the true state, its path will be close to the path of the true state when the observation noise is small. So the PCC can be used to choose the particles to ensure the particle close to the true state.
July 25, 2016
DRAFT
6
III. P EARSON PARTICLE F ILTER A. Survey of importance density Clearly, the degeneracy problem is an undesirable effect in particle filters. So there has been a permanent effort to find a suitable procedure in the last decade. Two methods to improve the PF often depend on the choice of good importance density and the use of resampling. Many efficient resampling have been proposed. Note that the resampling step need not be executed at each time instant because it introduce further simulation error into estimates. So a good choice of importance density (ID) can help to increase quality of the state estimate significantly as it can minimize V ar(ωki ) so the Nef f is maximized. Many works have been done to choose the ID, but no general rule seems to exist. In summary the ID design techniques includ direct approach and composite approach. (A) Direct approach of ID 1) Fixed ID: The fixed ID [25] was proposed by q(xik |xik−1, yk ) = q(xik )
(11)
The results are very poor because the fix ID is independent of previous samples and the current measurement. 2) Optimal ID: The optimal ID [26] is in the form of: q(xik |xik−1 , yk ) =
N
1 i i i=1 N P (xk |xk−1 , yk )
(12)
The major drawback is that it can be used when the model of p(x ik |xik−1 , yk ) is Gaussian. And this occur where the state dynamics are nonlinear and the measurements h k (·) are linear. 3) Prior ID: The prior ID [3] is in the form of: q(xik |xik−1 , yk ) =
N
1 i i i=1 N P (xk |xk−1 )
(13)
It can be seen that the current measurement yk is ignored during sampling. This disregard of the measurement is critical when the samples are generated near a tail of the measurement pdf, and this may lead to samples impoverishment. The prior ID has been used in bootstrap filter. 4) Auxiliary ID: which uses an auxiliary variable [27] to select the particles and has been used in auxiliary PF: q(xik |xik−1 , yk ) = July 25, 2016
N
i=1
λik p(xik |xik−1 )
(14) DRAFT
7
These methods may provide a good approximation of the true posterior distribution, but they are valid in the environment of a small noise variance of the state. 5) Likelihood ID: The Likelihood ID [28] may not be always possible as which is given by the likelihood p(yk |xk ). It has to invert the pdf p(yk |xk ) to get the pdf of xk . 6) Hybrid ID: This method is a combination of the optimal ID and the prior ID [29], and the state is separate into two parts. q(xik |xik−1 , yk ) = p(xi1,k |xi1,k−1 , yk )p(xi2,k |xi2,k−1 , yk )
(15)
7) Bridging ID: The Bridging ID [30] uses a sequence of density introduced from intermediate distribution between the initial transition density and final likelihood. q(xik |xik−1 , yk ) ∝ p(xk |xk−1 , yk )αm p(xk |yk )1−αm
(16)
(B) Composite approach to sampling ID It is possible to construct suboptimal ID by using local linearization technique. The approach is to move the particles from the domain of low likelihood to the domain of high likelihood by using existing local linearization filters. 1) Extend Kalman filter ID: The method approximates the optimal ID [31] by a gaussian pdf, and it may not be used for non-gaussian pdf. 2) Gaussian sum ID: This method is a mixture of EKF’s, where all the random variables are described by weighted sums of gaussian pdf’s [32]. 3) Sigma point Kalman filter ID: This idea is similar to EKF ID, where the unsented KF was used. 4) Gaussian mixture sigma point kalman filter ID: It is an extension of UKF ID. 5) H∞ filter ID: The extended H∞ filter [33] [34]is adopted for generating ID. The ideas mentioned all try to find a more efficient ID. Each one has their own goods and faults. In this part we will propose a new way to choose the particles from ID, which is based on PCC. When the likelihood lies in the tails of the prior distribution or it is too narrow, most particles drawn from the prior distribution have small weights and the estimation is not reliable. In this case, the likelihood provides more information than the prior. We may use the information of the measurements and let the particles move towards the region of high likelihood.
July 25, 2016
DRAFT
8
B. Pearson particle filter It is well known that the performance of PF heavily depends on whether the significant regions in state space are covered by particles. However, the transition density has been wildly used as the prior importance density, but in some situations is not suitable : When the likelihood lies in the tails of the prior distribution or it is too narrow, most particles drawn from the prior distribution have small weights and the estimation accuracy can be significantly decreased as the ID only contains the model information, but ignores the current measurement. To overcome the drawback we propose a new technique of sampling from the prior ID, which use the observation information and push the particles towards the regions of high likelihood. We use the PPC incorporated into the generic PF as the criteria to decide if a particle is similar to the true state and should be ignored or should be considered in the next step of the process. The coefficient rki,m will close to 1 when the observation path of a particle is similar to the observation path of the true state. With the help of the PPC correction term particles are expected to be pushed to the significant regions of the state space. denote a set of particles sampling from xik . yki,m denote the corresponding observaWe let xi,m k tion path of these particles. yk denote the observation path of the true state. Then the correlation coefficient rki,m between the particle and the true state paths can be computed as: n
i,m y i,m )(yk −¯ y) k=1 (yk −¯ n i,m i,m )2 (y −¯ y (y y )2 k=1 k k=1 k −¯
rki,m = √n
(17)
Then we choose the particle xi,m with the most similarity of the true state as the final particle k xik . The corresponding PPF is described in TABLE III. Remark 3.1: In the PPF, the ID is still the transition density. We diffuse the particles sampled from the ID randomly according the variance and get N × M particles. Then N particles have been resampled according to their PPC. Remark 3.2: The difference between the PF resampling and the PPF is that we select good particles which is close to the true states in the prediction step. Remark 3.3: Choosing the parameter M is important. When the variance used to diffuse the particles is appropriate, only a few particles are enough to find the right one which close to the true state.
July 25, 2016
DRAFT
9
TABLE III: Pearson Particle Filter 1) Initialize the particles, xi0 ∼ π0 (dx0 ),
i = 1, 2, ..., N
2) Predict the particles by drawing independent samples according to xik ∼ P (xk |xik−1 ), i = 1, ..., N 3) Handpick the particles to make them close to true state Draw xi,m ∼ xik , k calculate
m = 1, 2, ..., M
rki,m
Assign xik = xi,m where xi,m has the maximum rki,m k k 4) Evaluate the importance weights ωki = P (yk |xik ), i = 1, ..., N j and Normalize: ωki = ωki / N j=1 ωk ˆ f = ( N (ωki )2 )−1 5) Calculate Nef i=1 ˆ f < Nthr If Nef N
N
[xik , ωki , −i=1 ] = RESAM P LE[xik , ωki i=1 ] i i x ˆk = N i=1 ωk xk
C. Convergence of empirical measures of the PPF In this Section we investigate the asymptotic behavior of the PPF with respect to the operators in [35]. Definition 3.1: We define that the transition probability kernel K(x k |xk−1) has the Feller property if for every continuous and bounded function ϕ, the function z→
x
ϕ(x)K(dx|z)
(18)
is continuous and bounded. In this Section we assume that the transition kernel K(x k |xk−1 ) is Feller, and the likelihood function g(yk |xk ) is a continuous bounded strictly positive function. Definition 3.2: We define the prediction operator bk : P(X) → P(X) to be the mapping bk (ν)(dxk )
X
K(dxk |xk−1 )ν(dxk−1 )
(19)
for arbitrary ν ∈ P(X). We have πk|k−1 = bk (πk−1|k−1). Where the marginal distribution are defined as: πk|k−1 P (xt |y1:t−1 ), πk|k P (xt |y1:t )
July 25, 2016
DRAFT
10
Definition 3.3: We define the update operator ak : P(X) → P(X) to be a mapping such that for arbitrary ν ∈ P(X), ak (ν) is a probability measure defined as (ak (ν), ϕ) = (ν, g)−1 (ν, ϕg)
(20)
We have πk|k = ak (πk|k−1). Definition 3.4: We define the perturbation (random sampling) operator c N as for all ν ∈ P(X) (21) cN,x (ν) = N1 N i=1 δVj (x) where N > 0, x ∈ X, Vj , j = 1, 2, ..., N are i.i.d random variables with common distribution ν. We will ignore the dependence on x. Theorem 3.1: (Doucet, 2002) Assuming that the transition kernel K is Feller and that the N = πt|t likelihood function ρ is bounded, continuous, and strictly positive, then lim N →∞ πt|t
almost surely. The particle filter can be expressed as the operates (mentioned above) form on the probability space: N N := [¯ cN ◦ ak ◦ cN ◦ bk ]πk−1|k−1 πk|k
(22)
where ◦ denotes the composition of functions. c¯N be the resampling operator on P(X) that maps the measure ν into a random measure. N describe the prediction stage of the PF. Remark 3.4: The operators cN ◦ bk apply to πk−1|k−1
at denotes the update stage. The operator c¯N corresponds to the resampling step. Theorem 3.2: Assuming that the transition kernel K is Feller and the likelihood function g is bounded, continuous, and strictly positive. Let c¯N is a resampling operator such that for every bounded function φ, there exists a constant C such that: E[( X φ(x)¯ cN (ν)(dx) − X φ(x)ν(dx))4 ] ≤
C N2
(23)
N That means that as N → ∞ the marginal distribution π k|k defined as (22) converges almost
surely towards the true posterior PDF and it has been proven in [34]. Let us now discuss the condition when the PCC has been used to choose the particles. The prediction step as in (22) has been changed as follows: ⎡ ⎡ ⎤ ⎤ 1,m 1 K(x K(x |x ) |x ) k k−1 k k−1 ⎢ ⎥ M ⎢ ⎥ using ri,m cN ◦bk ⎢ ⎥ q ⎢ ⎥ k N −→ πk−1|k−1 −→ ⎢ ⎢ ⎥ ⎥ −→ ... ... ⎣ ⎣ ⎦ ⎦ K(xk |xN K(xk |xN,m k−1 ) k−1 ) July 25, 2016
⎡
⎤ 1 K(x |x ) k k−1 ⎢ ⎥ ⎢ ⎥ ak N ⎢ ⎥ −→ πk|k (24) ... ⎣ ⎦ N K(xk |xk−1) DRAFT
11
where the perturbation q M denotes a random sample of size M of the prediction states (m = 1, 2, ..., M) and satisfies: F or all eM , e ∈ E such that
(25)
limM →∞ eM = e ⇒ limM →∞ q M (eM ) = e
(26)
limM →∞ q M (bt (eM )) = bt (e)
(27)
Then,we get that
N still converges towards to πk|k−1 . So πk|k−1
D. Convergence of mean square error In this section, the convergence of the PPF is analyzed by computing the bound for the mean square error. We show that at each stage of the algorithm, the approximation admits a mean square error on the order of the number of particles. Theorem 3.3: For all t ≥ 0, there exist a constant Ct|t such that, for any bounded function φ 2
N , φ) − (πt|t , φ))2 ] ≤ Ct|t φ E[((πt|t N
(28)
where φ = sup|φ(x)| The proof is carried out using an induction framework. Lemma 3.1: Assume that for any bounded function φ, there exists some real number C 0|0 such that at Step 1 in TABLE III, N E[((π0|0 , φ) − (π0|0 , φ))2] ≤
c0|0 φ 2 N
(29)
holds at time t = 0. Because the N-particles from the prior distribution π0|0 are assumed to be independent and identically distributed variables, so (φ(x i0 ) − (π0|0 , φ)) is a sequence of independent integrable random variables. From the Marcinkiewicz and Zygmund inequalities, there exists a constant C such that N , φ) E[((π0|0
N 1
− (π0|0 , φ)) ] = E[( (φ(xi0 ) − (π0|0 , φ)))2] N i=1 2
1
((φ(xi0 ) − (π0|0 , φ))2 ] ≤ cE[ 2 N i=1 c0|0 φ 2 ≤ N N
July 25, 2016
(30) DRAFT
12
Lemma 3.2: Assume that for any bounded function φ, there exists some real number C t−1|t−1 , such that 2
N E[((πt−1|t−1 , φ) − (πt−1|t−1 , φ))2 ] ≤ Ct−1|t−1 φ N
(31)
holds, then after steps 2, 3 and 4 in TABLE III we have 2 N , φ) − (πt|t , φ))2 ] ≤ C˜t|t φ E[((˜ πt|t N
(32)
Let Ft−1 denotes the σ − f ield generated by the particles {xit−1 }N i=1 , then N N , φ)|Ft−1 ] = (πt−1|t−1 , Kφ) E[(πt|t−1
(33)
We have N |(˜ πt|t , φ)
− (πt|t , φ)| =
N (πt|t−1 , gφ) N (πt|t−1 , g)
−
(πt|t−1 , gφ) (πt|t−1 , g)
(˜ πt|t , φ) N [(πt|t−1 , g) − (πt|t−1 , g)] (πt|t−1 , g) 1 [(π N , gφ) − (πt|t−1 , gφ)] + (πt|t−1 , g) t|t−1 =
(34)
We let N , gφ) − (πt|t−1 , gφ) = Π1 + Π2 , where (πt|t−1
(35)
N N Π1 = (πt|t−1 , gφ) − E[(πt|t−1 , gφ)]
(36)
N Π2 = E[(πt|t−1 , gφ)] − (πt|t−1 , gφ)
(37)
then N N , gφ)2 |Ft−1 ] − (πt−1|t−1 , Kgφ)2 E[|Π1 |2 |Ft−1 ] = E[(πt|t−1 N 1
1 E[ ( ≤ φ(xit )g(yt |xit ))2 |Ft−1 ] N N i=1
≤ C1
φ 2 N
(38)
N , Kgφ) − (πt−1|t−1 , Kgφ))2|Ft−1 ] E[|Π2 |2 |Ft−1 ] = E[((πt−1|t−1
≤ Ct−1|t−1 July 25, 2016
g 2 φ 2 φ 2 = C2 N N DRAFT
13
Using Minkowski inequality, we have N , gφ) − (πt|t−1 , gφ))2] ≤ C3 E[((πt|t−1
φ 2 N
(39)
Using Minkowski inequality to (31), we have N (E[((˜ πt|t , φ) − (πt|t , φ))2 ])1/2 ≤ |
N , φ) (˜ πt|t
(πt|t−1 , g)
|(C3
1 1 1/2 φ 2 1/2 ) +| |(C3 ) N (πt|t−1 , g) N
φ 1 1 φ 2 1/2 |(C3 )1/2 + | |(C3 ) (πt|t−1 , g) N (πt|t−1 , g) N φ t|t √ ≤ C (40) N ≤ |
Lemma 3.3: Assume that Lemma 3.2 holds, then after steps 5 in TABLE III we have 2
N , φ) − (πt|t , φ))2 ] ≤ Ct|t φ E[((πt|t N
(41)
N N N N N , φ) − (πt|t , φ)))2 ])1/2 ≤ (E[((πt|t , φ) − (˜ πt|t , φ))2 ])1/2 + (E[((˜ πt|t , φ) − (πt|t , φ))2 (E[((πt|t √ φ φ φ t|t √ ≤ C√ + C = Ct|t √ (42) N N N
Therefore, the proof of Theorem 3.3 is completed. IV. SIMULATION In this Section, the Pearson particle filter (PPF) was compared with traditional filters and with the ER algorithm (PF-ER [37]) by two examples. All the algorithms were coded using MATLAB. The root mean square error (RMSE) was used to measure the estimation accuracy. Example 1. The univariate non-stationary growth model, which having been widely analyzed before in many publications, is used for an example. The discrete-time system dynamics can be represented by the following equations: xt+1 =
xt 2
+
25xt 1+x2t
yt =
July 25, 2016
+ 8cos(1.2t) + νt , x2t 20
+ et ,
(43) (44)
DRAFT
14
where νt ∼ N(0, 10), et ∼ N(0, 1), the initial values is x0 = 10, the initial error covariance is P (0) = 1, the initial states of particles follow uniform distribution, the sample interval is T = 1s, and the sampling time tf = 50s. The state transition probability density is selected as the importance density. In order to show the particles distribution, in this experiment we let N = 10, M = 3 and the results are list in Fig.1 (the real line denotes the path of the true state, the ∗ denotes the sampled particles in PF, denotes the sampled particles in PPF) and it is shown that the particles in PPF are more closed to the true state. 20 15 10 5
X
0 −5 −10 −15 −20 −25 −30
0
10
20
30
40
50
Time
Fig.1 The true state and the distribution of particles in PF (∗) and PPF() We use the root mean squared error (RMSE) of the state vector as the performance metrics, which yields a combined measure of the bias and variance of a filter estimate, to compare various algorithm performances. We define the average RMSE over all sampling times as xnk − xnk )2 RMSE = T1 Tk=1 (ˆ
(45)
where xkn and xˆnk are the true state vector and the minimum mean square error estimate of the state vector at time step k in the nth Monte Carlo run, respectively. The RMSE of estimations shown in Fig.2 are based on 100 particles of PF, PF-ER, and PPF M = 5, which is used to compare the accuracy of the estimations and illustrate the consistency and convergence property of the filters. It can be found that the PPF has better accuracy than the PF-ER and PF.
July 25, 2016
DRAFT
15
15 PF−100 PF−ER100 PPF−100
RMSE
10
5
0
0
10
20
30
40
50
Time
Fig.2 The RMSE of PF (N=100), PF-ER (N=100) and PPF (N=100, M=5) The sample mean and the sample variance of RMSE calculated from the 500 independent Monte Carlo realizations are listed in TABLE IV, TABLE IV: Estimation Results Filter
RMSE mean
RMSE variance
Time/s
PF N=100
3.374290
0.142336
0.046875
PF-ER N=100
2.866485
0.098238
1.062500
PPF N=100, M=5
1.582924
0.033893
0.828125
In Fig.3, we compare PPF with 100 particles M = 5, PF with 1000 particles and PF with 500 particles in term of RMSE. It can be found that the PPF still has better accuracy than the PF with 1000 particles and has a big advantage over the PF with 500 particles. So it can be verified that the PPF algorithm reduces the number of particles required to achieve the higher level of accuracy than PF algorithm.
July 25, 2016
DRAFT
16
12 PF−1000 PF−500 PPF−100
10
RMSE
8
6
4
2
0
0
10
20
30
40
50
Time
Fig.3 The RMSE of PF (N=1000), PF (N=500) and PPF (N=100, M=5) The average elapsed time of 10 independent Monte Carlo runs according to different particle numbers by the two filters is shown in Table V Though the elapsed time of PPF is longer than the PF when the algorithms take the same number of particles, but it has a better accuracy. Moreover, the elapsed time of PPF (N = 100) is less than the elapsed time of PF (N = 1000), it still has a better perform. Thus the PPF algorithm does not increase the computational cost in the situation of achieving the same level of accuracy. TABLE V: Estimation Results Filter
RMSE mean
RMSE variance
Time/s
PF N=1000
2.335544
0.126547
0.343751
PF N=500
3.125031
0.136481
0.203153
PPF N=100, M=5
1.582924
0.033893
0.828125
Example 2. We consider tracking a target moving in a 2-D space. The sensor is assumed to be stationary and located at the origin in the X − Y plane. The target moves according to the following process model: The system is equivalent to the following new system: xt = F xt−1 + Gvt−1
(46) (47)
where xt = [xt , x˙ t , yt , y˙t ], vt−1 = [vxt , vyt ]T July 25, 2016
DRAFT
17
⎡
1 1 0 0
⎤
⎡
0.5
⎢ ⎥ ⎢ ⎢ 0 1 0 0 ⎥ ⎢ 0 ⎢ ⎥ ⎢ F =⎢ ⎥, G = ⎢ ⎢ 0 0 1 1 ⎥ ⎢ 0 ⎣ ⎦ ⎣ 0 0 0 1 0
0
⎤
⎥ 0 ⎥ ⎥ ⎥, 0.5 ⎥ ⎦ 1
(48)
where, xt and yt denote the Cartesian coordinates of the target at time t, x˙ t and y˙ t denote the velocities in X and Y directions, respectively. The system noise vector v t−1 is assumed to be Gaussian white noise vector,that is, v t−1 ∼ N(0, Qt−1 ), where Qt−1 =Δ δv2 I2 is the covariance matrix, I2 is the 2 ×2 identity matrix. The initial prior distribution of the state vector is Gaussian N(x0 ; x¯0 , P0 ). The measurement equation is zt = h(xt + ωt )
(49)
The measurement vector zt has two components: the range and the bearing. h(xt ) can be expressed as
h(xt = [
x2t + yt2 arctan(yt /xt )]T
(50)
ωt is the glint measurement noise modeled as a mixture of two Gaussian distributions: p(ωt ) = (1 − ε)N1 (ωt ; 0, R1) + εN2 (ωt ; 0, R2 )
(51)
where ε is the glint probability. R 1 and R2 are the back-ground noise covariance and the glint noise covariance, respectively. The numerical simulations were performed for 40 time steps with the parameter values: σ v = 10m/s2 , x¯0 = [2000m −50m/s 2000m −10m/s]T , P0 = diag[(20m)2 (50m/s)2 (20m)2 (5m/s)2 ], ε = 0.1, R2 = diag[(10m)2 (9mrad)2 ], the covariance of back ground noise is time- varying as follows: R1 = diag[(10m)2 (0.05mrad)2] for 0 < t ≤ 20 and R1 = diag[(10m)2 (0.2mrad)2 ] for 20 < t ≤ 40. The particle number N = 100.
July 25, 2016
DRAFT
18
350 PF PF−ER PPF
RMSE of position in X (m)
300
250
200
150
100
50
0
0
5
10
15
20 Time
25
30
35
40
Fig.4 The RMSE of position in X directions (m) 90 PF PF−ER PPF
80
RMSE of position in Y (m)
70 60 50 40 30 20 10 0
0
5
10
15
20 Time
25
30
35
40
Fig.5 The RMSE of position in Y directions (m) Fig.4 and Fig.5 plots the RMSE curves of the positions in X and Y directions. Due to the large measurement noise variance and the small particle number, the estimation accuracy of PPF is better than that of PF and PF-ER in a long period of time. However, when the target passes though the vicinity of observer (at about the time index 26), the PF diverges frequently and the RMSE is significantly larger than that of PPF and PF-ER. No divergence was observed for PPF and PF-ER. This indicates that the PPF gives the most accurate results for this multi-dimensional nonlinear system estimation problem.
July 25, 2016
DRAFT
19
V. CONCLUSION In the paper, a new PF based on PCC algorithm as an improvement to the PF with resampling algorithm has been proposed. By computing the PCC of the particles in the prediction step, new particles have been generated which is close to the true state, and solve the degeneracy and sample impoverishment problem simultaneously. Simulations indicate that the PPF algorithm can improve the accuracy of estimation evidently. The PPF has to be further improved in order to get the better performance and reduce the computational complexity of the filtering. VI. ACKNOWLEDGMENT The authors would like to thank the anonymous reviewers for their detailed comments which helped to improve the quality of the paper. This work was supported by the National Natural Science Foundation for Outstanding Youth (61422102). R EFERENCES [1] Reif, K., et al., Stochastic stability of the discrete-time extended Kalman filter, IEEE Trans on Automatic Control, vol. 44, pp. 714-728, 1999. [2] S. J. Julier, J. K. Uhlmann, Unscented filtering and nonlinear estimation, Proc. IEEE, vol. 92, pp. 401-422, Mar. 2004. [3] N. J. Gordon, D. J. Salmond, A. F. M. Smith, Novel approach to nonlinear/non-Gaussian Bayesian state estimation, IEEE Proceedings, vol. 140, pp. 107-113, 1993. [4] H. X. Li, T. B. Schon, L. Ljung, A basic convergence result for particle filtering, IEEE Trans Signal Process, vol. 56, no. 4, pp. 1337-1348, Apr. 2008. [5] Hongyi Li, Chengwei Wu, Ligang Wu, Hak-Keung Lam, Yabin Gao, Filtering of Interval Type-2 Fuzzy Systems With Intermittent Measurements, IEEE Trans on Cybernetics, Vol. 46, no. 3, Mar. 2016. [6] J. Kim, M. Tandale, P. K. Menon, E. Ohlmeyer, Particle filter for ballistic target tracking with glint noise, Journal of Guidance Control and Dynamics, vol. 33, pp. 1918-1921, 2010. [7] Hongyi Li, Yabin Gao, Peng Shi, Hak-Keung Lam, Observer-based Fault Detection for Nonlinear Systems with Sensor Fault and Limited Communication Capacity, IEEE Trans on Automatic control, DOI:10.1109/TAC.2015.2503566. [8] L.L. Ong, T. Bailey, H. Durrant-Whyte, B. Upcroft, Asynchronous particle filter for tracking using non-synchronous sensor networks, Signal Processing, vol. 91, pp. 2304-2313, 2011. [9] Xunyuan Yin, Kevin Arulmaran, Jinfeng Liu, Subsystem Decomposition and Configuration for Distributed State Estimation, Wiley Online Library, DOI 10.1002/aic.15170. [10] Michal Zajac, Online fault detection of a mobile robot with a parallelized particle filter, Neurocomputing, vol. 126, pp. 151-165, 2014. [11] Hongyi Li, Yingnan Pan, Qi Zhou, Filter Design for Interval Type-2 Fuzzy SystemsWith D Stability Constraints Under a Unified Frame, IEEE Trans on Fuzzy Systems, Vol. 23, no. 3, Jun. 2015. July 25, 2016
DRAFT
20
[12] Jianfang Dou, JianxunLi, Robust visual tracking based on interactive multiple model particle filter by integrating multiple cues, Neurocomputing, vol. 125, pp. 118-129, 2014. [13] Lixian Zhang, Senior Member, IEEE, Xunyuan Yin, Zepeng Ning, Dong Ye, Robust Filtering for a Class of Networked Nonlinear Systems With Switching Communication Channels, IEEE Trans on Cybernetics, DOI: 10.1109/TCYB.2016.2523811. [14] Hong S. J., Bolic M., Djuric P.M., An eficient fixed-point implementation of residual resampling scheme for high-speed particle filters, IEEE Signal Processing Letters, vol. 11, pp. 482-485, May 2004. [15] T. Li, T. P. Sattar, S. Sun, Deterministic resampling: unbiased sampling to avoid sample impoverishment in particle filters, Signal Processing, vol. 92, pp. 1637-1645, 2012. [16] Kitagawa G., Monte Carlo filter and smoother for non-Gaussian nonlinear state space models, Journal of Computational and Graphical Statistics, vol. 5, pp. 1-25, 1996. [17] Liu J S, Chen R., Sequential Monte Carlo methods for dynamical systems, Journal of American Statistical Association, vol. 93, pp. 1032-1044, 1998. [18] Carpenter J, Clifford P, Fearnhead P, Improved particle filter for nonlinear problems, IEEE Proceedings: Radar, Sonar and Navigation, vol.146, pp. 2-7, 1999. [19] J.L. Rodgers, W.A. Nicewander, Thirteen ways to look at the correlation coefficient, The American Statistician, vol. 42, pp. 59-66, 1988. [20] Sanjay Kumar Tyagi, Correlation coefficient of dual hesitant fuzzy sets and its applications, Applied Mathematical Modelling, vol. 38, pp. 659-666, 2014. [21] Diego Pavanello, Willem Zaaiman, Alessandra Colli, John Heiser, Scott Smith, Statistical functions and relevant correlation coefficients of clearness index, Journal of Atmospheric and Solar-Terrestrial Physics, pp. 142-150, 2015. [22] Huchang Liao, Zeshui Xu , Xiao-Jun Zeng, Novel correlation coefficients between hesitant fuzzy sets and their application in decision making, Knowledge-Based Systems, vol. 82, pp. 115-127, 2015. [23] Yunmi Kim, Tae-Hwan Kim, Tolga Ergun, The instability of the Pearson correlation coefficient in the presence of coincidental outliers, Finance Research Letters, pp. 243-257, 2015. [24] Marie-Therese Puth, Markus Neuhauser, Graeme D. Ruxton, Effective use of Pearson’s product-moment correlation coefficient, Animal Behaviour, vol. 93, pp. 183-189, 2014. [25] H. Tanizaki, Nonlinear Filters: Estimation and Applications, ser. Lecture Notes in Economics and Mathematical Systems, Berlin: Springer, 1993. [26] M. Simandl, O. Straka, Functional sampling density design for particle filters, Signal Processing, vol.88 pp. 2784-2789, 2008. [27] C. Fritsche, T. Schon, A. Klein, The marginalized auxiliary particle filter, in Proceedings of the 2009 3rd IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Pro- cessing (CAMSAP), Aruba, Dutch Antilles, pp. 289292, Dec. 2009. [28] Arpita Mukherjee, AparajitaSengupta, Likelihood function modeling of particle filter in presence of non-stationary nongaussian measurement noise, Signal Processing, vol.90, pp. 1873-1885, 2010. [29] Junyi Zuo, Yan Liang, Yizhe Zhang, Quan Pan, Particle filter with multi mode sampling strategy, Signal Processing, vol. 93, no. 3, pp. 3192-3201, 2013. [30] S. Godsill, A. Doucet, M. West, Maximum a posteriori sequence estimation using Monte Carlo particle filters, Ann. Inst. Statist. Math, no. 1, pp. 82-96, 2001.
July 25, 2016
DRAFT
21
[31] T. Chen, T. Schon, H. Ohlsson, L. Ljung, Decentralized particle filter with arbitrary state decomposition, IEEE Trans on Signal Processing, vol. 59, no. 2, pp. 465-478, 2011. [32] O. Straka, J. Dunik, M. Simandl, Truncated unscented particle filter, In Proceedings of the American control conference, San Francisco, California, pp. 1825-1830, 2011. [33] Xunyuan Yin, Zhaojian Li, Lixian Zhang, Changhong Wang, Wafa Shammakh, Bashir Ahmad, Model reduction of A class of Markov jump nonlinear systems with time-varying delays via projection approach, Neurocomputing, no. 166, pp. 436-446, 2015. [34] Lixian Zhang, Yanzheng Zhu, Peng Shi, Yuxin Zhao, Resilient Asynchronous H∞ Filtering for Markov Jump Neural Networks with Unideal Measurements and Multiplicative Noises, IEEE Trans on Cybernetics, vol. 45(12), pp. 2840-2852, 2015. [35] Cristian, D., Doucet, A., A survey of convergence results on particle filtering methods for practictitioners, IEEE Trans on Signal Processing, vol. 50, pp. 735-746, 2002. [36] L. Zhang, Y. Zhu, W. X. Zheng, Energy-to-Peak State Estimation for Markov Jump RNNs with Time-Varying Delays via Nonsynchronous Filter with Nonstationary Mode Transitions, IEEE Trans on Neural Networks and Learning Systems, vol. 26(10), pp. 2346-2356, 2015. [37] Fu X Y, Jia Y M, An improvement on resampling algorithm of particle filters, IEEE Trans on Signal Processing, vol. 58(10), pp. 5414-5420, 2010.
July 25, 2016
DRAFT