An online sequential algorithm for the estimation of transition probabilities for jump Markov linear systems

An online sequential algorithm for the estimation of transition probabilities for jump Markov linear systems

Automatica 42 (2006) 1735 – 1744 www.elsevier.com/locate/automatica Brief paper An online sequential algorithm for the estimation of transition prob...

351KB Sizes 3 Downloads 122 Views

Automatica 42 (2006) 1735 – 1744 www.elsevier.com/locate/automatica

Brief paper

An online sequential algorithm for the estimation of transition probabilities for jump Markov linear systems夡 Umut Orguner ∗ , Mübeccel Demirekler Department of Electrical and Electronics Engineering, Middle East Technical University, 06531 Ankara, Turkey Received 1 December 2005; received in revised form 7 March 2006; accepted 3 May 2006 Available online 21 June 2006

Abstract This paper describes a new method to estimate the transition probabilities associated with a jump Markov linear system. The new algorithm uses stochastic approximation type recursions to minimize the Kullback–Leibler divergence between the likelihood function of the transition probabilities and the true likelihood function. Since the calculation of the likelihood function of the transition probabilities is impossible, an incomplete data paradigm, which has been previously applied to a similar problem for hidden Markov models, is used. The algorithm differs from the existing algorithms in that it assumes that the transition probabilities are deterministic quantities whereas the existing approaches consider them to be random variables with prior distributions. 䉷 2006 Elsevier Ltd. All rights reserved. Keywords: Jump Markov linear systems; Kullback–Leibler distance measure; Transition probability; Stochastic approximation

1. Introduction Jump Markov linear systems (JMLSs), which are defined to be linear systems whose parameters evolve according to a finitestate Markov chain, are used in many applications of signal processing, control theory and communications (see Logothetis & Krishnamurthy, 1999 and the references therein). In especially maneuvering target tracking, where the motion of a target can switch between different models along its trajectory, JMLs are used for modelling the signals of interest (Bar-Shalom & Li, 1993). Also, in applications where the systems involved have finite number of different modes of operation (such as fault detection and isolation, etc.), they appear naturally in the modelling process. There are many methods of achieving the state estimation of JMLSs which differ in their estimation criteria and means (Costa, 1994; Johnston & Krishnamurthy, 2001; 夡 This paper was not presented at any IFAC meeting. This paper was recommended for publication in revised form by Associate Editor George Yin under the direction of Editor Ian Petersen. ∗ Corresponding author. Tel.: +90 312 210 2374; fax: +90 312 210 2304. E-mail addresses: [email protected] (U. Orguner), [email protected] (M. Demirekler).

0005-1098/$ - see front matter 䉷 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.automatica.2006.05.002

Logothetis & Krishnamurthy, 1999; Zhang, 1999). The existing algorithms range from the classical generalized pseudoBayesian (GPB) (Ackerson & Fu, 1970; Tugnait, 1982) and the interacting multiple model (IMM) (Blom & Bar-Shalom, 1988; Mazor, Averbuch, Bar-Shalom, & Dayan, 1998) filters to stochastic sampling based methods (Boers & Driessen, 2003; Doucet, Logothetis, & Krishnamurthy, 2000, 2001). In almost all existing state estimation methods for the JMLSs, the transition probabilities of the underlying finite-state Markov chain are assumed to be known a priori. Even in the cases when they are not known, some diagonally dominant probability transition matrix is generally used. The online estimation of the transition probabilities as well as the base- and modal-states associated with a JMLS has been recently addressed by Jilkov and Li (2004) where some (conditionally) optimal (although infeasible to implement with its ever-growing memory requirements) and sub-optimal algorithms are generated using the minimum mean square error (MMSE) criterion. In addition, in Jilkov, Li, and Angelova (2003), the problem of state estimation of JMLSs with unknown transition probabilities is solved using Bayesian sampling. Although, to the authors’ knowledge, there is no previous attempt to solve the problem until (Jilkov et al., 2003), a literature is already available for a similar problem

1736

U. Orguner, M. Demirekler / Automatica 42 (2006) 1735 – 1744

associated with hidden Markov models (HMM) (Collings, Krishnamurthy, & Moore, 1994; Krishnamurthy & Moore, 1993; Krishnamurthy & Yin, 2002; Stiller & Radons, 1999). The aim of this study is to adapt one of these existing methods in the HMM literature, namely, the recursive Kullback–Leibler (RKL) estimation procedure (Collings et al., 1994; Krishnamurthy & Moore, 1993) for transition probabilities to the case of the JMLSs. Our method is fundamentally different from the algorithms given in Jilkov and Li (2004) in that we consider the transition probabilities as deterministic quantities (whereas Jilkov & Li, 2004 consider them to be random variables). The paper is organized as follows. In Section 2, the problem definition and the method of solution are given. The RKL algorithm is applied to the case of JMLSs in Section 3. The performance of the algorithm is illustrated in two examples with Monte Carlo runs in Section 4. The paper is finalized with conclusions in Section 5. 2. Problem definition and solution methodology The following JMLS model is considered: xk+1 = A(rk+1 )xk + B(rk+1 )wk+1 ,

(1)

yk = C(rk )xk + D(rk )vk ,

(2)

of the last element, i.e., Y n {y1 , y2 , . . . , yn },

(6)

 n {ˆ 1 , ˆ 2 , . . . , ˆ n }. 

(7)

The estimates are shown by hats over the letters. The iterations for the estimated quantities are shown by subscripts or by parenthesized superscripts for quantities already subscripted (n) (e.g., ˆ n or ˆ ij ). The individual elements of vector or matrix quantities are denoted by subscripted parentheses (e.g., when x is a vector, its ith element is shown by (x)i . When  is a matrix, the element of it corresponding to its ith row and jth column is denoted as ()i,j ). All the probability density functions involved in the equations are assumed to exist. The parametric probability density functions and expected value operations with  being the vector of parameters are denoted using the parameter value as a given condition (e.g., p(x|) and E{x|}) although the parameters are assumed to be deterministic quantities. A more complicated example is E{p(x|)| }



p(x|)p(x| ) dx.

(8)

where • {xk } is the continuous-valued base-state sequence with initial distribution x0 ∼ N(x0 ; xˆ0 , P0 ),

(3)

where the notation N(x; x, ¯ ) stands for a Gaussian probability density function for dummy variable x which has a mean x¯ and covariance . • {rk } is the unknown discrete-valued modal-state sequence. • {yk } is the noisy observation sequence. • {wk } is a white process noise sequence with distribution wk ∼ N(wk ; 0, Qk ).

(4)

• {vk } is a white measurement noise sequence independent from the process noise wk with distribution vk ∼ N(vk ; 0, Rk ).

(5)

The discrete-valued modal-state rk ∈ {1, 2, . . . , N} is assumed to be a first-order finite-state homogeneous Markov chain with fixed but unknown transition probability matrix  = [ij ]. The basic variables wk , vk , x0 and the modal-state sequence rk are assumed to be mutually independent for all k. The time-varying matrices A(rk ), B(rk ), C(rk ), and D(rk ) are assumed to be known for each value of rk . 2.1. Notes about the notation In the following, the capital letters with superscripts will denote sequences with the superscript corresponding to the index

The aim of our study is to find an online update mechanism for estimating the unknown fixed transition probabilities of the JMLS defined above which can work coupled with a conventional state estimator such as GPB or IMM algorithm. 2.2. Solution methodology The RKL approach introduced, in general, by Weinstein, Feder, and Oppenheim (1990) and applied later to HMMs by Krishnamurthy and Moore (1993), achieves the estimation of the unknown parameters by minimizing recursively the Kullback–Leibler divergence (Cover & Thomas, 1991) (called as relative entropy by information theorists)   p(Y n |0 )  Cn () = E log 0 p(Y n |)     p(Y n |0 )  log p(Y n |0 ) dY n p(Y n |) 

(9) (10)

between the likelihood function p(Y n |) of the unknown parameters  and the true likelihood p(Y n |0 ). The parameter RKL is therefore given by estimate ˆ n

RKL ˆ n = arg min Cn ().



(11)

Minimizing the cost function given in Eq. (10) is equivalent to maximizing the reward function Jn () given as Jn () = E{log p(Y n |)|0 }.

(12)

U. Orguner, M. Demirekler / Automatica 42 (2006) 1735 – 1744

The likelihood function p(Y n |) is extremely difficult (if not impossible) to calculate for the JMLSs like the case in many other applications. Therefore, we are going to consider Y n as the known (incomplete) part of the complete data K n = {Y n , Xn , R n }. The same incomplete data approach is used in Logothetis and Krishnamurthy (1999) and Johnston and Krishnamurthy (2001) to calculate offline and online base-state estimates using the expectation maximization procedure, respectively. We can obtain using the Bayes rule that, p(Y n |) =

p(Y n , Xn , R n |) p(X n , R n |Y n , )

(13)

which gives

Taking expected values of both sides given rameter value  ,

ˆ k+1 = arg max Qn (, ˆ k ). 

(14) Yn

at a fixed pa-

where the left-hand side of Eq. (14) is taken immediately out of the expectation since the likelihood p(Y n |) is deterministic with the given information in the expectation. Defining the functions Qn (,  , Y n ) and Pn (,  , Y n ) as Qn (,  , Y n )E{log p(Y n , Xn , R n |)|Y n ,  },

(16)

Pn (,  , Y n )E{log p(X n , R n |Y n , )|Y n ,  },

(17)

we can express log p(Y n |) as follows: log p(Y n |) = Qn (,  , Y n ) − Pn (,  , Y n ).

(18)

The reward function Jn () is then given by Jn () = E{Qn (,  , Y n )|0 } − E{Pn (,  , Y n )|0 }  Qn (,  ) − P n (,  ).

(19) (20)

Note that by Jensen’s inequality (Cover & Thomas, 1991), ∀Y n

(21)

and, therefore,

(27)

The sequential version of this algorithm becomes  n , Y n+1 ) ˆ n+1 = arg max Qn+1 (,  

(15)

(26)

Recognizing that the calculation of the function Qn (, ˆ n ) = E{Qn (, ˆ n , Y n )|0 } requires the knowledge of the true parameter value 0 , we necessarily must choose a suboptimal recursion. The generally adopted solution to this type of problems is to use Qn (, ˆ n , Y n ) instead of its ensemble average. Therefore, an approximate iterative algorithm for maximizing Jn () is 

log p(Y n |) = E{log p(Y n , Xn , R n |)|Y n ,  } − E{log p(Xn , R n |Y n , )|Y n ,  },

Pn (,  , Y n )Pn ( ,  , Y n )

An iterative but offline algorithm maximizing the reward function Jn () is therefore given as

ˆ k+1 = arg max Qn (, ˆ k , Y n ).

log p(Y n |) = log p(Y n , Xn , R n |) − log p(X n , R n |Y n , ).

1737

(28)

which avoids the reprocessing of all the accumulating data when a new estimate ˆ n is obtained. In the cases where the  n , Y n+1 ) is not possible, analytical maximization of Qn+1 (,  one can use the so-called stochastic approximation type algorithms (Benveniste, Métivier, & Priouret, 1990; Kushner & Yin, 1997). These algorithms were introduced to find the zeros of the functions which are unknown but whose noise corrupted observations can be obtained (Robbins & Monro, 1951). The stochastic approximation algorithm that we will use is intended to find the zeros of the gradient of the reward function,  n , Y n+1 ). This type of algoi.e., the zeros of (j/j)Qn+1 (,  rithms is generally called in the field as the stochastic gradient algorithms. The recursion of the algorithm is as follows:  n , Y n+1 ), ˆ n+1 = ˆ n + n Gn+1 (ˆ n , 

(29)

where n is a sequence of small scalar gains and Gn+1 (ˆ n ,  n , Y n+1 ) is a (possibly noisy) measurement of the gradient   n , Y n+1 ) with respect to the unknown parameter of Qn+1 (,   evaluated at the last parameter estimate ˆ n , i.e.,   j n n n+1 n+1  ˆ   Gn+1 (n ,  , Y )= ) Qn+1 (,  , Y j =ˆ n n n+1  + Wn+1 ( , Y ), (30)

which results in the following fact:

 n , Y n+1 ) is the parameter-dependent noise where Wn+1 ( term in the measurement of the gradient. The almost-sure and some other weaker types of convergence for the algorithm to the true parameter value 0 is guaranteed if some conditions on the step-size sequence n and the measurement  n , Y n+1 ) are satisfied (Benveniste et al., 1990; noise Wn+1 ( Kushner & Yin, 1997). In the cases where constraints exist on the parameters , the updated parameters are usually projected onto the constraint surface, i.e.,

Qn (,  )Qn ( ,  ) implies Jn ()Jn ( ).

 n , Y n+1 )}, ˆ n+1 = P{ˆ n + n Gn+1 (ˆ n , 

P n (,  )P n ( ,  ).

(22)

Using this, we can say that Jn ( ) = Qn ( ,  ) − P n ( ,  ) Qn ( ,  ) − P n (,  ) = Jn () + Qn ( ,  ) − Qn (,  )

(23) (24) (25)

(31)

1738

U. Orguner, M. Demirekler / Automatica 42 (2006) 1735 – 1744

where P denotes the projection operator related with the constraint surface (manifold) which returns the nearest element in the constraint surface to the updated parameter value. When the unknown parameter vector  is composed of the elements of the probability transition matrix of the JMLS, i.e.,  = [11 , 12 , . . . , 1N , 21 , 22 , . . . , 2N , . . . or in a more compact form, 1i, j N ,

(32)

the following constraint set is in effect: N 

ij = 1

for i = 1, . . . , N

and ij 0

(33)

j =1

for 1i, j N . These constraints define N standard Nsimplices. At the end of each parameter update, each updated set of probabilities {ij }N j =1 must be projected onto the standard simplex Si defined by the constraints N 

ij = 1

and ij 0 for j = 1, . . . , N.

log p(K n+1 |) = log(p(yn+1 |xn+1 , rn+1 ) × p(xn+1 |xn , rn+1 )P (rn+1 |rn , )) + log p(Y n , Xn , R n |).

(39)

Defining the log terms L1 and L2 as

. . . . . . , N1 , N2 , . . . , NN ]T

()N(i−1)+j = ij ,

The log-likelihood log p(K n+1 |) is simply

(34)

L1 (yn+1 , xn+1 , xn , rn+1 , rn |)  log(p(yn+1 |xn+1 , rn+1 )p(xn+1 |xn , rn+1 ) × P (rn+1 |rn , )), L2 (Y n , Xn , R n |) log p(Y n , Xn , R n |),

(40) (41)

 n , Y n+1 ) is written as the function Qn+1 (,   n , Y n+1 ) = E1 + E2 , Qn+1 (, 

(42)

where n

 }, E1 E{L1 (yn+1 , xn+1 , xn , rn+1 , rn |)|Y n+1 ,  n

 }. E2 E{L2 (Y n , Xn , R n |)|Y n+1 , 

(43) (44)

j =1

The problem of finding the projection of a vector onto a standard simplex can be formulated as a standard quadratic programming problem and the projection can be found in at most N steps. Throughout the paper, we are going to call the projection operator of a standard N-simplex as P and, with an abuse of notation, we are going to denote the projected probability value ∗ij obtained after the set of updated probabilities {ij }N j =1 is projected onto the corresponding simplex Si with ∗ij = P{ij }.

(35)

A description of the recursive method used for obtaining the projections for our algorithm is given in Appendix A. See, for example, Section 2.6.2 of Poznyak and Najim (1997) for a general solution of the problem of obtaining projections onto a simplex. 3. Derivation of the algorithm In this section, we are going to adapt the RKL procedure described in Section 2 to the problem of estimating the transition probabilities associated with a JMLS. For this purpose, we are  n , Y n+1 ) which is going to maximize the function Qn+1 (,  given as  n , Y n+1 ) = E{log p(K n+1 |)|Y n+1 ,   n }, Qn+1 (, 

(36)

where the likelihood p(K n+1 |) of the complete data set K n+1 for a JMLS has the following expansion: p(K n+1 |)p(Y n+1 , Xn+1 , R n+1 |) = p(yn+1 |xn+1 , rn+1 )p(xn+1 |xn , rn+1 ) × P (rn+1 |rn , )p(Y n , Xn , R n |).

(37) (38)

3.1. Calculation of the expectation E1 The expectation E1 is by definition E1 =

  i

L1 (yn+1 , xn+1 , xn , j, i|)

j

 n ) dxn+1 dxn , (45) × p(xn+1 , xn , rn+1 =j, rn =i|Y n+1 , where L1 can be written as follows: L1 (yn+1 , xn+1 , xn , j, i|) = log p(yn+1 |xn+1 , rn+1 = j ) + log p(xn+1 |xn , rn+1 = j ) + log ij . (46) The integrals in Eq. (45) can be expanded as the sum of three terms as   log(p(yn+1 |xn+1 , rn+1 = j )) E1 = j

 n ) dxn+1 × p(xn+1 , rn+1 = j |Y n+1 ,     + log(p(xn+1 |xn , rn+1 = j )) j

 n ) dxn dxn+1 × p(xn+1 , xn , rn+1 = j |Y n+1 ,    n ). + log(ij )P (rn+1 = j, rn = i|Y n+1 ,  i

(47)

j

In the sequential identification process, we are only interested  n , Y n+1 ) in the derivative of the reward function Qn+1 (,  with respect to the unknown parameters ij which constitute the parameter vector . Only the third term on the right-hand

U. Orguner, M. Demirekler / Automatica 42 (2006) 1735 – 1744

side of Eq. (47) contributes to the related derivative which gives the result,

i

where  n ) Qn (yn+1 , , 

jE1 j  = log(ij ) j j



j

n

×

(48)



jE1 N(i−1)+j jij 1 n) = P (rn+1 = j, rn = i|Y n+1 ,  ij 

(49)

for 1i, j N.

   n−1 , Y n )  jE2  jQn (,  =   j =ˆ n j

=ˆ n

+

The second expectation E2 is by definition E2 =

log(p(Y n , Xn , R n |))

 n ) dX n . × p(X n , R n |Y n+1 , 

(50) n

 ) can be written using The density function p(Xn , R n |Y n+1 ,  Bayes rule as n) p(Xn , R n |Y n+1 ,  n) p(yn+1 |X n , R n , Y n ,  n) = p(X n , R n |Y n ,  n n  p(yn+1 |Y ,  ) p(yn+1 |xn , rn , ˆ n )  n−1 ), = p(X n , R n |Y n ,  n) p(yn+1 |Y n , 

(51) (52)

where Eq. (52) is written by dropping the redundant terms in the given conditions of Eq. (51). Substituting Eq. (52) into Eq. (50), E2 =

Rn

n

n

n−1

 ) dX n × p(Xn , R n |Y n ,    log(p(Y n , Xn , R n |)) = Rn

×



n−1

=ˆ n

≈ 0.

(56)

The assumption in Eq. (56), which is rarely true, is a generally adopted one in practice to generate recursive identification algorithms (Ljung, 1987). Defining the sequence Mn as  j n   , Mn  Qn (yn+1 , ,  ) j =ˆ n

(57)

the partial derivative jE2 /j evaluated at the current parameter estimate ˆ n is approximately given as (58)

(59)

Rn



p(yn+1 |xn , rn , ˆ n ) −1 n) p(yn+1 |Y n , 

 ) dX n × p(X n , R n |Y n ,   n−1 , Y n ) + Qn (yn+1 , ,   n ), = Qn (, 

 , Y n )   

 n )|Y n ,  n} E{Qn (yn+1 , ,    n ) dyn+1  Qn (yn+1 )p(yn+1 |Y n ,   = log(p(Y n , Xn , R n |))

n−1

n

(55)

Remark 1. Note here that,

 ) dX n × p(X n , R n |Y n ,   log(p(Y n , Xn , R n |)) + R

n−1

 jQn (,  j

 jE2  ≈ Mn . j =ˆ n

p(yn+1 |xn , rn , ˆ n ) log(p(Y , X , R |)) n) p(yn+1 |Y n ,  n

  j  n ) Qn (yn+1 , ,   ˆ . j =n

 n−1 , Y n ) is assumed to be maximized by The function Qn (,  the previous iteration and, therefore, the first term on the righthand side of Eq. (55) is approximately zero, i.e.,

Rn





p(yn+1 |xn , rn , ˆ n ) −1 n) p(yn+1 |Y n , 

In the stochastic approximation algorithm, we are interested in the derivative of the expectation E2 evaluated at the current model value ˆ n which is

3.2. Calculation of the expectation E2



log(p(Y n , Xn , R n |))

 n−1 ) dX n . × p(X n , R n |Y n , 

Then, it easily follows that, jE1 j

 Rn

 ). × P (rn+1 = j, rn = i|Y n+1 , 



1739

×

 n ) dyn+1 p(yn+1 |xn , rn , ˆ n ) − p(yn+1 |Y n ,  

(53) (54)

n

n

n

 n−1

× p(X , R |Y , 

=0

) dX n = 0.

(60)

1740

U. Orguner, M. Demirekler / Automatica 42 (2006) 1735 – 1744

Actually, more can be said about the sequence Qn in the following way:

 n , Y n+1 ) evaluThe noisy gradient measurement Gn+1 (,  ated at the current parameter estimate ˆ n is finally given as

 n )|Qn−1 } E{Qn (yn+1 , ,   ⎫ ⎧  ⎬ ⎨   n )|Y n ,   n } Qn−1 = 0 = E E{Qn (yn+1 , ,    ⎭ ⎩

 , Y n+1 ))N(i−1)+j (Gn+1 (ˆ n ,  1 n) = (n) P (rn+1 = j, rn = i|Y n+1 ,  ˆ ij

n

(61)

=0

almost surely where Qn−1 {Q1 , Q2 , . . . , Qn−1 }. The first equality in Eq. (61) can be written due to the fact that Qn−1 is (Y n )-measurable where (Y n ) denotes the algebra generated by Y n . As a result, the random sequence Qn is a martingale difference. Under the assumption that the partial derivative with respect to the unknown parameters  and the integration corresponding to expectation can be interchanged, E{Mn |M n−1 }       j n   n−1  ) =E Qn (yn+1 , ,   M  ˆ  j =n ⎫ ⎧   ⎬ ⎨ j n n  n  n−1   = E E{Qn (yn+1 , ,  )|Y ,  }  M  ⎭ j ⎩ =0

(n) for 1 i, j N where ˆ ij = (ˆ n )N(i−1)+j .

Remark 2. Note here that the partial derivatives given in Eq. (64) are always non-negative. This stems from the fact that the constraints are not considered in the cost function. The constraints, therefore, will play an essential role for the stability of the algorithm. Adding the projection operation related with the constraints to each recursion of the algorithm, the update rule for the tran(n) sition probability estimate ˆ ij is given as   n) P (rn+1 = j, rn = i|Y n+1 ,  (n+1) (n) = P ˆ ij + n ˆ ij (65) (n) ˆ ij for 1 i, j N .

=ˆ n

=0

(64)

a.s.

Remark 3. Defining the updated non-projected parameters (n+1) as ˜ ij

Consequently, the sequence Mn is also a martingale difference.

˜ ij

3.3. RKL recursions

we see that they are always non-negative and satisfy the inequality

(n+1)

Combining Eqs. (49) and (58), the elements of the gradient  n , Y n+1 ) is written as of the reward function Qn+1 (, 

N 



j =1



j  n , Y n+1 ) Qn+1 (,  j =

N(i−1)+j n+1  n

P (rn+1 = j, rn = i|Y ij

, )

+ (Mn )N(i−1)+j

(62)

for 1 i, j N. Thus, the first term on the right-hand side of Eq. (62) can be used as a measurement (in a martingale difference noise) of the left-hand side, i.e.,  n , Y n+1 ))N(i−1)+j (Gn+1 (,  1  n ). = P (rn+1 = j, rn = i|Y n+1 ,  ij

(63)

The martingale difference characteristics of the noise term Mn (see Remark 1) is a nice property which is generally used in proving convergence of this type of algorithms (Kushner & Yin, 1997). Nevertheless, the complexity involved in the algorithm makes it difficult to reach a direct conclusion on the algorithm’s convergence using the existing results. We are therefore going to address the convergence issue in this document only through simulation results.

(n)

ˆ ij + n

(n+1)

˜ ij

1

1 (n) ˆ ij

n

 ), P (rn+1 = j, rn = i|Y n+1 , 

for i = 1, 2, . . . , N.

(66)

We give a simple method of projection for the set of non(n+1) negative quantities {˜ ij }N j =1 satisfying this inequality in Appendix A. 3.4. Calculation of the probabilities In order to be able to calculate the probabilities P (rn+1 =  n ), we are going to assume that an online j, rn = i|Y n+1 ,  conventional state estimator like IMM or GPB algorithm which  n of the RKL uses the probability transition matrix estimates  algorithm is also run on the measurements Y n . The conventional state estimator is required to supply the mode-conditioned state estimates n

i  } for i = 1 . . . N, E{xn |Y n , rn = i,  xˆn|n

(67)

covariances i i i n} E{(xn − xˆn|n )(xn − xˆn|n )T |Y n , rn = i,  Pn|n

(68)

for i = 1 . . . N, and the mode probabilities n

 }. i (n) = P {rn = i|Y n , 

(69)

U. Orguner, M. Demirekler / Automatica 42 (2006) 1735 – 1744

1741

Using these quantities, we can calculate the required probabilities approximately as

4. Simulation results

n) P (rn+1 = j, rn = i|Y n+1 ,  n) p(yn+1 |rn+1 = j, rn = i, Y n ,  = n) p(yn+1 |Y n ,   n )P (rn = i|Y n ,  n) × P (rn+1 = j |rn = i, 

This section evaluates the performance of the RKL method by means of computer simulations. For this purpose, we are going to use two examples which illustrate the capabilities of the RKL algorithm we have derived in the previous section.



i ,Pi ) p(yn+1 |rn+1 = j, xˆn|n n|n

4.1. Example 1

n) p(yn+1 |Y n ,   n )P (rn = i|Y n ,  n) × P (rn+1 = j |rn = i,  (70)

In order to evaluate the performance of our method, in this subsection, we are going to use the same example and methodology as in Jilkov and Li (2004) which is related with a system with failures. Consider the scalar dynamical system given by

(71)

xk+1 = xk + wk ,

(75)

yk = (rk − 1)xk + (100 − 90(rk − 1))vk ,

(76)

(n)

=

i , P i ) p(yn+1 |rn+1 = j, xˆn|n n|n ˆ ij i (n)

n) p(yn+1 |Y n ,  ij

ij

(n)

N(yn+1 ; yˆn+1|n , Sn+1|n )ˆ ij i (n) =  , ij ij (n) ˆ ij i (n) j i N(yn+1 ; yˆn+1|n , Sn+1|n ) where ij

i , yˆn+1|n C(j )A(j )xˆn|n

(72)

ij

i Sn+1|n C(j )(A(j )Pn|n AT (j ) + B(j )Qk B(j )T )C T (j )

+ D(j )Rk D(j )T .

(73)

Substituting Eq. (71) into Eq. (65), the update rule for the transition probabilities becomes ⎧ ⎨ (n+1) (n) = P ˆ ij ˆ ij ⎩ ⎫ ij ij ⎬ N(yn+1 ; yˆn+1|n , Sn+1|n )i (n) +n   , ij ij (n) ⎭ N(y ; y ˆ , S )  ˆ  (n) n+1 i j i n+1|n n+1|n ij where the projection operator P is described in Section 2.2. 3.5. Selection of the step-size sequence n The choice of the step-size sequence n is an important issue for obtaining a satisfactory performance with this type of algorithms. The sufficient conditions on the step-sizes for convergence are ∞  n=0

n = ∞

and

∞  n=0

2n < ∞

(74)

which guarantee a decrease in the value of step-size n which is fast enough to allow sufficient averaging of the noisy gradient measurements and slow enough to avoid premature convergence of the algorithm. In the cases where one needs the algorithm to track small changes in the parameter values, constant and sufficiently small step-sizes are used (Benveniste et al., 1990; Kushner & Yin, 1997). Some adaptive step-size sequence selection mechanisms exist in the literature (Benveniste et al., 1990) but we are going to choose constant step-sizes during the simulations for the sake of simplicity.

where x0 ∼ N(x0 ; 0, 202 ), wk ∼ N(wk ; 0, 22 ), vk ∼ N(vk ; 0, 1) with x0 , wk , and vk being mutually independent for k = 1, 2, . . . . The model sequence rk ∈ {1, 2} is a first-order, two-state, homogeneous Markov process with probability transition matrix  = [ij ] given as 

0.6 = 0.85

 0.4 . 0.15

(77)

Note that this system corresponds to a system with frequent measurement failures with the modal-state rk = 1 corresponding to the case of the failure. Three IMM algorithms were implemented to estimate the state of this system: • The first IMM algorithm is an exact one which uses the true probability transition matrix of the system given in Eq. (77). • The second IMM algorithm is a non-adaptive one which uses a typical probability transition matrix with 11 = 22 = 0.9. • The third IMM algorithm is an adaptive one which estimates the probability transition matrix of the underlying Markov chain using the RKL method with the constant step-size sequence n = 0.02. The algorithms use equal initial mode-probabilities (i.e., i (0)= 0.5 for i = 1, 2) and therefore assume no knowledge of the initial state distribution for the finite-state Markov chain. In order to obtain ensemble averages, 1000 Monte Carlo runs are performed. In each run, • The underlying Markov chain is simulated using the true probability transition matrix given in Eq. (77). The true mode-states are obtained. The initial state distribution of the Markov chain is randomly created assuming that all possible state distributions are equally likely. • The true base-state of the system described by Eq. (75) is simulated using an artificially generated process noise sequence {wk }.

1742

U. Orguner, M. Demirekler / Automatica 42 (2006) 1735 – 1744

1

0.8 π11 π22

0.7

0.9 0.8

0.6

π11 π22 π33 π12 π21 π32

0.7 0.6

0.5

0.5 0.4

0.4

0.3

0.3 0.2

0.2 0.1

0.1 0 900 1000

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

Fig. 1. The average transition probability estimation performance of the recursive Kullback–Leibler method.

Fig. 3. The average transition probability estimation performance of the recursive Kullback–Leibler method.

0

100

200

300

400

500

600

700

800

16 Exact Non-Adaptive RKL Adaptive

14

12

1.1

10

1

8

0.9

6

0.8

4

0.7

0

100

200

300

400

500

600

700

800

Exact Non-Adaptive RKL Adaptive

1.2

900 1000

Fig. 2. The mean-absolute base-state estimation errors of the IMM algorithms.

• The measurement sequence of the system described by Eq. (76) is obtained using the true mode-state sequence, base-state sequence, and artificially generated measurement noise sequence {vk }. • The IMM algorithms are executed using the measurement sequence. • The absolute base-state estimation errors of the algorithms are calculated. The average estimation performance of the RKL method is shown in Fig. 1. Note that there exist very small biases in the estimates like the case in Jilkov and Li (2004). These seem to be due to the approximations in the derivation process of the algorithm and seem to be acceptable. The mean-absolute basestate estimation errors of the three IMM filters are shown in Fig. 2. The errors of the exact IMM are at the minimum level

0

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

Fig. 4. Mean-absolute base-state estimation errors of the IMM algorithms.

as expected. The adaptive algorithm fed by the RKL estimation process appears to beat the non-adaptive IMM filter significantly. 4.2. Example 2 As a second example, a hypothetical three-model scalar JMLS (i.e., rk ∈ {1, 2, 3}) is used. The parameters of the system are given as • • • • •

A(1) = 0.8, A(2) = 0.9, A(3) = 1, B(i) = 1 and D(i) = 1 for i = 1, 2, 3, C(1) = 1, C(2) = 2, C(3) = 4, x0 ∼ N(x0 ; 0, 22 ), wk ∼ N(wk ; 0, 22 ), vk ∼ N(vk ; 0, 1),

U. Orguner, M. Demirekler / Automatica 42 (2006) 1735 – 1744

and the true transition probability associated with the mode sequence rk is taken as 

0.2 0.4  = 0.25 0.5 0.1 0.1

 0.4 0.25 . 0.8

(78)

The three IMM algorithms which have the same features as in the first example are executed. The constant probability transition matrix of the non-adaptive IMM filter is selected as 

0.9 Non-adaptive = 0.05 0.05

 0.05 0.05 0.9 0.05 . 0.05 0.9

(79)

 0 for the The initial probability transition matrix estimate  RKL algorithm of the adaptive IMM algorithm is taken as 

0.33  0 = 0.33  0.33

0.33 0.33 0.33

0.34 0.34 0.34

5. Conclusions A new method for the estimation of the transition probabilities of the JMLSs is given. Although the derivation of the method is quite complicated, the resulting recursions are easy to implement using the outputs of an online multiple-model state estimation algorithm like IMM or GPB approaches. The simulation results show that the algorithm can be an efficient alternative to those described in Jilkov and Li (2004). Appendix A. Recursive projection algorithm used in the simulations The following operation finds the projection x¯ of the N vector x = [x1 x2 · · · xN ]T which satisfies the inequalities (see Remark 3): xi 1

and xi 0 for i = 1, 2, . . . , N

(A.1)

i=1

onto the standard N-simplex defined by its elements in at most N recursions: x¯ = Project(x, N ),

function p=Project(x, N )  =( N i=1 xi − 1)/N if  mini xi p = x − 1¯ N else j = arg mini xi y = [x((1 : (j − 1)))T x((j + 1) : N )T ]T m=Project(y, N − 1) p(1 : (j − 1)) = m(1 : (j − 1)) p(j ) = 0 p((j + 1) : N ) = m(j : (N − 1)) end. Here, 1¯ N = [1 1 · · · 1]T denotes the N-vector composed of ones and the notation x(i : j ) stands for 

[xi · · · xj ]T [.]T

for j i, otherwise,

(A.3)

(80)

and the step-size sequence of the RKL algorithm is set to n = 0.002. The transition matrix estimation performance of the RKL algorithm is illustrated in Fig. 3. Small biases again exist in the estimates but the convergence trend is evident. The meanabsolute base-state estimation errors of the three IMM filters are shown in Fig. 4. The errors of the adaptive algorithm fed by the RKL estimation process are very close to those of the exact IMM after a transient period.

N 

where the recursive function Project(., .) is defined as follows:

x(i : j )



1743

(A.2)

where the notation [.] denotes an empty vector. The proof of the fact that this operation obtains the projection of x onto the standard N-simplex can be made easily by formulating the problem as a standard quadratic programming problem and is not given here. See Poznyak and Najim (1997) for a general solution which does not require the conditions in Eq. (A.1).

References Ackerson, G. A., & Fu, K. S. (1970). On state estimation in switching environments. IEEE Transactions on Automatic Control, 15(1), 10–17. Bar-Shalom, Y., & Li, X. R. (1993). Estimation and tracking: Principles, techniques, and software. Norwell, MA: Artech House. Benveniste, A., Métivier, M., & Priouret, P. (1990). Adaptive algorithms and stochastic approximations. Berlin, Germany: Springer. Blom, H. A. P., & Bar-Shalom, Y. (1988). The interacting multiple model algorithm for systems with Markov switching coefficients. IEEE Transactions on Automatic Control, 33(8), 780–783. Boers, Y., & Driessen, J. N. (2003). Interacting multiple model particle filter. IEE Proceedings—Radar Sonar and Navigation, 150(5), 344–349. Collings, I. B., Krishnamurthy, V., & Moore, J. B. (1994). Online identification of hidden Markov models via recursive prediction error techniques. IEEE Transactions on Signal Processing, 42(12), 3535–3539. Costa, O. L. V. (1994). Linear minimum mean square error estimation for discrete-time Markovian jump linear systems. IEEE Transactions on Automatic Control, 39(8), 1685–1689. Cover, T. M., & Thomas, J. A. (1991). Elements of information theory. New York: Wiley. Doucet, A., Gordon, N. J., & Krishnamurthy, V. (2001). Particle filters for state estimation of jump Markov linear systems. IEEE Transactions on Signal Processing, 49(3), 613–624. Doucet, A., Logothetis, A., & Krishnamurthy, V. (2000). Stochastic sampling algorithms for state estimation of jump Markov linear systems. IEEE Transactions on Signal Processing, 45(1), 188–202. Ford, J. J., & Moore, J. B. (1998). Adaptive estimation of HMM transition probabilities. IEEE Transactions on Signal Processing, 46(5), 1374–1385. Jilkov, V. P., & Li, X. R. (2004). Online Bayesian estimation of transition probabilities for Markovian jump systems. IEEE Transactions on Signal Processing, 52(6), 1620–1630.

1744

U. Orguner, M. Demirekler / Automatica 42 (2006) 1735 – 1744

Jilkov, V. P., Li, X. R., & Angelova, D. S., (2003). Estimation of Markovian jump systems with unknown transition probabilities through Bayesian sampling. In NMA ’02: Revised papers from the fifth international conference on numerical methods and applications, London, UK, Berlin: Springer, pp. 307–315. Johnston, L. A., & Krishnamurthy, V. (2001). An improvement to the interacting multiple model (IMM) algorithm. IEEE Transactions on Signal Processing, 49(12), 2909–2923. Krishnamurthy, V., & Moore, J. B. (1993). Online estimation of hidden Markov model parameters based on the Kullback–Leibler information measure. IEEE Transactions on Signal Processing, 41(8), 2557–2573. Krishnamurthy, V., & Yin, G. G. (2002). Recursive algorithms for estimation of hidden Markov models and autoregressive models with Markov regime. IEEE Transactions on Information Theory, 48(2), 458–476. Kushner, H. J., & Yin, G. G. (1997). Stochastic approximation algorithms and applications. New York: Springer. Ljung, L. (1987). System identification: Theory for the user. Englewood Cliffs, NJ: Prentice-Hall. Logothetis, A., & Krishnamurthy, V. (1999). Expectation maximization algorithms for MAP estimation of jump Markov linear systems. IEEE Transactions on Signal Processing, 47(8), 2139–2156. Mazor, E., Averbuch, A., Bar-Shalom, Y., & Dayan, J. (1998). Interacting multiple model methods in target tracking: A survey. IEEE Transactions on Aerospace and Electronic Systems, 34(1), 103–123. Poznyak, A. S., & Najim, K. (1997). Learning automata and stochastic optimization. New York, Secaucus, NJ, USA: Springer. Robbins, H., & Monro, S. (1951). A stochastic approximation method. Annals of Mathematical Statistics, 22, 400–407. Stiller, J. C., & Radons, G. (1999). Online estimation of hidden Markov models. IEEE Signal Processing Letters, 6(8), 213–215.

Tugnait, J. K. (1982). Adaptive estimation and identification for discrete systems with Markov jump parameters. IEEE Transactions on Automatic Control, 27(5), 1054–1065. Weinstein, E., Feder, M., & Oppenheim, A. V. (1990). Sequential algorithms for parameter estimation based on the Kullback–Leibler information measure. IEEE Transactions on Acoustics, Speech and Signal Processing, 38(9), 1652–1654. Zhang, Q. (1999). Optimal filtering of discrete-time hybrid systems. Journal of Optimization Theory and Applications, 100(1), 123–144. Umut Orguner received B.S. and M.S. degrees in electrical engineering from Middle East Technical University (Ankara, Turkey) in 1999 and 2002, respectively. He is currently a teaching assistant and a Ph.D. candidate in Department of Electrical and Electronics Engineering of the same university. His research interests include estimation theory, multiple-model estimation, target tracking, and information fusion.

Mübeccel Demirekler received her Ph.D. degree in electrical engineering from Middle East Technical University (Ankara, Turkey) in 1979. She is currently a professor in Department of Electrical and Electronics Engineering of the same university. Her research interests are target tracking systems, on which she conducted several projects, information fusion, and speech processing.