Proceedings of the 18th World Congress The International Federation of Automatic Control Milano (Italy) August 28 - September 2, 2011
An Information-Theoretic Approach to Distributed State Estimation Giorgio Battistelli ∗ Luigi Chisci ∗ Stefano Morrocchi ∗ Francesco Papi ∗ ∗
Dipartimento di Sistemi e Informatica DSI - Universit` a di Firenze, Via S. Marta 3, 50139 Firenze, Italy (e-mail: {giorgio.battistelli, luigi.chisci}@unifi.it).
Abstract: It is shown that the covariance intersection fusion rule, widely used in the context of distributed estimation, has a nice information-theoretic interpretation in terms of consensus on the Kullback-Leibler average of Gaussian probability density functions (PDFs). Based on this observation, a novel distributed state estimator based on the consensus among local posterior PDFs is proposed and its stability properties are analyzed. Keywords: Distributed state estimation; sensor fusion; Kalman filters; networks; information theory. 1. INTRODUCTION A challenging research topic is to develop efficient techniques for distributed information fusion in multi-agent systems, i.e. networks consisting of multiple nodes (agents) with local data acquisition, processing and communication capabilities. Such networks are widely employed in modern automation and supervision systems in both industrial and other contexts (e.g., environmental monitoring) and pose interesting problems that need a thorough investigation from both a theoretical and a practical point of view. A still open issue is how to effectively counteract the deleterious effects of ”data incest” that occur even in the simple case in which multiple inter-communicating agents, without any centralized coordination and according to an unknown and possibly time-varying network topology, try to estimate the state of the same linear dynamical process; in fact, in such a case, it is common to find, due to the correlation among the estimates of different agents, a remarkable performance degradation with respect to the centralized Kalman estimator or even, if the target process is unstable, divergence of the estimation error. A key objective to be pursued is, therefore, to devise, with reference to the state estimation of a linear process, distributed state estimation algorithms (to be implemented in each agent) that guarantee for each agent, under the assumptions of process observability from the whole network (but not necessarily from the individual agents) and network connectivity, bounded estimation error as close as possible to the one obtained with the optimal centralized estimator. Among the approaches devised to counteract the data incest phenomena, special attention is deserved by the consensus filters (Xiao et al., 2005; Olfati-Saber et al., 2007) and by the covariance intersection fusion rule (Julier and Uhlmann, 1997; Chen et al., 2002). Consensus filters provide a general tool for distributed averaging; in the context of distributed state estimation, they have been exploited for averaging innovations (Olfati-Saber, 2007; Kamgarpour and Tomlin, 2008) or state estimates (Al978-3-902661-93-7/11/$20.00 © 2011 IFAC
riksson and Rantzer, 2006; Carli et al., 2008; Olfati-Saber, 2009; Stankovic et al., 2009). The covariance intersection fusion adopts an information filtering approach, by propagating the information (inverse covariance) matrix and the information vector defined via premultiplication of the state estimate by the information matrix, and computes the fused information pair as a convex combination of the local information pairs. Other relevant approaches to distributed state estimation include those based on movinghorizon estimation (Farina et al., 2010) and diffusion processes (Cattivelli and Sayed, 2010). In this paper, we follow an information-theoretic approach by formulating a consensus problem among the probability density functions (PDFs) of the state vector to be estimated and by defining the average PDF as the one that minimizes the sum of the information gains from the initial PDFs. In particular, it is shown that in the linear Gaussian case and for a single consensus step this just provides the covariance intersection fusion which, therefore, can be interpreted as a single-step consensus on local PDFs. This consideration also suggests how multistep consensus can be exploited to possibly improve performance of the covariance intersection fusion. Another contribution of this paper is to show that, under weak network observability and connectivity assumptions as well as for any number of consensus steps, distributed state estimation with consensus on the posterior PDFs guarantees bounded estimation error covariance in all network nodes. All the proofs are omitted due to space constraints. 2. CONSENSUS ON THE KULLBACK-LEIBLER AVERAGE Consider a network consisting of N nodes. Using graph formalism, the communication structure among the nodes is represented by a directed graph G = (N , E) where N := {1, 2, . . . , N } is the set of nodes and E ⊆ N × N is the set of links. In particular, it is supposed that (j, i) belongs to N if and only if node j can communicate with
12477
10.3182/20110828-6-IT-1002.01998
18th IFAC World Congress (IFAC'11) Milano (Italy) August 28 - September 2, 2011
node i (by definition (i, i) ∈ E for any i ∈ N ). Further, let N i = {j ∈ N : (j, i) ∈ E} denote the set of neighbors of node i (note that, by definition, i always belongs to N i ). Finally, let |N i | be the cardinality of N i (i.e., the in-degree of node i plus one). Suppose that, in each node i of the network, a PDF pi (·) is available representing the local information on some random vector x ∈ Rnx of interest. For example, such PDFs can be obtained via statistical inference or can be the result of some recursive Bayesian estimation algorithm (as will be discussed in the following sections). Here, it is supposed that all the local PDFs belong to the same parametric family P = {p(·) = g(·; θ), θ ∈ Θ ⊂ Rnθ } where f is a given function and θ a parameter vector completely characterizing the PDF. In other words, each local PDF can be written as pi (·) = g(·; θi ) for some vector θi ∈ Θ. Aim of this section is to investigate whether it is possible to devise a suitable consensus algorithm guaranteeing that all the nodes of the network reach an agreement regarding the PDF of x. Here, by consensus algorithm we mean a distributed interaction rule specifying: i) the information exchange between a node and its neighbors; ii) the mechanism for updating the local PDF on the basis of the received information. In particular, the focus is on consensus algorithms that preserve the shape of the PDFs so that all the updated local PDFs always belong to the parametric family P. To this end, let pi[ℓ] (·) ∈ P denote the PDF available at node i at the ℓ-th iteration of the consensus algorithm (where i pi[0] (·) = pi (·)) and let θ[ℓ] be the corresponding parameter vector. Then, the following average consensus problem is addressed. Problem 1. Find a consensus algorithm such that i lim θ[ℓ] = θ¯ , ∀i ∈ N . ℓ→∞
¯ represents the where the asymptotic PDF p¯(·) = g(·; θ) average (in some sense) of the initial PDFs pi (·) = g(·; θi ), i ∈ N . Clearly, the first important issue to be addressed is how ¯ given the initial to define the average PDF p¯(·) = g(·; θ) i i PDFs p (·) = g(·; θ ), i ∈ N . To this end, it is convenient to recall that, given a set of points {x1 , . . . , xN } belonging PN to Rn , their average x ¯ = N1 i=1 xi also satisfies the variational property of minimizing the mean squared distance from such points, i.e., N X 1 x ¯ = arg minn kx − xi k2 (1) x∈R N i=1 where k · k denotes the Euclidean norm. As well known (), there exist many ways of measuring the distance between two PDFs pi (·) and pj (·). From the information-theoretic point of view, the most typical choice corresponds to the Kullback-Leibler divergence (KLD) or relative entropy defined as Z pi (x) DKL (pi kpj ) = pi (x) log j dx . p (x) The KLD has many meaningful interpretations; for instance, in Bayesian statistics, it can be seen as the in-
formation gain achieved when moving from a prior PDF pj (·) to a posterior PDF pi (·). It is also worth noting that the KLD can be seen as the equivalent of the squared Euclidean distance in the space of the PDFs; in fact, they are both examples of Bregman divergences (Banerjee et al., 2005). In this connection, taking into account (1), it seems quite natural to introduce the following notion. Definition 1. The Kullback-Leibler average (KLA) of N PDFs pi (·) belonging to the parametric family P is the PDF N X 1 p¯ = arg inf DKL pkpi . p∈P N i=1 According to Definition 1, the average PDF is the one that minimizes the sum of the information gains from the initial PDFs. Thus, this choice is coherent with the Principle of Minimum Discrimination Information (PMDI) according to which the PDF which best represents the current state of knowledge is the one which produces an information gain as small as possible (the interested reader is referred to (Campbell, 1970; Akaike, 1973) for a discussion on such a principle and its relation with Gauss’ principle and maximum likelihood estimation) or, in other words, “the probability assignment which most honestly describes what we know should be the most conservative assignment in the sense that it does not permit one to draw any conclusions not warranted by the data.” (Jaynes, 2003) In the remainder of this section, for the sake of brevity, the attention will be restricted to a particular parametric family of PDFs, namely, normal distributions. However, similar considerations could be made also for all those parametric families for which the KLD can be computed in closed form (such families include those belonging to the exponential class of density functions like the exponential distribution, the binomial distribution, etc.). Let now all the local PDFs pi (·) take the form pi (x) = Φ(x; µi , Σi ) i ⊤ i −1 i 1 1 =p e− 2 (x−µ ) (Σ ) (x−µ ) (2) i n (2π) x det(Σ )
where µi ∈ Rn is the mean and Σi ∈ Rn×n is the (positive definite) covariance matrix. Then, the following result can be stated. Lemma 1. The KLA of N Gaussian PDFs as in (2) takes ¯ where the mean µ the form p¯(·) = Φ(·; µ ¯, Σ) ¯ and the ¯ covariance matrix Σ can be computed from the algebraic equations N X ¯ −1 = 1 Σ (Σi )−1 , N i=1
(3)
N 1 X i −1 i ¯ −1 µ (Σ ) µ . Σ ¯= N i=1
(4)
Recalling that Ωi := (Σi )−1 and q i := (Σi )−1 µi are the information matrix and information vector, respectively, associated with the Gaussian PDF (2), Lemma 1 states that the KLA of N Gaussian PDFs can be simply obtained
12478
18th IFAC World Congress (IFAC'11) Milano (Italy) August 28 - September 2, 2011
by averaging their information matrices and information vectors. An important consequence of this state of affairs is that Problem 1 can be readily solved by applying one of the many existing consensus algorithms for distributed averaging of real numbers. In this connection, the simplest distributed averaging algorithms consist of updating the local data via convex combination with the data received by the neighbors (Xiao et al., 2005; Olfati-Saber et al., 2007). In this case, such a choice gives rise to the following consensus algorithm: 1 X π i,j Ωj[ℓ] , i ∈ N , ℓ = 0, 1, . . . (5) Ωi[ℓ+1] = j∈N i
i q[ℓ+1]
=
X
j π i,j q[ℓ] ,
i ∈ N , ℓ = 0, 1, . . .
(6)
j∈N i
where π i,j are suitable non-negative weights such that P i,j = 1 , ∀i ∈ N . Clearly, the recursion is j∈N i π i initialized by letting Ωi[0] = Ωi and q[0] = q i for all i ∈ N . Let us now denote by Π the consensus matrix whose generic (i, j)-element coincides with the consensus weight π i,j (if j ∈ / N i then π i,j is taken as 0). Then, the following proposition derives from well-known results on distributed averaging (Xiao et al., 2005; Olfati-Saber et al., 2007). Proposition 1. Let the consensus matrix Π be primitive and doubly stochastic. 2 Then, the consensus algorithm (5)-(6) asymptotically yields the KLA of the initial local PDFs in that ¯ := Σ ¯ −1 lim Ωi[ℓ] = Ω
ℓ→∞
i ¯ −1 µ lim q[ℓ] = q¯ := Σ ¯.
ℓ→∞
As well known (Calafiore and Abrate, 2009), a necessary condition for the matrix Π to be primitive is that the graph G associated with the sensor network be (strongly) connected. In this case, a possible choice satisfying the hypotheses of Proposition 1 is given by the so-called Metropolis weights (Xiao et al., 2005; Calafiore and Abrate, 2009) 1 , max{|N i |, |N j |} X π i,j . π i,i = 1 −
π i,j =
i ∈ N , j ∈ N i , i 6= j
j∈N i ,j6=i
3. APPLICATION TO DISTRIBUTED STATE ESTIMATION In this section, the developments of the previous section are applied to distributed state estimation. To this end, consider a discrete-time dynamical system xk+1 = f (xk ) + wk 1
Along the lines of Lemma 1 it could be seen that the update rules (5) and (6) correspond to computing a weighted KLA between the local PDF and the PDFs of the neighbors. 2 Recall that a non-negative square matrix Π is doubly stochastic if all its rows and columns sum up to 1. Further, it is primitive if there exists an integer m such that all the elements of Πm are strictly positive.
where k is the discrete time index, xk ∈ Rnx is the system state and wk ∈ Rnx is the process noise. The initial state x0 of the system is unknown but distributed according to a known PDF p0|−1 (·). It is supposed that the sequence {wk } is generated by a zero-mean white stochastic process with known PDF pw (·). Further, suppose that, at each time k = 0, 1, . . ., each node of the network collects a measurement yki = hi (xk ) + vki of the state vector xk , where vki ∈ Rny is generated by a zero-mean white stochastic process with known PDF pvi (·) (the initial state x0 and the sequences {wk } and {vki } are supposed to be mutually independent). If no communication between the nodes of the network were possible, the solution of the local state estimation problem would yield the well-known Bayesian filtering recursion: pi0|−1 (x) = p0|−1 (x) , pik|k (x) = R
pik+1|k (x) =
Z
pvi [yki pvi [yki
− −
(7) i
h (x)]pik|k−1 (x) hi (ξ)]pik|k−1 (ξ)dξ
pw [x − f (ξ)]pik|k (ξ)dξ ,
,
(8) (9)
for k = 0, 1, . . ., where pik|t (·) represents the PDF of xk conditioned to all the measurements collected at node i up to time t. If, however, a communication structure is available as described in the previous section, then each node can improve its local estimate by fusing the local information with the one received from its neighbors. More specifically, if one constrains 3 all the local PDFs to belong to a parametric family P, then one can perform at each time instant a certain number, say L, of consensus steps on the posterior PDFs pik|k (·), i ∈ N , in order to compute in a distributed fashion their KLA. This idea leads to the following Distributed State Estimation Algorithm with Consensus on the Posteriors (DSE-CP). Algorithm 1. At each time k = 0, 1, . . ., for each node i ∈ N: (1) collect the local measurement yki and update the local PDF pik|k−1 (·) via equation (8) to obtain the local posterior pik|k,[0] ; (2) perform L steps of consensus on the KLA average to obtain the fused posterior pik|k,[L] ; (3) compute the local prior pik+1|k from pik|k := pik|k,[L] via equation (9). Remark 1. The rationale for combining the local posteriors according to the KLA paradigm is to counteract the deleterious effects of the so-called data incest (Smith and 3
Notice that in some cases of interest, like the linear-Gaussian one, such an assumption is automatically satisfied. Further, even in a nonlinear and/or non Gaussian setting, since it is usually not possible to compute the true conditional PDF in closed form, it is quite common to approximate it with a PDF belonging to some fixed parametric family P. For example, when the Extended Kalman Filter (EKF) or the Unscented Kalman Filter (UKF) are used, all the PDFs are approximated as Gaussian.
12479
18th IFAC World Congress (IFAC'11) Milano (Italy) August 28 - September 2, 2011
Singh, 2006; Chang et al., 2008). In fact, it is well known that, when multiple inter-communicating agents try to estimate the state of the same dynamical process without any centralized coordination, a remarkable performance degradation with respect to the centralized estimator (or even divergence of the estimation error) can occur due to the possible correlation among the estimates of different agents and the consequent double-counting of information. As discussed in the previous section, an information fusion rule based on the KLA paradigm is expected to avoid such a double-counting since it follows the PMDI (this intuition will be formalized in Section 4 for the linear-Gaussian case). 3.1 The linear-Gaussian case: Covariance Intersection revisited Suppose now that both the system dynamics and the measurement equations are linear, i.e., consider a discretetime linear system xk+1 = Axk + wk (10) whose state is measured by N linear sensors yki = C i xk + vki , i = 1, . . . , N .
i qk+1|k = −1 i A−⊤ I − Ωik|k Ωik|k + A⊤ Q−1 A qk|k ,
Ωik+1|k = A−⊤ Ωik|k A−1 −1 −A−⊤ Ωik|k Ωik|k + A⊤ Q−1 A Ωik|k A−1 .
By exploiting the above-defined information filter as well as the consensus algorithm (5)-(6), it is possible to specialize Algorithm 1 as follows. Algorithm 2. At each time k = 0, 1, . . ., for each node i ∈ N: (1) collect the local measurement yki and update the local
Further, let the initial state, the process disturbance and all the measurement noises be normally distributed, i.e.,
i information pair qk|k−1 , Ωik|k−1 via equations (12)-
(13) to obtain the local posterior information pair i qk|k,[0] , Ωik|k,[0] ; (2) perform L steps of consensus on the KLA average X Ωik|k,[ℓ+1] = π i,j Ωjk|k,[ℓ] , ℓ = 0, . . . , L − 1(16)
p0 (x) = Φ(x; x ˆ0|−1 , P0|−1 ) , pw (w) = Φ(w; 0, Q) , pvi (v i ) = Φ(v i ; 0, Ri ) , i = 1, . . . , N , where x ˆ0|−1 is a known vector and P0|−1 , Q, Ri , i = 1, . . . , N are known positive definite matrices. As well known, in this case, the Bayesian filtering recursion (7)(9) admits a closed-form solution in that, for each node i, one has
j∈N i
i qk|k,[ℓ+1] =
(3) compute the local prior information pair i i i , Ωik|k,[L] via equaqk+1|k , Ωk+1|k from qk|k,[L] tions (14)-(15).
In principle, the algorithm should be initialized by letting −1 Ωi0|−1 = P0|−1 , i q0|−1
Ωi0|−1 = 0 ,
are recursively propagated. With this choice, the update step can be simply written as
Ωik|k
= Ωik|k−1
i ⊤
+ (C )
(Ri )−1 yki
i ⊤
i −1
+ (C ) (R )
,
(12)
i
(13)
C ,
whereas the prediction step takes the form
i∈N
−1 = P0|−1 x ˆ0|−1
,
i∈N.
However, since it is quite unrealistic to assume that all the nodes of the network share the same a priori information on the initial state x0 , a more practical initialization can be obtained by letting
i qk+1|k = Ωik+1|k x ˆik+1|k
i = qk|k−1
j π i,j qk|k,[ℓ] , ℓ = 0, . . . , L − 1 (17)
i , Ωik|k := to obtain the fused information pair qk|k i , Ωik|k,[L] ; qk|k,[L]
i pik+1|k (x) = Φ(x; x ˆik+1|k , Pk+1|k )
i qk|k
X
j∈N i
i pik|k (x) = Φ(x; x ˆik|k , Pk|k ),
and information vectors i qk|k = Ωik|k x ˆik|k ,
(15)
Notice that, for equations (14) and (15) to make sense, the system matrix A has to be invertible. This hypothesis is automatically satisfied in sampled-data systems wherein the matrix A is obtained by discretization of a continuoustime system matrix. Further, in the other cases, such a limitation could be easily overcome by considering different information filter algorithms (e.g, the generalized squareroot filter proposed by Chisci and Zappa (1992)). However, these generalizations are not addressed here in order to keep the presentation as streamlined as possible.
(11)
ˆik+1|k and the covariance where the mean vectors x ˆik|k and x i i matrices Pk|k and Pk+1|k can be recursively computed by means of the Kalman filter recursion. In view of the developments of the previous section, in order to streamline the presentation as well as the stability analysis, it is convenient to consider the information form of the Kalman filter recursion whereby, instead of the mean vectors and the covariance matrices, the information matrices i i )−1 )−1 , Ωik+1|k = (Pk+1|k Ωik|k = (Pk|k
(14)
i q0|−1 =0,
i∈N
(18)
i∈N,
(19)
which would amount to assuming that no a priori information is available. It is important to point out that the idea of fusing the information received from multiple sensors by making a convex combination of the information matrices and
12480
18th IFAC World Congress (IFAC'11) Milano (Italy) August 28 - September 2, 2011
vectors as in (16)-(17) is not novel in the literature (albeit the derivation proposed here is quite different from the existing ones). In fact, such a fusion rule is known in the literature as Covariance Intersection and dates back to over a decade ago (Julier and Uhlmann, 1997; Chen et al., 2002). In this connection, a contribution of this paper consists of showing that such a widely-used fusion rule naturally arises as a step of a consensus algorithm for computing, in a distributed fashion, the KLA of the local posterior PDFs. It is believed that, in the spirit of Algorithm 1, this result can serve as a starting point for deriving useful generalizations of such a fusion rule in nonlinear and/or non-Gaussian settings. 4. STABILITY ANALYSIS In this section, the stability properties of the proposed distributed state estimation algorithm are analyzed. To this end, in order to ensure the well-posedness of the state estimation problem, the following preliminary assumptions are needed. A1. The system matrix A is invertible. A2. The pair (A, C) is observable where C := col C 1 , . . . , C N .
One of the most basic features that a recursive state estimation algorithm should enjoy is that the true estimation errors be consistent with their predicted statistics. In fact, as well known (Jazwinski, 1970), divergence of the estimation error can take place when the covariance matrix becomes too small or optimistic (because in this case subsequent observations tend to be ignored). In this connection, the following definition can be introduced (Jazwinski, 1970; Julier and Uhlmann, 1997). Definition 2. Consider a random vector x. Further, let x ˆ be an unbiased estimate of x and P an estimate of the corresponding error covariance. Then, the pair (ˆ x, P ) is said to be consistent if E (x − x ˆ)(x − x ˆ )⊤ ≤ P . (20)
In words, according to inequality (20), consistency amounts to requiring that the estimated error covariance P be an upper bound (in the positive definite sense) of the true error covariance. This property becomes even more important in distributed state estimation because the unaware reuse of the same data due to the presence of loops within the network as well as the possible correlation between measurements of different sensors can lead to inconsistency and divergence. In fact, this was the primary motivation that led to the development of the Covariance Intersection fusion rule. Notice now that, if one considers the information pair (q, Ω) = (P −1 x ˆ, P −1 ), inequality (20) can be rewritten as −1 Ω ≤ E (x − Ω−1 q)(x − Ω−1 q)⊤ .
Then, taking into account the relationship between Algorithm 2 and Covariance Intersection, the following result can be stated. Theorem 1. Let Assumption A1 holds and let the distributed state estimation Algorithm 2 be adopted with the initialization (18)-(19). Then, for each k = 0, 1, . . . and
i each i ∈ N , the information pair (qk|k , Ωik|k ) is consistent in that o−1 n Ωik|k ≤ E (xk − x ˆik|k )(xk − x ˆik|k )⊤ . (21)
+ i with x ˆik|k := Ωik|k qk|k .4
Theorem 1 points out that, at least in the linear-Gaussian case, the intuitions of Remark 1 are correct in that, thanks to the adherence to the PMDI, the proposed distributed state estimation algorithm avoids double-counting and preserves consistency of all the information pairs. Further, in view of Theorem 1, in order to prove the boundedness o n of the error covariance E (xk − x ˆik|k )(xk − x ˆik|k )⊤ it is sufficient to show that asymptotically the information matrix Ωik|k is bounded below by some positive definite i matrix (or, equivalently, to show that Pk|k = (Ωik|k )−1 is asymptotically bounded above by some constant matrix). In this connection, the following theorem can be stated that represents the main result of this section. Theorem 2. Let assumptions A1-A2 hold and let the distributed state estimation Algorithm 2 be adopted with the initialization (18)-(19). Then, if the consensus matrix Π is primitive and the number L of consensus iterations is greater than 0, there exist a time instant k¯ and a positive ˜ such that definite matrix Ω ˜ ≤ Ωi , ∀i ∈ N and ∀k ≥ k¯ . 0<Ω k|k As a consequence, the estimation error is asymptotically bounded in mean square in that n o ˜ −1 }. lim sup E (xk − x ˆik|k )⊤ (xk − x ˆik|k ) ≤ tr{Ω k→∞
A few remarks on Theorem 2 are in order. First of all, it is important to point out that, to the best of our knowledge, this is the first stability result for distributed state estimation that relies only on collective observability assumptions. In fact, even the most recent stability proofs (concerning other estimation algorithms) require some sort of local observability (or detectability) condition in each node, thus limiting the practical applicability (see, for instance, (Stankovic et al., 2009; Olfati-Saber, 2009; Cattivelli and Sayed, 2010)). Such an advancement is mainly due to the fact that, in the proposed algorithm, the local posterior PDFs (rather than just the local estimates) are combined so that also the covariance matrices are updated and kept consistent with the combined estimates. It is also worth noting that such a stability result holds regardless of the possible correlation between the measurements coming from different sensors (which is supposed to be unknown). Further, only one consensus step per iteration is needed for stability. Another important feature of the proposed distributed state estimation algorithm is that, when the system dy4
Here, the Moore-Penrose pseudoinverse of the information matrix Ωik|k is considered to account for the fact that, under the uninformative prior (18)-(19), such a matrix is singular during the first time instants.
12481
18th IFAC World Congress (IFAC'11) Milano (Italy) August 28 - September 2, 2011
namics as well as all the measurement equations are noisefree, it is possible to show that all the local estimation errors xk − x ˆik|k converge asymptotically to zero. To see this, it is convenient to consider the quadratic time-varying Lyapunov functions Lik (x) := x⊤ Ωik|k−1 x , i ∈ N , and to define the vector N Lk = col L1k (xk − x ˆ1k|k−1 ), . . . , LN (x − x ˆ ) . k k k|k−1 Then, the following theorem can be stated. Theorem 3. Let the system dynamics (10) and the measurement equations (11) be noise-free, i.e., wk = 0 ,
vki = 0 ,
i∈N
for k = 0, 1, . . . ,. Then, under the same assumptions of Theorem 2, the following facts hold: (i) there exist a time instant k¯ and a positive real β < 1 such that Lk+1 ≤ β ΠL Lk , ¯ k¯ + 1, . . .; for k = k, (ii) the proposed distributed state estimation algorithm yields an asymptotic observer on each node of the network, in that lim xk − x ˆik|k = 0 , ∀i ∈ N . k→∞
5. CONCLUSIONS An information-theoretic approach to distributed estimation, exploiting the consensus on the Kullback-Leibler average of Gaussian probability density functions (PDFs), has been introduced. Following this approach, a distributed state estimator based on the consensus among local posterior PDFs has been derived. It has been proved that the proposed estimator guarantees bounded state estimation error covariance in all network nodes, under network connectivity and collective observability. Simulation experiments, not shown here due to lack of space, have demonstrated the effectiveness of the distributed estimator even in networks with few and low-degree sensor nodes. Future work will concern the development of distributed state estimation algorithms for nonlinear systems and/or with fault detection & accommodation features. REFERENCES Akaike, H. (1973). Information theory and the extension of the maximum likelihood principle. In Proceedings of the Second International Symposium on Information Theory, 267–281. Alriksson, P. and Rantzer, A. (2006). Distributed Kalman filtering using weighted averaging. In Proocedings of the 17th International Symposium on Mathematical Theory of Networks and Systems, 2006. Banerjee, A., Guo, X., and Wang, H. (2005). On the optimality of conditional expectation as a Bregman predictor. IEEE Transactions on Information Theory, 51(7), 2664 –2669. Calafiore, G.C. and Abrate, F. (2009). Distributed linear estimation over sensor networks. International Journal of Control, 82(5), 868–882.
Campbell, L. (1970). Equivalence of Gauss’s principle and minimum discrimination information estimation of probabilities. The Annals of Mathematical Statistics, 41(3), pp. 1011–1015. Carli, R., Chiuso, A., Schenato, L., and Zampieri, S. (2008). Distributed Kalman filtering based on consensus strategies. IEEE Journal on Selected Areas in Communications, 26, 622–633. Cattivelli, F.S. and Sayed, A.H. (2010). Diffusion strategies for distributed Kalman filtering and smoothing. IEEE Transactions on Automatic Control, 55(9), 2069– 2084. Chang, K.C., Chong, C.Y., and Mori, S. (2008). On scalable distributed sensor fusion. In Proceedings of the 2008 11th International Conference on Information Fusion, 1–8. Chen, L., Arambel, P., and Mehra, R. (2002). Estimation under unknown correlation: covariance intersection revisited. IEEE Transactions on Automatic Control, 47(11), 1879–1882. Chisci, L. and Zappa, G. (1992). Square-root Kalman filtering of descriptor systems. Systems & Control Letters, 19(4), 325–334. Farina, M., Ferrari-Trecate, G., and Scattolini, R. (2010). Distributed moving horizon estimation for linear constrained systems. IEEE Transactions on Automatic Control, 55, 2462–2475. Jaynes, E.T. (2003). Probability Theory: The Logic of Science. Cambridge University Press. Jazwinski, A. (1970). Stochastic Processes and Filtering Theory. Academic Press. Julier, S. and Uhlmann, J. (1997). A non-divergent estimation algorithm in the presence of unknown correlations. In Proceedings of the 1997 American Control Conference, volume 4, 2369–2373. Kamgarpour, M. and Tomlin, C. (2008). Convergence properties of a decentralized Kalman filter. In Proocedings of the 47th IEEE Conf. Decision and Control, 2008, 3205–3210. Olfati-Saber, R. (2007). Distributed Kalman filtering for sensor networks. In Proocedings of the 46th IEEE Conference on Decision and Control, 2007, 5492–5498. Olfati-Saber, R. (2009). Kalman-consensus filter : Optimality, stability, and performance. In Proceedings of the 48th IEEE Conference on Decision and Control, 2009 held jointly with the 2009 28th Chinese Control Conference, 7036 –7042. Olfati-Saber, R., Fax, J.A., and Murray, R.M. (2007). Consensus and cooperation in networked multi-agent systems. Proceedings of the IEEE, 95(1), 49–54. Smith, D. and Singh, S. (2006). Approaches to multisensor data fusion in target tracking: A survey. IEEE Transactions on Knowledge and Data Engineering, 18(12), 1696–1710. Stankovic, S.S., Stankovic, M.S., and Stipanovic, D.M. (2009). Consensus based overlapping decentralized estimation with missing observations and communication faults. Automatica, 45(6), 1397–1406. Xiao, L., Boyd, S., and Lall, S. (2005). A scheme for robust distributed sensor fusion based on average consensus. In Proceedings of The 4th International Symposium on Information Processing in Sensor Networks (IPSN), 63– 70.
12482