Variance estimation and ranking of target tracking position errors modeled using Gaussian mixture distributions

Variance estimation and ranking of target tracking position errors modeled using Gaussian mixture distributions

Automatica 41 (2005) 1433 – 1438 www.elsevier.com/locate/automatica Brief paper Variance estimation and ranking of target tracking position errors m...

219KB Sizes 6 Downloads 98 Views

Automatica 41 (2005) 1433 – 1438 www.elsevier.com/locate/automatica

Brief paper

Variance estimation and ranking of target tracking position errors modeled using Gaussian mixture distributions夡 Lidija Trailovi´c∗ , Lucy Y. Pao Department of Electrical and Computer Engineering, University of Colorado, Boulder, CO 80309-0425, USA Received 6 July 2002; received in revised form 3 February 2003; accepted 14 March 2005 Available online 13 May 2005

Abstract In this paper, variance estimation and ranking methods are developed for stochastic processes modeled by Gaussian mixture distributions. It is shown that the variance estimate from a Gaussian mixture distribution has the same properties as the variance estimate from a single Gaussian distribution based on a reduced number of samples. Hence, well-known tools for variance estimation and ranking of single Gaussian distributions can be applied to Gaussian mixture distributions. As an application example, we present optimization of sensor processing order in the sequential multi-target multi-sensor joint probabilistic data association (MSJPDA) algorithm. 䉷 2005 Elsevier Ltd. All rights reserved. Keywords: Variance estimation; Variance ranking; Gaussian mixture; Target tracking; Tracking error

1. Introduction A popular performance metric used for evaluating tracking algorithms is the root mean square (RMS) position error computed from a variance estimate based on data generated by the tracking algorithm (Frei & Pao, 1998; Pao & Trailovi´c, 2000; Trailovi´c & Pao, 2001, 2004). It has been shown that the position error distribution is much better modeled using mixtures of Gaussians rather than a single Gaussian (Trailovi´c, 2002). Estimation and ranking of variances in stochastic processes that generate data with Gaussian distributions are based on well-known properties of the 2 and F-distributions (Devore, 2003; Hoel, 1962). The purpose of this paper is to develop the tools for variance estimation and ranking of Gaussian mixture distributions. The paper is organized as follows. Variance estimation for Gaussian mixture distributions is discussed in

Section 2, including derivations of the moment generating function and the probability density function. Section 3 addresses the Gaussian mixture variance ranking. As an application example, we present selection of the optimum order of sensor processing in the sequential multi-sensor joint probabilistic data association (MSJPDA) target tracking algorithm (Bar-Shalom & Fortmann, 1998; Bar-Shalom & Tse, 1975; Frei & Pao, 1998; Pao, 1994; Pao & Trailovi´c, 2000; Trailovi´c, 2002). The results are summarized in Section 4. 2. Variance distribution for Gaussian mixtures A k-component Gaussian mixture pdf of a random variable x is fk (x) =

k 

wj (x; mj , j ),

(1)

j =1 夡 This paper was not presented at any IFAC meeting. This paper was recommended for publication in revised form by Associative Editor R. Boel under the direction of Editor I. Petersen. ∗ Corresponding author. E-mail addresses: [email protected] (L. Trailovi´c), [email protected] (Lucy Y. Pao).

0005-1098/$ - see front matter 䉷 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.automatica.2005.03.007

where (x; mj , j ) is a Gaussian pdf with mean mj and standard deviation j and wj are weights satisfying k j =1 wj = 1, wj  0. We consider a class of zero-mean Gaussian mixture distributions with mj = 0, j = 1, . . . , k. Such distributions can be used to model the position error

1434

L. Trailovi´c, Lucy Y. Pao / Automatica 41 (2005) 1433 – 1438

in target tracking algorithms (Trailovi´c, 2002). For the zeromean Gaussian mixtures, the variance 2eq is

2eq =

k  j =1

wj 2j .

(2)

n

1  2 ei . n−1

(3)

i=1

When samples originate from a single Gaussian distribution, the tools for variance estimation and ranking are based on properties of the 2 and the F-distributions (Devore, 2003; Hoel, 1962). Our objective is to develop similar tools for data originating from a k-component zero-mean Gaussian mixture distribution. Following the derivation for the 2 distribution outlined in Hoel (1962), let us define a random variable εn =

n  i=1

−∞ k 

=

k  j =1

xi2 ,

(4)

1 εn n−1

(5)

produces the unbiased variance estimator (3). With the change of variables (5), the objective is to find the density gk (x; n) = (n − 1)gkn ((n − 1)x; n).

(6)

2.1. Moment generating function (mgf) A pdf is uniquely determined by its moment generating function when it exists (Hoel, 1962). The mgf of the random variable εn is defined as  ∞ sεn esx gkn (x; n) dx, (7) Mεn (s) = E[e ] = −∞

where E[·] is the expectation of the random variable. Substituting (4), where xi are independent random variables with the same zero-mean k-component Gaussian mixture pdf, we have Mεn (s) = E[es

n

2 i=1 xi

= (Mx 2 (s))n ,

]=

n 

E[esx i ] = (E[esx ])n 2

2

i=1

(8)

esx

2

wj (x; 0, j ) dx

j =1

 wj

k 

∞ −∞

e

 sx 2

√

1 2j



e

x2 22 j

wj (1 − 22j s)−1/2 .

  dx

(9)

Combining (8) and (9) yields n  n  k k   Mεn (s)= wj (1−22j s)−1/2  = wj Mj  . (10) j =1

j =1

For k = 1, w1 = 1, and 1 = eq = 1, the mgf (10) reduces to the mgf of the 2 distribution (Hoel, 1962), Mεn (s)|k=1 = M2 (s) = (1 − 2s)−n/2 ,

(11)

with the corresponding density g1n (x; n) =

where xi is a zero-mean k-component Gaussian mixture random variable with the pdf in (1). Further, let gkn (x; n) be the pdf of the random variable εn having a k-component “2 mixture” distribution with n degrees of freedom (i.e., measurements or samples) (Devore, 2003; Hoel, 1962). The scaled random variable ε=

=



j =1

Given a sample set {e1 , . . . , en }, the unbiased variance estimator is

ˆ 2eq =

 Mx 2 (s) =

x (n/2)−1 e−x/2 . 2n/2 (n/2)

(12)

Unfortunately, for k > 1, a closed-form solution for the pdf corresponding to the given mgf (10) is not known. 2.2. 2 -mixture distribution for large n Without loss of generality, we assume that the standard deviation (2) of the zero-mean mixture is normalized to eq = 1. By the central limit theorem, the pdf of a sum of n random variables converges to a Gaussian pdf for large n. The mean m and the variance 2 can be found from the mgf M(s) as (Hoel, 1962) m = M  (s)|s=0 = M  (0),

(13)

2 = M  (s)|s=0 − m2 = M  (0) − m2 ,

(14)

where M  (s) and M  (s) are the derivatives of M(s) with respect to s. Applying (13) and (14) to (11), we conclude that the 2 pdf with n degrees of freedom converges to the Gaussian pdf n?1

g1n (x; n) −→ (x; m1 , 1 )

(15)

with the mean m1 = n and the variance 21 = 2n. Therefore, by the change of variables (5), the scaled 2 pdf converges to the Gaussian pdf

n?1 2 g1 (x; n) −→  x; 1, . (16) n Similarly, applying (13) and (14) to (10), we find that the scaled 2 -mixture density gk (x; n) converges to the Gaussian pdf n?1

gkn (x; n) −→ (x; mk , k )

(17)

L. Trailovi´c, Lucy Y. Pao / Automatica 41 (2005) 1433 – 1438

For the worst-case mixing weight w = 1/(1 + 2 ), the reduction factor ¯ in (24) has the minimum value ¯ min

with the mean and the variance  n−1 k  wj Mj (0) mk = Mε n (s)|s=0 = n  ×

¯ min () =

j =1

k  j =1



wj Mj (0) = n,

(18)

  k  2k = Mεn (s)|s=0 − m2k = n  wj Mj (0) − 1 j =1

    k  = n 3  wj 4j  − 1 .

(19)

j =1

From (5), (17), (18), and (19), it follows that

n?1 2 gk (x; n) −→  x; 1, , n∗ = ¯ n, n∗

¯ =

2

.  3( kj =1 wj 4j ) − 1

(20)

Note that 0  ¯  1. Comparing (16) and (20), we come to the conclusion n?1

(22)

which shows that the distribution of the variance estimate based on a sample set of size n?1 from a zero-mean Gaussian mixture with a standard deviation eq is the same as the distribution of the variance estimate based on a reduced number n∗ = ¯ n of samples from the zero-mean single Gaussian distribution having the same standard deviation eq . The parameter ¯ in (21) can be named the reduction factor. Based on (22), well-known tools for variance estimation and ranking for samples originating from a Gaussian distribution can be applied to the class of zero-mean Gaussian mixture distributions simply by taking into account the reduction factor ¯ of the degrees of freedom. The reduction factor ¯ in (21) depends on the mixture parameters, which can be estimated using maximum likelihood methods as discussed in Trailovi´c (2002). Further, it is of interest to find worst-case limits for the reduction factor ¯ when the mixture parameter estimates are not available. Let us consider a two-component (k = 2) zero-mean Gaussian mixture pdf with the parameters

(23)

From (2), (21), and (23), we have 2(1 − w + w 2 )2 3(1 − w) + 3w 4 − (1 − w + w 2 )2

3 + 22 + 34

.

(24)

.

j,min . j,max

(25)

(26)

In practice, an estimate of  may come from process heuristics. For example, in target tracking applications, j,min can be the value obtained when tracking without clutter, while a reliable bound on j,max can be obtained by mixture parameter estimation from the data (Trailovi´c, 2002). 2.3. Computation of the 2 -mixture density for arbitrary n In this section, we show how the 2 -mixture pdf can be obtained numerically for any n. Since gkn (x; n) = 0 for x < 0, the mgf (7) can be written as the Laplace transform L[·] of the pdf  ∞ e−sx gkn (x; n) dx = L[gkn (x; n)]. (27) Mεn (−s) = 0

Therefore, numerical inversion of the Laplace transform yields the pdf, gkn (x; n) = L−1 [Mεn (−s)]  +i∞ 1 esx Mεn (−s) ds. = 2i −i∞

(28)

The results we obtained for various {wj , mj , j } and n showed that the scaled 2 -mixture density can indeed be approximated with a scaled 2 density with n∗  n degrees of freedom, gk (x; n) ≈ g1 (x; n∗ ).

2 = ,  1, eq = 1, 1 w1 = 1 − w, w2 = w.

¯ =

8 2

If  = 1, the mixture components are equal, the mixture pdf reduces to Gaussian pdf, and the worst-case reduction factor is ¯ min (1) = 1. If  → 0, i.e., if the mixture components have a very wide spread of standard deviations, the worstcase reduction factor approaches zero. For k > 2, finding the parameters resulting in the minimum possible value of  ¯ reduces to the problem of findk 4 ing the maximum of j =1 wj j under the constraints k k 2 j =1 wj = 1 and j =1 wj j = 1, which can be solved using the Lagrange method. In the solution, all but one j are equal. Therefore, the worst-case analysis leading to (25) for k = 2 holds for any k, with  being the ratio of the smallest and the largest standard deviations in the mixture,

= (21)

gk (x; n) −→ g1 (x; ¯ n),

1435

(29)

As an example, let us consider a two-component Gaussian mixture with the parameters w1 = 0.5, 1 = 1.66, w2 = 0.5, 2 = 0.33, and eq = 1.2. Corresponding to this mixture, Fig. 1 presents a comparison of (a) the numerically computed scaled 2 -mixture density g2 (x; n), with n = 10 degrees of freedom, (b) the scaled 2 density g1 (x; n) with n = 10

1436

L. Trailovi´c, Lucy Y. Pao / Automatica 41 (2005) 1433 – 1438

0.6

0

g1(x ; 10)

50

n 100

150

200

1.0

0.5 0.9

pdf

0.4

puv

g2(x ; 10)

0.3

0.8

0.2

g1(x ; 4.6)

0.7

0.1 0.0 0.0

0.5

1.0

1.5 x

MC puv

2.0

2.5

3.0

Fig. 1. Fitting the numerically computed 2 -mixture pdf g2 (x; n = 10) (dots) with a 2 pdf g1 (x; n = 10) (dotted line) and a 2 pdf with reduced degrees of freedom g1 (x; n∗ = 4.6) (solid line).

p*uv

0.6

0.5 Fig. 2. Confidence level puv as a function of the number of samples n using: (a) (dotted line) the F-distribution in (31), (b) (solid line) the F-distribution with the reduction factors in (33), and (c) a Monte Carlo simulation (dots).

degrees of freedom, and (c) the 2 density g1 (x; n∗ ) with n∗ =4.6 degrees of freedom. The best-fitting reduction factor  = 0.46 can be compared to ¯ = 0.4443 found from (21). ∗ puv = 1 − CDF(n∗u , n∗v , 1/ 2uv )  ∞ f (n∗u , n∗v , x) dx. =

3. Variance ranking for Gaussian mixtures

1/ 2uv

Let us consider two stochastic processes u and v producing random variables eu and ev with zero-mean Gaussian mixture densities. Given data samples {eu1 , . . . , eunu } from u and {ev1 , . . . , evnv } from v , the respective variance estimates ˆ 2eq,u and ˆ 2eq,v are obtained from (3). Suppose that ˆ eq,u < ˆ eq,v . The problem is to compute the confidence level, i.e., the a posteriori probability puv that eq,u < eq,v ,      eq,v 2  (30) puv = P > 1 ˆ eq,u , ˆ eq,v .  eq,u If the random variables eu and ev were Gaussian, the solution to this ranking problem would be obtained based on properties of the F-distribution (Trailovi´c & Pao, 2004) puv = 1 − CDF(nu , nv , 1/ 2uv )  ∞ f (nu , nv , x) dx, = 1/ 2uv

(31)

where f (nu , nv , x) is the pdf of the random variable having Fnu −1,nv −1 distribution with (nu − 1) degrees of freedom in the numerator and (nv − 1) degrees of freedom in the denominator, uv =ˆ eq,v /ˆ eq,u , and CDF(·) is the cumulative density function of the F-distribution random variable. For the random variables eu and ev with Gaussian mixture distributions, assuming that nu and nv are large, the confi∗ can be computed as in (31), except that n dence level puv u and nv are replaced by n∗u = ¯ u nu ,

n∗v = ¯ v nv ,

(32)

(33)

Application examples based on (a) synthetic data and (b) position error data generated from the MSJPDA target tracking simulator are presented next. 3.1. Synthetic data example Consider a process u producing a Gaussian random variable eu with mu = 0 and u = 1. For this process, the reduction factor is ¯ u = 1. Another process v produces a random variable ev having a 2-component Gaussian mixture distribution with the mixture parameters kv = 2, mv1 = mv2 = 0, wv1 = 0.309, v1 = 2.068, wv2 = 0.691, v2 = 0.414, eq,v = 1.2, and  = v2 /v1 = 0.2. From (21), the reduction factor is ¯ u = 0.125. The number of samples generated from u and v ranges from nu = nv = n = 10 to nu = nv = n = 200. Fig. 2 shows the confidence level as a function of n for: (a) puv computed from (31), i.e., assuming the random variables are Gaussian, ∗ comwith no reduction in the degrees of freedom; (b) puv MC puted from (33), and (c) puv obtained by a Monte Carlo experiment in which the sampled variances are compared in 10,000 trials. The assumption that the random variables are Gaussian (case (a)) results in incorrect, overly optimistic confidence levels. The variance ranking for Gaussian mixtures (case (b)) gives the confidence levels that closely match the Monte Carlo results for large n and give slightly pessimistic confidence levels for low n. These results illustrate the simplicity and power of the variance ranking method based on the reduction factors.

L. Trailovi´c, Lucy Y. Pao / Automatica 41 (2005) 1433 – 1438

1

APCS

Pao (2001, 2004) and well over 1000 runs (based on Chen, Chen, & Yucesan (2000)) needed in a Monte Carlo experiment. The selected best performing design with the smallest RMS position error was the same b = 4 .

(b)

0.8

(a)

0.6

θ1 = {0.01, 0.1, 0.1, 1.0} θ2 = {0.01, 0.1, 1.0, 0.1} θ3 = {0.01, 1.0, 0.1, 0.1} θ4 = {1.0, 0.1, 0.1, 0.01} θ5 = {1.0, 0.1, 0.01, 0.1} θ6 = {1.0, 0.01, 0.1, 0.1} θ7 = {0.1, 1.0, 0.01, 0.1} θ8 = {0.1, 0.01, 1.0, 0.1}

0.4 0.2

200 400 600 Number of simulation runs, Nruns

4. Conclusion

800

Fig. 3. APCS as a function of Nruns for (a) the conservative algorithm described in Trailovi´c and Pao (2001) and (b) the algorithm using the variance ranking described in this paper. The best performing design is b = 4 with confidence AP CS = 85%.

3.2. Target tracking application example Let us consider a system of Ns (not necessarily identical) sensors tracking T targets using the sequential MSJPDA algorithm (Pao, 1994; Pao & Trailovi´c, 2000). The sensor set is S = {r1 , r2 , . . . , rNs }, where rs is the parameter in the sensor noise covariance matrix. The sensor design u , u = 1, . . . , Nd , is an ordered set of the sensors in S, u = {ru1 , ru2 , . . . , ruN s }. In each simulation run for the design u , n = Npts sample points are generated by the simulator. The sample points are used to update the variance estimate for the design u . Let b be the design that results in the smallest variance estimate. The approximate probability of correct selection (APCS) (Chen, 1996) is computed as APCS =

Nd  u=1,u=b

∗ pbu ,

1437

We have developed estimation and ranking methods for comparing variances of stochastic processes modeled by Gaussian mixture distributions. Following a derivation of the moment generating function, it is shown that the distribution of the variance estimate based on a large number n of data samples from a zero-mean Gaussian mixture distribution has the same properties as the distribution of the variance estimate based on a reduced number  n (where  1) of samples from a zero-mean single Gaussian distribution. As a result, the well-known tools of variance estimation and ranking based on properties of 2 and F-distributions can be applied to data originating from k-component zero-mean Gaussian mixtures simply by taking into account the reduction factor . For large n, the reduction factor is found in closed form as a function of the mixture parameters. A lower bound for the reduction factor is also found as a function of the ratio of the minimum and maximum standard deviations of the components in the mixture. This bound can be used for conservative variance estimation and ranking even in cases when the mixture parameters are not available. The developed approach can be applied for performance evaluation and ranking of different algorithms where variance (root mean square error) is the performance metric. Application of the new approach for optimizing the sensor processing order in the sequential multi-sensor joint probabilistic data association (MSJPDA) tracking algorithm shows over an order of magnitude reduction in computational complexity compared to traditional ranking methods.

(34)

∗ follows from (33) with n∗ = N where pbu pts points per simulation run. The APCS is a measure of confidence of the selection of the best performing design. Since Npts ?1 and n∗ > 1, the confidence levels increase much faster with the number of simulation runs compared to the very conservative approach of using n∗ = 1 in Trailovi´c and Pao (2001, 2004). An example of ranking Nd = 8 designs is shown in Fig. 3. The average number of sample points per run is Npts = 50. The data collected from the first two runs for each design are used to estimate the mixture parameters as described in (Trailovi´c, 2002) and to find that the smallest ratio of the standard deviations in the mixtures is  = 0.2. From (25), a conservative value for  = 0.1 is then used in all subsequent steps of the optimization algorithm. The ranking confidence levels in (34) were computed from (33). As shown in Fig. 3, a total of 189 simulation runs was needed to achieve an APCS of 85%, compared to 897 runs in Trailovi´c and

Acknowledgements This work was supported in part by the Office of Naval Research (Grant N00014-02-1-0136) and a University of Colorado Faculty Fellowship. References Bar-Shalom, Y., & Fortmann, T. (1998). Tracking and data association. New York: Academic Press Inc. Bar-Shalom, Y., & Tse, E. (1975). Tracking in a cluttered environment with probabilistic data association. Automatica, 11(5), 451–460. Chen, C. H. (1996). A lower bound for the correct subset-selection probability and its application to discrete-event system simulations. IEEE Transactions on Automatic Control, 41(8), 1227–1231. Chen, H. C., Chen, C. H., & Yucesan, E. (2000). Computing efforts allocation for ordinal optimization and discrete event simulation. IEEE Transactions on Automatic Control, 45(5), 960–964. Devore, J. L. (2003). Probability and statistics for engineering and the sciences. (6th ed.), Pacific Grove, CA: Thompson, Brooks/Cole.

1438

L. Trailovi´c, Lucy Y. Pao / Automatica 41 (2005) 1433 – 1438

Frei, C. W., & Pao, L. Y. (1998). Alternatives to Monte-Carlo simulation evaluations of two multi-sensor fusion algorithms. Automatica, 34(1), 103–110. Hoel, P. G. (1962). Introduction to mathematical statistics. Wiley. Pao, L. Y. (1994). Centralized multi-sensor fusion algorithms for tracking applications. Control Engineering Practice, 2(5), 875–887. Pao, L. Y., & Trailovi´c, L. (2000). Optimal order of processing sensor information in sequential multi-sensor fusion algorithms. IEEE Transactions on Automatic Control, 45(8), 1532–1536. Trailovi´c, L. (2002). Ranking and optimization of target tracking algorithms. Ph.D. thesis, University of Colorado at Boulder. Trailovi´c, L., & Pao, L. Y. (2001). Computing budget allocation for optimization of sensor processing order in sequential multi-sensor fusion algorithms. Proceedings of the American Control Conference (pp. 1841–1847). Trailovi´c, L., Pao, L. Y. (2004). Computing budget allocation for efficient ranking and selection of variances with application to target tracking algorithms. IEEE Transactions on Automatic Control, 49 (1), 58–67. Lidija Trailovic was born in Belgrade, Yugoslavia. She received the B.S. degree in Electrical Engineering from the University of Belgrade in 1986, M.S. and Ph.D. degrees in Electrical and Computer Engineering from the University of Colorado at Boulder in 1994 and 2002. She was a lecturer in the Applied Mathematics Department at the University of Colorado at Boulder. Currently, Dr. Trailovic is a Systems Engineer with Northrop Grumman Mission Systems working on a weather satellite program. Her research interests are in multi-sensor data fusion, target tracking, computing budget allocation, ordinal optimization, particle filters, and remote sensing systems. Lucy Y. Pao was born in Washington, D.C. in 1968. She received the B.S., M.S., and Ph.D. degrees in Electrical Engineering from Stanford University. She is currently a full Professor in the Electrical and Computer Engineering Department at the University of Colorado at Boulder. She spent the 2001–2002 academic year as a Visiting Scholar at Harvard University. She has published over 120 refereed journal and conference papers in the areas of control systems (with applications to flexible structures, disk drives, tape systems, and wind turbines), multisensor data fusion, and haptic and multimodal visual/haptic/audio interfaces. Her research is currently supported by the US National Science Foundation

(NSF), the US Office of Naval Research (ONR), the Colorado Center for Information Storage, and Maxtor Corporation. She also currently has active collaborations with Quantum Corporation and the National Renewable Energy Laboratory (NREL). Dr. Pao is the recipient of the 1996 IFAC World Congress Young Author Prize, and she is co-author and advisor on the paper that was awarded the Best Student Paper Award at the 2001 American Control Conference (ACC). She also received a NSF CAREER Award (1996–2001) and an ONR Young Investigator Award (1997–2000). Other recent awards include a 2003 Subaru Teaching Excellence Award and the Best Commercial Potential Award at the 2004 International Symposium on Haptic Interfaces for Virtual Environments and Teleoperator Systems, held in conjunction with the IEEE Virtual Reality Conference. She has been on organizing committees and program committees for over a dozen international conferences and workshops. Recently, she was the Vice Chair for Invited Sessions for the 2003 ACC, an appointed member of the IEEE Control Systems Society (CSS) Board of Governors (BoG) in 2003, and the Program Chair for the 2004 ACC. She is currently an elected member on the IEEE CSS BoG (2005–2007), a member of the IFAC Committee on the Past, Present, and Future of Control Education, and an Associate Editor for the International Journal of Control, Automation, and Systems.