Computers and Electrical Engineering 40 (2014) 884–896
Contents lists available at ScienceDirect
Computers and Electrical Engineering journal homepage: www.elsevier.com/locate/compeleceng
Mixture reduction techniques and probabilistic intensity models for multiple hypothesis tracking of targets in clutter q Hugh L. Kennedy ⇑ Defence and Systems Institute (DASI), University of South Australia, Australia
a r t i c l e
i n f o
Article history: Available online 24 August 2013
a b s t r a c t A linear combination of Gaussian components, i.e. a Gaussian ‘mixture’, is used to represent the target probability density function (pdf) in Multiple Hypothesis Tracking (MHT) systems. The complexity of MHT is typically managed by ‘reducing’ the number of mixture components. Two complementary MHT mixture reduction algorithms are proposed and assessed using a simulation involving a cluttered infrared (IR) seeker scene. A simple means of incorporating intensity information is also derived and used by both methods to provide well balanced peak-to-track association weights. The first algorithm (MHT-2) uses the Integral Squared Error (ISE) criterion, evaluated over the entire posterior MHT pdf, in a guided optimization procedure, to quickly fit at most two components. The second algorithm (MHT-PE) uses many more components and a simple strategy, involving Pruning and Elimination of replicas, to maximize hypothesis diversity while keeping computational complexity under control. Ó 2013 Elsevier Ltd. All rights reserved.
1. Introduction Multiple Hypothesis Tracking (MHT) is a complete Bayesian framework for tracking targets in clutter [1]. In theory, it is an ideal data association approach in surveillance sensor systems using detect-before-track architectures where all target and clutter peaks are considered to be point-like with no distinguishing features. Tracks on targets are maintained primarily using kinematic models which define the relationship between spatial and temporal variables, using a Kalman filter. In some cases, intensity measurements may also be used to probabilistically adjust the weights of peak-to-track assignment hypotheses. By comparison, clutter is assumed, on average, to be less intense and spatiotemporally uncorrelated. After local maxima in the digitized sensor data have been identified, a threshold is applied to extract high-intensity peaks that may be due to targets. This simple but crude peak declaration process reduces the number of possible data association hypotheses by several orders of magnitude. MHT is then applied over time to further reduce the hypotheses to a number that provides a tolerably low probability of track divergence at an acceptable computational cost. The digitized sensor data are first pre-processed by a whitening algorithm to ensure that the aforementioned assumptions are reasonable. An ideal whitener transforms structured backgrounds into spatiotemporally de-correlated white noise. The target tracking problem is then reduced to a form that is readily solved using MHT, regardless of the sensor modality – acoustic, electro-optic, radio-frequency or hyper-spectral. An infrared (IR) seeker application is considered in this paper [2,3]. While perfect whitening is unlikely in real electro-optic systems, the assumption of de-correlated clutter is a convenient theoretical starting point.
q
Reviews processed and recommended for publication to Editor-in-Chief by Deputy Editor Dr. Ferat Sahin.
⇑ Tel.: +61 8 8302 6582.
E-mail address:
[email protected] 0045-7906/$ - see front matter Ó 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.compeleceng.2013.07.023
H.L. Kennedy / Computers and Electrical Engineering 40 (2014) 884–896
885
In practice, MHT implementation in an embedded processor of modest capability is a challenge. MHT is perhaps best regarded as an unrealizable abstraction. MHT instantiations in real systems require simplifications, of one kind or another, to ensure that the number of competing data association hypotheses is manageable. Mixture reduction and sub-problem formation are particularly effective strategies for controlling MHT complexity [4]. Probabilistic Data Association (PDA) is an extreme case where each track is a single sub-problem and each sub-problem is reduced to a single Gaussian component [5,6]. Joint PDA (JPDA) also maintains a single component for each track but sub-problems with multiple tracks are permitted [6]. Mixture reduction strategies for tracking multiple closely-spaced targets using MHT has more recently been examined in [7], where the minimum mean optimal subpattern assignment (MMOSPA) metric [8], is used to produce ‘‘smooth, coalescencefree state estimates’’. Reduction algorithms are also required to manage complexity in the newer Gaussian-mixture probability hypothesis density (PHD) trackers [9–11]. A number of generic reduction strategies have been proposed to allow the use of Gaussian mixtures of arbitrary size. Replacement of pairs and clusters by a single component in proposed in [12]. Candidates for replacement are identified using a statistical measure of inter-component separation. This approach has been incorporated in a multi-target tracker in [13]. The Integral Squared Error (ISE) is used in [14–17] to quantify the error associated with a given reduction. Unlike the Kullback–Leibler measure used in [18,11], the ISE has a closed analytical form, allowing it to be quickly evaluated without numerical approximation. Expressions for partial derivatives of the ISE [14], and the normalized ISE [17], have also been derived, which in principal, may be used to define a system of ordinary differential equations (ODEs) [14,17]; however, in practice, iterative rule-based approaches are found to strike a more sensible balance between tracking performance and computational cost. The error surface of the optimization problem is difficult to negotiate; thus mathematical rigor gives way to a somewhat more pragmatic series of: moment-preserving merge operations [18]; merge-or-prune operations [14–16]; or successive split operations, guided by an adaptive solver [17]. More recently, in [11] a generalized likelihood ratio is used in a statistical hypothesis test to determine which Gaussian components should be merged; in [19] sparse modelling is used within a L1/2 regularization framework, while a computationally expensive ‘brute-force’ combinatorial approach is adopted in [4]. In this paper, an efficient bimodal reduction algorithm is presented (MHT-2) [20]. The use of two components is sufficient to resolve temporary association ambiguity that might otherwise lead to track divergence in a simpler one-component approach. In the simple abstract scenarios investigated here, maintaining a greater number of components, using a more complicated mixture reduction scheme, offers diminishing returns. An alternative reduction algorithm (MHT-PE), capable of maintaining an arbitrary number of components at a reasonable computational cost, is also presented [20]. It is used to highlight the performance/complexity trade-off associated with MHT system design. In the next section the use of intensity information is discussed. The target tracking problem is then posed, in the context of an IR seeker in a long-range air-to-air engagement. The low-observable target, which is well approximated by a single point, is set against blue sky. It is assumed that the track management functions (initiation and termination) are manually performed by the pilot therefore the formulations do not include an integrated track confidence model. Finally, Monte Carlo (MC) simulations are used to assess the various mixture reduction strategies, using the mean-time-before-failure (mtbf) and the mean-time-to-process-frame (mtpf) metrics as measures of performance.
2. Intensity ‘Information’ The use of non-kinematic feature attributes has been shown to aid data association in imaging sensors. Peak curvature and volume [21]; color distribution [22]; down-range and cross-range extent [23]; target size, shape and orientation [24]; and of course signal strength or intensity [21,25,26]; have all been proposed. The volume and diversity of published research on the use of intensity information in PDA and MHT trackers suggests that the subject is more of an art than a science. In [21,25], clutter-peak signal-to-noise ratio (SNR) statistics are represented using discrete histograms and Gaussian distributions are used to model target SNR. The process- and measurement-noise variance of the target SNR is assumed to be known a priori and a Kalman filter is used to estimate the mean. A high filter gain in required in this dimension to ensure that the SNR estimate follows the target during periods of sudden signal power fluctuation. As a side-effect, the high gain may cause the intensity estimate to follow clutter. A Markov stationary random signal was shown to be an appropriate target intensity model in the real IR data sets analyzed in [26]. Rayleigh pdfs were used to represent clutter-peak and target-peak amplitude distributions in [27,28], with the mean of the target pdf estimated using an exponential average. The resulting likelihood ratio increases very rapidly with the measured amplitude; for example, see (29) in [26]. As discussed in [29], this gives a unity probabilistic weighting to high-amplitude peaks, causing the filter to over commit, which negates the benefits of a PDA filter, that is, its ‘hedging’ behavior. Similar problems are also likely when Rayleigh distributions are used in operational PHD-type trackers [9]. This highlights an important design consideration – if intensity models are used in practice, then care must be taken to ensure that their influence does not overwhelm the influence of kinematic models, which are arguably more reliable. In contrast to kinematic distributions, which are governed by the laws of Newtonian mechanics, intensity distributions are notoriously difficult to model because their parameters vary with time and past behavior is not a reliable indicator of future behavior. Blackman et al. point out that high-SNR clutter peaks occur more frequently than might otherwise be expected from a simple statistical analysis [30]. Consequently in their MHT, they recommend the use of heavy-tailed
886
H.L. Kennedy / Computers and Electrical Engineering 40 (2014) 884–896
functions, such as Student’s t distribution to better represent the occurrence of clutter outliers. If these outliers are given a large weighting in a data association filter, then track divergence is likely. It is therefore important to choose intensity models that maintain an appropriate balance between kinematic and non-kinematic characteristics. Other heavy-tailed distributions have also been used, for example Type-1 Swerling targets in K-distributed clutter are assumed in [31,32], due to their physical plausibility and mathematical convenience. If prior amplitude/intensity distributions are used then their parameters must either be estimated on-line or set using prior information. The former approach offers flexibility while the latter approach offers certainty. The measured intensity of targets observed using an IR sensor in an air-to-air engagement changes rapidly and unpredictably due to: (1) (2) (3) (4) (5) (6)
Reflection of sunlight from the target. Target throttle adjustments. Dependence of hot-spot visibility on target aspect. Obscuration of the target by cloud. Changes in sensor gain or frequency band settings. Directed IR counter measures.
While short-term trends may be seen in the data, they are broken suddenly and without warning. For these reasons, no attempt is made here to estimate the target intensity. Instead, the mean of the target intensity pdf is assumed to be a constant multiple of the clutter intensity pdf. A factor of two is recommended. When the actual ratio is significantly less than this value, and the clutter-peak density is high, the PDA filter is unlikely to maintain track, with or without the use of intensity. When the actual ratio is greater than this value, the tracking problem is simpler because the spatial clutter-peak density is lower and the probability of detection is closer to unity, thus the tracker is less reliant on intensity information. It is postulated that the tracker therefore stands to gain the most from intensity information when the SNR is in the 3–10 dB range. The intensity of the portion of the clutter peaks above the detection threshold is assumed to be exponentially distributed while the target peaks are assumed to be Rayleigh distributed. The (dimensionless) intensity likelihood ratio is then shown to be proportional to the intensity of a given peak, with the constant of proportionality equal to the inverse of the mean clutter-peak intensity. This simple and intuitive relationship gives rise to well-balanced event probabilities – high-intensity peaks are favoured but low-intensity peaks are not ignored. 3. Formulation The Nc brightest peaks in a given frame are retained and passed to the tracker. The detection threshold intensity (i.e. the intensity of the dimmest peak) is subtracted from all raw intensity measurements. This adjusted intensity I, is then used in all downstream processing functions. The way in which intensity information is incorporated into the PDA filter is presented in this section, along with a description of the two-component MHT filter. As the focus here is on the maintenance of a single track on a high-interest target in clutter, multiple target extensions are not presented here. 3.1. PDA The PDA filter update manages association ambiguity by correcting a predicted track using all feasible assignment hypotheses (or events). A Kalman filter correction is used for hypotheses involving a target peak. The prediction is unchanged for the single hypothesis with an absent target peak. The result is a Gaussian mixture. For the so-called ‘parametric’ form of the PDA filter, the un-normalized mixture weights are
aPDAz ¼ ð1 pg pd Þfc ðNg ; q; V g Þ 0
ð1Þ
for the prediction, or the target-peak-is-absent hypothesis, and
aPDAz ¼ n
1 ^; SÞ p fc ðNg 1; q; V g Þft ðzn ; y Ng d
ð2Þ
for the correction involving nth peak, for n = 1 . . . Ng, in the ellipsoidal association (or validation) gate, where:
The PDAz superscript is used to denote a PDA-related variable, in units of density, raised to a power of Ng. Ng is the number of peaks inside the association gate. Vg is the volume of the association gate. pg is the probability that the target peak is within the association gate. pd is the probability that the target peak is above the detection threshold. fc(N; q, V) is the ‘likelihood’ that there are N clutter peaks, within a volume (or area) of V, given a local clutter-peak spatialdensity of q. It is the product of multiple uniform probability density function (pdf) terms and a Poisson probability mass function (pmf) term
H.L. Kennedy / Computers and Electrical Engineering 40 (2014) 884–896
fc ðN; q; V g Þ ¼
1 Vg
N
ðqV g ÞN expfqV g g: N!
887
ð3Þ
When the target density is low and the clutter density is high, the density parameter q is equal to the number of peaks (Nc) divided by the sensor sensor’s field of view [33]. The factor of N1g in (2) is used to account for the fact that there are Ng different ways of removing a single clutter peak from the validation volume. ^; SÞ is the likelihood that the n th peak is due to the target being tracked, ft ðzn ; y
^; SÞ ¼ ft ðzn ; y
1 ^ T S1 ½zn y ^ pffiffiffiffiffiffi exp 2 ½zn y jSj ð2pÞ 1
Mz 2
ð4Þ
^ is the estimate of the where zn is the spatial coordinate vector of the n th peak inside the association gate, of length Mz, y target’s state predicted forward to the current time and transformed into the measurement space, S is the target’s innovation (or residual) covariance matrix. The association gate is defined using the ellipsoid
^T S1 ½z y ^ < k2 : ½z y
ð5Þ
For each track, the number of associated peaks is limited to Nd. All peaks gated by a given track are sorted on the metric defined in (5), that is, the squared Mahalanobis distance. The most distant peaks are discarded to ensure that N g 6 N d . The dimensions of the measurement (Mz) and state spaces are assumed here to be 2 (x and y position) and 4 (x and y position and velocity) respectively. Dividing (1) and (2) by pdfc(Ng; q, Vg), and simplifying, cancels the gate volume terms and results in two dimensionless quantities
aPDA ¼ 0
ð1 pg pd Þ pd
ð6Þ
and
aPDA ¼ n
^; SÞ f1 ðzn ; y
q
ð7Þ
where the PDA superscript is used to denote a dimensionless PDA-related variable. If a simplistic model of target motion is assumed, then pg is close to unity in the implementation used here, because a very large association gate is used, therefore
aPDA ¼ 0
ð1 pa Þ pa
ð8Þ
where pa is the probability of target peak association, which is equal to pd if a simple model of sensor measurement is also assumed; otherwise, pa < pd. Using this definition, 1 pa is the probability of any adverse event, that causes a target peak to be absent, such as a target peak below the detection threshold, an extreme maneuver, an occluded target, and uncompensated sensor motion. This modification to the standard parametric PDA filter is used here to simplify the event space and to introduce some flexibility into the tracker tuning process. The normalized mixture weights are
aPDA PDA n0 ¼0 an0
ð9Þ
aPDA PDA n0 ¼0 an0
ð10Þ
bPDA ¼ PNg 0 0 and
bPDA ¼ PNg n n
The mixture is then reduced to a single Gaussian component. The parameters of the Gaussian are chosen to equal the parameters of the mixture, by matching the first and second moments of the two distributions using
^r ¼ x
N X ^n bn x
ð11Þ
n¼0
and
Pr ¼
N n o X ^n x ^ r Þðx ^n x ^r ÞT bn Pn þ ðx n¼0
ð12Þ
888
H.L. Kennedy / Computers and Electrical Engineering 40 (2014) 884–896
3.2. PDA with Intensity Intensity information is incorporated into the PDA filter using
aiPDAz ¼ ð1 pa Þfc ðNg ; q; V g Þ 0
Ng Y gðIn ; jc ; hc Þ
ð13Þ
n¼1
for the prediction hypothesis and
aiPDAz ¼ n
Ng Y 1 ^; SÞgðIn ; jt ; ht Þ gðIn0 ; jc ; hc Þ pa fc ðNg 1; q; V g Þft ðzn ; y Ng n0 ¼1
ð14Þ
n0 –n
for the correction involving the nth peak in the association gate, where the c and t subscripts correspond to the clutter-peak and target-peak populations, respectively; and where the iPDAz superscript refers to a PDA-related quantity, with intensity information, in units of density, raised to a power of Ng. In (13) and (14), g(I; j, h) is the gamma distribution, that is,
gðI; j; hÞ ¼
Ij1 I exp : h h ðj 1Þ!
ð15Þ
j
In the above definition, j and h are the shape and scale parameters, respectively. The mean of the distribution is jh. The clutter peaks above the detection threshold are assumed to be exponentially distributed (jc = 1). In theory, the mean clutterpeak intensity Ic, is assumed to be known; however, in practice, it is set equal to the mean intensity bI, of all peaks above the detection threshold, in a given frame. The target peaks are assumed to be Rayleigh distributed (jt = 2); furthermore, the mean intensity of the target peaks It, is assumed to be twice that of the clutter peaks. Dividing (13) and (14) by QNg pa fc ðN g ; q; V g Þ n¼1 gðIn ; j0 ; h0 Þ, and simplifying, then yields
aiPDA ¼ 0
ð1 pa Þ pa
ð16Þ
aiPDA ¼ n
^ ; SÞ I n ft ðzn ; y q Ic
ð17Þ
and
where the iPDA superscript refers to a dimensionless PDA-related quantity, with intensity information. 3.3. MHT with Intensity PDA and JPDA may result in the posterior pdf being too thinly ‘spread’ in cases of severe peak-to-track assignment ambiguity and conflict. Then on the next update, none of the more likely hypotheses in the previous frame make an appropriate contribution to the estimate of the target’s state in the current frame because the target-peak-is-absent hypothesis is over represented. An MHT mixture reduction algorithm, that does not reconsider and adjust previous association hypotheses, maintains at most Ld components. Like PDA and JPDA, the parameters of the reduced mixture components need not be the same as any of the original mixture components. The posterior target pdf for the kth frame is then
pðx; LÞ ¼
L X ^ l ; Pl Þ wkl N ðx; x
ð18Þ
l¼1
where L 6 Ld and w is the component weight. The reduced mixture is propagated forward to the time of the next frame, transformed from state space to measurement space, then used to create the new multi-modal target likelihood function (ft). After peaks have been gated in the next frame, the MHT mixture is used to represent the prior pdf when evaluating the likelihood of all feasible association hypotheses. The MHT mixture weights are Ng Y
aiMHTz ¼ wk1 ð1 pa Þfc ðNg ; q; V g Þ l l;0
gðIn ; jc ; hc Þ
ð19Þ
n¼1
for the prediction hypothesis due to the lth component and
aiMHTz ¼ l;n
Ng Y 1 k1 ^; SÞgðIn ; jt ; ht Þ gðIn0 ; jc ; hc Þ wl pa fc ðNg 1; q; V g Þft ðzn ; y Ng n0 ¼1
ð20Þ
n0 –n
for the correction involving the lth component and the nth peak in the association gate in the kth frame. Dividing throughout QNg by pa fc ðN g ; q; V g Þ n¼1 gðIn ; jc ; hc Þ and simplifying yields the dimensionless, un-normalized, MHT mixture weights
H.L. Kennedy / Computers and Electrical Engineering 40 (2014) 884–896
889
aiMHT ¼ wk1 l l;0
ð1 pa Þ pa
ð21Þ
aiMHT ¼ wk1 l l;n
^ l ; Sl Þ In ft ðzn ; y q Ic
ð22Þ
and
In simple scenarios L = 1 is typically sufficient (i.e. PDA/JPDA). In slightly more challenging scenarios, the posterior pdf may be strongly bimodal. Bimodality typically arises when (1) A single distant peak leads to target-peak-absent and target-peak-present hypotheses of similar likelihood. (2) Two nearby peaks, on either side of the predicted measurement position, lead to two target-peak-present hypotheses of similar likelihood. Both cases ‘stretch’ a single Gaussian to cover both assignment possibilities; however, there is insufficient probability mass where it is most required. The MHT-2 approach proposed is this paper therefore uses a simple hypothesis reduction algorithm, capable of handling these cases, without incurring a heavy processing penalty. The process introduces an additional degree of freedom when representing the posterior pdf. Firstly, all components with negligible mixture coefficients less than a specified threshold (wr) are deleted and the pdf is renormalized. A planar partition is then used to separate the mixture components of p(x; Ng + 1) into two groups. A single Gaussian is fitted to the Gaussian components in each group to form a candidate two-component mixture p(x; 2). The parameters of the single Gaussian are chosen to equal the parameters of the mixture in the group, by matching their first and second moments. The merit of the candidate two-component mixture, is determined using the global ISE [14–16], where
ISEðA; BÞ ¼
Z
2 fpðx; NA Þ pðx; NB Þg dx:
ð23Þ
It is evaluated analytically using T
ISEðA; BÞ ¼ aT IAA a 2aT IAB b þ b IBB b
ð24Þ
where, for NA = Ng + 1 and NB = 2, a is a column vector of length Ng + 1 containing the mixture coefficients of p(x; Ng + 1). b is a column vector of length 2 containing the reduced mixture coefficients of p(x; 2) and IAA, IAB and IBB are matrices containing the integrals of Gaussian product pairs, with elements evaluated using
Iab ¼
Z
n o 1 ^a x ^b ÞT ðPa þ Pb Þ1 ðx ^a x ^b Þ : ^b ; Pb Þdx ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi exp ðx ^a ; Pa ÞN ðx; x N ðx; x D ð2pÞ jPa þ Pb j
ð25Þ
As the first term of the ISE (aTIAAa) is the same for all candidate configurations, it may be omitted as a time saving measure. In state space, the normal vector of the planar partition coincides with the major principal axis of a one-component ref ^REF ; PREF , which is fitted to the complete MHT pdf prior to reduction. The partition is shifted erence Gaussian ðx; 1Þ ¼ N x; x along the axis and the merit of all possible Ng candidate two-component mixtures evaluated. The Gaussian components are
^l x ^ REF , onto the major principal axis direction vector, which assigned to a group by projecting their relative locations, l ¼ x REF with the greatest eigenvalue. The two-component mixture that minimizes the ISE is used as the is the eigenvector of P MHT posterior pdf of the track. In the current application, position and velocity estimates are produced in the x and y dimensions, therefore the state-space dimension is equal to four (D = 4). In some cases the two reduced components (a and b) are effectively identical. The normalized overlap integral (S) is used as a final measure of similarity
R ^ a ; Pa ÞN ðx; x ^b ; Pb Þdx N ðx; x ffi Sab ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi R R 2 ^b ; Pb Þ2 dx ^a ; Pa Þ dx N ðx; x N ðx; x
ð26Þ
The reference Gaussian is used when S > Sr, where Sr is a system constant. In this case, MHT-2 reduces to PDA. Regardless of the reduction process outcome, the reference Gaussian is used as the estimate of the target’s state when reporting to the pilot or a control subsystem. As the mean of the reference Gaussian is a weighted average quantity, this provides a relatively smooth output. The reference Gaussian is also used to gate peaks during association processing. Gating is a crude process, which is used to limit the scope of the assignment problem; in contrast, likelihood evaluation is a more subtle process, which benefits from a high-fidelity representation of the pdf. The reference Gaussian is a persistent entity within the system while the corresponding L MHT mixture components are ephemeral entities that simply transfer probability mass from one frame to the next. Mixture reduction is a formal means of performing hypothesis pruning and merging functions in an MHT tracker. As a result of the mixture reduction process, there is no continuity of component identity. Unlike
890
H.L. Kennedy / Computers and Electrical Engineering 40 (2014) 884–896
track-splitting MHT implementations, track components only exist from the correction stage of a given frame to the prediction stage of the next frame. 4. Simulation 4.1. Scenarios Clutter was randomly generated and distributed over a 64 64 pixel (pix) Field-Of-View (FOV), according to (3) and (15). Unless otherwise indicated, the ‘baseline’ scenario used 100 peaks per frame (frm) and SNR = 3 dB, where SNR = 20log10(jtht/ jchc). A single point-target was injected, with pd = 1.0 for the first two frames to facilitate track establishment, and pd = 0.8 thereafter. Unity pd was used in [12]. Unfortunately, pd < 1.0 is an inextricable part of the tracking-in-clutter problem and consecutive missed target detections were a contributing factor in most cases of tracking failure in the simulations performed here. The target track was ‘manually’ initiated on the first target detection. The target orbited the FOV center on a circular trajectory with a radius of rt = 2/3 64 (pix) and a constant tangential velocity of vt = 1 (pix/frm). Gaussian noise with lR = 0 pix) and rR = 1 (pix) was added to target position measurements. 4.2. Trackers The diagonal elements of the R matrix used by the Kalman filter were set equal to r2R ; while the Q matrix was parameterized using an approximate linearization of target motion using rQ = at, where at (pix/frm2) is the centripetal acceleration, determined using at ¼ v 2t =r t . Parameters for all filters in all scenarios, unless otherwise stated, were set as follows: k2 ¼ 64, Nd = 24, Ld = 64, Sr = 0.7 and wr = 1.0E4. A value of pa = 0.90 was used by the PDA, MHT-2 and MHT-PM; pa = 0.99 was used
^ ¼ zx ; 0; zy ; 0 T by MHT-P and MHT-PE. The tracker designations are defined below. The Kalman filter was initialized using x and diagðPÞ ¼ ½r2R ; v 2t ; r2R ; v 2t . One hundred instantiations (‘runs’) of each scenario were generated. For a given tracker, each ^ REF for the MHT trackers) and the true target run was terminated when the distance between the estimated target position (x position exceeded 16 pixels – i.e. when the track is deemed to be ‘divergent’. PDA, MHT-2, MHT-P, MHT-PE, and MHT-PM tracking algorithms were implemented: (1) PDA is the one-component reduction scheme described in Sections 3.1 and 3.2. (2) MHT-2 is the two-component reduction scheme described in Section 3.3. (3) MHT-P uses a simple Prune-only strategy. It sorts the components on their weight and keeps only the Ld ‘heaviest’ terms. (4) MHT-PE uses a slightly more involved Prune then Eliminate duplication strategy. Component duplicates are identified using (26). Then the term with the lesser weight is deleted while its weight is added to the term with the greater weight, which is retained. All component pairs with S > Sr are treated consecutively in this manner, from most similar to least similar. Finally, the Ld heaviest terms remaining are identified and retained.
Fig. 1. Baseline simulation example, processed with MHT-2.
891
H.L. Kennedy / Computers and Electrical Engineering 40 (2014) 884–896 Table 1 Baseline scenario results.a
mtbf (frm) mtpf (s) a
PDA
MHT-2
MHT-P
MHT-PE
84.0 0.0023
153.4 0.0131
102.2 0.0360
334.7 0.0322
PDA
MHT-2
MHT-P
MHT-PE
51.3 42.2 0.0024
78.2 59.2 0.0227
57.6 63.2 0.0395
154.2 126.4 0.0498
PDA
MHT-2
MHT-P
MHT-PE
51.3 676.1 0.0021
78.2 1419.2 0.0079
57.6 447.9 0.0326
154.2 2191.5 0.0126
PDA
MHT-2
MHT-P
MHT-PE
153.9 300.3 0.0012
276.3 820.6 0.0049
153.4 306.3 0.0191
730.0 1818.9 0.0078
PDA
MHT-2
MHT-P
MHT-PE
34.9 46.2 0.0041
40.1 63.0 0.0408
34.7 48.2 0.0648
45.5 88.3 0.1546
(SNR = 3, Nd = 100).
Table 2 LOW-SNR scenario results.a
mtbf (frm)a mtbf (frm) mtpf (s) a
(SNR = 0, Nd = 100); Intensity information ignored by all trackers.
Table 3 High-SNR scenario results.a
mtbf (frm)a mtbf (frm) mtpf (s) a
(SNR = 10, Nd = 100); Intensity information ignored by all trackers.
Table 4 Low-clutter scenario results.a
mtbf (frm)a mtbf (frm) mtpf (s) a
(SNR = 3, Nd = 50); Intensity information ignored by all trackers.
Table 5 High-clutter scenario results.a
mtbf (frm)a mtbf (frm) mtpf (s) a
(SNR = 3, Nd = 200); Intensity information ignored by all trackers.
(5) MHT-PM is the iterative Prune or Merge strategy described in [14–16], where on each iteration, the operation that minimizes the global ISE cost is applied. Reduction processing in all MHT implementations begins with the deletion of insignificant components, with normalized weights less than wr. A simulation example is depicted in Fig. 1; results for the baseline scenario and alternative scenarios are given in Tables 1–5; results are plotted in Figs. 2 and 3 along with some tracker parameter tuning results; and a time-before-failure (tbf) histogram is plotted in Fig. 4. 5. Discussion 5.1. Summary of results The results show that in the baseline scenario, the mtbf of MHT-2 is approximately twice that of PDA and that the mtbf of MHT-PE is approximately twice that of MHT-2. The mtbf of MHT-P is slightly greater than the mtbf of PDA; however, the mtpf of MHT-P is an order of magnitude greater.
892
H.L. Kennedy / Computers and Electrical Engineering 40 (2014) 884–896
Fig. 2. Mean-time-before-failure (mtbf), for each tracker, as a function of algorithm tuning and simulation scenario parameters.
Fig. 3. Mean-time-to-process-frame (mtpf), for each tracker, as a function of algorithm tuning and simulation scenario parameters.
5.2. Computational complexity When an upper limit is placed on the number of MHT components, the removal of duplicates is required because they consume computing resources that could be better utilized elsewhere. However, the complexity of a reduction algorithm that executes merge operations has quadratic complexity which scales as OfðN d Ld Þ2 g when association ambiguity is high,
H.L. Kennedy / Computers and Electrical Engineering 40 (2014) 884–896
893
Fig. 4. Time-before-failure (tbf) histogram, for each tracker, for 1000 repetitions of the baseline scenario.
in contrast to the linear complexity of the component updates (prediction, association and correction) which scales as OfðN d Ld Þg. It is therefore important to ensure that the time spent reducing the mixture is not greater than the time saved through the manipulation of fewer components during the remainder of the track update cycle. This design principle rules out the use of more rigorous methods in the current application, such as the combinatorial approach adopted in [4], which takes around 10 min to reduce a ten-component mixture in a four-dimensional state-space. MHT-P is faster than MHT-PE because it employs a simpler prune-only reduction strategy, with no replica management policy; however it struggles to outperform PDA because maintaining many hypotheses is of little benefit if they are all concentrated in the same region of state space. In contrast, the moment matching fitting process used to reduce the MHT-2 and PDA pdfs helps to ‘spread out’ the probability mass. A more ‘diffuse’ pdf represents uncertainty more effectively than many closely spaced components, resulting in a very favorable (i.e. large) mtbf to mtpf ratio. However, the moment-matching fitting process does have an undesirable side effect – it destroys the symmetric x–y block structure of the component covariance matrices, which complicates the Kalman filter matrix manipulations. As all filters used the same Kalman filter Matlab code, the MHT-P and MHT-PE filters were unable to exploit this property to gain a speed advantage. 5.3. The role of pa A less-than-unity probability of association means that the target-peak-is-absent hypothesis always has a non-negligible likelihood. It was found that pa P pd is desirable in PDA and MHT-2, probably to compensate for the fact that the target’s acceleration is not distributed as a zero-mean normal variable. Higher pa increases the weighting of peak-is-target events, which in turn increases the filter gain. An even higher pa benefits the MHT-P and MHT-PE filters because it helps to spread the pdf, encouraging it to ‘explore’ unlikely association hypotheses, which may become more likely in the future as more data are acquired. With a lower pa value, the target-peak-is-absent hypotheses ‘spawned’ from many similar components overlap, resulting in an undesirable accumulation of prediction-only probability mass when the track is ‘struggling’ to ‘find’ the target. 5.4. MHT-PM The MHT-PM algorithm has a number of attractive features [14–16]. Firstly, it seeks a reduced representation that minimizes a clearly defined error metric, evaluated over the entire pdf. It does not simply consider component doublets or singlets in isolation. Furthermore, the error metric has a closed analytical form, allowing it to be evaluated quickly and accurately without numerical approximation. The MHT-PM algorithm also has a number of shortcomings in the tracking scenarios used here. Firstly, like the approach used in [18], the algorithm aims to reduce the mixture to a minimum number of components. Consequently, during periods of low association ambiguity, many more components may be maintained than are actually needed. For this reason, the approach described in [17] is probably more appropriate, because it builds up from one component, rather than breaking down from all components. The MHT-PM search process is also prohibitively slow. The speed problem is exacerbated by the fact that the reduced mixture components evolve as the search is executed – they are not necessarily the same as the components of the original mixture, due to merge operations. This means that all of the required error integrals cannot simply be pre-computed and
894
H.L. Kennedy / Computers and Electrical Engineering 40 (2014) 884–896
reused as the reduction unfolds and that the integrals are unable to exploit x–y block symmetry in matrix inversions. Even though the integrals are evaluated analytically, a matrix determinant and inverse is required; however, it is possible to simplify these matrix operations if the product of the diagonal matrix elements is used for the determinant and if U–D factorization with back substitution is used instead of the full matrix inverse [15]. We were unable to encode a MHT-PM implementation that ran fast enough to allow a complete performance optimization and investigation. Only the baseline scenario was run with Ld = 2 and all other parameters set to the MHT defaults. In the Ld = 2 case, mtbf = 96.0 frames and mtpf = 2.2 s. Thus its mtbf is much shorter than MHT-2 (153.4 frames), longer than PDA (84.0 frames) and MHT-PE (83.0 frames, with Ld = 4), but less than MHT-PE (157.6 frames, with Ld = 8), for an mtpf that is two orders of magnitude greater than MHT-PE (0.0323 s, with Ld = 128). While implementation optimizations have been suggested [15], the iterative nature of the algorithm and the need to evaluate many integrals on each pass, places it at a disadvantage relative to other simpler approaches [12], even when all the integrals are evaluated analytically. 5.5. MHT-2 Like the MHT-PM algorithm, the MHT-2 reduction process minimizes a global cost function (the ISE); unlike MHT-PM, it employs a guided search, which greatly reduces the dimension of the parameter space. The MHT-2 algorithm was motivated by the observation that the maintenance of more components does not automatically translate into more performance, i.e. a longer mtbf. The PDA algorithm performs surprisingly well with only one well-fitted component, while the MHT-P performance gain is surprisingly slight, with so many more – albeit poorly selected – components retained. 5.6. MHT-PE The MHT-PE algorithm is more of a ‘brute force’ approach. It aims to maintain a large number of hypotheses using a fast and simple reduction approach. Comparatively naive logic is used to remove unlikely and duplicate hypotheses from a very large starting set. The MHT-PE algorithm was motivated by the observation that the maintenance of more components only translates into a longer mtbf without an unreasonable increase in the mtpf, if hypothesis duplication is eliminated and if the reduction process is simple. While the intrinsic OfðN d Ld Þ2 g complexity of a pair-wise reduction process cannot be avoided, the simplicity of the approach ensures that the proportionality constant is small. After twin components have been identified using (26), no attempt is made to merge the components by combining their parameters. This means that normalized overlap integrals between elements do not need to be computed ‘on-the-fly’ and that matrix blocking is preserved. The identification of duplicates causes MHT-PE to run much more slowly than MHT-P, i.e. an increased mtpf, for the same number of retained components; however, the increased hypothesis diversity results in a much longer mtbf. 5.7. Tuning considerations The results in Figs. 2 and 3 suggest that Sr = 0.7 is an appropriate setting for MHT-PE: A larger value results in a much longer mtpf because the processor is burdened by many similar components; while a smaller value means that the tracker is unable to fully utilize the tracking benefits a large Ld, because too many components are deemed to be duplicates and eliminated. With an mtbf of 126.4 frames, MHT-PE performs well in the low-SNR scenario, relative to 42.2, 59.2 and 63.2 frames, for PDA, MHT-2 and MHT-P, respectively. The results were similar in the high-clutter scenario, with an mtbf of 88.3, compared with 46.2, 63.0 and 48.2 frames, for PDA, MHT-2 and MHT-P, respectively. The plateau in mtbf as a function of Nd and Ld, suggests that more hypotheses do not necessarily result in a longer mtbf in the baseline scenario and that the rate of mtbf increase is greatest for small Nd and Ld. MHT-2 has the advantage of a more uniform and predictable system load. The MHTPE load is less predictable due to the highly variable number of retained hypotheses; however, there is greater scope to configure it for an optimal tradeoff between mtbf and mtpf. The performance of MHT-PE with Ld = 8 is very similar to MHT-2. In this case, for the baseline scenario, the mtbf and mtpf of MHT-PE is 157.6 frames and 0.0134 s, respectively; compared with, 153.4 frames and 0.0131 s for MHT-2. In the highclutter scenario, the mtbf and mtpf of MHT-PE with Ld = 8, is 50.9 frames and 0.0317 s, respectively; compared with 63.0 frames and 0.0408 s for MHT-2. This suggests that MHT-2 mixture reduction allows more to be done with fewer components but that the associated computational cost of this ‘smarter’ MRA is not insignificant. 5.8. The benefits of an intensity model Results with and without intensity information are given in Tables 2–4. When intensity information is ignored, the intensity likelihood ratio is set to unity (In/Ic = 1). In the baseline scenario, use of intensity information nearly doubles the mtbf for all trackers, for an insignificant increase in mtpf (compare the first row of Table 1 with the first row of Table 2). In the lowSNR scenario, where target and clutter intensities are equal on average, the distribution used to generate the intensity data (i.e. an SNR of 0 dB) does not match the models (i.e. an SNR of 3 dB) assumed by the trackers (compare the second row of Table 2 with the first row of Table 2). As a consequence, there is a slight reduction in the mtbf for all trackers, with the exception of MHT-P, when an incorrect intensity model is used. The intensity models are also mismatched in the high-SNR scenario; however, the mtbf is increased in all cases, regardless. Note that when intensity information is ignored, the results
H.L. Kennedy / Computers and Electrical Engineering 40 (2014) 884–896
895
are the same for the low-SNR, baseline, and high-SNR, scenarios. The mbtf dispersion of all trackers is ‘compressed’ in the high-clutter scenario when intensity information is ignored [20]. This suggests that the value of retaining more hypotheses diminishes in the absence of intensity information and that the tracker needs a way of ‘knowing’ when one of the components has ‘found’ the target again after a period of heightened uncertainty. It is impossible to prove the utility of the probabilistic intensity models used in this paper using simulated data alone. The design rationale was motivated by three observations: (1) Firstly, that a brighter object is usually more likely to be a target than clutter. (2) Secondly, that intensity information should be treated with ‘scepticism’ due to the unreliable nature of intensity models. (3) Thirdly, that the complexity of the probabilistic model should be commensurate with the reliability of the information. One undesirable consequence of using intensity information is degraded tracking of targets with equal or lesser brightness than the clutter; however, this is likely to be the case for all methods that rely on intensity information. 6. Conclusion The global ISE is a simple and effective measure of the goodness-of-fit for the reduction of the high-order Gaussian mixtures used in MHT; however, putting it to use in an ideal algorithm is not straightforward, due to: the high dimension of the parameter search space, constraints imposed on the pdf, and the ‘dimpled’ error surface. As tracking algorithms are required to run in real-time in mission-critical systems, mathematically sub-optimal implementations are needed. MHT-2 uses the ISE in a simple optimization process, where the major principal axis of the optimal one-component fit, is used to guide the search for a near-optimal two-component fit. For the baseline scenario described herein, MHT-2 increases the meantime-before-failure (mtbf) of the tracker by a factor of 1.8, while increasing the mean-time-to-process-frame (mtpf) by a factor of 5.7, relative to PDA. MHT-PE is a useful implementation for probing the performance limits of MHT in a reasonable timeframe; it is also more flexible than MHT-2, allowing the mtbf/mtpf balance to be optimized for a given tracking scenario. Both MHT-2 and MHT-PE provide similar or superior track-maintenance performance to MHT-PM, at a much lower computational cost. For all MHT methods, the improved ability to maintain track, relative to PDA, is particularly noticeable when the simple probabilistic intensity model described in this paper is used. The intensity model is intended to be simple, subtle, and flexible, so that it is able to accommodate a wide range of unexpected clutter conditions and target behaviors. References [1] Blackman SS. Multiple hypothesis tracking for multiple target tracking. IEEE Aerosp Electron Syst Mag 2004;19:5–18. [2] Bai Xiangzhi, Zhou Fugen. Infrared small target enhancement and detection based on modified top-hat transformations. Comput Elec Eng 2010;36:1193–201. [3] Bae Tae-Wuk, Kim Byoung-Ik, Kim Young-Choon, Sohng Kyu-Ik. Small target detection using cross product based on temporal profile in infrared image sequences. Comput Electr Eng 2010;36:1156–64. [4] Crouse DF, Willett P, Pattipati K, Svensson L. A look at Gaussian mixture reduction algorithms. In: Proc 14th int conf on information fusion; 2011. p. 1– 8. [5] Bar-Shalom Y, Kirubarajan T, Lin X. Probabilistic data association techniques for target tracking with applications to sonar, radar and EO sensors. IEEE Aerosp Electr Syst Mag 2005;20:37–56. [6] Bar-Shalom Y, Fortmann TE. Tracking and data association. Mathematics in science and engineering series, vol. 179. San Diego CA: Academic Press; 1988. [7] Crouse DF, Willett P, Svensson L, Svensson D, Guerriero M. The set MHT. In: Proc 14th int conf on information fusion; 2011. p. 1–8. [8] Guerriero M, Svensson L, Svensson D, Willett P. Shooting two birds with two bullets: how to find minimum mean OSPA estimates. In: Proc 13th int conf on information fusion; 2010. p. 1–8. [9] Clark D, Ristic B, Vo Ba-Ngu, Vo Ba-Tuong. Bayesian multi-object filtering with amplitude feature likelihood for unknown object SNR. IEEE Trans Signal Process 2010;58:26–37. [10] Georgescu R, Willett P. The GM-CPHD tracker applied to real and realistic multistatic sonar data sets. IEEE J Ocean Eng 2012;37:220–35. [11] Ardeshiri T, Orguner U, Lundquist C, Schon TB. On mixture reduction for multiple target tracking. In: Proc 15th international conference on information fusion (FUSION); 2012. p. 692–99. [12] Salmond DJ. Mixture reduction algorithms for point and extended object tracking in clutter. IEEE Trans Aeros Electr Syst 2009;45:667–86. [13] Pao LY. Multisensor mixture reduction algorithms for tracking. J Guid Contr Dynam 1994;17:1205–11. [14] Williams JL. Gaussian mixture reduction for tracking multiple maneuvering targets in clutter, M.S.E.E. Thesis, Air Force Institute of Technology, WrightPatterson Air Force Base, OH; 2003.
. [15] Williams JL, Maybeck PS. Cost-function-based hypothesis control techniques for multiple hypothesis tracking. Math Comput Model 2006;43:976–89. [16] Maybeck PS, Kozak MC, Smith BD. Mixed-model multiple-hypothesis tracking of targets in clutter. IEEE Trans Aeros Electr Syst 2008;44:1402–15. [17] Huber MF, Hanebeck UD. Progressive Gaussian mixture reduction. In: Proc 11th int conf on information fusion (ICIF); 2008. p. 1–8. [18] Runnalls AR. Kullback–Leibler approach to Gaussian mixture reduction. IEEE Trans Aeros Electr Syst 2007;43:989–99. [19] Zhu Hongyan, Han Chongzhao, Lin Yan. A reduced Gaussian mixture representation based on sparse modeling. In: Proc 15th international conference on information fusion (FUSION); 2012. p. 684–91. [20] Kennedy HL. Mixture reduction techniques for multiple hypothesis tracking of targets in clutter. In: Proc the 7th int conf on intelligent sensors, sensor networks and information processing (ISSNIP); 2011. p. 413–18. [21] Colegrove SB, Cheung B. A peak detector that picks more than peaks. In: Proc RADAR; 2002. p. 167–71. [22] Jaward M, Mihaylova L, Canagarajah N, Bull D. Multiple object tracking using particle filters. In: Proc IEEE Aerospace Conf; 2006. p. 8. [23] Salmond DJ, Parr MC. Track maintenance using measurements of target extent. IEE Proc Radar Son Nav 2003;150:389–95. [24] Koch Wolfgang. On Bayesian tracking of extended objects. In: Proc multisensor fusion and integration for intelligent systems; 2006. p. 209–16.
896
H.L. Kennedy / Computers and Electrical Engineering 40 (2014) 884–896
[25] Colegrove SB, Davey SJ. PDAF with multiple clutter regions and target models. IEEE Trans Aeros Electr Syst 2003;39:110–24. [26] Li Zhengzhou, Jin Gang, Dong Nengli. Novel approach for tracking and recognizing dim small moving targets based on probabilistic data association filter. Opt Eng 2007;46:016401. [27] Lerro D, Bar-Shalom Y. Automated tracking with target amplitude information. Proc Am Control Conf 1990:2875–80. [28] Bae SH, Kim DY, Yoon JH, Shin V, Yoon K-J. Automated multi-target tracking with kinematic and non-kinematic information. IET Radar Son Nav 2012;6:272–81. [29] Ehrman LM, Blair WD. Probabilistic data association with amplitude information versus the strongest neighbor filter. In: Proc IEEE Aerospace Conf; 2007. p. 1–6. [30] Blackman SS, Dempster RJ, Roszkowski SH, Sasaki DM, Singer PF. Improved tracking capability and efficient radar allocation through the fusion of radar and infrared search-and-track observations. Opt Eng 2000;39:1391–8. [31] Brekke E, Hallingstad O, Glattetre J. Tracking small targets in heavy-tailed clutter using amplitude information. IEEE J Ocean Eng 2010;35:314–29. [32] Brekke E, Hallingstad O, Glattetre J. The modified riccati equation for amplitude-aided target tracking in heavy-tailed clutter. IEEE Trans Aeros Electr Syst 2011;47:2874–86. [33] Bar-Shalom Y, Blackman SS, Fitzgerald RJ. Dimensionless score function for multiple hypothesis tracking. IEEE Trans Aeros Electr Syst 2007;43:392–400. Hugh L. Kennedy (BE & PhD from The University of New South Wales) is a principal engineer in the Defence and Systems Institute at the University of South Australia. Prior to joining the university in late 2010, he worked in industry on the design, development, integration and maintenance of a variety of different sensor systems – electro-optic, radio-frequency and acoustic.