Available online at www.sciencedirect.com
Knowledge-Based Systems 21 (2008) 225–231 www.elsevier.com/locate/knosys
On a novel ACO-Estimator and its application to the Target Motion Analysis problem Lars Nolle School of Science and Technology, Nottingham Trent University, Clifton Lane, Nottingham NG11 8NS, UK Available online 23 November 2007
Abstract In the oceanic context, the aim of Target Motion Analysis (TMA) is to estimate the state, i.e. location, bearing and velocity, of a sound-emitting object. These estimates are based on a series of passive measures of both the angle and the distance between an observer and the source of sound, which is called the target. These measurements are corrupted by noise and false readings, which are perceived as outliers. Usually, sequences of measurements are taken and statistical methods, like the Least Squares method or the Annealing M-Estimator, are applied to estimate the target’s state by minimising the residual in range and bearing for a series of measurements. In this project, an ACO-Estimator, a novel hybrid optimisation algorithm based on Ant Colony Optimisation, has been developed and applied to the TMA problem and its effectiveness was compared with standard estimators. It was shown that the new algorithm outperforms conventional estimators by successfully removing outliers from the measurements. 2007 Elsevier B.V. All rights reserved. Keywords: Target Motion Analysis; Robust estimation; Ant Colony Optimisation
1. Introduction The aim of Target Motion Analysis (TMA) in the maritime context is to predict the state, i.e. location, bearing and velocity, of a signal-emitting object, also known as the target, based on previous observations [1]. The observer receives signals that are emitted from the target where the time of emission is not known. The range R and the bearing h of the target are usually determined by measuring differences in arrival time of short-duration acoustic emissions along the paths R1, R and R2 of a target T using hydrophones that are mounted at some distance D on a cable towed by an observer platform (Fig. 1). In a real application, the time delay measurements are disturbed by noise, caused, for example, by the cross-correlation function used for finding a common signal in a pair of sensors or by the environment [2]. The time delay error distribution function is a non-Gaussian one. Another source of errors is false readings or clutter. This clutter is usually
assumed to be uniformly distributed over an area A and to follow a Poisson probability density function. Fig. 2 shows a typical scenario for the TMA problem. An observer and a target are moving with constant speed and the target is detected at unequally spaced time instances by the observer. Fig. 3 shows the noisy measurements and clutter for the same scenario. Both types of errors introduce additional complexity to the target state estimation problem. Usually, the measurements are taken and estimators, for example the Least Squares (LS) method [3], are applied to estimate the targets’ state. 2. Robust estimation The LS method attempts to minimise the sum of the squares of the errors (called residuals ri) between the estimated positions and the corresponding measurements Eq. (1). p ¼ arg min
E-mail address:
[email protected] 0950-7051/$ - see front matter 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.knosys.2007.11.011
p
n X i¼1
r2i
ð1Þ
226
L. Nolle / Knowledge-Based Systems 21 (2008) 225–231
range and bearing measurements an Annealing M-Estimator [4] outperforms traditional methods in highly cluttered environments. M-Estimators are estimation methods that replace the squared residuals r2i with a function q(ri) of the residuals Eq. (2).
Target T
R1
p ¼ arg min
Radius R
Sensor 1
p
R2
θ
D
D Cable
Sensor 2
Sensor 3
Observer
Fig. 1. Geometry for the Target Motion Analysis problem.
10000 True target trajectory Observer trajectory 8000 t=0
y [m]
6000
4000
2000 direction t=0
direction
0 -1000
0
1000
2000
3000
4000
5000
6000
7000
8000
x [m]
Fig. 2. Typical scenario for the Target Motion Analysis problem.
10000 Clutter Observed target positions Observer trajectory
8000
n X
qðri Þ
ð2Þ
i¼1
Here, q is an arbitrary symmetric, positive-definite function with a unique minimum at zero. It is usually chosen so that it increases less rapidly than that of least squares estimators as the residual departs from zero, and hence it reduces the negative affect of measures with large errors. This is what makes the M-Estimator more robust against outliers. The target vector p can then be found by finding the root of the first derivative q 0 (ri) of the objective function q(ri). The right choice of q(ri) is crucial for the success of the M-Estimator and is problem specific. The functions that are most often used [5] are Hubert’s function Eq. (3) and Turkey’s function Eq. (4). ( 1 2 x for jxj 6 c qðxÞ ¼ 2 ð3Þ 1 2 cjxj 2 c for jxj > c 8 h i3 > < c2 1 1 x2 for jxj 6 c c ð4Þ qðxÞ ¼ 6 > : 1 c2 for jx > cj 6 In both equations c is a tuning constant and it is common to choose c = 1.345r for the Hubert function and c = 4.685r for the Turkey function [5]. Often iterative versions of M-Estimators are used, also known as Iterative Re-weighted Least Squares (IRLS) algorithm [5]. Here, q(x) is replaced with a weighting function w(x) = q 0 (x)/x so that the iterative form of an M-Estimator becomes: pk ¼ arg min p
n X
wðri Þk1 r2i
ð5Þ
i¼1
y [m]
6000
Here, k denotes the kth iteration of the algorithm. A commonly used weighting function is Turkey’s biweight function Eq. (6): ( ½1 ðxÞ2 2 for jxj 6 1 wðxÞ ¼ ð6Þ 0 for jxj > 1
4000
2000
t=0
direction
0
-1000
0
1000
2000
3000
4000
5000
6000
7000
8000
x [m]
Fig. 3. Noisy measurements and clutter for scenario given in this figure.
However, the LS method is not robust against outliers and hence is not suitable for the TMA problem. In recent research [2], it was shown that for the TMA problem with
It can be seen that for small errors x the weighting factor approaches one, whereas for larger error values, the weighting factor approaches zero. That means that the influence of the large error values on the search process is reduced while more importance is given to measures with small error values. Because Eq. (6) is only defined for x 2 [1, +1], the errors ri need to be normalised, which results in Eq. (7):
L. Nolle / Knowledge-Based Systems 21 (2008) 225–231
wðri Þ ¼
8h < :
1
r 2 i2 i
cS
0
for ri =cS 6 1
ð7Þ
for ri =cS > 1
Eq. (9) shows the basic form of the AM-Estimator: pk ¼ arg min p
k1 wðri ; cÞ r2i
100
1
0
0
0
0
0
1
0
1
x2
x3
x4
x5
x6
x7
x8
x9
x10
For the TMA problem using linear arrays, Carevic has shown that AM-Estimators outperform other estimation techniques, including M-Estimators [2]. However, for the TMA problem with noisy measurements and clutter, the best estimation results would obviously be achieved if the clutter, i.e. the outliers, were not used at all in the estimation process. However, none of the standard estimators described above makes hard decisions about whether to include or exclude each item of data from the set of measurements. Instead, they use all available data, including the clutter. The IRLS and the AMEstimators weight the data according to their importance, which is basically their distance from the proposed trajectory. It would clearly be of benefit if the clutter were completely removed from the data set before the estimation process takes place. This would mean making a decision for every single measurement in the set as to whether or not it is included in the estimation process. This can be seen as constructing a binary decision vector of length m, where m is the number of measurements in the set. Each bit in the decision vector would act as a switch, i.e. a ‘1’ would mean the associated data is included in the subset, a ‘0’ would mean it is removed. Fig. 4 shows an example of a binary decision vector for a set of 10 measurements. In this example, only three bits are set to ‘1’ and hence only the measurements x2, x8 and x10 would be used to estimate the target’s state. Therefore, the problem can be formulated as a binary optimisation problem where the optimum decision vector has to be found, i.e. the binary vector that selects all measurements that stem from the real target. The aim of this research was to develop an intelligent estimator that has the ability to decide whether or not a particular measurement is clutter. The new estimator developed, which is referred to as ACO-Estimator, is described in more detail in the next section. 3. ACO-Estimator
ð9Þ
i¼1
At the beginning of the search process, c needs to be set to a sufficient large value, so that the algorithm can overcome local minima and the cooling schedule must ensure that c approaches zero towards the end of the search run in order to find the global minimum. Eq. (10) represents a class of commonly used cooling schedules for AM-Estimators [2,4]. ck ¼ c k 1
0 x1
Fig. 4. Example of a decision vector.
Here, cS determines the threshold for accepting measurements and depends on the spread of data. Usually S is taken as the median of the measurements and c is set to either 6 or 9. The reason for that is that the median error for a Gaussian distribution is approximately equal to the probable error rprobable, which specifies the range that contains 50% of the measured values. For a Gaussian distribution, the probable error rprobable equals 0.6745r and hence a value of 9 for c corresponds approximately to 6r. In other words, every measurement from this range is considered part of the Gaussian distribution and measurements with larger values are rejected. The IRLS algorithm starts form an initial guess for p0 and the weights are recalculated in each iteration n. A new target parameter vector pn is then determined successively by applying an arbitrary minimisation algorithm, using pn1 as start value. The main problem with M-Estimators is that they are not robust to the initialisation [4]. Another problem is that they depend on the right degree of scaling; for example, the heuristics to chose the tuning constant c is based on the global standard deviation r of a Gaussian error distribution. Also, the performance of the iterative IRLS algorithm depends on the start values for p0. As a consequence, the convergence to the global optimum of M-Estimators is not guaranteed. The Annealing M-Estimator (AM-Estimator) [4] is a combination of an M-Estimator and the Simulated Annealing optimisation algorithm [6]. It overcomes the scaling problem of M-Estimators by using an Adaptive Weighting Function (AWF) Eq. (8), which replaces the scaling estimate by a parameter c, which is set to a very high value at the start of the algorithm and is reduced over the run of the algorithm towards c fi 0+. 1 r2 wðri ; cÞ ¼ 1 þ i ð8Þ c
n X
227
ð10Þ
In Eq. (10), the constant c has to be greater one, for example a typical setting is c = 1.5.
The ACO-Estimator is based on Ant Colony Optimisation (ACO) [7]. ACO refers to a class of discrete optimisation algorithms, i.e. a Computational Intelligence (CI) meta-heuristic, which is modelled on the collective behaviour of ant colonies. Real ants are very limited in their individual cognitive and visual capabilities, but an ant colony as a social entity is capable of solving complex problems and tasks in order to survive in an ever-changing hostile environment. For example, ants are capable of finding the shortest path to a food source [8]. If the food source is depleted, the ant colony adapts itself in a way that it will explore the environment and discover new food sources.
228
L. Nolle / Knowledge-Based Systems 21 (2008) 225–231
Ants communicate indirectly with other ants by depositing a substance called pheromone on the ground while they are walking around. This pheromone trail can then be used by the ant to find its way back to its nest after the ant has found a food source and other ants can also sense it. Ants have the tendency to follow existing paths with high pheromone levels. If there is no existing pheromone trail, they walk around in a random fashion. If an ant has to make a decision, for example to chose a way around an obstacle in its way, it follows existing paths with a high probability. However, there is always a chance that the ant explores a new path or a path with a lower pheromone level. If an ant has chosen an existing path, the pheromone level of this path will be increased because the ants deposit new pheromone on top of the existing one. This makes it more likely that other ants will also follow this path, increasing the pheromone level again. This is a positive feedback process, known as autocatalysis [9]. Although the pheromone evaporates over time, the entire colony builds up a complex solution based on this indirect form of communication, called stigmergy [10]. Fig. 5 demonstrates the basic principle of the ACO meta-heuristic, which is modelled after the behaviour described above. In this example, the system S that has to be optimised has three independent variables x1. . .x3 and the quality of the solution can be measured by the achieved fitness value y. Each input can have one of five different discrete alternative values sij, where i represents the input and j the chosen alternative for that input. Each alternative has an associated probability value, which is randomly initialised. The collection of probability distributions can be seen as a global probability matrix. Each artificial ant in the colony has to choose randomly a ‘path’ through state space, i.e. one input value for each independent variable. In the example below, the ant chooses alternative s12 for input x1, s24 for input x2 and s33 for input x3. The chosen path depends on the probabilities associated with the states, i.e. a state with a high probability is more likely to be selected for a trial solution than states with a low probability value. This probability values are refereed to as the pheromone level s. A chosen path represents one candidate solution, which is evaluated and the probabilities of the states that the ant has visited on that trail is updated based on the achieved fitness. In the next generation, the updated probability matrix is used, which means that states that have proven fit in the past are more likely to be selected for the subsequent trail. However, it should be pointed out that a ‘path’
S11
S12
S13
S14
S15
x1
S21
S22
S23
S24
S25
x2
S35
x3
S31
S32
S33
S34
System S
y
Fig. 5. Example of an artificial ant constructing a trial vector by traversing through state space.
is not actually traversing through the state space; it simply refers to the collection of chosen alternatives for a particular candidate solution. The order in which the states are selected does not have any effect on the candidate solution itself, i.e. one could start with determining the input for x1 first or, alternatively, with x2 or x3. The resulting candidate solutions would still be the same. A major advantage of ACO is that adjacent states in the neighbourhood do not need to show similarities, i.e. the state space does not need to be ordered. This is different to most optimisation heuristics, which rely on ordered collections of states, i.e. fitness landscapes. The main idea behind the ACO-Estimator is to use a colony of artificial ants to build up a probability distribution for a set of measurements for TMA. Artificial ants ‘travel’ through the state space, choosing their paths based on the associated pheromone levels. Based on the achieved fitness, which is related to the mean residuals achieved using local search and the LS method, the pheromone levels are adjusted. Fig. 6a shows an example of an ant choosing a path through the state space for a data set containing four measurements. The chosen states, which result in the decision vector given in Fig. 6b, are s10, s21, s31 and s40. After a number of time-steps, the algorithm converges towards a stable state and the target state, i.e. position and velocity vector, is estimated. Fig. 7 shows a flowchart of the ACO-Estimator. The fitness of an individual ant k at time step t is calculated using Eq. (11). Here, m is the number of measurements, j is the number of measurements selected for the subset, ri(t) is the residual for measurement i at time step t and n(i) is a penalty function. To avoid a division by zero, one has been added to the denominator in Eq. (11). Dskij ðtÞ ¼ nðjÞ
j
1þ
Pm
i¼1 r i ðtÞ d
d¼
1 if item i is selected 0 otherwise ð11Þ
START
measurement 1
S10
S11
measurement 2
S20
S21
measurement 3
S30
S31
measurement 4
S40
S41
0
1
1
0
x1
x2
x3
x4
STOP
(a) Path through state space
(b) Resulting Decision vector
Fig. 6. Example of a path through state space and resulting decision vector.
L. Nolle / Knowledge-Based Systems 21 (2008) 225–231
nðjÞ ¼
START
select control parameters
if j < k
ð12Þ
Two of the four tracking scenarios proposed by Carevic [2] have been used in this work: scenario 1 and scenario 4. For both scenarios, datasets with seven different Mean Percentages of Clutter (MPC) have been generated. The MPC was varied from 0% to 60% in order to cover the whole range up to 50%, which is the theoretical breakdown point for robust estimators [11]. For each MPC, 100 sets were generated resulting in a total of 1400 data sets, i.e. experiments. The Mean Trajectory Distances (MTDs) Eq. (13) were used as the fitness function where Dxi and Dyi are the differences between the measurements and the model predictions for each of the N measurements in a set. N qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 X MTD ¼ Dx2i þ Dy 2i ð13Þ N i¼1
i=1
construct trail vector using ant i
select measurements based on trail vector
apply LS method to calcuate fitness based on mean residuals
update local probability matrix for ant i based on fitness
no
i = i+1
yes combine local and global probability matrix
no
if j P k
0
4. Experiments
initialise global probability matrix
i=n ?
1
229
For each of the 100 simulations in an MPC sets, the average Mean Trajectory Distance (MMTD) for that group and their Standard Deviations (STD) were calculated. Because the TMA problem is a semi real-time application, the maximum number of fitness evaluations was limited to 200,000 for each of the experiments. The new estimator was compared with three standard estimators, the LS estimator, the Iterative Re-weighted Least Squares Estimator (IRLS), and the Annealing MEstimator (AM). For each of the 100 simulations in an MPC sets, the average Mean Trajectory Distance (MMTD) for that group and their Standard Deviations (STD) were calculated. Fig. 8 shows a graphical representation of the results for the MMTDs obtained from the experiments for tracking scenario 1, Fig. 9 shows the MMTDs for tracking scenario
stop criteria fulfilled ?
3500
yes
3000
print best solution
Fig. 7. Flowchart of the ACO-Estimator.
2500
MMTD [m]
STOP
LS IRLS AM ACO
2000
1500
1000
500
The penalty function n(i) Eq. (12) is a simple threshold function where k represents the minimum number of measurements that have to be used for the estimation. Any estimation based on fewer measurements would be invalidated by setting the resulting fitness value to zero.
0 0
10
20
30
40
50
60
Mean Clutter [%]
Fig. 8. Average Mean Trajectory Distance (MMTD) for scenario 1.
230
L. Nolle / Knowledge-Based Systems 21 (2008) 225–231 4000
1000
LS IRLS AM ACO
3500
LS IRLS AM ACO
900 800
3000
600
STD [m]
MMTD [m]
700 2500 2000 1500
500 400 300
1000 200 500
100 0
0 0
10
20
30
40
50
0
60
10
20
Fig. 9. Average Mean Trajectory Distance (MMTD) for scenario 4.
1000
LS IRLS AM ACO
900 800
40
50
60
Fig. 11. Standard deviation of average Mean Trajectory Distance (MMTD) for scenario 4.
the time available for the estimations, especially when the amount of MPC was high (Figs. 10 and 11). Fig. 12 shows a typical run of the algorithm without time limitations. As it can be seen, the algorithm converges after approximately 24 time-steps, whereas the algorithm usually carried out 11 time-steps in the experiments before the time available elapsed.
700
STD [m]
30
Mean Clutter [%]
Mean Clutter [%]
600 500 400 300
6. Conclusion
200 100 0 0
10
20
30
40
50
60
Mean Clutter [%]
Fig. 10. Standard deviation of average Mean Trajectory Distance (MMTD) for scenario 1.
4. Figs. 10 and 11 present graphical representations of the achieved STDs of the MMTDs. 5. Discussion As it can be seen from Figs. 8 and 9, for both tracking scenarios the three conventional estimators showed very similar performances, although the AM-Estimator performed slightly better. This is in agreement with the findings reported by Carevic [2]. Table 1 shows the average improvements in terms of MMTD achieved using the ACO-Estimator. It can be seen that the overall improvement is 46.3% for scenario 1 and 43.1% for scenario 4. Therefore, the new ACO-Estimator outperformed all of the other estimators used in terms of MMTD. However, in terms of STD, the performance of the ACO-Estimator was worse for both tracking scenarios (Table 2). The ACO-Estimator had problems with full converge within
The main aim of this research project was to develop a new robust estimation method that has the ability to solve the linear array-based TMA problem. This has been achieved by developing a new ACO-based estimation algorithm, which was implemented and successfully applied to the TMA problem. Based on the statistical analysis of the results obtained from the experiments, it was concluded that the new algorithm outperforms conventional estima-
Table 1 Average improvement of MMTD Algorithm compared with ACO-Estimator
Average improvement of MMTD for ACO-Estimator
LS IRLS AM
Scenario 1 [%]
Scenario 4 [%]
47.7 46.4 44.9
44.4 42.6 42.3
Table 2 Average improvement of STD Algorithm compared with ACO-Estimator
Average improvement of STD for ACO-Estimator Scenario 1 [%]
Scenario 4 [%]
LS IRLS AM
48.9 57.6 57.6
43.3 51.5 48.0
L. Nolle / Knowledge-Based Systems 21 (2008) 225–231
References
0.015 Best Average
0.014 0.013 0.012
Fitness
0.011 0.01 0.009 0.008 0.007 0.006 0.005 0
5
10
15
20
25
30
35
40
45
231
50
Timesteps
Fig. 12. Typical search run without time limitations.
tors by successfully removing outliers from the measurements. The findings might not only be of interest to practitioners from the established field of TMA, other interested parties could be scientists and engineers that are trying to solve also problems where only noisy measurements with outliers can be observed but some problem domain specific knowledge is readily available. One example of such a problem is the recovery of lost asteroids [12] or tracking problems in the automotive sector [13].
[1] J.C. Hassab, B.W. Guimond, S.C. Nardone, Estimation of location and motion parameters of moving source observed from a linear array, Journal of the Acoustical Society of America 70 (4) (1981) 1054–1061. [2] D. Carevic, Robust estimation techniques for Target Motion Analysis using passively sensed transient signals, IEEE Journal of Oceanic Engineering 28 (2) (2003) 262–270. [3] K.F. Gauss, Theoria motus corporum coelestium in sectionibus conicis solem ambientum, Perthes and Besser, Hamburg, 1809. [4] S.Z. Li, Robustizing robust M-Estimation using deterministic annealing, Pattern Recognition 29 (1) (1996) 159–166. [5] X. Hong, S. Chen, M-Estimator and D-optimality model construction using orthogonal forward regression, IEEE Transactions on Systems, Man, and Cybernetics Part B 5 (1) (2005) 155–162. [6] S. Kirkpatrick, C.D. Gelatt Jr., M.P. Vecchi, Optimization by Simulated Annealing, Science 220 (4598) (1983) 671–680. [7] M. Dorigo, G. De Caro, The Ant Colony Optimization metaheuristic, in: D. Corne, M. Dorigo, F. Glover (Eds.), New Ideas in Optimization, McGraw-Hill, 1999. [8] S. Goss, S. Aron, J.L. Deneubourg, J.M. Pasteels, Self-organized shortcuts in the Argentine ant, Naturwissenschaften 76 (1989) 579– 581. [9] M. Dorigo, V. Maniezzo, A. Colorni, Positive Feedback as a Search Strategy. Technical Report No 91-016, Politecnico di Milano, 1991. [10] M. Dorigo, G.D. De Caro, L.M. Gambardella, Ant algorithms for discrete optimization, Artificial Life 5 (1999) 137–172. [11] P.J. Rousseeuw, A.M. Leroy, Robust Regression and Outlier Detection, John Wiley, New York, 1987. [12] A. Milani, The asteroid identification problem. I. Recovery of lost asteroids, Icarus 137 (1999) 269–292. [13] C.V. Stewart, Robust parameter estimation in computer vision, SIAM Review 41 (3) (1987) 513–537.