Robust appearance-guided particle filter for object tracking with occlusion analysis

Robust appearance-guided particle filter for object tracking with occlusion analysis

Int. J. Electron. Commun. (AEÜ) 62 (2008) 24 – 32 www.elsevier.de/aeue Robust appearance-guided particle filter for object tracking with occlusion ana...

872KB Sizes 0 Downloads 68 Views

Int. J. Electron. Commun. (AEÜ) 62 (2008) 24 – 32 www.elsevier.de/aeue

Robust appearance-guided particle filter for object tracking with occlusion analysis Bo Zhang∗ , Weifeng Tian, Zhihua Jin Department of Instrument Science and Engineering, Shanghai Jiao Tong University, Dong Chuan Road 800, Shanghai, 200240, PR China Received 11 December 2006; accepted 25 January 2007

Abstract A major challenge for most tracking algorithms is how to address the changes of object appearance during tracking, incurred by large illumination, scale, pose variations and occlusions. Without any adaptability to these variations, the tracker may fail. In contrast, if adapts too fast, the appearance model is likely to absorb some improper part of the background or occluding objects. In this paper, we explore a tracking algorithm based on the robust appearance model which can account for slow or rapid changes of object appearance. Specifically, each pixel in appearance model is represented using mixture Gaussian models whose parameters are on-line learned by sequential kernel density approximation. The appearance model is then embedded into particle filter framework. In addition, an occlusion handling scheme is invoked to explicitly indicate outlier pixels and deal with occlusion events, thus avoiding the appearance model to be contaminated by undesirable outlier ‘thing’. Extensive experiments demonstrate that our appearance-based tracking algorithm can successfully track the object in the presence of dramatic appearance changes, cluttered background and even severe occlusions. 䉷 2007 Elsevier GmbH. All rights reserved. Keywords: Object appearance; Sequential kernel density approximation; Particle filter; Outlier pixels

1. Introduction Object tracking is a challenging task in computer vision and has a wide range of applications ranging from visual surveillance, human-machine interaction and robot navigation. Although researchers have made impressive efforts at object tracking, developing a robust and efficient tracking algorithm still remains unsolved due to the tracking problem’s inherent difficulty. From the probabilistic viewpoint, object tracking can be viewed as a recursive Bayesian estimation problem, where the estimated states represent some parameters related to the tracked object, i.e., position, scale, velocity. Particle

∗ Corresponding author.

E-mail address: [email protected] (B. Zhang). 1434-8411/$ - see front matter 䉷 2007 Elsevier GmbH. All rights reserved. doi:10.1016/j.aeue.2007.01.006

filter [1,2] and its variants [3,4], some approximated methods based on Monte Carlo sampling for Bayesian estimation, recently have been extensively used and achieved promising success in tracking field due to their ability of addressing non-Gaussian and non-linear problems. Two important components required for particle filter are the state transition model and the likelihood model. At each time step, a few possible position hypotheses of the object are firstly predicted by the transition model, and then the likelihood model is used to determine the most likely location among those hypotheses as the new observation becomes available. The likelihood model significantly contributes to the performance of the tracking algorithm. Consequently, the appearance model used to construct the likelihood model becomes important. The appearance model should be able to adapt the variations of object appearance, inevitably caused by intrinsic factors (e.g. pose variation and shape deformation) and

B. Zhang et al. / Int. J. Electron. Commun. (AEÜ) 62 (2008) 24 – 32

extrinsic factors (e.g. illumination, view angle changes and occlusions). Without any adaptability to these changes, the tracker tends to fail since the likelihood model becomes unreliable. At the other extreme, the overeager adaptation for changes is likely to make the appearance model absorb some undesirable background or occluding objects, thus leading to the tracker unstable. Therefore, pursuing a compromise appearance model which is capable of effectively accommodating those changes and meanwhile be resistant to erroneous distractions is necessary. This leads to our paper. In this paper, we develop a robust tracking algorithm based on the adaptive pixel-wise appearance model. Intensity value of each pixel in appearance model is modeled by a mixture Gaussian density whose parameters are on-line updated using sequential kernel density approximation (SKDA) proposed by Han et al. [5]. SKDA allow us to represent the density function very accurately with incremental fashion. More attractively, the parameters of mixture Gaussians such as the number of components, mean, covariance, and weight can be automatically determined. Instead of gradient-based optimal solution in [5], we recur to the particle filter framework for state estimation, thus making the tracker robust against cluttered backgrounds or partial occlusions. Moreover, robust statistics is invoked to deal with occlusions. The remainder of the paper is arranged as follows. A brief review on previous work is provided in Section 2. In Section 3, SKDA method is introduced. Section 4 describes the whole tracking algorithm in detail, including the implementation of particle filter and occlusion handling. Section 5 performs extensive experiments on many difficult test sequences and gives some remarks. Finally, some conclusions are given in Section 6.

2. Previous work There exist huge amounts of literatures on object tracking. In this section, we mainly review the most relevant work. In [6], Nguyen et al. proposed a tracking algorithm based on template matching. The kalman filter was used to smooth appearance features and update the template, one to every pixel, thus keeping the tracker robust against occlusions. Different from [6], Raja et al. [7] described the pixel color distribution using Gaussian mixture model (GMM) whose parameters were updated through Expectation-Maximization (EM) algorithm. A more delicate appearance model was presented in [8], where a three-component mixture model WSL reflecting the stable (S component), the outlier (L component) and wandering ingredient (W component) of appearance changes respectively was used to adapt the appearance of the tracked object. Following the research, Zhou et al. [9] and Li et al. [10] modified the original WSL appearance model and presented an improved appearance model. Instead of phase feature in [8], their appearance model directly used pixel intensity as appearance feature, thus decreasing the complexity of computation. More impor-

25

tantly, particle filter was employed to incorporate the appearance model for robust tracking and at the same time, an explicit occlusion handling was also evoked, thus making the tracker more robust against outlier ‘thing’ and occlusions. Experimental results showed the robustness and effectiveness of their methods in the presence of large illumination, pose variations and partial occlusion or even short-term full occlusion. Although the above-mentioned algorithms achieved good tracking performance, the fixed number of Gaussian components in their appearance model may limit the ability of dealing with multi-modal density which has a large number of modes and the number of modes changes frequently. Some incremental leaning algorithms [11–13] for GMM, which simultaneously estimated the parameters of Gaussian components and determined the number of components, have been proposed to model GMM. These approaches can save much memory space and satisfy the real-time application. Recently, the non-parametric kernel density estimation (KDE) technique was innovatively applied to model the background and the appearance of object [14,15]. KDE is capable of approximately representing the underlying density distribution by a set of Gaussian distributions whose means are the samples drawn from the density. The data-driven model fashion makes KDE more flexible for arbitrary density. Nevertheless, to maintain the non-parametric representation of the density, KDE needs large memory and huge computational cost [16]. In order to explore a trade-off between the flexibility of the model and the complexity of computation, Han et al. [5] presented an alternative method to KDE, named as SKDA, which can provide accuracy and compact representation for density function. Experimental results on simulating and real image sequences demonstrated the correctness and effectiveness of their method. Mostly motivated by [5], we propose our tracking algorithm which has the following different characteristics from their method: (1) We use particle filter as the state estimator while the maximum likelihood estimator was used in [5]. The particle filter is more suitable and robust against partial occlusion and clutters in tracking scenes; (2) Our algorithm can directly handle outlier pixels and occlusion events by robust statistics, thus alleviating their influences on state estimation; (3) Our appearance model can automatically deal with scale change through particle filter while their algorithm needs an additional step for updating the appearance model in scale space.

3. Sequential kernel density approximation To keep the paper self-contained, we review SKDA in this section.

26

B. Zhang et al. / Int. J. Electron. Commun. (AEÜ) 62 (2008) 24 – 32

Table 1. Sequential kernel density approximation (SKDA) Algorithm 1: SKDA

Input : fˆk (x) and new observation component N(, xknk +1 , Pknk +1 ) Output : fˆk+1 (x)

(1) ci = meanshift(fˆk+1 (x), xki ), i = 1, . . . , nk + 1, where the operation meanshift denotes VBMS from the sample point xki in the density fˆk+1 (x), ci denotes the corresponding convergence point. (2) Find the convergence locations yj , j = 1, . . . , C, where at least two sample points or xknk +1 converge. Denote the corresponding starting locations converging to yj by xki , i = 1, . . . , S j . (3) For j = 1, . . . , C ˆ j) • Compute the Hessian matrix H(y ˆ j ) is negative definite • If H(y Allocate a Gaussian component N(y , yj , P(yj )) for the mode yj , where y is the sum of the weight of xki , i = 1, . . . , S j and replace the Gaussian components centered at xki , i = 1, . . . , S j by N (y , yj , P(yj )). 2

ˆ j )−1 |2(−H(y ˆ j )−1 )|− d+2 P(yj ) = −yd+2 H(y 1

Else Maintain the components related to xki , i = 1, . . . , S j unchanged.

End End (4) Combine those changed modes with unchanged modes, obtain the updated density function fˆk+1 (x) at time k + 1 nk+1 ik+1 1 2 i i fˆk+1 (x) = 1 d/2 i=1 i 1/2 exp(− 2 D (x, xk+1 , Pk+1 )) (2)

|Pk+1 |

In the literature [17,18], Bohyung Han and Comaniciu firstly introduced SKDA method and innovatively applied it to background model and object tracking. Particularly, Bohyung’s PhD thesis [16] gave a more detailed and thorough discuss about its application in computer vision field. Roughly speaking, similar to KDE, SKDA is a non-parametric density estimation technique and also approximates the density by a weighted sum of Gaussians. However, different from KDE where the probability density at each estimated point is computed by averaging the effect of a set of Gaussian distributions with means at each sample point, SKDA explores the dominant mode of underlying density by invoking a mode finding algorithm-variable bandwidth mean shift and then estimates the density only using those Gaussian functions centered at modes, thus producing a more compact representation and mitigating the burden of storing sample points. SKDA is operated on an incremental pattern that facilitates on-line process. Compared with other density estimation methods, it is more flexible for representing the complicated density due to its specialty in automatically determining the parameters of mixture Gaussians. The concrete SKDA algorithm proceeds as follows. Given the density function fˆk (x) at time k approximated by nk weighted Gaussian modes N (ik , xki , Pki ), i = 1, . . . , nk and a new observation N (, xknk +1 , Pknk +1 ), where N(, , 2 ) is a Gaussian distribution with weight, mean and covariance,  is learning rate, the goal of SKDA is to derive the density function fˆk+1 (x). The density

can be empirically written as k ik 1−  fˆk+1 (x) = (2)d/2 i=1 |Pki |1/2   1 2 i i × exp − D (x, xk , Pk ) 2  + d/2 nk +1 1/2 (2) |Pk |   1 × exp − D 2 (x, xknk +1 , Pknk +1 ) , 2

n

(1)

where D 2 (x, xki , Pki ) = (x − xki )T (Pki )−1 (x − xki ) stands for the Mahalanobis distance between x and xki and d is the dimension of xki . Pknk +1 ∈ Rd×d is the covariance matrix of new observation and can be computed according to the method suggested in [14]. Then the variable-bandwidth mean shift (VBMS) [19] is utilized to find the density modes of fˆk+1 (x). We list the SKDA algorithm in Table 1 (refer to [5,16] for more detailed introductions).

4. Particle filter-based tracking algorithm In this section, the whole tracking algorithm is implemented within particle filter framework. Additionally, occlusion analysis is also introduced.

B. Zhang et al. / Int. J. Electron. Commun. (AEÜ) 62 (2008) 24 – 32

27

4.1. Object motion model

4.3. Implementation of particle filter

Suppose the object being tracked is represented by a rectangular window, the motion between two consecutive frames can be approximated by a similarity transformation. The state vector of the object is defined as four parameters of similarity transformation

Once obtained the motion model and the likelihood model, our particle filter-based tracking algorithm can be easily implemented. For the sake of completeness, we briefly review particle filter. (See [1] for detailed introductions.) With the first-order Markovian assumption, the posterior density p(Xk |Z1:k ) can be written as  p(Xk |Z1:k )∝p(Zk |Xk ) p(Xk |Xk−1 )p(Xk−1 |Z1:k−1 ), (7)

X = (x, y, s, ),

(2)

where x, y denotes x, y translation, s the scale and  the orientation angle. Due to the uncertainty of motion, each variable in X is independently modeled by a random walk Xk = Xk−1 + wk ,

(3)

where the noise wk is a Gaussian distribution with zero mean and covariance matrix whose diagonal entries are the corresponding variances of state variable, i.e., xk , yk , sk , k .

where p(Xk |Xk−1 ) is the transition model obeying the distribution in Eq. (3) and p(Xk−1 |Z1:k−1 ) the posterior density at time k − 1. Due to intractable integral in Eq. (7), particle filter is evoked to approximate it by a set of particles drawn from the density, each particle consisting of a state vector and an associated weight {(Xki , wki ), i = 1, . . . , N} p(Xk |Z1:k ) ≈

Given the state Xk , the similarity transformation can be used to produce an image region corresponding to the state Xk by warping a standard region which is created by resizing the object being tracked to a rectangle centered at coordinates (0, 0) with fixed width and height. The transformation is mathematically represented as     cos k − sin k xk s Pk = sk P + , (4) sin k yk cos k i , p i ), i = 1, . . . , P } are the pixel points where Pk = {(px,k y,k within the object region after similarity transformation and Ps = {(pxi , pyi ), i = 1, . . . , P } the points within the standard rectangular window. P is the number of pixels in the appearance model. Consequently, the observed appearance can be acquired by taking the intensity value of the pixel points. That is

(5)

where I (Pk ) denotes the intensity value of Pk and Zk = {Zki , i = 1, . . . , P }. The likelihood model arises from appearance model. The intensity value of each pixel in the appearance model is modeled by SKDA. Denote the appearance model by Ak−1 = {im,k−1 , im,k−1 , im,k−1 ; i = 1, . . . , P , m = 1, . . . , M}, where im,k−1 , im,k−1 , im,k−1 denote the weight, mean and covariance of mth Gaussian mode on ith pixel, respectively, and M is the number of the modes. Under the assumption that the pixels in the appearance model are independent, the observation likelihood p(Zk |Xk ) can be derived by feeding the observation into the Gaussian modes. p(Zk |Xk ) =

M P   i=1 m=1

wki (Xk − Xki ).

(8)

i=1

4.2. Robust likelihood model

Zk = I (Pk ),

N 

im,k−1 N (Zki ; im,k−1 , (im,k−1 )2 ). (6)

In practice, the approximation can be achieved by performing the following four recursive steps, namely particle predicting, particle weighting, state outputting and re-sampling: Particle predicting: Prior to state estimation, the posterior density is unknown. It is hard to directly sample the posterior density. As an alternative, we can draw particles from an known and easily sampling density, Xki ∼ i q(Xk |Xk−1 , Z1:k ), referred as proposal density. In computer vision, the proposal density is usually selected as the transition model p(Xk |Xk−1 ), and then the predicted particle set {Xki , i = 1, . . . , N} can be obtained by propagating each i according to Eq. (3). particle Xk−1 Particle weighting: Given the predicted particle set, their weights can be evaluated by wki ∝

i p(Zk |Xki )p(Xki |Xk−1 ) i q(Xki |Xk−1 , Z1:k )

.

(9)

i , Z1:k ) is taken as p(Xk |Xk−1 ), If the proposal q(Xki |Xk−1 Eq. (9) can be simplified as

wki ∝ p(Zk |Xki ).

(10)

State outputting: The mean state of the new particle set, which specifies the position of object being tracked, can be calculated using minimum mean square error (MMSE) estimator Xˆ k = E(Xk ) ≈

N 

wki Xki .

(11)

i=1

Re-sampling: R-sampling is an auxiliary step, which is used to alleviate the particle degeneracy problem inevitably encountered over time [1]. It is implemented by multiplying particles with high weights multiple times and diminishing particles with relatively low weights. The result of

28

B. Zhang et al. / Int. J. Electron. Commun. (AEÜ) 62 (2008) 24 – 32

Table 2. The proposed tracking algorithm with occlusion handling (1) Initialization : • For i = 1, . . . , N X0i ∼ p(X0 ) /* Draw initial particles from the prior distribution */

End

• occ_f lag = 1 /* Set the occlusion flag to indicate no occlusion at the beginning */ • Select the initial object appearance A0 at the first frame (2) For k = 1, 2, . . . • Particle predicting and weighting : For i = 1, . . . , N i ) /* Propagate the particle set */ Xki ∼ p(Xk |Xk−1 Compute the observation Zik of the corresponding state Xki by Eq. (4) and Eq. (5) Compute the likelihood density p(Zk |Xki ) by Eq. (6) Compute the weight wki by Eq. (10)

End

 i Normalize the weight to ensure N i=1 wk = 1 • State outputting : ˆ k using MMSE estimator by Eq. (11) Estimate the state X • Re-sampling : Re-sample to obtain a set of replacement particles (Xki , 1/N ) ∼ (Xki , wki ), i = 1, . . . , N • Appearance model updating ˆ k of the estimated state X ˆ k by Eq. (4) and Eq. (5) Calculate the observation Z ˆk Set the occlusion flag occ_flag according to the number of outlier pixels in Z If occ_flag = 1 Update mixture Gaussians of each pixel in Ak−1 by SKDA algorithm in Table 1

Else Ak = Ak−1

End End

re-sampling is a new set of particles with equal weight {(Xki , 1/N ), i = 1, . . . , N}.



4.4. Occlusion handling



In visual tracking, occlusion is an unavoidable event and would probably occur when the object being tracked moves behind the occluding objects, e.g., trees, buildings and other interactive objects. In the presence of occlusion, the tracker is declined to lose the object. Therefore, an explicit occlusion handling strategy is indispensable for robust tracking. Like [10], we use robust statistics to address occlusion. Generally speaking, occlusion incurs large variation of the intensity value within the occluded region. These pixels are viewed as outlier pixels, which cannot be explained by the underlying process and cause unfavorable influence on state estimation. A robust Huber function is adopted to alleviate the influence. is defined as 1 x if |x|c, (12) (x) = 2 1 2 cx − 2 c otherwise. For the ith pixel, we set x=

Zki − im,k−1 im,k−1

.

Substitute Eq. (13) into Eq. (12), for ∀m = 1, . . . , M, the Huber function becomes

(13)

Zki − im,k−1



im,k−1

2 ⎧ i i Zk −m,k−1 1 ⎪ ⎪ ⎪ if |Zki − im,k−1 | cim,k−1 , ⎨2 im,k−1

i = ⎪ Zk − im,k−1 ⎪ c ⎪ ⎩c otherwise, − 2 im,k−1 (14)

where c is a constant used to control the outlier rate. In the experiments, we take c=2. This leads to the well-known 2- rule, which means there is only 4.5 percent chance of considering the pixel as an outlier pixel. If the newly observed intensity Zki of the pixel matches none of Gaussian modes, we assume it is an outlier pixel. Then the corresponding likelihood evaluation in Eq. (6) is replaced by N (Zki ; im,k−1 , (im,k−1 )2 )

  i Zk − ij,k−1 1 1 =√ . exp −c − c 2 im,k−1 2im,k−1

(15)

B. Zhang et al. / Int. J. Electron. Commun. (AEÜ) 62 (2008) 24 – 32

The replacement reduces the influence of the outlier pixels on the observation likelihood. If the total number of outlier pixels is over 25 percent of the appearance size P, we deem severe occlusion occurs. Then we stop updating appearance model, thus being able to effectively prevent the appearance model from wrongly absorbing the occluding area. Incorporating the appearance updating algorithm-SKDA and the occlusion handling scheme into particle filter framework, we derive a robust tracking algorithm. As a summary, we outline it in Table 2 at length.

5. Experimental results and remarks To verify the correctness and effectiveness of our algorithm, numerous experiments are carried out in this section. The image sequences used in the experiments involve many difficult scenarios such as large pose variations, large illumination changes, cluttered background and severe occlusions. All the algorithms are run in Matlab environment. For all experiments, the size of appearance model is set 20 × 20. The number of particles used in the experiments is 100. The object being tracked is enclosed by a white rectangle and the tracking algorithms are initiated by manually selecting the object at the first frame. All of test sequences can be publicly downloaded on the website [20].

29

demonstrates the stability of our appearance model in dealing with large illumination changes.

5.2. Tracking a man’s face Fig. 2 shows the result of tracking a man’s face undergoing illumination and expression changes. The main difficulty in this sequence lies in sudden light changes. The tracker based on the fixed appearance model was prone to drift away from the face when the light was suddenly turned on in frame 220 and 570 and finally lost the face while our adaptive appearance model can accommodate the change of appearance and shows powerful stability.

5.3. Tracking a man’s face under drastic pose variation In Fig. 3, we list some select frames of tracking a man’s face. The appearance of the face being tracked varied dramatically when the person walked and simultaneously changed his pose. As the illustration in Fig. 3, the tracker with the fixed appearance model tends to be invalid at frame 330 and 375 where the appearance of the face suffers from significant changes while our tracker is capable of reliably tracking the face. The experiment demonstrates the ability of our algorithm in coping with pose variations.

5.1. Tracking a car

5.4. Tracking a man’s face undergoing partial occlusion and pose variation

In the first video, we aim at tracking a car which exposed itself to different illumination situation. The sequence was recorded in outdoor. The car moved from a bright road toward shadow area which was caused by a bridge, thus resulting in large appearance variation. For comparison, we implemented two particle filter-based tracking algorithms: one based on the fixed appearance model and the other based on our adaptive appearance model. Fig. 1 shows the sample frames of the two algorithms, where the first row and second row list the results of fixed-appearance model algorithm and our adaptive-appearance model, respectively. The sequence

In this section, we show a more challenge sequence containing many difficult situations such as partial occlusions (e.g. frame 38 and 41), facial expressions changes (e.g. frame 200, 300 and 405), pose variations (e.g. frame 545, 775 and 800) and lighting variations (e.g. frame 689). The sequence investigates the robustness and effectiveness over long period of time in spite of the synthetical effects. The experimental results in Fig. 4 manifest our adaptive appearance model is able to constantly learn the changes of appearance and successfully track the object.

Fig. 1. Demonstration of tracking a car. Row 1: tracking using the fixed appearance model. Row 2: tracking using our appearance model.

30

B. Zhang et al. / Int. J. Electron. Commun. (AEÜ) 62 (2008) 24 – 32

Fig. 2. Sampling results of tracking a man’s face. Row 1: tracking using the fixed appearance model. Row 2: tracking using our appearance model.

Fig. 3. Experimental results of tracking a man’s face with large pose variations. Row 1: tracking using the fixed appearance model. Row 2: tracking using our appearance model.

Fig. 4. Some key frames of tracking a man’s face.

Fig. 5. Tracking a person undergoing frequent occlusions in real surveillance scenes.

B. Zhang et al. / Int. J. Electron. Commun. (AEÜ) 62 (2008) 24 – 32

5.5. Tracking a person undergoing frequent occlusions In the last experiment, we validate our algorithm on CAVIAR test sequences database to investigate its ability of handling severe occlusions. Three sequences are tested and the tracking results are showed in Fig. 5, each row listing a sequence. Since an explicit occlusion handling strategy is applied, those outlier pixels can be effectively addressed, thus making the tracker more robust against unstable and undesirable pixels. Additionally, the mechanism of judging severe occlusion events attributes our appearance model to the ability of avoiding absorbing occluding area. In the 3rd sequence, the person being tracked was full occluded by another person. Despite being occasionally distracted at frame 156 where the trouser of the occluding person was similar to the appearance of the person being tracked, the proposed algorithm is still able to track the person through the whole sequence. Moreover, due to the use of particle filter, our algorithm can recover from temporary lost, which can be seen in frame 191.

6. Conclusions In this paper, we present a robust tracking algorithm using an adaptive appearance model which accounts for the changes of object appearance during tracking. Each pixel in the appearance is modeled using mixture Gaussians whose parameters are automatically determined by SKDA. What’s the most attractive strength in our algorithm is to invoke particle filter for state estimation, thus making the tracker more robust against occlusions and cluttered backgrounds. In addition, occlusion events are explicitly handled by robust statistics. The whole algorithm is a potent synergism of adaptive appearance model and particle filter framework. Extensive experiments demonstrate that our algorithm can reliably and effectively track the object undergoing dramatic appearance changes.

Acknowledgement We are very grateful to David Ross for providing image sequences used in our experiments. We also appreciate the valuable comments and suggestions of the anonymous reviewers.

References [1] der Merwe RV, Doucet A, de Freitas N, Wan E. The unscented particle filter. CUED/F-INFENG/TR380, UK, Department of Engineering, Cambridge University; 2000. [2] Nummiaro K, Koller-Meier EB, Gool LV. An adaptive color-based particle filter. Image Vision Comput 2003;21: 99–110.

31

[3] Chang C, Ansari R. Kernel particle filter for visual tracking. IEEE Signal Process Lett 2005;12:242–5. [4] Zhang B, Tian WF, Jin ZH. Head tracking based on the integration of two different particle filters. Meas Sci Technol 2006;17:2877–83. [5] Han B, Davis L. On-line density-based appearance modeling for object tracking. In: Proceedings of IEEE international conference on computer vision, Beijing; 2005. p.1492–99. [6] Nguyen HT, Worring M, van den Boomgaard R. Occlusion robust adaptive template tracking. In: Proceedings of IEEE international conference on computer vision, Vancouver, 2001, p. 678–83. [7] Raja Y, Mckenna SJ, Gong S. Object tracking using adaptive colour mixture models. Image Vision Comput 1999;17: 223–9. [8] Jepson AD, Fleet DJ, El-Maraghi TF. Robust online appearance models for visual tracking. IEEE Trans Pattern Anal Mach Intell 2003;25:1296–311. [9] Zhou S, Chellappa R, Moghaddam B. Visual tracking and recognition using appearance-adaptive models in particle filters. IEEE Trans Image Process 2004;13:1491–506. [10] Li A, Jing Z, Hu S. Learning-based appearance model for probabilistic visual tracking. Opt Eng 2006;077204(7). [11] Zivkovic Z, Van der Heijden F. Recursive unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 2004;26:651–6. [12] Arandjelovicc O, Cipolla R. Incremental learning of temporally coherent gaussian mixture models. In: Proceedings of the British machine vision conference, Oxford; 2005. [13] Cheng J, Yang J, Zhou Y, Cui Y. Flexible background mixture models for foreground segmentation. Image Vision Comput 2006;24:473–82. [14] Elgammal A, Duraiswami R, Harwood D, Davis L. Background and foreground modeling using non-parametric kernel density estimation for visual surveillance. Proc IEEE 2002;90:1151–63. [15] Elgammal A, Duraiswami R, Davis L. Efficient kernel density estimation using the fast gauss transform with applications to color modeling and tracking. IEEE Trans Pattern Anal Mach Intell 2003;25:1499–504. [16] Han B. Adaptive kernel density approximation and its applications to real-time computer vision. Dissertation, College Park, MD, University of Maryland; 2005. [17] Han B, Comaniciu D, Zhu Y, Davis L. Incremental density approximation and kernel-based bayesian filtering for object tracking. In: Proceedings of international conference on computer vision and pattern recognition, Washington, DC; 2004. p. 638–44. [18] Han B, Comaniciu D, Davis L. Sequential kernel density approximation through mode propagation: applications to background modeling. In: Proceedings of Asian conference on computer vision, Jeju Island, Korea; 2004. [19] Comaniciu D, Ramesh V, Meer P. The variable bandwidth mean shift and data-driven scale selection. In: Proceedings of IEEE international conference on computer vision, Vancouver, Canada; 2001. p. 438–445. [20] Some test sequences used in the paper can be download: http://www.cs.toronto.edu/∼dross/ivt/. http://groups.inf.ed.ac.uk/vision/caviar/ .

32

B. Zhang et al. / Int. J. Electron. Commun. (AEÜ) 62 (2008) 24 – 32

Bo Zhang was the born in DaTong, China in 1977. He received his B.S. degree from the Shenyang Ligong University, Shenyang, China, in 1999, his M.S. degree from the Northeastern University, Shenyang, China, in 2003. He is now a Ph.D. student in the Instrument Science and Engineering Department, Shanghai Jiao Tong University, China. His research interests include motion detection, visual tracking, background modeling, and pattern analysis. Weifeng Tian was born in Shanghai, China in 1958. She received her B.S. and Ph.D. degrees both from the Instrument Science and Engineering Department, Shanghai Jiao Tong University, Shanghai, China, in 1984 and 1998, respectively. Since 2000, she has been a professor in the Department of Instrument Science and Engineering of Shanghai Jiao Tong University. Her main research areas include navigation system, robot localization, information fusion and computer vision. Currently, she is director of the Navigation and Control Lab.

Zhihua Jin was born in Shanghai, China in 1942. He received his B.S. degree from the Shanghai Jiao Tong University in 1965. Since 1998, he has been a Professor in the Department of Instrument Science and Engineering of Shanghai Jiao Tong University, China. His research interests include navigation system, inertia instrument and information fusion. He has authored two books and published more than 50 papers.