Camera handoff with adaptive resource management for multi-camera multi-object tracking

Camera handoff with adaptive resource management for multi-camera multi-object tracking

Image and Vision Computing 28 (2010) 851–864 Contents lists available at ScienceDirect Image and Vision Computing journal homepage: www.elsevier.com...

1MB Sizes 5 Downloads 101 Views

Image and Vision Computing 28 (2010) 851–864

Contents lists available at ScienceDirect

Image and Vision Computing journal homepage: www.elsevier.com/locate/imavis

Camera handoff with adaptive resource management for multi-camera multi-object tracking Chung-Hao Chen a,*, Yi Yao b, David Page c, Besma Abidi d, Andreas Koschan d, Mongi Abidi d a

Department of Mathematics and Computer Science, North Carolina Central University, NC 27713, USA GE Global Research Center, Nikayuna, NY 12309, USA c Third Dimension Technologies LLC, Knoxville, TN 37920, USA d Imaging, Robotics, and Intelligent Systems Laboratory, Department of Electrical Engineering and Computer Science, The University of Tennessee Knoxville, TN 37996, USA b

a r t i c l e

i n f o

Article history: Received 5 September 2008 Received in revised form 1 September 2009 Accepted 26 October 2009

Keywords: Camera handoff Multi-camera multi-object tracking Resource management Surveillance system

a b s t r a c t Camera handoff is a crucial step to obtain a continuously tracked and consistently labeled trajectory of the object of interest in multi-camera surveillance systems. Most existing camera handoff algorithms concentrate on data association, namely consistent labeling, where images of the same object are identified across different cameras. However, there exist many unsolved questions in developing an efficient camera handoff algorithm. In this paper, we first design a trackability measure to quantitatively evaluate the effectiveness of object tracking so that camera handoff can be triggered timely and the camera to which the object of interest is transferred can be selected optimally. Three components are considered: resolution, distance to the edge of the camera’s field of view (FOV), and occlusion. In addition, most existing real-time object tracking systems see a decrease in the frame rate as the number of tracked objects increases. To address this issue, our handoff algorithm employs an adaptive resource management mechanism to dynamically allocate cameras’ resources to multiple objects with different priorities so that the required minimum frame rate is maintained. Experimental results illustrate that the proposed camera handoff algorithm can achieve a substantially improved overall tracking rate by 20% in comparison with the algorithm presented by Khan and Shah. Ó 2009 Elsevier B.V. All rights reserved.

1. Introduction With the increase in the scale and complexity of a surveillance system, it becomes increasingly difficult for a single camera to accomplish object tracking and monitoring with the required resolution and continuity. Camera networks emerge and find extensive applications. The employment of multiple cameras not only improves coverage but also brings in more flexibility. However, the use of multiple cameras induces problems such as camera handoff. Camera handoff is a decision process of transferring a mobile object from one camera to another, wherein consistent labeling solving the identity problem among multiple observing cameras and laying the foundation for camera handoff. In general, camera handoff regulates the collaboration among multiple cameras and answers the questions of When and Who: when a handoff request should be triggered to secure sufficient time for a successful consistent labeling and who is the most qualified camera to take over

* Corresponding author. Tel.: +1 919 530 6237; fax: +1 919 530 6125. E-mail addresses: [email protected] (C.-H. Chen), [email protected] (Y. Yao), [email protected] (D. Page), [email protected] (B. Abidi), [email protected] (A. Koschan), [email protected] (M. Abidi). 0262-8856/$ - see front matter Ó 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.imavis.2009.10.013

the object of interest before it falls out of FOV of currently observing camera. Most existing camera handoff algorithms focus on developing efficient consistent labeling schemes. In the literature, consistent labeling methods could be grouped into three main categories: (I) feature-based, (II) geometry-based, and (III) hybrid-based approaches. In feature-based approach [1,2,33], color or other distinguishing features of the tracked objects are matched, generating correspondence among cameras. The geometry-based approach can be divided into three sub-categories: location-based, alignment-based, and homograph-based approaches. In location-based approach [3,4], consistent labeling can be established by projecting the trace of the tracked object back into the world coordinate system, and then establishing equivalence between objects projected onto the same location. In alignment-based approach [5,6], the tracks of the same object are recovered across different cameras after being aligned by the geometric transformation between cameras. The homography-based approach [7,8,31,32] obtains position correspondences between overlapped views in the 2D image plane. For instance, Calderara et al. [31] used the likelihood that is computed by warping the vertical axis of the new object on the FOV of the other cameras and computing the amount of match therein. This improves the algorithm’s capability in handling both the cases

852

C.-H. Chen et al. / Image and Vision Computing 28 (2010) 851–864

of single individuals and groups. Hu et al. [32] selected principal axes of people, homography, as the primary features for matching. The hybrid approach [9] is a combination of geometry and featurebased methods. Most existing consistent labeling methods [5,6,23] need a certain amount of time/frames to be carried out successfully. This validates the importance of when to trigger a handoff request, because a timely handoff ensures that a successful consistent labeling before the object of interest falls out of FOV of currently observing camera. Meanwhile, the question of Who is challenging when the object of interest can be observed by multiple cameras. Most existing handoff algorithms choose the camera to which the object of interest is approaching. This simple rule is frequently insufficient and leads to unnecessary handoffs. Note that in essence, our proposed method mainly addresses multi-object tracking with joint views. Although the works of Javed et al. [22,25], Kang et al. [26], and Lim et al. [24] can consistently label the object in the case with disjoint views, those tracking systems cannot detect the occurrence of unusual events due to the lack of continuous observations on the object. This may cause a serious loophole in a surveillance system. Therefore, the inspiration of introducing the trackability measure in the paper is to assist the camera handoff algorithm to prevent the occurrence of occlusion or discontinuity for a multi-camera surveillance system with overlapped FOVs. On the other hand, the camera handoff algorithm can transfer the to-be-occluded or to-be-unseen objects to another proper camera beforehand. In comparison, the works of Javed et al. [22,25], Kang et al. [26], and Lim et al. [24] can only be used for the compensation purpose. Due to the lack of research work addressing the questions of When and Who, there is no clear formulation to govern the transition between adjacent cameras. As a result, the abovementioned

camera handoff algorithms, concentrating on consistent labeling, are unable to optimize the system’s performance in terms of handoff success rate. For instance, a handoff request in the work of Khan and Shan [5] is triggered when the object is close to the edge of the camera’s FOV. No quantitative measure is given to describe the distance that is considered as close to the edge of the camera’s FOV. One exemplary camera handoff approach in the work of De Silva et al. [10] selects the successive camera by measuring which camera could obtain a better frontal view of the person to achieve a better recognition rate. A quantitative measure is derived and is sufficient for the face recognition applications. However, the measure of frontal view is unable to evaluate the overall quality in object tracking. To select the optimal camera and minimize unnecessary handoff requests, multiple criteria should be considered including resolution, occlusion, and distance to the edge of the camera’s FOV. Fig. 1 shows examples of these criteria. Fig. 1a illustrates the scenarios where two objects of interest are moving towards each other, leading to a high probability of occlusion. It is desirable to transfer to-be-occluded object to another camera to avoid the potential occlusion. In Fig. 1b, the object of interest is moving toward the boundaries of the camera’s FOV, which also requires a transition between cameras before it falls out of the FOV of the current observing camera. In parallel, the object of interest in Fig. 1c is moving away from the camera along the camera’s optical axis. As a result, the resolution of the object decreases to where it is infeasible for the camera to maintain its track. Under such condition, a handoff is necessary as well. Therefore, in this paper, we propose the trackability measure including these multiple components, each of which describes different aspect of object tracking. Equipped with the quantified and comprehensive measure of the effectiveness of object tracking we can answer the questions of When and Who with the optimized solution.

Fig. 1. Image sequence examples of: (a) occlusion, (b) distance to the edge of the camera’s FOV, and (c) resolution.

853

C.-H. Chen et al. / Image and Vision Computing 28 (2010) 851–864

In addition, most multiple objects tracking systems [11–14] find it difficult to maintain a constant frame rate given limited resources. Note that frame rates in this paper represent the number of processed frames per second by the tracking system for executing functions such as tracking, crowd segmentation, and behavioral understanding, instead of the number of read-in frames by cameras themselves. This difference occurs due to the tracking system incapable of processing each read-in frame for accommodating the execution of all functions in real-time given limited resources, even though cameras themselves are capable to acquire more frames. Herewith, resources include (I) CPU capacity for executing object tracking, crowd segmentation, and behavior understanding in an automated manner [16] and (II) network bandwidth for exchanging camera handoff information. The computational complexity of most existing tracking systems [11–14] is of the order from NpO(n) to NpO(n3) [15], where Np is the number of tracked objects and n represents the number of steps to execute the algorithm. There inherently exists an upper bound on the number of objects that can be tracked simultaneously without deteriorating the system’s frame rate. Those unprocessed read-in frames may be dropped immediately or reserved for future reference. Therefore, it is crucial for a tracking system to be able to maintain a reasonable frame rate in real-time. A lower frame rate may result in the following problems: (I) the surveillance system’s real-time ability to automatically detect a threatening event degrades, causing possible observation leaks. This dangerous loophole impedes the practical application of these real-time multi-camera multi-object tracking systems [17] and (II) the decreased frame rate also affects the performance of consistent labeling and consequently camera handoff, because a successful execution of consistent labeling requires accumulated information of the object of interest over a period of time [5,6,23]. The reduced frame rate leads to a decreased number of available frames/information for carrying out consistent labeling successfully. In summary, the contributions of this paper are: (I) a trackability measure is introduced to quantitatively evaluate the effectiveness of a camera in observing the tracked object. This gives a quantified met-

Handoff request side

ric to direct camera handoff for continuous and automated tracking before the tracked object is occluded or falls out of the FOV of currently observing camera, (II) an adaptive resource management algorithm that automatically and dynamically allocates resources to objects with different priority ranks is developed, and (III) based on the trackability measure and adaptive resource management, a camera handoff algorithm is designed. The proposed handoff algorithm can achieve a significantly improved overall tracking rate while maintaining a constant frame rate of each camera. The remainder of this paper is organized as follows. Section 2 illustrates the overall system architecture of our proposed camera handoff algorithm. Section 3 defines the trackability measure. Section 4 presents the adaptive resource management algorithm. Experimental results are demonstrated in Sections 5 and 6 concludes the paper.

2. Camera handoff The flow chart of the proposed camera handoff algorithm is shown in Fig. 2, where operations are carried out at the handoff request and handoff response sides. Let the jth camera be the handoff request side and the ith object be the one that needs a transfer. To maintain persistent and continuous object tracking, a handoff request is triggered before the object of interest is untraceable or unidentifiable in the currently observing camera. The object of interest may become untraceable or unidentifiable due to the following reasons: (I) the object is being occluded by other objects, (II) the object is leaving the camera’s FOV, and (III) the object’s resolution is getting low. Accordingly, three criteria are defined in the trackability measure to determine when to trigger a handoff request: occlusion (MO), distance to the edge of the camera’s FOV (MD), and resolution (MS). Let MO,ij, MD,ij, and MS,ij be the MO, MD, and MS values of the ith object observed by the jth camera, respectively. These three components MO,ij, MD,ij, and MS,ij, to be discussed in details in Section 3, are scaled to [1] where zero means that the object is untraceable or unidentifiable and one means that the camera has the best effectiveness in tracking the object.

Handoff response side

No

r

∑ n j ', r ≤ N th, j ', r

CT,ij = 1

k =1

Yes

Yes

No

Handoff Request

No

Handoff Response

Target Visible

Handoff Reject

Consistent Labeling

Resource Management

Yes

Consistent Labeling Yes

Handoff Response

No

j*=argmax{Bij’}

No

Handoff Failure

CE,ij* = 1 Yes

Handoff Success Fig. 2. Flow chart of the proposed camera handoff algorithm.

Update

N th , j ', r

854

C.-H. Chen et al. / Image and Vision Computing 28 (2010) 851–864

Define the trigger criterion CT,ij as:

C T;ij

   dM O;ij <0 ¼ ðM O;ij < T O Þ ^ dt    dM D;ij _ ðM D;ij < T D Þ ^ <0 dt    dM S;ij <0 ; _ ðM S;ij < T S Þ ^ dt

ð1Þ

where ^ and _, both logical symbols, represent ‘and’ and ‘or’ operations, respectively. TO, TD, and TS, associated with MO, MD, and MS, represent the predefined thresholds for triggering handoff and are mainly determined by the time needed for handoff execution and the objects’ maximal moving speed. A handoff request, therefore, is triggered and broadcasted, if CT,ij = 1, which suggests that at least one of the three components is below the predefined threshold and is decreasing. The decreasing criterion in Eq. (1) may appear to be redundant since at the first instance that one of the components becomes below the predefined threshold, its derivative should be necessarily negative. However, it requires a certain amount of execution time from the point when a handoff is triggered to the point when a handoff is granted. During this period of time, the trackability measure may change due to the dynamics of the object of interest. Chances are the measure is still below the threshold but its value is increasing. For instance, the object of interest may change its motion direction and begin to move towards the center of the camera’s FOV. To address the aforementioned situation and avoid back-and-forth transitions, the decreasing criterion is included. Furthermore, since each component of the trackability measure is computed from the estimated position of the object of interest, noise and possible jittering exist primarily due to the limited detection and tracking accuracy. Kalman filtering is, therefore, employed to smooth the trackability measure by exploiting the dynamics of the object. In so doing, the probability of back-and-forth transition can be reduced, which in turn improves the efficiency of the proposed handoff algorithm. Afterwards, the jth camera keeps tracking the ith object and waits for confirmation responses from adjacent cameras while the object is still visible. At the handoff response side, the (j0 )th camera examines its current load. Let N th;j0 ;r denote the maximum number of objects with a priority rank smaller than or equal to r that can be tracked simultaneously and nj0 ;r the number of objects with a priority rank r that have been tracked by the (j0 )th camera. A positive handoff response for the ith object is granted, if Pr 0 0 k¼1 nj ;k < N th;j ;r , which represents that the total number of tracked objects from different priority ranks has to be less than the maximum number of tracked objects in the system. To achieve a higher acceptance rate or equivalently a higher handoff success rate, the thresholds N th;j0 ;r should be adaptively adjusted according to the system’s current load. Given limited capacity, more resources should be allocated to objects with higher priorities at the cost of dropping out objects with lower priorities. Such a system provides a higher threat awareness level compared to systems where objects have the same priority ranks. Sometimes additional requirements on the overload probabilities of objects with different priority ranks are given. To meet these requirements, we need an online learning process to automatically adjust the distribution of the capacities according to estimated system load. Since the priority rank plays an important role in evaluating the load of a camera, before continuing with the proposed handoff algorithm, we stop here to clarify the possible selection methods of the priority rank. The priority rank assignment is application dependent. Frequently, the priority rank depends on the object’s behavior. For example, in an airport surveillance sys-

tem, the moving direction is a good hint to allocate the priority rank. Passengers walking in the opposite direction of an exit hall way should be assigned a higher priority rank than passengers following the regulated direction. Another example is a workplace surveillance system, where a close observation is necessary on workers handling valuable assets [34]. Workers in the close whereabouts of these high valuable assets should be assigned a higher priority rank. The initial priority ranks are obtained from low level behavior understanding that can be performed easily once the object of interest is detected. Fore example, the motion direction and location as discussed in the previous two examples. These initial priority ranks can be adjusted and refined as the observation is long enough to carry out more complex behavior understanding. Back to the handoff request side, if no positive handoff response is received before the jth camera loses the track of the ith object, a handoff failure is issued. Otherwise, consistent labeling is carried out between the handoff request side and all available candidate cameras. A handoff failure means that the ith object is no longer tracked or monitored by any camera in the system. It might be picked up by one camera later on once it enters the camera’s FOV and the camera has resource to process it. However, without successful handoff its original identity is lost and a new identify is assigned instead. In order to select the most appropriate candidate camera to take over the object of interest in the pool of candidate cameras, the one with the lowest system load PO,ij0 and the highest trackability measure Qij0 is chosen:

Bij0 ¼ ð1  PO;ij0 ÞQ ij0 ;

ð2Þ 0

where PO,ij0 is the overload probability of the ith object in the (j )th camera and Qij0 denotes the trackability measure of the ith object in the (j0 )th camera. The detailed definition of Qij0 and P O;ij0 are given in Sections 3 and 4, respectively. The term (1  PO,ij0 ) is included to reduce the chances of choosing a camera with high system load, which ensures an evenly distributed system load across all cameras. The execution criterion C E;ij is defined as:

   dM O;ij C E;ij ¼ ðM O;ij < M O;ij Þ _ >0 d    dM D;ij ^ ðM D;ij < M D;ij Þ _ >0 d    dM S;ij ^ ðM S;ij < M S;ij Þ _ >0 : d

ð3Þ

Since an efficient tracking system should be able to direct camera handoff for continuous and automated tracking before the tracked object is occluded or falls out of the FOV of currently observing camera. In the meanwhile, system load can be evenly distributed without deteriorating the frame rate of each camera. Thus, the ith object is transferred to the (j)th camera if C E;ij ¼ 1. Note that in some applications such as the work of Lien and Huang [28], each object of interest is tracked by multiple cameras to obtain more or better monitoring results. Our proposed handoff algorithm can easily be applied to these applications, because each camera still needs to handoff the object of interest to another camera that is not tracking the object when either occlusion, low resolution, or falling out of its FOV occurs. For one extreme case where the object of interest is tracked or monitored by all the cameras that can see it, our algorithm can be employed with the following modifications. The handoff triggering and camera selection processes are not necessary. Once an object of interest enters the field view of a camera, the camera computes its current load to determine whether it has spare computational resources to accept the object of interest. If the object of interest is accepted, consistent labeling is performed between this camera and adjacent cameras.

855

C.-H. Chen et al. / Image and Vision Computing 28 (2010) 851–864

3. Trackability measure In the following discussion, formulas are derived for a single target observed by a single camera. For clear representation, the subscripts i and j are omitted. Assume that from object tracking the target image’s relative scale q and center of mass g = [gx gy]T are estimated. The size preserving tracking algorithm discussed in [18] can be used for this purpose. The resolution component MS is defined as:

M S ¼ aS

f ¼ aS q; Zr

ð4Þ

where f represents the camera’s focal length, Zr is the average target depth, and aS denotes the normalization coefficient. Let fmax denote the maximum focal length of the camera and Zr,min the minimum distance between the target and the camera so that the target can be observed completely. The normalization coefficient aS is given by

aS ¼

Z r;min : fmax

To reserve enough computation time for the execution of the handoff between cameras, the object should remain at a distance from the boundaries of the camera’s FOV. This margin distance is also affected by the object’s depth. When the object is at a closer distance to the observing camera, its projected image undergoes a larger displacement in the image plane. Therefore, a larger margin should be reserved. In our definition, a varying polynomial power is used to achieve different decreasing/increasing rates and in turn different margin distances. The MD term is defined as:

( " 2  2 #)b1 qþb0     2g x 2g y 1    MD ¼  1 þ 1    1 ; 1 2 Nx Ny

ð5Þ

where Nx (Ny) denotes the width (height) of the image. The MD component evaluates the distance from the four image boundaries deN fined by x ¼  N2x and y ¼  2y . The coefficients b1, and b0 are used to adjust the polynomial power according to the target depth. In our experiments, we choose b1, and b0 according to  b1 fmax =Z r;min þ bo ¼ 1 Z , which leads to b1 ¼  2fr;min and b0 ¼ 1:5. max 2b1 fmax =Z r;min þ bo ¼ 0:5 The above equations are obtained empirically. In order to continuously track multiple objects, the system should be able to transfer the tracked object with latent occlusion to another camera with a clear view. Therefore, the occlusion caused by objects’ motion is also considered. The MO term is defined as:

 n ob1 qþb0 M O ¼ aO min ðg x;i  g x;j Þ2 þ ðg y;i  g y;j Þ2 ; i–j

ð6Þ

where aO is a normalization weight. [gx,i gy,i]T and [gx,j gy,j]T denote the centers of mass of any pair of objects in the field of view of currently tracking camera. Occlusion can be caused by stationary obstacles, such as tables and cabinets, or other moving pedestrians in the environment. Thus, those objects do not only represent mobile objects, but stationary. Nevertheless, how to differentiate which one is in the front or in the back is not the scope of this paper. Interesting readers can refer the work of Hoiem et al. [27]. In conclusion, the trackability measure is given by:

Q ¼ MO ðwS M S þ wD M D Þ;

requirements to increase our algorithm’s flexibility. Meanwhile, default values can be used if the corresponding variables are not specified by users. The default values of wS and wD are simply 0.5. The resolution and distance components describe two aspects of the same interaction between the target and the observing camera. Summation is used to combine the quantitative measures of these two aspects. In contrast, the occlusion component measures the interaction between two targets, which is independent of the interaction between the target and the camera. Therefore, the occlusion component measure is included via multiplication.

ð7Þ

where wS and wD are importance weights for the resolution and distance components, respectively. The sum of these importance weights is one. The selection of the importance weights is application dependent. We purposefully reserve the freedom for users to choose different importance weights according to their special

4. Adaptive resource management In this section, we first derive the overload probabilities of objects at different priority ranks and then introduce our resource management algorithm. In the following discussion, formulas are derived for any single camera. For clear representation, the subscript j is omitted. 4.1. Probability of camera overload Assume that the arrival of objects with priority rank r follows a Poisson distribution with a rate kr. The amount of time that an object remains within the camera’s FOV, is independent and follows an exponential distribution with mean of l1 . The exponential distribution describes a process where events occur continuously and independently at a constant average rate. The exponential distribution has been proved to be a good approximate model for service rates and has been widely used in queueing systems such as bank service and wireless communication [21,35,36]. The case of an object of interest entering the FOV of a surveillance system for the service of ‘‘tracking” and ‘‘monitoring” is similar to the case of a customer entering a bank for the service of account transactions and the case of a mobile call entering the base station for the service of wireless communication. Therefore, the exponential distribution is chosen to formulate the amount of time an object remains within the FOV. Let Nth,r be the maximum number of objects with a priority rank smaller than or equal to r that can be tracked simultaneously. We deliberately add Nth,0 = 0 to simplify the formulation. Let the maximum number of objects that can be tracked simultaneously be Nmax and the total number of priority ranks be Npr. To derive the overload probability of objects at different priority ranks, a multi-object tracking system is modeled as an M/ M/Nmax/Nmax/FCFS queuing system, where M represents arrival or departure distribution as a Poisson distribution, and the serving rule is first come first serve (FCFS) [19,20]. Such a system constitutes a Markov process of the birth–death type, as shown in Fig. 3. We examine the queuing system when it is at equilibrium. Under proper conditions, such equilibrium will be reached after the system has been operating for a period of time. It also implies that the probability of n objects being tracked, P(n), eventually becomes stable, where n ranges from 0 to Nmax. Therefore, the probability of the nth state can be computed given the probability of the (n1)th state:

∑ Ng=pr1 λ g

µ

n

2

1

0



λN

∑ Ng =prr λ g

∑ Ng=pr1 λ g



Nmax-1

pr

Nmax

Nmax µ

Fig. 3. Illustration of the state transition of an M/M/Nmax/Nmax/1/FCFS queuing system, which is used to model a multi-object tracking system.

856

C.-H. Chen et al. / Image and Vision Computing 28 (2010) 851–864

PNpr

k¼r kk

PðnÞ ¼

nl

Pðn  1Þ;

ð8Þ

where Nth,r1 < n 6 Nth,r. This relation leads to: Npr r1 X Pð0Þ Y kl PðnÞ ¼ n! k¼1 l¼k l

!Nth;k Nth;k1

N pr X kk k¼r

!nNth;r1 ð9Þ

;

l

8 2 !Nth;g Nth;g1 N !nNth;r1 391 N pr pr
is then compared with the predefined or desired overload probabilb O;r > P th;r , the thresholds Nth,r1 and Nth,r should be adity Pth,r. If P

ð10Þ According to (9) and (10), the overload probability for the object with a priority rank of r is given by N max X

PðnÞ:

ð11Þ

n¼Nth;r

The overload probability is one important criterion to evaluate the performance of a multi-camera system fulfilling multiple object tracking. It determines the number of objects that may be dropped due to limited resources. Therefore, in practice, it is desirable to distribute the resources dynamically according to the system’s current load and the object’s priority rank. From the above derivations, we learn that Nth,r determines the overload probabilities. Given the overload probabilities for objects at different priority ranks, we could adjust these thresholds to achieve the requirements. If the b O;r , for the object with real-time estimated overload probability, P a priority rank r exceeds the desired overload probability, Pth,r, we need to decrease the thresholds Nth,k with 1 6 k < r or increase the thresholds Nth,k with r 6 k 6 Npr. Based on this key concept, we develop our adaptive resource management algorithm. 4.2. Algorithm description The flow chart of our resource management algorithm is illustrated in Fig. 4. If given the known arrival rate kr with 1 6 r 6 Npr, Pr kk the initial thresholds Nth,r can be computed as N th;r ¼ Pk¼1 N max . N pr k k¼1 k

. Let nr be the numIf not, the initial values can be set to N th;r ¼ rNNmax pr ber of tracked objects with priority rank r. As we mentioned before, Pr if k¼1 nk < N th;r , the handoff request is accepted. Otherwise the



pr n i =1 i

If Fk<-Fth, then Nth,k= Nth,k-1, Fk=0 and If Fk>Fth, then Nth,k= Nth,k+1, Fk=0 ,where k = 1,…, Npr

< N th,r

No

Yes

Accept object

Reject object

Fr-1 = Fr-1-r and Fr = Fr+r No

Update λˆr

PˆO , r

Pth,r

frame l1 . Note that even in scenarios with known average arrival rates, it is also necessary to estimate the real-time arrival rates so as to adjust resource allocation among objects with different ranks according to current system load. Given the estimated ^ kr , the realb O;r , for objects with rank r can be comtime overload probability, P b O;r puted according to Eq. (11). The estimated overload probability P

where

PO;r ¼

handoff request is rejected. Afterwards, the real-time arrival rates of objects with different ranks ^ kr are estimated during the time

Yes

Update PˆO , r Fig. 4. Flow chart of the proposed adaptive resource management scheme. In b O;r , for the object with a general, if the real-time estimated overload probability, P priority rank r exceeds the predefined or desired overload probability Pth,r, we need to decrease the thresholds Nth,k with 1 6 k < r or increase the thresholds Nth,k with r 6 k 6 Npr.

justed. Ideally, we want to increase Nth,r and decrease Nth,r1. However, varying Nth,r1 and Nth,r also affects the overload probability of objects from other ranks. In addition, the estimated overb O;r may fluctuate, which in turn induces load probability P unnecessary adjustment of the thresholds. Therefore, to smooth the decisions over a period of time and incorporate the requirements from objects of other ranks, a flag is set up for the thresholds b O;r > Pth;r , decrease at each priority rank, which is defined as Fr. If P Fr1 by r suggesting that a decrease in Nth,r1 is requested and increase Fr by r suggesting that an increase in Nth,r is preferred. Since it is cumulative, Fr takes the previous decision into consideration as well. If multiple handoff requests are received, the same procedure repeats for each object and the decisions from multiple objects are combined in Fk with k = 1, . . ., Npr. The contribution in Fk from each object is associated with its priority rank. In so doing, more importance is assigned to the decisions from objects with higher priorities and the following adjustment of the thresholds favors a smaller overload probability for objects with higher priorities. In addition, a more prompt response is also achieved for objects with higher priorities. In addition, the priority rank is included to improve the system’s level of threat awareness. Priority ranks can be assigned to tracked objects according to their behaviors. For example, in the surveillance of an airport, passengers moving along the indicated direction (from the gates to exit) in the hall way are assigned with a lower priority while passengers moving in the opposite direction are assigned with a higher priority. After all the objects have been processed, the thresholds are updated. If Fk > Fth, Nth,k is increased by one, where Fth is a predefined threshold. If Fk < Fth, Nth,k is reduced by one. After the adjustment of Nth,k, the corresponding Fk is reset to zero. Nth,k remains the same b O;r and Fk is of the order if |Fk| 6 Fth. The complexity of computing P P Npr O k¼1 nk . The adjustment of thresholds Nth,k has a computational complexity of O(Npr). As a result, the proposed resource adjustment is able to dynamically relocate the available resources with marginally increased computational cost in comparison with the complexity of multiple object tracking and consistent labeling. 4.3. Example system To further study the effect of adjusting Nth,r for adaptive resource management, we consider the asset monitoring system as an example. In this application, people who are close to or carry the valued asset should be adaptively allocated more resource when the system’s load is high. This is because the tracking system needs to continuously track the people to immediately detect any threats to the valued asset. A system with Npr = 2, therefore, represents a system with only two types of objects, high and low priorities. Let kH and kL be the arrival rate of objects with high and low priorities. The probability of n tracked objects is given by

8  n kH þkL > < Pð0Þ l n! PðnÞ ¼  N  nNth > : Pð0Þ kH þkL th kH n!

l

l

0  n  N th ; ð12Þ Nth < n  N max ;

857

C.-H. Chen et al. / Image and Vision Computing 28 (2010) 851–864 0

: Camera covering entry door

Overload probability

10

PO,L

: Camera without covering entry door

PO,H

: Field of view

Pth,L=0.2

9.5 meters Camera 5

: Entry door

Camera 7 -1

10

Adjusted value: Nth=5

13 meters Camera 2

Camera 4

Initial value: Nth=2 10 meters Camera 3 Camera 1

-2

10

1

2

3

4

5

6

10 meters

Nth

Fig. 5. Illustration of the overload probabilities PO,H and PO,L as functions of Nth. kH kL l ¼ 2; l ¼ 1, Nmax = 6. The corresponding PO,H and PO,L are 0.015 and 0.710, respectively. In the beginning, the PO,L is much higher than the probability Pth,L = 0.2. Our resource management algorithm is able to increase Nth by one at one time so as to decrease PO,L. At equilibrium, we arrive at Nth = 5 resulting in PO,H = 0.035 and PO,L = 0.142.

with

Pð0Þ ¼

 n  N  nNth N th N max X X 1 kH þ kL 1 kH þ kL th kH þ n! n! u u u n¼0 n¼N þ1

!1 :

th

ð13Þ The overload probabilities for the object of high and low priorities P max PðnÞ. These two probabilities are PO,H = P(Nmax) and PO;L ¼ Nn¼N th are monotonously increasing and decreasing functions of the kL threshold Nth as shown in Fig. 5. Suppose we have kH l ¼ 2; l ¼ 1, Pr kg Nmax ¼ 2. The corand Nmax = 6. The initial Nth is initialized by Pg¼1 Npr g¼1

kg

responding PO,H and PO,L are 0.015 and 0.710, respectively. The PO,L is much higher than the probability Pth,L = 0.2. Our resource management algorithm is able to increase Nth by one at one time so as to decrease PO,L. At equilibrium, we arrive at Nth = 5 resulting in PO,H = 0.035 and PO,L = 0.142. Fig. 5 also depicts the adjustment process.

11 meters

Height: 3 meters

Camera 6

Fig. 6. Floor plan of the experimental environment.

sustain at most three tracked objects without deteriorating the system’s frame rate. In other words, the system only includes multi-object tracking. Thus, it can monitor 10 objects without deteriorating the frame rate. This observation also exemplifies the importance of resource management in a real-life scenario. Since the focus of this paper is not developing object tracking and consistent labeling algorithms, we use existing algorithms for multi-object tracking and consistent labeling. Image difference and homography-based approaches are implemented for object tracking and consistent labeling, respectively. 5.1. Experiments on trackability measure From the definition of the trackability measure, we first study the individual effect of MS, MD, and MO based on real-time tracking system where camera 2 indicated in Fig. 6 is used in this experiment. According to the derivation introduced in (4) and (5), we notice that the components MS and MD mainly describe the variations along and orthogonal to the camera’s optical axis, respectively. As expected, in Figs. 7 and 8, MS increases as the target moves toward the camera along the optical axis and MD increases as the target moves toward the image center. In Fig. 9, two targets walk

5. Experiment results In this section, we study the individual and combined effects of the three components, MS, MD, and MO, defined in the trackability measure. Afterwards, experiments are conducted to verify the effectiveness of our proposed camera handoff algorithm via video sequences generated by ourselves and dataset S7 in PETS’ 2006 [29]. Fig. 6 shows the floor plan of the experimental environment. Yao et al.’s camera placement algorithm [30] is used in our experiment to optimally preserve overlapped FOVs. Static perspective cameras with a resolution of 640  480 are placed along the walls at a height of 3 m with a tilt angle hT of 30°. Two priority levels are assigned to the objects, Npr = 2. The maximum number of objects that can be tracked simultaneously is three for all cameras, Nmax = 3 in our case. The thresholds TO, TD, and TS are 0.2 to comply with the time needed for executing camera handoff (5 s average) and the maximal moving speed of the objects (0.6 m/s). The surveillance system in our experiment includes behavioral understanding in addition to multiple object tracking algorithm. The behavioral understanding part is necessary for assigning different priorities to tracked objects. As a result, the surveillance system illustrated in our experiment can only

Fig. 7. The computed resolution component MS from frames acquired by a realtime tracking system as the object of interest moves toward the camera along the optical axis.

858

C.-H. Chen et al. / Image and Vision Computing 28 (2010) 851–864

Fig. 8. The computed distance component MD from frames acquired by a real-time tracking system as the object of interest moves toward the image center.

diagonally across the camera’s FOV with the same direction at different speeds. As a result, the relative distance between them decreases. This variation is indicated by a decreased MO, as shown in Fig. 9. Fig. 10 illustrates sampled frames at fn and fn+15 from a real-time tracking sequence 1 with two static perspective cameras. The cameras’ positions are specified in Fig. 6 as cameras 1 and 2. Table 1 lists MS,ij, MD,ij, and MO,ij for the ith object observed by the jth camera at frames fn and fn+15, where i ranges from 1 to 5 and j is either 1

Fig. 9. The computed occlusion component MO from frames acquired by a real-time tracking system. Two objects move across the camera’s FOV at different speeds, resulting in a decreased relative distance between them.

or 2. Fig. 11 illustrates continuous trackability measures, MS,ij, MD,ij, and MO,ij, of objects 1, 2, 3, 4, and 5 from frame fn to fn+20 in realtime tracking sequence 1. In frame fn, object 4 is blocked by object 3 in camera 1 while object 1 is blocked by object 2 in camera 2. Both objects can be observed without occlusion in the other camera. Thus, objects 4 and 1 are transferred to cameras 2 and 1, respectively. Object 5 in the camera 1 is close to objects 3 and 4. Its MO,51 is 0.18, less than TO = 0.2. A handoff request is, therefore,

Fig. 10. Illustration of the effectiveness of our proposed trackability measure in the camera handoff procedure at sampled frames fn and fn+15 in real-time tracking sequence 1.

Table 1 The illustration of MO,ij, MD,ij, and MS,ij shown in Fig. 10. Object 2 (i = 2)

Object 3 (i = 3)

Object 4 (i = 4)

Object 5 (i = 5)

fn

Object 1 (i = 1) fn+15

fn

fn+15

fn

fn+15

fn

fn+15

fn

fn+15

Camera 1 (j = 1)

MO,ij MD,ij MS,ij

0.31 0.6 0.43

0.15 0.5 0.41

0.41 0.5 0.41

0.15 0.4 0.42

0 0.45 0.43

0 0.5 0.42

0 0.43 0.3

0 0.4 0.3

0.18 0.38 0.45

0.25 0.14 0.6

Camera 2 (i = 2)

MO,ij MD,ij MS,ij

0 0.6 0.42

0.25 0.15 0.41

0 0.6 0.43

0 0.6 0.41

0.6 0.9 0.42

0.5 0.6 0.42

0.24 0.85 0.41

0 0.56 0.41

0.5 0.43 0.40

0.6 0.15 0.41

859

C.-H. Chen et al. / Image and Vision Computing 28 (2010) 851–864 0.7

0.5

0.6

Trackability Measure

Trackability Measure

0.7

MO,11 M O,11 MO,12 O,12 MD,11 D,11 MD,12 D,12 MS,11 S,11 S,12 MS,12

0.6

0.4 0.3 0.2 0.1

0.5 0.4 0.3

MO,21 MO,21 MO,22 O,22 MD,21 D,21 MD,22 D,22 MS,21 S,21 MS,22 S,22

0.2 0.1

0

0 0

4

8

12

16

20

0

4

8

Frame 1

16

20

0.9

MO,31 MO,31 MO,32 O,32 MD,31 D,31 MD,32 D,32 MS,31 S,31 MS,32 S,32

0.8 0.7 0.6

MO,41 M O,41 MO,42 O,42 MD,41 D,41 MD,42 D,42 MS,41 S,41 S,42 MS,42

0.8

Trackability Measure

0.9

Trackability Measure

12

Frame

0.5 0.4 0.3

0.7 0.6 0.5 0.4 0.3

0.2

0.2

0.1

0.1

0

0 0

4

8

12

16

20

0

4

8

Frame

12

16

20

Frame 0.7

Trackability Measure

0.6 0.5 0.4 0.3

MO,51 MO,51 MO,52 O,52 MD,51 D,51 D,52 MD,52 MS,51 S,51 S,52 MS,52

0.2 0.1 0 0

4

8

12

16

20

Frame Fig. 11. Illustration of continuous trackability measures, MS,ij, MD,ij, and MO,ij, of objects 1, 2, 3, 4, and 5 from frame fn to fn+20 in real-time tracking sequence 1.

triggered for object 5. On the other hand, camera 1 sends out a handoff request to its adjacent camera 2 and receives a positive response. As a result, object 5 in camera 1 will be transferred to camera 2, as marked by a yellow rectangle. Similarly, in frame fn+15, object 5 in camera 2 is close to the edge of the camera’s FOV, where its MD,52 is 0.15 and less than TD = 0.2. It requires camera handoff. On the other hand, camera 2 sends out the handoff request to its adjacent camera 1 and the request is granted, which is marked with a yellow rectangle in the camera 1. In general, we can see that trackability measure gives a quantified metric to direct the camera handoff successfully and smoothly before the tracked object is occluded or falls out of FOV of currently observing camera.

static perspective cameras. The cameras’ positions are specified in Fig. 6 as cameras 3, 4 and 5. To illustrate the effectiveness of adaptive resource management, we focus on object 1. In frame fn, even though camera 5 can see object 1, it does not track the object. This is because camera 4 tracks object 1 first and does not send out handoff request to adjacent cameras. In frame fn+15, object 1 is moving out of FOV of camera 4 and camera 4 had send out handoff request to adjacent cameras 3 and 5 before frame fn+15. Since camera 3 has reached its maximum system load (PO;130 ¼ 0:9 and P O;150 ¼ 0:1) and MS, MD, and MO are not dominant factors in the camera selection process, camera 5 is the next best camera to track object 1. In general, our adaptive resource management is able to guide the camera handoff procedure to choose the least system load.

5.2. Experiments on adaptive resource management 5.3. Experiments on overall performance In order to illustrate the importance of our proposed adaptive resource management in camera handoff, Fig. 12 illustrates sampled frames at fn and fn+15 from real-time tracking sequence 2 with three

In order to examine the overall performance of our proposed camera handoff algorithm including trackability measure and

860

C.-H. Chen et al. / Image and Vision Computing 28 (2010) 851–864

Fig. 12. Illustration of the effectiveness of our proposed adaptive resource management in the camera handoff procedure at sampled frames fn and fn+15 in real-time tracking sequence 2.

adaptive resource management, the algorithm discussed in [5] is implemented and serves as the comparison reference. The reference algorithm simply triggers a handoff request whenever the object of interest is close to the edge of camera’s FOV without regarding the system’s load, object priority, and the next best camera to track the object. Note that since there is no direct works corresponding to ours to the best of our knowledge, we choose Khan and Shah’ work as a symbolic algorithm to demonstrate problems we face and then overcome in a real-life case. To accommodate Khan and Shah’s work to our experiments, we make the following adjustments for their algorithms: (I) we trigger a handoff request when its distance to the edge of the camera’s FOV (MD) is smaller than the predefined threshold, TD, (II) we choose the next best camera by merely the biggest MD in adjacent cameras, and (III) according to our experiment, average 10 frames is necessary for Khan and Shah’s work to carry out a successful consistent labeling in a general situation. The failure of consistent labeling may occur when less than 10 frames are collected before the object is moving out of FOV of currently observing camera. One solution to reduce the possibility of failure of consistent labeling is to increase the overlapped views among adjacent cameras. This leads to the decreased overall coverage, thus, requiring more cameras to cover the area. This may not be practical in many cases. Thus, optimizing the tradeoff between coverage and overlapped views [30] is used in this experiment. As a result, accumulating sufficient number of frames is necessary before objects fall out of FOV of currently observing camera to avoid the failure of handoff process. In our experiment, we first illustrate how frame rates fluctuate when not considering adaptive resource management scheme in the tracking system. The overall tracking rate, the ratio between the time of objects being tracked by the system and the total time of objects staying in the FOV of the system, is used to describe the

system’s overall performance. To obtain a statistically valid estimation of the overall tracking rate, simulations are carried out to enable a large amount of tests under various conditions. Several points of interest are generated randomly to form a pedestrian trace. Overall tracking rate is obtained from simulation results of 300 randomly generated traces. In order to understand the behavior of our proposed camera handoff algorithm facing varying arrival rates of the objects with low and high priority, the ratio kL/kH is set to vary from 0.8 to 1.2. The expected probability of camera overload for objects with low and high priorities is Pth,L = 0.2 and Pth,H = 0.2. Note that once we lose the track of the object due to failure of camera handoff, we will not recover it until the object moves to another adjacent camera. Fig. 13 compares the performance of our adaptive resource management method and the reference algorithm [5] with various kL/kH in term of the handoff success rate. The notation Adaptive-0.8 suggests a system using our proposed resource management method with kL/kH = 0.8 and the notation KS-0.8 means the reference system [5] with kL/kH = 0.8. Fig. 13a illustrates the system equipped with our adaptive resource management can keep a steady frame rate of 8 fps while the frame rate of the system based on the reference algorithm varies between 3 fps and 8 fps. In addition, in Fig. 13b and c, regardless of kL/kH, the overall tracking rate of our adaptive approach is higher than that of the static approach. A considerable improvement in overall tracking rate by 20% is achieved in comparison with the Khan and Shah’s work. The observed inferior overall tracking rate of the reference method results from its fluctuating frame rate. When the frame rate is low, less information is acquired for the execution of consistent labeling, hence deteriorating the accuracy of identity matching and then the overall tracking rate. In other word, the continuity of objects being tracked in the system is compromised.

861

a

9

Frame rate (frames/second)

8

Adaptive-1.2, 1, and 0.8 KS--1.2 KS--1 KS--0.8

7 6 5 4 3 2 1

b

90

Overall tracking rate (%) for objects with high priority

C.-H. Chen et al. / Image and Vision Computing 28 (2010) 851–864

80

0

70 60 50

Adaptive-1.2 KS--1.2 Adaptive-1 KS--1 Adaptive-0.8 KS--0.8

40 30 20 10 0

20

40

60

80

100

120

140

160

180

200

20

40

60

80

Overall Tracking rate (%) for objects with low priority

c

100

120

140

160

180

200

Time (Minutes)

Time (Minutes) 80 70 60 50 40

Adaptive-1.2 KS--1.2 Adaptive-1 KS--1 Adaptive-0.8 KS--0.8

30 20 10 0

20

40

60

80

100

120

140

160

180

200

Time (Minutes) Fig. 13. Comparisons of camera handoff approaches with our proposed adaptive and Khan and Shah’ static resource management methods with various kkHL : (a) the illustration of how frame rates fluctuate when not considering the adaptive resource management scheme in the system. (b) Handoff success rate for objects with high priority and (c) handoff success rate for objects with low priority. In (a), (b), and (c), adaptive and KS denote our proposed adaptive and Khan and Shah’ static resource management methods respectively.

Fig. 14. Illustration of the effectiveness of our proposed camera handoff procedure including trackability measure and adaptive resource management at sampled frames fn and fn+30 in real-time tracking sequence 3.

Fig. 14 illustrates sampled frames from fn to fn+30 from real-time tracking sequence 3 with three static perspective cameras. In this

sequence, since objects 1 and 2 are carrying valuable materials, reduced frame rates is not allowed for the sake of security. Thus, in

862

C.-H. Chen et al. / Image and Vision Computing 28 (2010) 851–864

Table 2 The illustration of MO,ij, MD,ij, MS,ij, and P O;ij0 shown in Fig. 14. Object 1 (i = 1)

Object 2 (i = 2)

Object 3 (i = 3)

fn

fn+10

fn+20

fn+30

fn

fn+10

fn+20

fn+30

fn

fn+10

fn+20

fn+30

Camera 7 (i = 7)

MO,ij MD,ij MS,ij PO,ij’

0.31 0.16 0.5 0.6

0 0.85 0.49 0.6

   0.1

   0.3

0.31 0.35 0.48 0.6

0 0.84 0.48 0.6

   0.1

0.7 0.39 0.8 0.3

   0.6

   0.6

   0.1

   0.3

Camera 1 (i = 1)

MO,ij MD,ij MS,ij PO,ij’

   0.1

0.23 0.3 0.7 0.6

0.25 0.32 0.7 0.6

0.3 0.34 0.71 0.6

   0.6

0.23 0.2 0.7 0.6

0.28 0.15 0.65 0.6

0.29 0.1 0.6 0.6

   0.6

   0.6

   0.6

   0.6

Camera 6 (i = 6)

MO.ij MDi,j MS,ij PO,ij’

   0.3

   0.3

   0.3

   0.3

   0.3

   0.3

   0.3

   0.3

0.99 0.8 0.75 0.3

0.89 0.8 0.75 0.3

0.69 0.8 0.75 0.3

0.59 0.8 0.75 0.3

this experiment, objects 1 and 2 represent the high priority rank. Object 3 represents the low priority rank. The cameras’ positions are specified in Fig. 6 as cameras 1, 6, and 7. Table 2 lists MS,ij, MD,ij, MO,ij, and PO,ij0 for the ith object observed by the jth camera at frames from fn to fn+30, where i ranges from 1 to 3 and j is either 1, 6 or 7. In frame fn, objects 1 and 2 are tracked by camera 7. Object 3 is tracked by camera 6. In frame fn+10, object 2 is occluded by object 1 in camera 7. However, our trackability measure has triggered the camera handoff procedure before the occlusion happens. Even though object 1 can be seen by cameras 1 and 6 and represents similar MS,ij, MD,ij, and MO,ij in both cameras, camera 1 has the lowest computational load as compared with camera 6 (PO;110 ¼ 0:1 and P O;160 ¼ 0:3). Thus, object 1 is transferred to camera

1. In frame fn+20, object 2 is under camera handoff procedure since it is moving out of FOV of camera 1 (MD,21 = 0.15). In frame fn+30, object 2 had been successfully handed over to camera 7. In general, we can see that the newly defined trackability measure gives a quantified metric to direct the camera handoff successfully and smoothly before the tracked object is occluded or falls out of FOV of currently observing camera. Also, our adaptive resource management is able to effectively guide camera handoff to choose the camera with the least system load. This can reduce the probability of missing critical events and improve the system’s level of threat awareness. The maintained frame rate also stabilizes the performance of consistent labeling and leads to an improved handoff success rate.

Fig. 15. Illustration of the effectiveness of our proposed camera handoff procedure including trackability measure and adaptive resource management at sampled frames f1147, f1225, f1292, f1348, and f1414 in PETS’ 2006 dataset S7.

C.-H. Chen et al. / Image and Vision Computing 28 (2010) 851–864

5.4. Experiment on PETS’ video sequence Fig. 15 illustrates sampled frames at f1147, f1225, f1292, f1348, and f1414 from PETS’ 2006 dataset S7 where it contains a single person with a suitcase who loiters before leaving the item of luggage unattended and four cameras are monitoring the scene. During this event other people move in close proximity to the item of luggage. Two priority levels are assigned to the objects, Npr = 2. The maximum number of objects that can be tracked simultaneously is also three for all cameras, Nmax = 3. The thresholds TO, TD, and TS are 0.2 to comply with the time needed for executing camera handoff (5 s average) and the maximal moving speed of the objects (0.6 m/s). In this sequence, since object 1 is leaving his luggage unattended in the scene, which may post a threat to the area, reduced frame rates are not allowed. To illustrate the effectiveness of our proposed handoff algorithm, we focus on object 1. In the beginning, object 1 is tracked by camera first. In frame f1292, because object 4 is going to occlude object 1 (MO,1A = 0.18), handoff request from camera A is sent out to adjacent cameras B, C, and D. Since camera C has the lowest system load (P O;1B0 ¼ 0:4; PO;1C 0 ¼ 0:1 and PO;1D0 ¼ 0:3), the resolution of object 1 in camera B is too low (MS,1B = 0.13), and object 1 has similar MS,ij, MD,ij, and MO,ij in both cameras C and D, object 1 is transferred to camera C. In general, we can see that our defined trackability measure gives a quantified metric to direct the camera handoff successfully and smoothly before the tracked object is occluded by other objects. Also, our adaptive resource management is able to effectively guide camera handoff to choose the camera with the least system load. This can reduce the probability of missing critical events and improve the system’s level of threat awareness. 6. Conclusion Most existing camera handoff algorithms leave two crucial unsolved problems: (I) no quantitative measure is given to guide the transitions between adjacent cameras and (II) it is difficult to maintain a constant frame rate given limited resources. These two problems lead to a deteriorated performance of consistent labeling and possible observation leaks. As a result, the surveillance system is unable to continuously track the object of interest and immediately detect threatening events in the monitored area. In this paper, we first defined a trackability measure based on resolution, distance to the edge of the camera’s FOV, and occlusion to quantitatively evaluate the effectiveness of object tracking. The trackability measure is used to determine when to trigger a handoff request and to select the optimal camera to which the object of interest is transferred. We also developed an adaptive resource management algorithm based on system’s current load to adaptively allocate the resources among multiple objects with different privileges. Experimental results illustrated that our handoff algorithm outperforms Khan and Shah’s method by keeping a higher overall tracking rate and a more stable frame rate. This improves the reliability of the tracking system for continuously tracking multiple objects across multiple cameras. Acknowledgment This work was supported in part by the University Research Program in Robotics under Grant DOE-DE-FG52-2004NA25589. References [1] M. Balcells, D. DeMenthon, D. Doermann, An appearance-based approach for consistent labeling of humans and objects in video, Pattern and Application (2005) 373–385. [2] D.G. Lowe, Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision 60 (2) (2004) 91–110.

863

[3] P. Kelly, A. Katkere, D. Kuramura, S. Moezzi, S. Chatterjee, R. Jain, An architecture for multiple perspective interactive video, in: Proceedings of ACM Multimedia 95, May 1995. [4] J. Black, T. Ellis, Multiple camera image tracking, in: Proc. Performance Evaluation of Tracking and Surveillance Conference (PETS 2001), with CVPR 2001, December 2001. [5] S. Khan, M. Shah, Consistent labeling of tracked objects in multiple cameras with overlapping fields of view, IEEE Transactions on PAMI 25 (10) (2003) 1355–1361. [6] F. Fluuret, J. Berclaz, R. Lengagne, P. Fua, Multicamera people tracking with a probabilistic occupancy map, IEEE Transactions on PAMI 30 (2) (2008) 267– 273. [7] L. Lee, R. Romano, G. Stein, Monitoring activities from multiple video streams: establishing a common coordinate frame, IEEE Transactions on PAMI 22 (8) (2000) 758–767. [8] S. Calderara, A. Prati, R. Vezzani, R. Cucchiara, Consistent labeling for multicamera object tracking, in: 13th International Conference on Image Analysis and Processing, September 2005. [9] J. Kang, I. Cohen, G. Medioni, Continuous tracking within and across camera streams, in: IEEE International Conference on Computer Vision and Pattern Recognition, June 2003. [10] G.C. De Silva, T. Yamasaki, T. Ishikawa, K. Aizawa, Video handover for retrieval in a ubiquitous environment using floor sensor data, in: IEEE International Conference on Multimedia and Expo, July 2005. [11] C. Beleznai, B. Fruhstuck, H. Bischof, Multiple object tracking using local PCA, in: 18th International Conference on Pattern Recognition, June 2006. [12] X. Luo, S.M. Bhandarkar, Multiple object tracking using elastic matching, in: IEEE Conference on Advanced Video and Signal based Surveillance, September 2005. [13] Y. Yao, B. Abidi, M. Abidi, Fusion of omnidirectional and PTZ cameras for accurate cooperative tracking, in: IEEE International Conference on Advanced Video and Signal Based Surveillance, Sydney, Australia, November 2006. [14] M. Han, W. Xu, H. Tao, Y. Gong, An algorithm for multiple object trajectory tracking, in: IEEE Conference on Computer Vision and Pattern Recognition, 2004. [15] I.O. Sebe, S. You, U. Neumann, Globally optimum multiple object tracking, in: SPIE Defense and Security Symposium, 2005. [16] W. Hu, T. Tan, L. Wang, S. Maybank, A survey on visual surveillance of object motion and behaviors, IEEE Transactions on Systems, Man, and Cybernetics 34 (3) (2004) 334–352. [17] M. Shah, Understanding human behavior from motion imagery, Machine Vision and Applications 14 (September) (2003) 210–214. [18] Y. Yao, B. Abidi, M. Abidi, 3D target scale estimation for size preserving in PTZ video tracking, in: IEEE International Conference on Image Processing, Atlanta, GA, Oct. 2006. [19] L. Klennrock, Queuing Systems, Theory, vol. 1, Wiley, New York, 1975. [20] L. Huang, S. Kumar, C.-C. Jay Kuo, Adaptive resource allocation for multimedia Qos management in wireless networks, IEEE Transactions on Vehicular Technology 53 (2) (2004) 547–558. [21] Y. Nagai, T. Kobayashi, Statistical characteristics of pedestrians’ motion and effects on teletraffic of mobile communication networks, in: Second IFIP International Conference on Wireless and Optical Communications Networks, March 2005. [22] O. Javed, Z. Rasheed, K. Shafique, M. Shah, Tracking across multiple cameras with disjoint views, in: IEEE International Conference on Computer Vision, October 2003. [23] S. Guler, J.M. Griffith, I.A. Pushee, Tracking and handoff between multiple perspective camera views, in: IEEE Proceedings of the 32nd Applied Imagery Pattern Recognition Workshop (AIPR 03), USA, October 2003. [24] F.L. Lim, W. Leoputra, T. Tan, Non-overlapping distributed tracking system utilizing particle filter, The Journal of VLSI Signal Processing 49 (3) (2007) 343– 362. [25] O. Javed, K. Shafique, M. Shas, Appearance modeling for tracking in multiple non-overlapping cameras, in: IEEE Conference on Computer Vision and Pattern Recognition, June 2005. [26] J. Kang, I. Cohen, G. Medioni, Persistent objects tracking across multiple non overlapping cameras, in: IEEE Workshop on Motion and Video Computing, 2005. [27] D. Hoiem, A.N. Stein, A.A. Efros, M. Hebert, Recovering occlusion boundaries from a single image, in: IEEE International Conference on Computer Vision, October 2007. [28] K.-C. Lien, C.-L. Huang, Multi-view-based cooperative tracking of multiple human objects in clustered scenes, in: 18th International Conference on Pattern Recognition, June 2006. [29] PETS: Performance Evaluation of Tracking and Surveillance. . [30] Y. Yao, C.-H. Chen, B. Abidi, D. Page, A. Koschan, M. Abidi, Sensor planning for automated and persistent object tracking with multiple cameras, in: Proceedings IEEE Conference on Computer Vision and Pattern Recognition, June 2008. [31] S. Calderara, R. Cucchiara, A. Prati, Bayesian-competitive consistent labeling for people surveillance, IEEE Transactions on PAMI 30 (2) (2008) 354–360. [32] W. Hu, M. Hu, X. Zhou, T. Tan, J. Lou, S. Maybank, Principal axis-based correspondence between multiple cameras for people tracking, IEEE Transactions on PAMI 28 (4) (2006) 663–671.

864

C.-H. Chen et al. / Image and Vision Computing 28 (2010) 851–864

[33] J.-Y. Choi, J.-W. Choi, Y.-K. Yang, Improved tracking of multiple vehicles using invariant feature-based matching, Pattern Recognition and Machine Intelligence 4815 (2007) 649–656. [34] C.-H. Chen, Y. Yao, D. Page, B. Abidi, A. Koschan, M. Abidi, Video-based MultiCamera Automated Surveillance of High Value Assets in Nuclear Facilities, Transactions of the American Nuclear Society, Washington, DC, November 2007. [35] R. Panneerselvam, Research Methodology, Prentice Hall, 2004. [36] K.S. Tivedi, Probability and Statistics with Reliability, Queuing, and Computer Science Applications, first ed., Viley-Interscience, 2001.

Chung-Hao Chen received his B.S. and M.S. both in Computer Science and Information Engineering from Fu-Jen University, Taiwan 1997 and 2001, respectively. He received his Ph.D. in the department of Electrical Engineering and Computer Science at the University of Tennessee, Knoxville in 2009. His research interests include object tracking, robotics and image processing. He is currently an assistant professor in the department of Mathematics and Computer Science at North Carolina Central University.

Yi Yao received her B.S. and M.S. both in Electrical Engineering from Nanjing University of Aeronautics and Astronautics, China in 1996 and 2000, respectively. She received her Ph.D. in the department of Electrical Engineering and Computer Science at the University of Tennessee, Knoxville in 2008. Her research interests include object tracking and multi-camera surveillance systems. She is currently with the Global Research Center, General Electric.

David Page received the B.S. and M.S. degrees in electrical engineering from Tennessee Technological University, Cookeville, in 1993 and 1995, respectively, and the Ph.D. degree in electrical engineering from the University of Tennessee (UT), Knoxville, in 2003. After graduation, he was a Civilian Research Engineer with the Naval Surface Warfare Center, Dahlgren, VA. From 2003 to 2008, he was a Research Assistant Professor with the Imaging, Robotics, and Intelligent Systems Laboratory, Department of Electrical and Computer Engineering, UT. He is currently a partner with Third Dimension Technologies LLC, a Knoxville-based startup. His research interests include 3-D scanning and modeling for computer vision applications, robotic vision systems, and 3-D shape analysis for object description.

Besma Abidi is a Research Assistant Professor with the Department of Electrical and Computer Engineering at the University of Tennessee, Knoxville, which she joined in 1998. She was a research scientist at the Oak Ridge National Laboratory from 1998 until 2001. From 1985 to 1988 she was Assistant Professor at the National Engineering School of Tunis, Tunisia. Dr. Abidi obtained two M.S. in 1985 and 1986 in image processing and Remote Sensing with honors from the National Engineering School of Tunis. She received her Ph.D. from the University of Tennessee in 1995. Her general areas of research are in senor positioning and geometry, video tracking, sensor fusion, nano-vision, and biometrics. She is a senior member of IEEE, member of SPIE, Tau Beta Pi, Eta Kappa Nu, Phi Kappa Phi, and The Order of Engineer.

Andreas Koschan received his Diploma (M.S.) in Computer Science and his Dr.-Ing. (Ph.D.) in Computer Engineering from the Technical University Berlin, Germany in 1985 and 1991, respectively. Currently he is a Research Associate Professor in the Department of Electrical and Computer Engineering at the University of Tennessee, Knoxville. His work focused on color image processing and 3D computer vision including stereo vision and laser range finding techniques. He is a coauthor of two textbooks on 3D image processing and he is a member of IS&T and IEEE.

Mongi Abidi, Professor and Associate Department Head in the Department of Electrical and Computer Engineering, directs activities in the Imaging, Robotics, and Intelligent Systems Laboratory. He received his Ph.D. in Electrical Engineering at The University of Tennessee in 1987, M.S. in Electrical Engineering at The University of Tennessee in 1985, and Principal Engineer in Electrical Engineering at the National Engineering School of Tunis, Tunisia in 1981. Dr. Abidi conducts research in the field of 3D imaging, specifically in the areas of scene building, scene description, and data visualization.