Contour-based object tracking in video scenes through optical flow and gabor features

Optik 157 (2018) 787–797 Contents lists available at ScienceDirect Optik journal homepage: www.elsevier.de/ijleo Original research article Contour...

Download PDF

2MB Sizes 0 Downloads 15 Views

Report

PDF Reader
Full Text

Optik 157 (2018) 787–797

Contents lists available at ScienceDirect

Optik journal homepage: www.elsevier.de/ijleo

Original research article

Contour-based object tracking in video scenes through optical ﬂow and gabor features S. Kanagamalliga ∗ , S. Vasuki Velammal College of Engineering and Technology, ECE Department, Viraganoor, Madurai, 625009, India

a r t i c l e

i n f o

Article history: Received 1 February 2017 Accepted 22 November 2017 Keywords: Video surveillance Background subtraction Motion estimation Feature extraction Object classiﬁcation Occlusion detection Object tracking

a b s t r a c t While many algorithms have been proposed for object tracking with demonstrated success, a crucial problem still persists which is to improve the performance of non-rigid object structures. This paper presents a new, efﬁcient algorithm for Movement Estimation and object tracking in video scenes using Optical Flow and Gabor Features Based Contour Model. The target motion detection is done based on optical ﬂow method to calculate the ﬂow ﬁeld, according to the optical ﬂow distribution characteristics. Once the ﬂow ﬁeld has been determined it is used for motion analysis and the Expectation Maximization Based Effective Gaussian Mixture Model (EMEGMM) algorithm based background subtraction is performed to obtain the foreground pixels. With this method complete motion, shape and Gabor features are estimated. The extracted features are classiﬁed using Adaboost classiﬁer for effectively handling the region of interest. Then contour based object tracking is carried out by locating the object region in every frame through the object model created by the previous frames. The object shapes are considered as boundary silhouettes and the tracking results obtained are updated dynamically in the video frames. Experimental outcomes validate that our proposed method runs faster and is more accurate, when compared to the several state-of-the-art tracking methods. © 2017 Elsevier GmbH. All rights reserved.

1. Introduction Video processing is a particular case of signal processing where the input and output of video surveillance systems will be a video format or video streams [4]. The major three steps involved in video processing are detection, classiﬁcation and tracking. Most commonly used methods for detection of interested moving objects are background subtraction and optical ﬂow method. The Optical ﬂow method provides an apparent change of a moving object between the frames that determines the velocities and directions of each point of a video frame. Due to its higher detection accuracy, it is more suitable for non-rigid object analysis. Through optical ﬂow estimation, motion information of moving objects can also be obtained for various video frames [1,19]. Gaussian Mixture Model (GMM) has been generally used for non-rigid object recognition due to its vast applicability. However, the GMM cannot appropriately model noisy or non-stationary background modes. The dense displacement ﬁelds, or optical ﬂows, between consecutive video frames appear as the natural tool to build dense point trajectories: any optical ﬂow technique, from the classic Horn and Schunk estimator and alternatives [10,11] to its most recent descendants [5], can be readily used to construct any number of point tracks over arbitrarily long video shots via numerical integration.

∗ Corresponding author. E-mail addresses: [email protected] (K. S.), [email protected] (V. S.). https://doi.org/10.1016/j.ijleo.2017.11.181 0030-4026/© 2017 Elsevier GmbH. All rights reserved.

788

K. S., V. S. / Optik 157 (2018) 787–797

Pattern classiﬁcation methods have been presented to attain fruitful results in many areas of non-rigid object detection [6]. These methods can be decomposed into two signiﬁcant components: feature extraction and classiﬁer construction. In feature extraction, the dominant features are extracted from numerous training samples. These dominant features are used to train the classiﬁer. During testing, the trained classiﬁer cast an eye over the entire input to look for particular object patterns. The Support Vector Machine (SVM) classiﬁer is broadly used for detection and recognition. The methods based on boosting [8] show impressive performance and attract much attention. Some existing approaches [12–14] have validated impressive recognition results. However, it is idealistic to expect ﬂawless performance in non-rigid object recognition. Most non-rigid object recognition methods may miss a target or may also incorrectly classify a person in a stream of video scenes. Such inevitable errors would cause any object recognition method to misunderstand the object and background pixels. This method adds more difﬁculty to the non-rigid object recognition problem. As reported in a recent experimental study [8], the Adaboost-based approach has the fastest Detection Speed (DS) and comparable accuracy without the time constraint. Tracking involves estimating the trajectory of an object moving around in a video scene [2,3]. Certain challenges in the literature survey have been made to use contour, segmenting method for dynamic target tracking [15,16]. Despite having the promising performance, the traditional trackers face a practical problem that they use the rectangular bounding box or oval to approximate the tracked target. However, non-rigid objects in practice may have complex shapes. Since the rectangle box used for presenting the tracked target directly determines the samples to be extracted in the subsequent target appearance modeling step, it is a critical factor in tracking performance. Inaccurate target presentation easily results in performance loss due to the pollution of non-object regions residing inside the rectangle box. Ideally, a better manner to describe the target is to use the accurate silhouette along the target’s surface. In this paper, the proposed optical ﬂow is described for motion estimation, whose ﬂow vectors are obtained by a combination of both Lucas-Kanade and Horn-Schunk method. The background subtraction method EM-EGMM is proposed to discard the noise and ﬁll the holes for getting complete background region. Then, Adaboost classiﬁer with the Gabor features is prescribed to guarantee both accuracy and speed necessities for real-world applications. Finally, Contour tracking has been chosen for tracking because silhouette based methods give a perfect shape sketch for the targets. The remainder of this paper is structured as follows. In Section 2, the current state of the related work is reviewed. Then the proposed model is introduced in Section 3, and its generalized version is proposed in Section 4. Section 5 presents dense experiments conducted on a number of challenging video sequences. Section 6 concludes the paper. 2. Related work 2.1. Object detection Frame differencing is a simple approach, which thresholds the difference between two image frames, and large changes are measured to be the foreground object. Another approach is to construct an illustration of the background that is used to evaluate against new images. Pixel wise median ﬁlter is a commonly used background modeling method, where the background is deﬁned to be the median at each pixel location. Rather than using the median value of a group of pixels, a more reasonable way is that the pixel value follows a distribution of gaussian in the temporal route, and a model is used to calculate the likelihood of foreground and background for a particular pixel. When the single Gaussian is not able to satisfactorily account for the variance, a Mixture of Gaussians [9] is used to get better accuracy of the assessment. Another technique [18], estimates the probability of observing the pixel intensity values based on a sample of the intensity values of each pixel. The model is supposed to adapt quickly to changes in the video scene, thereby enabling responsive recognition of targets. An entirely dissimilar idea [17], investigates global information as an alternative of the local one. Alike to eigen faces, an eigen background is formed to conﬁne the dominant variability of the background. In general, Markov Random Field (MRF) model representation takes a superior sense of balance between complexity and accuracy of the algorithm, in which the idea is to establish the mask from segmentation via a Maximum A-Posteriori (MAP) approximation. 2.2. Object classiﬁcation The performances of numerous object recognition approaches have been evaluated in [14]. With respect to the receiver operating characteristic performances and effectiveness manifold feature classiﬁer combinations have been evaluated. Different features including PCA, local receptive ﬁelds (LRF) feature, and Haar wavelets are used to train neural networks, support vector machines (SVM) and Neural Networks (NN) classiﬁers. In addition, the Adaboost classiﬁer based on features, has the maximum accuracy under a rigid constraint. 2.3. Moving object tracking A real time tracking of non-rigid object shapes using mean-shift technique has been proposed [7]. The mean shift method is based on the iterations and the most probable target position in the current frame. The analysis of the method shows that it relates to the Bayesian structure at the same time as providing a resourceful solution. Thus, the tracking algorithm tries to identify the area of a video frame that is locally most related to a previously initialized representation and the object region to be tracked is carried out by a histogram. Here the target is represented by a rectangular or elliptical region. Some objects

K. S., V. S. / Optik 157 (2018) 787–797

789

Fig. 1. Basic work ﬂow of the proposed method.

will have complex shapes such as hand, ﬁngers, and shoulders. These objects cannot be deﬁned by well-deﬁned geometric shapes like boxes or blobs as in the case of mean-shift tracking algorithm. Silhouette based models [20] give an accurate shape sketch for the objects. Therefore, the contour based technique is able to track a variety of non-rigid complex shapes. From the survey of the existing methods, the challenging problems are listed as • • • • •

abrupt object motion non-rigid object structures object-to-object and object-to-scene occlusions view point variation illumination The proposed system is focused to overcome the difﬁculties in object tracking due to non-rigid object structures.

3. System overview The ultimate aim of our approach is to discover the object present in foreground region in a video frame using object tracking methods. The proposed system workﬂow is illustrated in Fig. 1, which consists of ﬁve major works. In the following sections each work is described systematically.

3.1. Background subtraction An adaptive background subtraction is performed to produce the initial masks for the moving persons. An EM based EGMM method is proposed to improve the segmentation quality of the moving objects. An EMEGMM is the method of modeling the background by using image sequences. Even though the foreground objects are extracted, it is important to extract the target accurately. In video scenes, unwanted regions exist in the foreground may cause an improper tracking problem. Hence, to remove those unwanted portions, morphological operation is performed. The opening and closing operator is described to identify the foreground pixels, as a result binary mask images are obtained.

3.2. Motion estimation The masks obtained from background subtraction are given as the input to optical ﬂow based segmentation algorithm that consolidates the spatial coherency. A new efﬁcient optical ﬂow estimation technique is described, and its application in a movement based analysis framework unifying non-rigid object recognition and tracking is outlined. Our proposed algorithm for motion detection represents the optical ﬂow by grid based ﬂow vectors, which are calculated efﬁciently. The motion patterns of the tracked non-rigid targets are modeled by their positions, velocities, motion magnitudes, and movement directions of their ﬂow vectors.

790

K. S., V. S. / Optik 157 (2018) 787–797

3.3. Region of interest (ROI) classiﬁcation The shape and optical ﬂow features are extracted from the output of target detection and also Gabor features are extracted. These extracted features are then classiﬁed into an object and background using Adaboost classiﬁer. Thus, the classiﬁer technique works faster and also give accurate results. 3.4. Occlusion estimation In the non-occlusion and occlusion cases, if the distances in both x and y directions between several objects are smaller than a predeﬁned threshold in the present frame, the occlusion may occur in the next video frame. Otherwise, object descriptions keep separated. Based on this concept, the number of persons involved in the occlusion can be deduced. 3.5. Contour extraction In combination with the extracted features Adaboost signiﬁcantly outperforms. From the feature based results exact bounding silhouette is obtained, to indicate the tracked target. The prescribed method extracts the perfect contours of the target as tracking output, which achieves better description of the non-rigid target objects while reducing background pollution of the target model. 4. Description of the algorithm The video sequence of Non rigid object is taken as input. In pre-processing step, video datasets are converted into successive frames for processing. 4.1. EM based EGMM The EM algorithm started by either initializing the procedure with a set of initial parameters or performed by an E-step, or by starting with a set of initial weights and then doing M-step. The EM algorithm for Gaussian mixtures is deﬁned as follows. Every single iteration involves an E-step and an M-step. 1 E (Estimation)-Step: For a speciﬁed parameter value, the predictable value of the latent variable is calculated. Represent the current parameter’s value of . Calculateωml for all data points xm , 1 < m < N and all mixture components 1 < l < L. 2 M (Maximization)-Step: The parameter model is updated based on the latent variable computed using the ML method. The membership weights and the data are used to calculate new parameter values. Let Nl =

N

ωml . This is the effective

m=1

number of data points assigned to component L. These perceptions are solved using Eq. (1) –(3), ˇlnew =

Nl ,1
(1)

These are new mixture weights 1 ωml .xi , 1 < l < L Nl N

new = l

(2)

m=1

The updated mean is estimated in a manner similar to how a standard empirical average could be computed, except that the mth data vector xm has a fractional weight ωml . lnew =

1 N Nl

m=1

ωml . xi − new l

xm − new l

t

1
(3)

Another equation is described to calculate an empirical covariance matrix, except that the contribution of every information point and it is weighted by ωml . The equation in the M- step1 < l < L, need to be calculated as follows. k new ˇ s are computed, then the k new new and ﬁnally the l new k s. After the computation of new parameters, the M-step is complete l and the membership weights in the E-step is recomputed, then the parameters are recomputed again in the E-step, and in this way the parameters are updated. The EMEGMM method is one of the simpler and faster congregating approaches of the object detection. It is a model based method that tries to sustain a balance between the implementation complexity and accuracy.

K. S., V. S. / Optik 157 (2018) 787–797

791

4.2. Morphological operation Morphological techniques probe the output obtained from background subtraction with a pattern called a structuring element. The operations test assesses whether the element ﬁts within the neighbourhood. Zero-valued pixels of the structuring element are unnoticed. The desired shape information is obtained after the morphological operation is performed. The midpoint pixel of the structuring element, called the origin, recognizes the pixel of interest being processed. In the structuring element, pixels having 1 s describe the neighbourhoodof the structuring element. 4.3. A combined local and global method The optical ﬂow algorithm describes the direction and time rate of pixels of two consequent images of the video. A twodimensional velocity vector, which carrying the information on the direction and the velocity of motion is assigned to each pixel in a given place of the video frame. The image in the video has been described by the 2-D dynamic brightness function of location and time I(x, y, t). Then, by providing that in the neighboring pixel, the change of brightness and intensity will not happen along the motion vector ﬁeld. To estimate the optical ﬂow between two images, the Eq. (4) optical ﬂow constraint equation is solved: Ia u + Ib v + It = 0

(4)

The spatio-temporal video frame brightness derivatives are denoted asIa , Ib , u is the horizontal optical ﬂow, and v is the vertical optical ﬂow. To solve for u and v the Lucas-Kanade and Horn-Schunck methods are used. To resolve the optical ﬂow constraint condition for u and v, the LK technique splits the input frame and adopts a constant velocity in each segment. Then, a weighted least-square ﬁt of the optical ﬂow constraint equation to a constant model for [u v]T is executed,by minimizing the Eq. (5),

W 2 [Ia u + Ib v + It ]2

(5)

x∈˝

By assuming that the optical ﬂow is smooth over the entire image, the Horn-Schunck method determines an estimate of the velocity ﬁeld, [u v]T , that minimizes this Eq. (6),

E=

2

(Ia u + Ib v + It ) dadb + ˛

∂u ∂a

2

+

∂u ∂b

2

+

∂v ∂a

2

+

∂v ∂b

2

dadb

(6)

Where, ∂u and ∂u are the spatial derivatives of the optical velocity component u, and ˛scales the global smoothness term. ∂a ∂b The Eq. (6) is simpliﬁed by the Horn-Schunck method to obtain the velocity ﬁeld, [u v] for each pixel in the image. 4.4. Adaboost classiﬁer The AdaBoost classiﬁer is a blend of a few classiﬁers in which every classiﬁer just spotlights on one dimension’s classiﬁcation of the data feature vector. In this way, every classiﬁer is known as a weak classiﬁer. The classiﬁer initially trains the system by employing weak learning systems and producing the solutions and the conﬁdence for the solutions. The conﬁdences are then estimated by employing weights for the system. Finally, the best values are selected as features. Error Rate and Accuracy are measured as the performance measures in the process. The accuracy represents the efﬁciency of the classiﬁcation process. The Error rate shows the number of pixels that are wrongly identiﬁed as objects. 4.5. Contour tracking Silhouette tracking methods, iteratively evaluates a primary silhouette in the former frame to its new position in the current frame. Smoothening with a Gaussian, which enhances the trade-off between noise ﬁltering and edge localization. The Gradient magnitude is calculated by approximations of partial derivatives (2 × 2 ﬁlters). Thin edges by relating nonmaxima suppression of the gradient magnitude. Detect edges are detected by double thresholding. The gradient can be computed as shown in Eqs. (8) –(10).

Gb =

1

1

−1

−1

; Ga =

−1

1

−1

1

(8)

P[i,j]2 + Q[i,j]2

(9)

[i,j] = tan−1 (Q[i,j],P[i,j])

(10)

M[i,j] =

792

K. S., V. S. / Optik 157 (2018) 787–797

Table 1 Historical elaboration of the most important video datasets. Dataset

Source

Type of problem

Complexity

Weizmann PETS BEHAVE ViSOR

Recorded videos (indoor/outdoor) Recorded videos (indoor/outdoor) Videos from recording TV shows Videos from different sources

Unrealistic action analysis (simple and static background) Realistic object analysis (complex and static background) Interaction analysis (complex background) Repositories (complex object analysis)

Low Moderate Moderate High

Table 2 Video Parameters information. Video Dataset

Frame Size W × H

Source Rate (fps)

Frame Count

PETS2006 S7-T6-B3 Weizman’ira walk.avi’ Weizman’ira side.avi’ Weizman’eli run.avi’

256 × 256 180 × 144 180 × 144 180 × 144

15 25.00 25.00 25.00

50 88 64 49

Where, M[i,j] is Magnitude of the Gradient, [i,j] is Orientation of the Gradient. The Computation of foreground pixels by hybrid method are described as, For each pixel (k), calculate the difference R(k)->|Background frame(k)-Current frame(k)|.If R(k)> threshold set F(k) = true else F(k) = false. Continually update the background frame. 4.6 Performance measure In Quantitative Analysis, the performance of the process is measured by the following parameters shown in Eqs. (11)–(16). F-Measure, False Alarm Rate (FAR), Accuracy (Acc), Jaccard similarity (J), Sensitivity (Se), and Speciﬁcity (Sp). F-Measure = ((2 ∗ (TP/(TP+FN)) ∗ (TP/(TP+FP)))/((TP/(TP+FN)) + (TP/(TP+FP))))

(11)

FAR = FP/(FP+TN)

(12)

Acc = (TP+TN)/(TP+FN+TN+FP)

(13)

J = TP/(TP+FP+FN)

(14)

Se = TP/(TP+FN)

(15)

Sp = TN/(TN+FP)

(16)

Where TP, TN, FP, and FN indicate the number of true positives i.e., foreground pixels correctly classiﬁed as foreground, true negatives i.e., background pixels correctly classiﬁed as background, false positives i.e., background pixels wrongly classiﬁed as foreground, and false negative i.e., foreground pixels wrongly classiﬁed as background, respectively. 5. Experimental results and analysis The proposed method performance is assessed with a number of video data sequences. Table.1 shows the historical development of the most important video datasets. The information about video frame rate, frame size and total number of frames are shown in Table.2 . The video scene comprise Non rigid objects is taken as an input information and it is converted as frames. The EM based EGMM algorithm offers a possible option to estimate the parameters in GMM models. It is an iterative model that can be used to make a maximum likelihood assessment of parameters based on the data sets. Fig. 2 shows that the selected target of the background object will be black and that object will be in original color using EM-EGMM method. The shadow components also have superior weights when occlusion exists. The EMEGMM algorithm has better effect on the stability, sensitivity and quick convergence. The moving object is detected using the optical ﬂow method for non-rigid object motion features estimation. The advantage of robustness and also dense optical ﬂow ﬁelds have been obtained by combining both the global method. As a model of local methods, the Lucas and Kanade method was used, while the Horn and Schunck approach was our descriptive for a global method. T ﬂow vectors are obtained from the optical ﬂow method. Local differential techniques are known to have the robustness under noise. Dense optical ﬁelds are produced by global techniques. Thus, it gives better accuracy in detection when both local and global methods are combined. Fig. 3, which shows the ﬂow vectors for both Lucas-Kanade and proposed method for Input video sequences, motion in both forward and reverse direction. Both local and global differential approaches have complementary shortcomings and advantages. Hence, it would be exciting to construct a hybrid technique that constitutes the better of two worlds: The result of the LK optical ﬂow algorithm and the combination of the robustness of local approaches with the density of global methods is shown in Fig. 3. The results of object detection method are shown in Fig. 4. This combines the background subtraction and optical ﬂow and the resultant is background eliminated foreground.

K. S., V. S. / Optik 157 (2018) 787–797

793

Fig. 2. Assessment of the proposed method and a related method, First column: Input video sequences, Second column: Background subtraction using GMM method, Third column: Output results using our proposed method. Table 3 Quantitative evaluations on different video datasets. Algorithms

Optical Flow EGMM Contour Proposed method

Video Dataset 1: PETS2006 S7-T6-B3

Video Dataset 2: Weizman ’ira walk.avi’

FAR

J

F-Measure

FAR

J

F-Measure

0.4588 0.4698 0.4550 0.4662

0.4968 0.4968 0.4969 0.4969

0.6638 0.6639 0.6639 0.6639

0.5001 0.4659 0.4575 0.4459

0.4430 0.4812 0.4838 0.4844

0.6517 0.6519 0.6525 0.6527

The prediction of the objects in the video is the estimation of the object contour in the video based on shape, optical ﬂow and Gabor features. Fig. 5 shows the results of Gabor feature extraction. The Gabor features are unique for each and every region in the images and hence the process can be more accurate while recognizing based on the similarities. To the classiﬁcation process, weak learners are assigned. The learner ﬁnds the weighted sum error value for each data. These obtained values are used for the classiﬁcation of the data. The process is repeated by updating the weights of the classiﬁer and the classiﬁcation process is employed. The resultant output of the proposed system is compared with a related method in this research. Fig. 6 shows that the individual object can be classiﬁed and tracked by the proposed method, which has perfect silhouette of the objects compared to existing techniques. The detected objects are then tracked by ﬁnding the measurements of the exact blob identiﬁed. The most signiﬁcant advantage of contour tracking is its ﬂexibility to handle a huge variety of shapes. Table.3 shows that the output of the proposed method is as good as that of the other existing methods. It can be seen that the proposed method has better accuracy to identify the target objects. Table 4 shows the Performance evaluation of object tracking results for different video datasets. Table 3 demonstrates that the yield of the proposed strategy is superior to alternate strategies. It can be seen that the proposed technique has better exactness to distinguish the target objects. The proposed technique is contrasted and the present best in class best strategy for human identiﬁcation, which utilizes shape detection, optical ﬂow and feature based classiﬁcation, shown in Fig. 7. The accuracy is higher in the proposed method for the human tracking in video input. Because this provides the background model which works well for relatively static backgrounds. ACC is used for the results to highlight the correct classiﬁcation. FAR, J, and F-Measure are considered to be the best as shown in Table 3 and are used for

794

K. S., V. S. / Optik 157 (2018) 787–797

Fig. 3. Assessment of the proposed method and a related method: motion in forward & reverse direction First and second column: Flow vectors using Lucas and Kanade method, Third and Fourth column: Flow vectors using our proposed method.

Fig. 4. The results of moving object detection, from left to right: PETS2006 S7-T6-B3, Weizman’ira walk.avi’, Weizman’ira side.avi’, and Weizman’eli run.avi’.

K. S., V. S. / Optik 157 (2018) 787–797

795

Fig. 5. The results of Gabor feature extraction. a) PETS2006 S7-T6-B3 b) Weizman’ira walk.avi’ c) Weizman’ira side.avi’ d) Weizman’eli run.avi’. Table 4 Performance comparison of object tracking results for different video datasets. Video Dataset

Sensitivity

Speciﬁcity

Error Rate

Overall Time (sec)

PETS2006 S7-T6-B3 Weizman’ira walk.avi’ Weizman’ira side.avi’ Weizman’eli run.avi’

0.9773 1 1 1

0.9434 0.9009 0.9009 0.9009

0.067114 0.067114 0.067114 0.040268

134.000624 141.176999 140.626774 137.016705

the experimental analysis. FAR is used for the average results to highlight the misclassiﬁcation. For lower FAR and higher ACC, J and F-Measure represent better results. The comparison between different tracking methods are shown in Table 3. It demonstrates that the proposed strategy has a superior execution. Features extracted from ROI give more information, accuracy and clarity compared to the features extracted from the input. Moreover, the output obtained directly from the input has pitfalls like occlusion, illumination, and background clutter. These pitfalls are minimized in the proposed method due to the background subtraction that is done during extraction. It also increases the accuracy of the output since this eliminates unwanted features available in the background. It is more simple compared to the existing method because it combines two simple algorithms in an easier way. It also provides shape and motion based information which is preferable than the existing method. Fig. 7 depicts the tracking accuracy obtained for traditional algorithms such as Optical ﬂow (OF), Effective Gaussian Mixture Model (EGMM), Contour, and the proposed algorithm mentioned in section 4. It is observed that the proposed algorithm outperforms traditional methods. 6. Conclusion and future work In this paper, an algorithm has been presented to show how background subtraction, optical ﬂow, and feature extraction together can be applied in the domain of target tracking in video sequences. Contours extracted by our method characterize the Non rigid object present in the video frames in a better way. The EM algorithm is used for parameter estimation of the EGMM. The EGMM is utilized as a part of the setting of a complex environment while Optical Flow is utilized for speedy

796

K. S., V. S. / Optik 157 (2018) 787–797

Fig. 6. ROI Contour Extraction, Top row: Adaboost based ROI classiﬁed video frame, Bottom row: The Object contour tracking results of the proposed method a) PETS2006 S7-T6-B3 b) Weizman’ira walk.avi’ c) Weizman’ira side.avi’ d) Weizman’eli run.avi’.

Fig. 7. Comparison of traditional algorithms with the proposed method for tracking accuracy.

estimation with simple background. This exchange persuades us to present and assess a novel technique that joins the upsides of background subtraction and optical ﬂow approaches. The object recognition using shape, optical ﬂow vectors, and Gabor features is to feed the descriptors into the tracking system. The Adaboost classiﬁer can make decisions regarding the presence of an object. With various successions, investigations are performed showing its magniﬁcent results. The combination of detection and feature based approach facilitates an improved contour tracking of non-rigid object structures. The results validate that the proposed methodology is indeed superior to some existing detection and tracking algorithms especially for video sequences. To classify human being from other objects, object recognition and tracking is described. The future scope of this work is to have a system that could alert authorities if a pedestrian displays suspicious performance such as: entering a secured zone, walking or running unevenly, wandering or walking against trafﬁc area. The work can be extended to other non-rigid objects like tracking of animals and birds. References [1] Pushpendra Kumar, Sanjeev Kumar, Balasubramanian Raman, A fractional order variational model for the robust estimation of optical ﬂow from image sequences, Opt.-Int. J. Light. Electron Opt. 127 (20) (2016) 8710–8727.

K. S., V. S. / Optik 157 (2018) 787–797

797

[2] Xiaohui Luo, Fuqing Wang, Mingli Luo, Collaborative target tracking in lopor with multi-camera, Opt.-Int. J. Light. Electron Opt. 127 (23) (2016) 11588–11598. [3] Mohd Zulkiﬂey, Asyraf, Robust single object tracker based on kernelled patch of a ﬁxed RGB camera, Opt.-Int. J. Light. Electron Opt. 127 (3) (2016) 1100–1110. [4] Hyungtae Kim, Jaehoon Jung, Joonki Paik, Fisheye lens camera based surveillance system for wide ﬁeld of view monitoring, Opt.-Int. J. Light. Electron Opt. 127 (14) (2016) 5636–5646. [5] Sandeep Singh Sengar, Susanta Mukhopadhyay, Moving object area detection using normalized self adaptive optical ﬂow, Opt.-Int. J. Light. Electron Opt. 127 (16) (2016) 6258–6267. [6] Jagdish Raheja, Swati Deora Lal, Ankit Chaudhary, Cross border intruder detection in hilly terrain in dark environment, Opt.-Int. J. Light. Electron Opt. 127 (2) (2016) 535–538. [7] Haichao Zheng, et al., Adaptive edge-based mean shift for drastic change gray target tracking, Opt.-Int. J. Light. Electron Opt. 126 (23) (2015) 3859–3867. [8] H. Wang, Y. Cai, Monocular based road vehicle detection with feature fusion and cascaded adaboost algorithm, Opt.-Int. J. Light. Electron Opt. 126 (22) (2015) 3329–3334. [9] X. Chen, C. Xi, J. Cao, Research on moving object detection based on improved mixture gaussian model, Opt.-Int. J. Light. Electron Opt. 126 (20) (2015) 2256–2259. [10] J. Lan, J. Li, G. Hu, B. Ran, L. Wang, Vehicle speed measurement based on gray constraint optical ﬂow algorithm, Opt.-Int. J. Light. Electron Opt. 125 (1) (2014) 289–295. [11] Y. Xin, J. Hou, L. Dong, L. Ding, A self-adaptive optical ﬂow method for the moving object detection in the video sequences, Opt.-Int. J. Light. Electron Opt. 125 (19) (2014) 5690–5694. [12] Jianfang Dou, Jianxun Li, Moving object detection based on improved VIBE and graph cut optimization, Opt.-Int. J. Light. Electron Opt. 124 (23) (2013) 6081–6088. [13] Qing Ye, Rentao Gu, Yuefeng Ji, Human detection based on motion object extraction and head–shoulder feature, Opt.-Int. J. Light. Electron Opt. 124 (19) (2013) 3880–3885. [14] Jorge L. Flores, et al., Edge linking and image segmentation by combining optical and digital methods, Opt.-Int. J. Light. Electron Opt. 124 (18) (2013) 3260–3264. [15] J. Han, Real-time multiple people tracking for automatic group-behavior evaluation in delivery simulation training, Multimed. Tools Appl. 51 (3) (2011) 913–933. [16] O. Ozturk, T. Matsunami, Y. Suzuki, T. Yamasaki, K. Aizawa, Real-time tracking of humans and visualization of their future footsteps in public indoor environments, Multimed. Tools Appl. 59 (1) (2012) 65–88. [17] T. Marciniak, A. Chmielewska, R. Weychan, M. Parzych, A. Dabrowski, Inﬂuence of low resolution of images on reliability of face detection and recognition, Multimed. Tools Appl. 74 (12) (2015) 4329–4349. [18] K. Kopaczewski, M. Szczodrak, A. Czyzewski, H. Krawczyk, A method for counting people attending large public events, Multimed. Tools Appl. 74 (12) (2015) 4289–4301. [19] K. Aihara, T. Aoki, Motion dense sampling and component clustering for action recognition, Multimed. Tools Appl. 74 (16) (2015) 6303–6321. [20] R. Goldenberg, R. Kimmel, E. Rivlin, M. Rudzsky, Fast geodesic active contours, IEEE Trans. Image Process. 10 (10) (2001) 1467–1475.

S. Kanagamalliga is an Assistant Professor, with the Department of Electronics and Communication Engineering, Velammal College of Engineering and Technology, Madurai, India. She received her B.E degree in Electronics and Communication Engineering from Kamaraj College of Engineering and Technology, Virudhunagar in 2008 and her M.E degree in Communication Systems from Mepco Schlenk Engineering College at Sivakasi in 2010. She is the author of more than 30 Journal/Conference papers. Her current research interests include image, video and audio signal processing.

S. Vasuki received her B.E degree from Government College of Technology, Coimbatore and obtained her M.E Degree from A.C. College of Engineering and Technology, Karaikudi, TamilNadu, India and Ph.D degree in Color Image Processing at Anna University, Chennai. She is working as Professor and Head in Electronics and Communication Engineering Department, Velammal College of Engineering and Technology, Madurai, Tamilnadu, India. She has published over 115 Technical papers in International Journals, International/ National Conferences. She is a Life member of Indian Society for Technical Education and Fellow of Institution of Electronics and Telecommunication Engineers and member in Institute of Electrical and Electronics Engineers.

Contour-based object tracking in video scenes through optical flow and gabor features

Contour-based object tracking in video scenes through optical flow and gabor features

Recommend Documents