Multi-vehicle detection algorithm through combining Harr and HOG features

Multi-vehicle detection algorithm through combining Harr and HOG features

Available online at www.sciencedirect.com ScienceDirect Mathematics and Computers in Simulation ( ) – www.elsevier.com/locate/matcom Original arti...

4MB Sizes 0 Downloads 34 Views

Available online at www.sciencedirect.com

ScienceDirect Mathematics and Computers in Simulation (

)

– www.elsevier.com/locate/matcom

Original articles

Multi-vehicle detection algorithm through combining Harr and HOG features Yun Wei a , Qing Tian b , Jianhua Guo c , Wei Huang c , ∗, Jinde Cao d a Beijing Urban Construction Design and Development Group Co., Ltd, Beijing 100000, China b North University of Technology, Beijing 100000, China c Intelligent Transportation System Research Center, Southeast University, Nanjing 210096, China d School of Mathematics, and Research Center for Complex Systems and Network Sciences, Southeast University, Nanjing 210096, China

Received 15 August 2017; received in revised form 28 December 2017; accepted 30 December 2017 Available online xxxxx

Abstract In order to achieve a better performance of detection and tracking of multi-vehicle targets in complex urban environment, we propose a two-step detection algorithm based on combining the features of Harr and Histogram of Oriented Gradients (HOG). This algorithm makes full use of HOG characteristic advantages for target vehicles, i.e., the good descriptive ability of HOG feature, and the prospect region of interest (ROI) can be extracted using Harr features. Moreover, the extracted HOG features from the ROI target area can be selected through applying the cascade structured AdaBoost classifier features and target area classification. Precise target can be further extracted by using support vector machine (SVM). Experimental results using video collected from real world scenarios are provided, showing that the proposed method possesses higher detecting accuracy and time efficiency than the conventional ones, and it can detect and track the multi-vehicle targets successfully in complex urban environment. c 2018 International Association for Mathematics and Computers in Simulation (IMACS). Published by Elsevier B.V. All rights ⃝ reserved.

Keywords: Harr features; HOG features; Vehicle detection; Environment perception; Computer vision

1. Introduction With the rapid development of urbanization and motorization, the number of vehicles is increasing rapidly internationally, leading to serious issues such as increased number of traffic accidents, energy waste, environmental pollution, and economic expenses [13]. Transportation system is an integrated system with vehicles operating within the surrounding environment; therefore, vehicle environment perception plays an important role in regulating the operation of vehicles through capturing and applying the environmental information to assist vehicle operations. In fact, vehicle environment perception is based on primarily computer vision technology which can provide warning ∗ Corresponding author.

E-mail address: [email protected] (W. Huang). https://doi.org/10.1016/j.matcom.2017.12.011 c 2018 International Association for Mathematics and Computers in Simulation (IMACS). Published by Elsevier B.V. All rights 0378-4754/⃝ reserved.

Please cite this article in press as: Y. Wei, et al., Multi-vehicle detection algorithm through combining Harr and HOG features, Mathematics and Computers in Simulation (2018), https://doi.org/10.1016/j.matcom.2017.12.011.

2

Y. Wei et al. / Mathematics and Computers in Simulation (

)



and auxiliary driving instructions to drivers, such as vehicle safety warnings, anti-collision warnings, autopilot and tracking assistance, through acquiring real-time information on the surrounding environment. Considering the importance of vehicle operations in transportation systems, vehicle environment perception has become a central issue internationally for improving the efficiency and safety of vehicle operations [15,25]. Computer vision technology based moving target detection is one of the basis of vehicle environment perception. In transportation systems, the moving objects usually refer to the moving vehicles or pedestrians that are present in the driving environment. By contrast, other non-moving objects such as road infrastructures are usually referred to as background. Therefore, it is very important to separate the moving objects from the background in real time through analyzing primarily the video stream collected for the surrounding environment. Considering these factors, further analysis and processing can be conducted for generating information that can be exploited for assisting vehicle operations. In this regard, a variety of findings or techniques have been achieved for visual detection of moving objects [14,31], especially for the single moving object detection in simple environmental driving scenes [32]. However, in real world, vehicles are generally operating under complex environment. In this regard, first, many types of driving scenes might be met during the driving task, e.g., urban environment or highway environment. Second, for a vehicle running in the traffic, multiple objects are to be detected and identified to enhance the understanding of the driving scene. Therefore, for complex driving environment, study of multiple object detection and identification is needed for complex driving environment perception. The main purpose of this paper is to propose a multiple object detection technique in complex driving environment. Noting that the existing works have concentrated on the influence of single feature for driving scene perception, and in this technical paper, a combined application of Harr and HOG features will be exploited to enhance the efficiency and accuracy of multiple objects detection. The organization of this paper is as follows. First, a literature review is conducted to reveal the related investigations. Then the proposed approach is described in the third section. The fourth section pays attention to the performance of the proposed approach and comparisons are also conducted. Finally, this paper concludes with summaries and discussions. 2. Literature review Vehicle detection is an important topic in computer vision field, which has been widely investigated due to its potential applications in vehicle safety and auxiliary driving. A variety of vehicle detection algorithms have been proposed from different prospectives based on different features and/or different classifiers [22,28]. Currently, a large amount of studies are conducted for single object detection and it has achieved desirable effect under simple driving scenes. For typical single object detection algorithms, Kate et al. proposed a vehicle detection algorithm based on the information of feature [32]. Shadow detection, entropy analysis, and level symmetry measurement were employed to detect vehicles without prior knowledge of road geometry. Matthews introduced a method, using PCA to extract features and then using SVM or neural network classifiers to detect vehicles [20]. Hoffmann et al. employed the feature of shadows and symmetry to detect front-view vehicles, and the basic theory of this method is to use shadows formed at the bottom of the vehicle to determine the possible positions and rough width, and then to use the symmetry characteristic of the vehicles to separate target vehicles [11]. Khammari et al. introduced a three-tier Gaussian pyramid level Sobel filter to get the maximum value of local gradient [16], and used a time filter to filter non-vehicle pixels. Then they extracted the bounding box for the candidate region to conduct the symmetry test. Other methods include vehicle detection through combined applications of road marking features [17,38]. These algorithms were designed for single target detection. Since there is no mutual interference among the objects for this case, this problem of single object detection is relatively simple. In addition, the extracted features are usually priori features of vehicles that are not easy to extract. In other studies, features based on regional information were widely applied. For example, Sun proposed a method using Gabor filter to extract rectangle features or combining Harr wavelet features and Gabor features with SVM classifier applied for vehicle detection [29,30], and Optical Flow Method has also been widely used for vehicle detection [2,3]. However, these methods in general cannot be applied in real time systems due to the low computation efficiency. Multi-vehicle detection also received great attention from the research community. Shen proposed a multi-vehicle detection algorithm based on vehicle symmetry and wavelet maxima [27]. For this work, a candidate area of the vehicle is found by using the feature of shadow at the bottom of the vehicle and gray symmetry is calculated; then the subsequent region of the vehicle is determined by using wavelet modulus maxima. This method has a high detection rate because of acquired prior knowledge of the vehicles; however, the priori characteristics of vehicles Please cite this article in press as: Y. Wei, et al., Multi-vehicle detection algorithm through combining Harr and HOG features, Mathematics and Computers in Simulation (2018), https://doi.org/10.1016/j.matcom.2017.12.011.

Y. Wei et al. / Mathematics and Computers in Simulation (

)



3

such as the bottom shadow might be difficult to be differentiated from the background when the driving scenes are complex. In addition, for multiple object detection, other interfering targets might also have similar priori features so that the detection accuracy might be reduced in such cases. Leissi proposed a vehicle detection algorithm for which offline learning is conducted by using a large number of images obtained from different cameras (dynamic or static) for imparting good generalization ability into the classifier [18]. Chong employed a multi-step method for vehicle detection [5]. Bottom shadow is extracted to calibrate the region of interest (ROI) of the vehicle and the mean models of energy and edge of the region are created. The method was tested under typical path in Hong Kong with desirable effect. However, it should be noted that the shadow is also difficult to be extracted in complex environment. Considering the vehicle wavelet feature and the goal search strategy from coarseness to fine, a vehicle detection algorithm is proposed by Schneiderman [26]. This method needs long computing time and its detection accuracy is not high. To overcome these drawbacks, more satisfactory results have been proposed by Chang et al. on the basis of Harr-like features [4], while the long computing time makes it inappropriate for real time systems. In summary, the above mentioned methods proposed different target detection models from different application perspectives and each model is affected greatly by the driving scenes as well as the number of targets to be detected. Compared with the achieved success for single object detection in simple driving scenes, further studies are to be carried out for multiple object detection under complex driving environment with adequate detection accuracy and efficiency. 3. Proposed multiple object detection algorithms Considering the complexity of urban environment for vehicle detection, it is necessary to obtain a good balance between detection accuracy and efficiency. This paper proposes a two-step vehicle detection algorithm by using combined features of Harr and HOG [6,21,37]. The algorithm can ensure a better operational efficiency under the premise of being adaptive to complex environment. The process of two-step target retrieval is based on a coarse-tofine detection strategy, containing primarily two steps, i.e., initial segmentation of front-view targets to narrow down the region of interest (ROI), and precise extraction of vehicle target in the ROI. Detailed procedure can be found in the following (see Fig. 1). 3.1. Initial segmentation of front-view targets The initial segmentation of front view targets is conducted on the strength of the Harr features, and the purpose of this step is to find reduced ROI of each potential object. Harr features are a kind of feature descriptors usually used in computer vision. Harr features have good description of the edges and lines of the object, so they are usually applied to detect the outline of the object. Extended Harr wavelet, which was first proposed by Lienhart [19], include three types of edge features, line features, and center–surround features as shown in Fig. 2. Considering that vehicle frame extraction is the emphasis of vehicle detection in this study and vehicle frame has clear edges and linear structure features, this paper chooses eight basic feature types with good edge and linear descriptions, i.e., a, b, c, d of edge features and a, c, e, g of linear features. In addition, integral image calculation proposed by Viola et al. was used for fast computing [34]. The number of extracted features was chosen to be 38 677. Based on the selected features, reduced ROI can be obtained for improving the real-time calculation performance. For balancing the detection rate and false positive rate, this study tries to ensure higher detection rate in this first detection period to maximize the possibility of including all potential objects. The detailed steps are given as below. Step 1: Read Image Selecting an image from the video sequence L(x, y). Step 2: Initialization The purpose of this step is to determine the templates and model parameters, involving primarily selecting Harr feature prototypes and setting system parameters. Aforementioned eight basic feature types (i.e., the edge features of a, b, c, and d, and the line features of a, c, e, and g) are defined. The sample size is 24 ∗ 24, the step is 10, the scale factor is 1.2, and classifier series is as n = 3, L , 12. Step 3: Image Preprocessing The algorithm is based on gray image for easy calculating. Meanwhile, in order to reduce the influence of light, this paper uses Gamma filtering to process the gray image to effectively reduce the variations of shadows and lighting in the image [10]. Please cite this article in press as: Y. Wei, et al., Multi-vehicle detection algorithm through combining Harr and HOG features, Mathematics and Computers in Simulation (2018), https://doi.org/10.1016/j.matcom.2017.12.011.

4

Y. Wei et al. / Mathematics and Computers in Simulation (

)



Fig. 1. The proposed two-step detection algorithm flowchart.

Please cite this article in press as: Y. Wei, et al., Multi-vehicle detection algorithm through combining Harr and HOG features, Mathematics and Computers in Simulation (2018), https://doi.org/10.1016/j.matcom.2017.12.011.

Y. Wei et al. / Mathematics and Computers in Simulation (

)



5

Fig. 2. Prototypes of extended Haar-like features.

Step 4: Multi-scale scaling Targets have large differences in positions and sizes in the images and the same target also presents different sizes from different perspectives. As a result, images need to be traversed in multi-scale. Without loss of generality, the scale factor is set to 1.2. In order to achieve multi-scale target detection, the images with different resolutions need to be scanned and scaled gradually to the real image size. Step 5: Integral Image Calculation Harr features calculation, which is performed on pixel level, needs multi-scale scaling. The calculation of this feature extraction is extremely complex. Calculating the image features in the Integral Image turned into the retrieval process of four corner coordinates. Four corner coordinates will be retrieved after calculating the image features in Integral Image. For this purpose, according to Eq. (1), each point is scanned in the image, and then the integral image is calculated to make the calculation of Harr features more efficiently. ∑∑ L G (x, y) = L(x ′ , y ′ ), (1) x ′
where L(x , y ) is the original image, L G (x, y) is the integral image. Step 6: Harr Features Extraction The specific feature extraction is obtained through learning the positive and negative samples and the detailed learning process can be found in Fig. 3. Firstly, the integral image is obtained by processing the original image. Secondly, the integral image is intensively scanned by using eight selected Harr feature prototypes in different scales (from 2 ∗ 2 to 24 ∗ 24), and the rectangle feature vectors of different scales and positions will be formed. Then, after each scanning, a classification threshold q can be obtained if the positive and negative samples could be separated. Otherwise, the scan results are discarded. The target images are usually sparse, so they need to be detected. The most non-functional feature regions are discarded after scanning. A small number of functional features (often reduced from several hundred thousand to a few dozen) are left. These features constitute a vector group on the scale, position, and threshold value, defined as T = (l, x, y, q). The rectangle feature vector group can be directly used to classify the images without repeating the comparison to train the samples constantly. That is the reason for slow feature training and fast target detection. Step 7: Front-view Object Segmentation (Harr feature detection) Front-view object segmentation takes advantage of the AdaBoost classifier with a cascade structure [7,8]. The target segmentation process is a two-level iterative algorithm, which conducts the overall classification through adjusting the test parameters in each level. Before the iteration, the classification parameters are set previously. The maximum and minimum false positive rates in each level are set to be f and d, respectively. n k denotes the number of weak splitter layers for level k. Fk and Dk stand for the total false positive rate and total detection rate for the former k levels. The ′



Please cite this article in press as: Y. Wei, et al., Multi-vehicle detection algorithm through combining Harr and HOG features, Mathematics and Computers in Simulation (2018), https://doi.org/10.1016/j.matcom.2017.12.011.

6

Y. Wei et al. / Mathematics and Computers in Simulation (

)



Fig. 3. Process of Harr features extraction.

Fig. 4. Block segmentation and feature vector of HOG features extraction. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

final false positive rate is set to be F = d n , F0 = 1.0, D0 = 1.0, K 0 = 0. Then, for the first level iteration, N-layer strong cascade classifier calculation stops if Fk ⩽ d n , otherwise the second level iterative calculation is conducted; for the second level iteration, the strong classification of each level is determined and computed if the current Fk is larger than the original value of the false positive rate f and the value of the total false positive rate Fk−1 , then n k , the number of weak classifiers in this level is determined and Fk is updated. Only misclassified negative samples are retained and sent to the next level classifier. The iterations will stop when all the conditions are met. Step 8: Merge the detected targets and output the front-view mask figure (front-view objects segmentation figure). 3.2. Precise extraction of vehicle target The purpose of this section is to find the precise objects by using HOG features to analyze the images with reduced ROI. Histogram of Oriented Gradients (HOG) features have the advantage of describing objects and have been widely used to extract precise targets through scanning the dense and overlapping features though intensive image scanning and complex calculation, resulting in low detection efficiency [1]. In this study, HOG features as well as integral image are used to detect vehicles [23]. For this purpose, horizontal and vertical gradients of the image are calculated and the image is divided into equal size cells with adjacent cells combined into a bigger block. Then, the image slides are overlapped to build the HOG features of the detecting window. This paper chooses the simple first-order template to calculate the gradient and employs R-HOG to extract features. Specific process of features extraction is shown in Fig. 4, where (a) is the original sample chart, (b) is cell division (blue) and sliding block (red), and (c) is the HOG features vector of block in the upper left corner. Therefore, after obtaining the reduced ROI for potential objects by means of Harr features, the subsequent HOG features can be carried out on the basis of these ROI for saving significantly the computing time in extracting Please cite this article in press as: Y. Wei, et al., Multi-vehicle detection algorithm through combining Harr and HOG features, Mathematics and Computers in Simulation (2018), https://doi.org/10.1016/j.matcom.2017.12.011.

Y. Wei et al. / Mathematics and Computers in Simulation (

)



7

accurate vehicle targets. Vehicle detection process involves intensive scanning of HOG features in target ROI and linear classifier SVM (Support Vector Machine) is used for classification. The steps are as below. Step 1: Read Image Precise target extraction is the second step of the algorithm, and the input image is front-view mask image K (x, y) segmented by Harr. Step 2: Initialization The aim of initialization is to set the minimum detection window, the size of cell and block, the sliding step, and the scaling factor. Based on previous experiences for better performance, the specific setting is as follows: The width of detection window is 64, the height is 64, the blockw is 16, the blockh is 16, the cellw is 8, the cellh is 8, the step is 8, and the scaling factor is 1.2. Step 3: Calculation of the Gradient Chart HOG features is carried out based on the gradient chart. First, the horizontal gradient d x and vertical gradient dy of the front-view image are calculated. The gradient magnitude and direction can be obtained according to Eqs. (2) (3). Then the gradient image can be obtained as the same size of the original image. √ )2 ( )2 ( (2) L(x + 1, y) − L(x − 1, y) + L(x, y + 1) − L(x, y − 1) , A(x, y) = (( ) ( )) θ (x, y) = arctan L(x, y + 1) − L(x, y − 1) / L(x + 1, y) − L(x − 1, y) .

(3)

Step 4: Scale Images In Multi-scale Targets in image have large differences in positions and sizes. The same target presents different sizes from different perspectives. Similarly, the scale factor in this period is 1.2. The minimum 64 ∗ 64 size can be gradually scaled to detect the target in multiple scales. Integral image calculation process is similar to the above process. However, there is a difference that the integral image of HOG features has nine regions. In accordance with 20◦ intervals and symmetry, the phase angle can be categorized in the range of 180◦ to the directions of nine regions, and the amplitude can be accumulated, generating nine integral images with the same size as the original images. Step 5: Calculate HOG Features Vectors (a) Slide the detection window, and take slip detection on the input image; (b) Slide block in the sliding window, take the gradient amplitude of each pixel as the weight, and count the gradient direction histogram of the cell by retrieving the integral images, forming a group of 9-dimensional H OG c feature vectors; (c) Form a group of 36-dimensional H OG B vectors from the H OG B of the block created from the combination of gradient direction histograms of four cells; (d) Using the following equation for the normalized vector; V V∗ = √ , (4) ∥V ∥22 +ε 2 where V is the original vector, V ∗ is the normalized vector, and ε is a small constant. (e) Cascade and combine all the H OG B blocks in detection window, forming HOG feature vectors of detected targets. The specific number of the feature vectors is calculated by the following formula. ( ) ( ) n = (W idth − Blockw)/Step + 1 ∗ (H eight − Blockh)/Step + 1 . (5) Step 6: Classifier Selection and Target Classification Object extraction using HOG features is a second-class classification problem. SVM classifier is a better choice which is a method created from two types of linearly separable problems [9,12]. The core of SVM is to find a hyperplane, g(x) = w T x + w0 , meeting the classification requirements. It makes the classification boundaries among the different mode maximums under the condition of ensuring classification accuracy. Compared with the existing statistical learning methods, it focuses on the minimization of structural risk in learning process. It has a good generalization ability in the case of small sample and has great advantage in solving small sample, nonlinear, and high dimensional pattern recognition problems [33]. This paper uses a linear SVM classifier for target classification detection. Step 7: Merge the detected targets and output the final frame of detected targets. Please cite this article in press as: Y. Wei, et al., Multi-vehicle detection algorithm through combining Harr and HOG features, Mathematics and Computers in Simulation (2018), https://doi.org/10.1016/j.matcom.2017.12.011.

8

Y. Wei et al. / Mathematics and Computers in Simulation (

)



Fig. 5. Positive samples in sample library.

Fig. 6. Negative samples in sample library.

4. Empirical study and analysis 4.1. Sample database creation The feature extraction algorithm is an integrated learning process which required a lot of positive and negative samples (positive samples are images of different vehicle targets and negative samples are complex background images except the vehicle targets). In order to better reflect the real world requirement, videos are captured to form the samples on the multiple roads in Beijing for calibration. The positive samples include 9443 pictures of different situations, including different roads, light, and vehicle models. The total 128 837 samples from INRIA database often used internationally are also selected, containing 4000 training positive samples, 5443 testing positive samples, 50 000 training negative samples, and 78 837 test negative samples. The sizes of samples are different in two periods: the sample size in Harr feature training period is 24 ∗ 24 and the sample size in HOG feature training period is 64 ∗ 64. Some positive and negative samples in the sample library are shown in Figs. 5 and 6. 4.2. Experiment and performance measures In order to verify the effectiveness of the proposed algorithm, this paper conducts the experiments on the above testing sample library, containing 5443 positive samples and 78 837 testing negative samples. By integrated learning Please cite this article in press as: Y. Wei, et al., Multi-vehicle detection algorithm through combining Harr and HOG features, Mathematics and Computers in Simulation (2018), https://doi.org/10.1016/j.matcom.2017.12.011.

Y. Wei et al. / Mathematics and Computers in Simulation (

)



9

Fig. 7. Vehicle detection ROC curve from the 3rd level to the 12th level.

on a large sample set, the number of cascade AdaBoost classifiers level (n), the threshold of HOG classifier (v) and other parameters can be determined. Some methods are compared and analyzed. In these methods, the first one used a single Harr feature and cascade AdaBoost classifier, and the test results in different cascade layers are shown in Fig. 7. The second one used a single HOG feature and SVM classifier. The experimental results are shown in Fig. 8 with different thresholds. The third one used combined features introduced in this paper. The results in different cascade structure layers and with different threshold of SVM classifiers are shown in Fig. 9. Algorithm experimental platform uses OpenCV computer vision library based on Intel, and computer hardware configuration is as follows: Intel Core i7-2600 CPU 3.4 GHz, 12 GB RAM. Meanwhile, in order to evaluate the effectiveness of different methods, through reviewing the evaluation index of target detection system in the related literatures, this paper defines several performance indicators of detection system. They are detection rate t p , false positive rate f p , non-detection rate n p , detection time t and the size of images m ∗ m, and the specific definitions are as follows [35]. tp =

Number of vehicles detected correctly , Number of vehicles detected correctly and non-vehicles detected incorrectly

(6)

fp =

Number of vehicles detected incorrectly , Number of vehicles detected incorrectly and non-vehicles detected correctly

(7)

np =

Number of vehicles detected incorrectly . Number of vehicles detected correctly and non-vehicles detected incorrectly

(8)

4.3. Results of Harr and AdaBoost The proposed method uses a cascade AdaBoost classifier for target front view segmentation. The specific parameters are set as follows: the maximum false positive rate in each level is f = 0.004, the minimum detection rate is d = 0.99, and the total detection rate is F = 0.99n . The final false positive rate is adjusted by choosing the level n = 3, 4..., 12, ROC (Receiver Operating Characteristics) curve of the system is plotted by taking false positive rate as the horizontal axis, and missing detection rate is taken as the vertical axis shown in Fig. 7. As shown in Fig. 7, with the change of detection layers, the false positive rate and missing detection rate vary greatly. When the missing detection rate reaches 0.4% (i.e. t p = 99.6%), the false positive rate f p reaches 56.8%. When the false positive rate f p is reduced to 2.8%, the missing detection rate increases significantly to 27.1%. Please cite this article in press as: Y. Wei, et al., Multi-vehicle detection algorithm through combining Harr and HOG features, Mathematics and Computers in Simulation (2018), https://doi.org/10.1016/j.matcom.2017.12.011.

10

Y. Wei et al. / Mathematics and Computers in Simulation (

)



Fig. 8. ROC curve of HOG+SVM vehicle detection when threshold varies from −1.5 to 3.

Therefore, the detection rate and the false positive rate of the algorithm are conflicting. When the detection system is based on the Harr+AdaBoost technique, the derived algorithm ensures a higher detection rate, and it is easy to conclude that a large number of false positives cannot lead to detecting vehicles normally. Therefore, it needs to strike a balance between them. This paper has higher requirements on detection rate in the front view extraction period and the false positive rate. The choice of parameters should focus on the case of higher detection rate and take the false positive rate as the referencing indicator. When the level is low, its false positive rate is high, which also results in longer overall operating time. 4.4. Results of HOG+SVM SVM classifier threshold can determine the hyperplane bias in the direction of positive samples or negative samples, which plays a decisive role in detection accuracy. This paper adopts Enumeration Method and selects ten values from −1.5 to 3 to conduct experimental analysis [24,36]. The interval is 0.5. Aiming at the 5443 testing vehicle positive samples and 78 837 testing negative samples, this paper uses HOG features and SVM classifiers. The experimental results of different thresholds are plotted in Fig. 8. As can be seen from Fig. 8, the detection accuracy is greatly affected by the change of the threshold. When the threshold is −1.5, the detection rate is 98.6%. The detection rate is only 70% while the threshold is 3. The false positive rate is low in this algorithm (basically less than 4%) because of the description ability of HOG features. Taking the experimental data as examples, when the maximum false positive is about 3.4%, the overall detection effect is better than Harr features. It is shown that HOG features have a strong robustness. The drawback of the algorithm is the poor time efficiency, thus it is important to improve the detection efficiency of this algorithm. 4.5. Results of improved method (Harr+HOG) Single Harr feature has high vehicle detection efficiencies, and higher detection accuracy can be obtained by adjusting levels. Accordingly, the false positive rate is higher, (The false positive rate reaches 56.8% in level 3.) Conversely, though the vehicle detection rate of HOG features is poor, it has a low false positive. So this paper proposes a layered target detection algorithm based on multi-features. By adjusting the cascade classifier levels under different thresholds, ROC curve of improved method is shown in Fig. 9, where the horizontal axis represents the ultimate false positive rate of the vehicle detection system and the vertical axis represents the final detection rate of Please cite this article in press as: Y. Wei, et al., Multi-vehicle detection algorithm through combining Harr and HOG features, Mathematics and Computers in Simulation (2018), https://doi.org/10.1016/j.matcom.2017.12.011.

Y. Wei et al. / Mathematics and Computers in Simulation (

)



11

Fig. 9. ROC curves for different levels and thresholds.

the system. The detection rate is higher, and the system performance is better under the condition of the same false positive rate. As can be seen from Fig. 9, with the adjustment of different parameters, detection performance of the system changes from t p = 98.5%, f p = 3.1% in level three to t p = 69.2%, f p = 0.01 in level 12, when the threshold changes from −1.5 to 3. Due to the influence of HOG features, although the overall false positive conditions vary widely, they are still within the acceptable range. The parameters selection of this method needs to focus on the detection rate and consider the false positive rate as the referencing indicator. Overall, the detection performance of the proposed algorithm is as expected, and the detection rate can reach 95.4% even though the threshold of classifier is at its worst. As the range of low false positive rate given in Fig. 9, the detection performance is becoming better with the increase of the number of levels. But in the high detection region, with the increase of false positive rate, the curves intercept and the detection rate is restricted to a certain degree. 4.6. Computation efficiency of the improved algorithm In order to analyze the performance of algorithm from multiple levels, this article also conducts experiments on time efficiency. Test objects are 200 high-definition video images selected randomly, and the size of the images is Please cite this article in press as: Y. Wei, et al., Multi-vehicle detection algorithm through combining Harr and HOG features, Mathematics and Computers in Simulation (2018), https://doi.org/10.1016/j.matcom.2017.12.011.

12

Y. Wei et al. / Mathematics and Computers in Simulation (

)



Fig. 10. Comparison of detection operating times of system. Table 1 Optimal dissipativity performance ν for different methods a. Level indicator

3

4

5

6

7

8

9

10

11

12

Operating time (ms/frame) Performance increase over single-HOG

246

213

171

160

137

111

110

108

101

96

9%

6%

24%

29%

39%

51%

51%

52%

55%

57%

1280 ∗ 720 pixels. Operating times of different levels are shown in Table 1 and Fig. 10. The analysis reveals that the detection time of the proposed system relates merely to the level and has nothing to do with the threshold of SVM classifiers. It can be seen from Fig. 10 that the detection time is 226 ms when HOG features are employed, and the operating time of the proposed algorithm is 246 ms in level 3. Both are longer than that of traditional single-HOG feature. The analysis reveals that the proposed algorithm has one more calculation of integral image than the extraction of single-HOG feature. At the same time, the algorithm in this paper leads to more false positive in low levels (e.g., the false detection rate is 56.8% in level 3). The increases of false positive also need more HOG features extraction time. These overlay effects illustrate that the extraction time is longer than the time for front view segmentation. With the increase of the number of levels, the detection efficiency of the algorithm is significantly improved. Taking the time efficiency and the detection accuracy into consideration, the parameter of the AdaBoost classifier is selected as seven levels and the threshold of SVM classifier is −1.5. The overall performance indicators of the system are obtained as follows: t p = 97.96%, f p = 1.33%, n p = 2.04%. The ROC curves of three methods using AdaBoost classifier with seven levels are shown in Fig. 11. As can be seen from Fig. 11, the time of the algorithm can be upgraded from 6%, in level 4 to 57% in level 12, starting from the fourth level of classifier. From Fig. 11, one can find that though the detection efficiency of the algorithm based on Harr+AdaBoost is high, the detection accuracy is worse than the method based on HOG+SVM. It also verifies that Harr features are relatively simple for vehicle detections and suitable for target front view coarse extraction. The method is not as good as the other two methods in terms of detection efficiency, but the ROC curves of the system show that the overall performance is excellent, and the time efficiency is improved by 39% than HOG+SVM. 4.7. Comparison and effect of results This paper uses the videos captured from typical roads in Beijing, including express roads, urban roads, tunnels, corners, ramps, with different lighting and weather. With the help of high-definition video camera, the resolution is Please cite this article in press as: Y. Wei, et al., Multi-vehicle detection algorithm through combining Harr and HOG features, Mathematics and Computers in Simulation (2018), https://doi.org/10.1016/j.matcom.2017.12.011.

Y. Wei et al. / Mathematics and Computers in Simulation (

)

13



Fig. 11. The comparing ROC curves of three methods.

Table 2 Performance comparison of the proposed method with classical algorithms. Algorithms

Detection rate tp

False detection fp

Operating time t (ms/frame)

Image size

Gabor+SVM [38] GW (Gabor+Wavelet)+SVM [17] Symmetry+Wavelet [27] Vehicle shadow+ROI entropy [5] Wavelet+Coarse to fine search strategy [26] Harr+online boosting [4] Proposed method

94.81% 96.97 97.72% 94.1 86%

2% 2.16% 7.19 – 4.7%

– – 66.2 50 300

– – 320 ∗ 240 – 320 ∗ 240

96% 97.96%

8% 1.33%

200 137

320 ∗ 240 1280 ∗ 720

1280 ∗ 720, which is rarely used in literature. The resolution in most literatures is 320 ∗ 240, which simplified the calculation greatly. The hardware configuration of the testing platform is: Intel Core i7-2600 CPU 3.4 GHz, 12 GB RAM. The platform is based on OpenCV and adopts the multi-threading accelerating technology. In order to verify the effectiveness of the method, the detection performance of the method is compared with classical algorithms in the literature, and the result is shown in Table 2. It can be seen that the algorithm has better efficiency than other traditional algorithms. It also shows low false positive rate. In addition, in terms of operating time, the experiments are performed in high-definition images in this paper, and the detection efficiency is 137 ms which is shorter than that in SD video in literatures [26] and [4], showing the advantages of this paper in vehicle detection accuracy and efficiency in complex scenes. The results indicate that this method can be used to detect multiple vehicles well in complex or simple scenes and has a good ability to adapt to real environment. The comprehensive detection accuracy is 97.96%, and the detection efficiency is nearly 8 frames/s. Note that the proposed methods work desirably even for the image with increased size. 4.8. Field detection performance of improved method (Harr+HOG) Fig. 12 shows the results of the continuous multi-frame detection on the Lianshi road in Beijing. In this situation, there is complex interference of background and camera vibration, together with the appearance, disappearance, and shelter of vehicles. As in the fifth frame, the front vehicle on the middle lane is blocked and is detected in the 105th frame, and the vehicle on the right lane is also detected. Since the front vehicle is blocked by the vehicle on the right lane, the detection of this vehicle fails. Due to the existence of interfering targets (such as traffic signs, etc.), false positive can be produced, such as the false positives of signs. These false positives only exist in one frame and some Please cite this article in press as: Y. Wei, et al., Multi-vehicle detection algorithm through combining Harr and HOG features, Mathematics and Computers in Simulation (2018), https://doi.org/10.1016/j.matcom.2017.12.011.

14

Y. Wei et al. / Mathematics and Computers in Simulation (

)



Fig. 12. Results of multi-lane road vehicle detection on Lianshi road in Beijing.

continuous frames, and would be eliminated under the condition of the detection system combined with the tracking algorithm. Figs. 13 and 14 show the results of vehicle detection in two special scenes. One is the curve condition. This condition will lead to the target shape change greatly with the great change of the camera angles. In terms of detection effect, vehicle detection has a good effect in the condition of little change of the vehicle frame. Another one is the Please cite this article in press as: Y. Wei, et al., Multi-vehicle detection algorithm through combining Harr and HOG features, Mathematics and Computers in Simulation (2018), https://doi.org/10.1016/j.matcom.2017.12.011.

Y. Wei et al. / Mathematics and Computers in Simulation (

)



15

Fig. 13. Results of multi-lane road vehicle detection on south second ring curve in Beijing. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 14. Results of multi-lane road vehicle detection in Beijing west terminal tunnel conditions. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

tunnel condition. In this case, the light changes severely in the whole scene, and the colors of vehicles change in different degree in the scene. Testing results show that the system has a better adaptability to environment. 5. Conclusion Using video cameras to assist drivers ability becomes more and more attractive. To this end, most of the published methods are for single vehicle detection under relatively simple scenarios. This is undesirable considering the complexity of driving environment, in which multiple vehicles are generally presented. Consequently, developing advanced algorithm to meet these requirements becomes even more urgent. To tackle the issue of detecting multiple vehicles in complex driving environment, this paper proposed a two-step vehicle detection algorithm, i.e., to extract targets by using the strong descriptive ability of HOG and segmenting the region of interest (ROI region) based on Harr features, considering both the environmental adaptability and time efficiency. Using real world videos captured for complex driving environment in Beijing, the proposed method is applied and the experimental results showed that the proposed method can detect multiple vehicles in the conditions of multi-lane, partial coverage, curves, tunnel, etc., verifying the validity of the proposed method. In addition, the proposed method shows higher detecting accuracy and time efficiency than the conventional ones due to the combined use of the Harr and HOG features in the proposed procedure. Given the desirable results reported in this paper, further research is to be conducted on real world driver assistance systems to show its real world applicability of improving driving safety. In addition, future work can also be carried out to test the ability of this proposed method in extended scenarios. Acknowledgment This work is supported by the National Natural Science Foundation of China (Grant No. 61573106). References [1] J. Arrspide, L. Salgado, M. Camplani, Image-based on-road vehicle detection using cost-effective Histograms of Oriented Gradients, J. Vis. Commun. Image Represent. 24 (2013) 1182–1190. [2] P.H. Batavia, D.A. Pomerleau, C.E. Thorpe, Overtaking vehicle detection using implicit optical flow, in: IEEE Conference on Intelligent Transportation Systems, 1997, pp. 729–734. [3] H.Y. Chang, C.M. Fu, C.L. Huang, Real-time vision-based preceding vehicle tracking and recognition, in: IEEE Intelligent Vehicles Symposium, 2005, pp. 514–519.

Please cite this article in press as: Y. Wei, et al., Multi-vehicle detection algorithm through combining Harr and HOG features, Mathematics and Computers in Simulation (2018), https://doi.org/10.1016/j.matcom.2017.12.011.

16

Y. Wei et al. / Mathematics and Computers in Simulation (

)



[4] W.C. Chang, C.W. Cho, Online boosting for vehicle detection, IEEE Trans. Syst. 40 (2010) 893–902. [5] Y. Chong, W. Chen, Z. Li, W.H. Lam, C. Zheng, Q. Li, Integrated real-time vision-based preceding vehicle detection in urban roads, Neurocomputing 116 (2013) 144–149. [6] N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in: Proceeding of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005, pp. 886–893. [7] Y. Freund, Boosting a weak learning algorithm by majority, Inform. Comput. 121 (1995) 256–285. [8] Y. Freund, A. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting, J. Comput. System Sci. 55 (1997) 119–139. [9] M.G. Genton, Classes of kernels for machine learning: A statistics perspective, J. Mach. Learn. Res. 2 (2001) 299–312. [10] Q. Guo, W.J. Song, L. Hou, Y.L. Zhang, J.G. Liu, Effect of the time window on the heat-conduction information filtering model, Physica A 401 (2014) 15–21. [11] C. Hoffmann, T. Dang, C. Stiller, Vehicle detection fusing 2D visual features, in: IEEE Intelligent Vehicles Symposium, 2004, pp. 280–285. [12] C. Hsu, C. Lin, A comparison of methods for multi-class support vector machines, IEEE Trans. Neural Netw. 13 (2002) 415–425. [13] W. Huang, Introduction To Intelligent Transportation System (ITS), China Communications Press, 2008. [14] A. Jazayeri, H. Cai, J. Zheng, Vehicle detection and tracking in car video based on motion model, IEEE Trans. Intell. Transp. Syst. 12 (2011) 583–595. [15] Y. Jia, C. Zhang, Front-view vehicle detection by Markov chain Monte Carlo method, Pattern Recognit. 42 (2009) 313–321. [16] A. Khammari, F. Nashashibi, Y. Abramson, Vehicle detection combining gradient analysis and AdaBoost classification, in: IEEE Conference on Intelligent Transportation Systems, 2005, pp. 66–71. [17] Y.C. Kuo, N.S. Pai, Y.F. Li, Vision-based vehicle detection for a driver assistance system, Comput. Math. Appl. 61 (2011) 2096–2100. [18] L.C. Leon, R. Hirata Jr., Car detection in sequences of images of urban environments using mixture of deformable part models, Pattern Recognit. Lett. 39 (2014) 39–51. [19] R. Lienhart, J. Maydt, An extended set of Haar-like features for rapid object detection, in: IEEE International Conference on Image Processing, Vol. 1, 2001, pp. 900–903. [20] N.D. Matthews, P.E. An, D. Charnley, C.J. Harris, Vehicle detection and recognition in greyscale imagery, Control Eng. Pract. 4 (1996) 473–479. [21] C. Papageorgiou, M. Oren, T. Poggio, A General frame-work for object detection, in: International Conference on Computer Vision, Vol. 108, 1998, pp. 555–562. [22] P. Parodi, G. Piccioli, Feature-based recognition scheme for traffic scenes, in: Intelligent Vehicles Symposium. 1995, pp. 229–234. [23] F. Porikli, Integral histogram: A fast way to extract higtograms in Cartesian spaces, in: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1, 2005, pp. 826–836. [24] S. Qin, J. Zhang, X. Chen, F. Chen, Enumeration of spanning trees on contact graphs of disk packings, Physica A 433 (2015) 1–8. [25] R. Rad, M. Jamzad, Real time classification and tracking of multiple vehicles in highways, Pattern Recognit. Lett. 26 (2005) 1597–1607. [26] H. Schneiderman, A statistical approach to 3D object detection applied to faces and cars, in: IEEE Conference on Computer Vision and Pattern Recognition, Vol. 1, 2000, pp. 746–751. [27] Z. Shen, The Target Detection Technology Based on the Visual Navigation Intelligent Vehicle in Complex Urban Scenes (Ph.D. thesis), Chongqing University, Chongqing, 2008. [28] N. Srinivasa, A vision-based vehicle detection and tracking method for forward collision warning, in: Intelligent Vehicle Symposium, Vol. 2, 2002, pp. 626–631. [29] Z. Sun, G. Bebis, R. Miller, Improving the performance of on road vehicle detection by combining Gabor and wavelet features, in: IEEE International Conference on Intelligent Transportation Systems. 2002, pp. 130–135. [30] Z. Sun, G. Bebis, R. Miller, On road vehicle detection using Gabor filters and support vector machines, in: International Conference on Digital Signal Processing, Vol. 2, 2002, pp. 1019–1022. [31] Z. Sun, G. Bebis, R. Miller, On-road vehicle detection: A review, IEEE Trans. Pattern Anal. Mach. Intell. 28 (2006) 694–711. [32] T.K. ten Kate, M.B. van Leewen, S.E. Moro-Ellenberger, Mid-range and distant vehicle detection with a mobile camera, in: Intelligent Vehicles Symposium, 2004, pp. 72–77. [33] S. Theodoridis, K. Koutrombas, Pattern Recognition, fourth ed., Electronic Industry Press, 2010. [34] P. Viola, M. Jones, Rapid object detection using a boosted cascade of simple features, in Proceeding of International Conference on Computer Vision and Pattern Recognition, Vol. 1, 2001, pp. 511–518. [35] Y. Wei, A Method of Multi-Target Recognition and Tracking in Road Mobile Visual Environment Perception Southeast University, Nanjing, 2013 (Ph.D. thesis.). [36] Y. Xiao, H. Zhao, G. Hu, X. Ma, Enumeration of spanning trees in planar unclustered networks, Physica A 406 (2014) 236–243. [37] Y. Zheng, Object Detection Techniques and Semi-definite Programming Relaxation Clustering Algorithm (Ph.D. thesis), National University of Defense Technology, Changsha, 2011. [38] W. Zhu, J. Miao, J. Hu, L. Qing, Vehicle detection in driving simulation using extreme learning machine, Neurocomputing 128 (2013) 160–165.

Please cite this article in press as: Y. Wei, et al., Multi-vehicle detection algorithm through combining Harr and HOG features, Mathematics and Computers in Simulation (2018), https://doi.org/10.1016/j.matcom.2017.12.011.