Neurocomputing 129 (2014) 350–361
Contents lists available at ScienceDirect
Neurocomputing journal homepage: www.elsevier.com/locate/neucom
Power line detection from optical images Biqin Song a,b, Xuelong Li a,n a Center for OPTical IMagery Analysis and Learning (OPTIMAL), State Key Laboratory of Transient Optics and Photonics, Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi'an 710119, Shaanxi, P.R. China b Graduate University of the Chinese Academy of Science, Beijing 100049, P.R. China
art ic l e i nf o
a b s t r a c t
Article history: Received 11 April 2013 Received in revised form 21 September 2013 Accepted 26 September 2013 Communicated by Dr. Z. Wang Available online 4 November 2013
Image-based power line detection is highly important for threat avoidance when the aerial vehicles fly in low altitude. However, it is very challenging for the requirements of high detection rates, low false alarms and real-time application. In this paper, a sequential local-to-global power line detection algorithm is proposed. In the local criterion, a line segment pool is detected by morphological filtering an edge map image, which is computed based on matched filter (MF) and first-order derivative of Gaussian (FDOG). It results in over detection to guarantee high detection rates. In the next global criterion, grouping the line segments into whole power lines is formulated as a graph-cut model based on graph theory. The principal advantage of the proposed algorithm is that it can detect not only the straight power lines but also the curve ones. Experimental results demonstrate that the algorithm has good performances both in detection accuracy and in processing time. & 2013 Elsevier B.V. All rights reserved.
Keywords: Power line detection Threat avoidance Matched filter Line segment pool Graph-cut model
1. Introduction Over the past several years, interest in the development of Unmanned Aerial Vehicles (UAVs) and Micro Air Vehicles (MAVs) has accelerated substantially for their wide applications. They range from battlefield surveillance and natural disaster areas survey to electrical infrastructure maintenance and traffic monitoring. However, serious hazards exist when aerial vehicles operated at low altitude in cluttered environments without constrained or controlled conditions of different obstacles, lighting effects, weather and so on [1–4]. Specially, power lines are one of the most formidable hazards. By the United States Army records [5], more aerial vehicles were lost to power lines than against enemies in combat. Specific data [6] reports 54 so-called power line strikes which caused 13 military personnel die and $224 million in damages from 1997 to 2006; during the same time period, there were 102 civilian power line strikes killing 33 people in accordance with the data of National Transportation Safety Board. Therefore, the ability of navigation system for detecting power lines efficiently and real-timely must be ensured before the aerial vehicles can execute various intelligent missions safely [7]. Great attention has been paid on the development of visual navigation system for UAVs or MAVs. Correspondingly, many sensor-based strategies for automatic detecting power lines have been developed. Generally, there are three kinds of systems: infrared system [8], laser radar system [9,10] and electromagnetic
system [11]. But these strategies are limited to wide practical applications for their various disadvantages. For infrared system, the contrast between the background and power lines is usually not enough for automatic detecting even if the high-resolution infrared camera is used; moreover, the technology may be dysfunction in the case of power failure or in the environments with some heat sources. Laser radar system is easily affected by weather conditions for being highly sensitive to atmospheric attenuation effects. Electromagnetic system may be invalid in the case of outage. Beyond that, these three kinds of emissive devices are most energy consuming, big size and heavy. Recently, the optical-based system has received more and more attention due to their low energy consumption, small size and light weight. These advantages can help small aircraft to overcome the significant limitations in electrical power supply and payload. However, perceiving power lines from optical image is challenging. On one hand, power lines are difficult to be detected accurately. Firstly, the contrast between the background and the thin lines is usually low. Secondly, when the background is heavily cluttered, such as urban settings, it is hard to discriminate power lines from other similar objects in the background, for instance the edges of buildings. On the other hand, it is hard to estimate the distance to power lines from the optical image. In this paper, we focus on the detection problem. A sequential local-to-global power line detection is proposed, which has the following contributions:
Some non-power-line segments can be suppressed by the n
Corresponding author. E-mail address:
[email protected] (X. Li).
0925-2312/$ - see front matter & 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.neucom.2013.09.023
introduced edge detection method, which is inclined to detect symmetrical edge and suppress step edge. This property makes
B. Song, X. Li / Neurocomputing 129 (2014) 350–361
the following global stage easier. More importantly, it guarantees low false alarms compared to the existing results. The more detail explanation is in Section 1.2. To the best of our knowledge, the proposed algorithm employing graph theoretical to group line segments into whole lines is new. Compared to the existing power line detection methods, a quadratic polynomial is used to fit power line instead of restricting power line to be straight, which enable the proposed method detect not only the straight power line but also the curve one.
1.1. Related works The existing algorithms of power line detection from optical image can be classified into two kinds by their applications. One is for automatic surveillance and inspection of the electrical power infrastructure, and the other is for threatening obstacle avoidance in navigation. Some algorithms were proposed for the first application. Golightly and Jones designed an artificial-vision system to inspect overhead power lines based on Hough transform (HT) [12]. Yan et al. [13] extracted straight line segments by Radon transform, followed by employing Kalman filter to connect segments into whole lines. Li et al. [14,15] removed background noise by a pulse coupled neural filter, and then detected straight lines by HT based method, finally K-means clustering algorithm was used to refine the detecting results . The above algorithms depended on some prior knowledge of the power lines. For example, the number of power lines is known or power lines are approximately parallel to each other [13–15]. However, power line detection for threatening obstacle avoidance when the aerial vehicles flying in low altitude is much more difficult and challenging than for electrical power infrastructure surveillance. Because when the aerial vehicles are operated in unknown scenario, little information about the threatening power lines may be known aforehand. Besides, the detection algorithms should be fast to leave enough time for avoiding obstacle. In spite of huge challenges, there have been a few of researchers involved in related investigation. To our best knowledge, the premier work was introduced by Kasturi et al. [16]. Steger's detector of curvilinear structures [17] was applied to extract line segments prior to HT. This algorithm was with subpixel accuracy but time consuming for Steger's detector. More recently, Candamo et al. [18–21] stated that not only the spatial information carried by a single frame still image was useful, but also the temporal information between different frames should be exploited. They proposed a series of algorithms based on the low altitude videos. Firstly, the feature map image was produced by estimating the relative motion of each pixel based on optic flow method from consecutive frames. And then, HT was performed in each of subwindow of the feature map image. Finally, the locations of the previously detected lines in the next frame were predicted by tracking the parameter space of HT over time with a linear motion model. So the lines would be not missed out easily even when they were not fully visible. But as we all know, the data capacity of the video is huge and the computation complexity of the optical flow algorithm is high, which may not be benefit for real time processing. The above reported schemes generally capture two criteria: (i) Local criterion, creating a line segment pool involving local operator or pixel-wise manipulation mainly based on the gradient image; (ii) Global criterion, grouping “true” candidates in pool into whole power lines by incorporating additional knowledge about their structures, such as the smoothness.
351
1.2. Detecting line segment pool In the first local processing stage of all the above-mentioned methods, an edge map image is computed by different kinds of pre-existing line feature detection methods, such as pulse couple neural network method [14,15], Steger's curvilinear structures detector [16], and Canny edge detector [18–20]. According to the cross-section shapes of the edges, they can be classified into two categories: step edge and symmetrical edge, namely the crosssection is Gaussian-shaped and symmetric with respect to its peak location. Usually, the power line is very thin and only with the width of two or three pixels. It is reasonable to think that the backgrounds on both sides of it have little change, which means that the edges representing the power line are symmetrical. However, the above methods do not make a distinction between two kinds of edges. It means that they do not distinguish power lines from other linear objects, such as the edge lines of the buildings. This may cause high false alarms. Because it is harder to filter out non-power-line segments in the following global stage, which is operated just on the edge map image without any gray gradation information. In this paper, we propose a new power-line segments detection method based on the variant of matched filter (MF) with first-order derivative of Gaussian (FDOG). MF has been used to detect various line features, for instance the retinal blood vessels extraction [22]. It is an effective yet simple method, which detects vessel-like feature by filtering image and thresholding on its response image. MF firstly used for vessel extraction exploits the information that the crosssection of a vessel is Gaussian shaped. But in fact the profile is also symmetric about its peak position. Hence, MF may respond to not only vessel edges but also non-vessel edges. By making the best of this prior knowledge, Zhang et al. [23] proposed a retinal blood vessel detection method based on the extension of MF, which is called MF-FDOG. MF-FDOG is the same as MF approach detecting vessels by thresholding on the response image to MF, while the improvement is self-adjusting the threshold in accordance with the response image to FDOG. MF-FDOG method achieves competitive detection results when being compared with those state of the art schemes but with much lower complexity. However, MF-FDOG cannot be directly applied to detect power line. Because there are two differences: the attributes of the optical image and medical image are different; and the morphological properties of the power line and the vessel are different. In this paper, the edge map image is computed based on a variant of MF-FDOG firstly, and then a morphological filter is used to filter out as many non-power-line edges as possible. Finally, local criterion is finished and a line segment pool is obtained.
1.3. Grouping line segments into power lines Since it is unknown that which elements in the pool are the “true” candidates, which ones are the background noise, and each power line is made of by which elements, “black box” is the right words for the pool. The classical of decoding this black box method HT is used in most of the above-mentioned algorithms [14–16,18,20]. But it is hard to control in that improper threshold choice may cause significant false negatives or false positives. Besides, huge computing is another drawback. Another kind of usually used algorithms [13,19,21] group the line segments into power lines by its morphological properties, such as parallelism, distance and so on. However, all of these methods make an assumption that the power lines are straight. Obviously, it cannot be always true on account that the power lines over two long distant towers may leave a curved down due to force of gravity. In order to detect not only the straight power lines but also the curve
352
B. Song, X. Li / Neurocomputing 129 (2014) 350–361
ones, a unified grouping framework based on graph cut for linking line segments into whole lines is proposed in this paper. The graph theoretical approach has been used for different types of line structures extraction, such as road in satellite image [24,25], tubular structure in medical image [26–28], and others [29]. The line features identification is modeled as an image segmentation problem solved by finding an optimal path in a high-dimensional graph in all these works except the one in [24]. Defining every pixel of the image as a vertex of the graph, the segmentation problem is formulated as a binary labeling of the vertices with foreground (line features) and background objects. In literature [24], the graph theoretical framework is used to perform the global analysis by representing the vertices as a set of detected road segments and possible connection segments. Then, road detection problem is also treated as a binary labeling of the vertices. The label set answers whether each segment belongs to road or not. It can be seen that the global analysis problem faced in power line detection is different from road detection. Although defining the vertices composed of detected line segments can be the same, the solution wished to be found is different. Besides it needs to find that each segment belongs or not to a power line, it needs to clear that which segments belong to which power line. So it is no longer binary labeling, but multiclass labeling without knowing the number of the classes. Great challenge is encountered for the two inherent disadvantages of the graph theoretical approach: (1) huge computation burden; (2) not many efficient solution algorithms for multi-part graph cut. Such drawbacks aside, the multi-graph cut theory with the potential of grouping both straight line segments and curved ones by a unified model is greatly appealing. In this paper, in order to develop a unified model for detecting both straight power lines and curve ones, a new weight associated with the edge on the graph between any two vertices is designed. Based on the new weight, the problem of grouping line segments into power lines is formulated as a normalized graph cut (Ncut) model, which is solved by graph Laplacian proximately. As the new construction of weight matrix, the number of the partition can be self-adaptive decided by thresholding on the eigenvalue of the graph Laplacian matrix. In addition, the partition is fast because the Laplacian matrix is low dimensional. To the best of our knowledge, this is the first time to use graph theoretical approach for grouping line segments into whole lines.
The rest of this paper is organized as follows. The proposed power line detection algorithm is described in Section 2. In Section 3, the experimental results are shown and conclusions are drawn in Section 4.
2. Power line detection 2.1. Algorithm overview An overview of the proposed algorithm which operates in a two-stage way is shown in Fig. 1. In the local criterion, an edge map based on MF and FDOG is proposed to detect all the line segments with symmetrical edges of the image, then a morphological filter is designed specifically to filter out the non-powerline candidates, at the same time generating a line segment pool. Although some detected line segments are filtered out based on the morphology properties of the power lines in this coarse detection step, it is still an over detection to guarantee low missing detection rate, which means to keep power line candidates as many as possible. So the line segments in the pool need to be further verified. In the next global criterion, the graph-cut model based on the graph theory is exploited to group the line segment pool into whole line pool prior to picking up the “true” power lines by morphology properties again. 2.2. Line segments detection For flying aircrafts, as it is well known that the threat caused by the false negatives is much stronger than the false positives. Therefore, in local criterion of the proposed algorithm, the features on behalf of power lines are coarsely over detected in order to decrease the false negatives. These features are obtained by morphologically filtering an edge map image which is computed based on MF and FDOG. In this paragraph, some definitions, notations and properties about MF and FDOG are given. MF is a Gaussian-shaped filter shown in Fig. 3(a) and defined as 1 x2 Fðx; yÞ ¼ pffiffiffiffiffiffi exp 2 s; for x r t s; y r L=2 ð1Þ 2s 2π s Output image
Input image
and
MF response
: Morphological filtering
Training segments Line pool
Gray image Graph-cut model
FDOG response Edge map image Local criteria line segment pool detection
Line segment pool Global criteria power lines detection
Fig. 1. The overview of the local-to-global power line detection algorithm.
B. Song, X. Li / Neurocomputing 129 (2014) 350–361
where s denotes the scale of filters; t is a constant and is usually set to 3 because larger than 99% confidence interval is ½ 3s; 3s; L represents the length of the neighborhood along the y-axis to smooth noise; In order to remove the smooth background after filtering by normalizing the mean value of the filter to 0, s is introduced as Z ts 1 x2 pffiffiffiffiffiffi exp 2 dx =ð2t sÞ s¼ ð2Þ 2s 2π s ts It can be derived without much exertion that the first order derivative of the Gaussian function (FDOG) F is as follows: x x2 ð3Þ f ðx; yÞ ¼ pffiffiffiffiffiffi exp 2 ; for x r t s; y r L=2 2s 2π s3 FDOG filter is shown in Fig. 3(c). Given two synthetic signals representing the profiles of two kinds of edges in Fig. 2, the corresponding responses of MF and FDOG are given in Fig. 3 (b) and (d) respectively. It demonstrates that for MF, the response of the symmetrical edge is still symmetrical and strong positive in the peak area of the edge, while the response of the step edge is
anti-symmetrical and very low in the jump area; for FDOG, the opposite is true. After the introduction of the definition, the idea of MF-FDOG thresholding scheme proposed by Zhang et al. [23] for retinal blood vessel detection is briefly sketched. Because the retinal vessels belong to symmetrical edges, the edge map image of the retinal image is obtained when the response to MF is larger than a given threshold in the original MF method. But it can also match the non-vessel edges to some extent. In [23], the threshold is adjusted by the response to FDOG: if there is a vessel in the image, the threshold will be lowered depending on the weak FDOG response at the corresponding area, thus it is benefit for detecting vessel; if there is not, the threshold will be raised by the strong FDOG response, so the non-vessel structures can be suppressed. However, MF-FDOG method may not be robust enough for power lines detection. Because the quality of the images used for detection may seriously degrade compared to the retinal images for the motion of the sensor and all kinds of noises. This may lead to some weak power lines to be ignored and cause high false negatives. Considering the false negatives is much more hazardous than the false positives for the aerial vehicles to avoid threatening obstacle when flying, a robust and efficient power line detection algorithm should be exposited. In this paper, a problem-specific design threshold scheme is proposed to overly detect power line segments. The proposed thresholding scheme is to set a reference threshold on the response to FDOG firstly, and then adjust the reference threshold by the response to MF. The response images are obtained by convolving the input image with MF kernel and FDOG filter kernel in N orientations prior to maximizing and minimizing all the N filtered images at each pixel. They are denoted by M and G and shown in Fig. 4(c) and (d) respectively. The reference threshold is denoted by TG and set as follows: T G ¼ c μG
Fig. 2. The profiles of two kinds of edges: a symmetrical edge (left) and an ideal step edge (right).
353
ð4Þ
where c is a constant and μG is the mean value of the FDOG response G. Next the reference threshold is adjusted by the
Fig. 3. Responses of MF and FDOG to the two kinds of edges: (a) MF, (b) the filter response of MF, (c) FDOG, and (b) The filter response of FDOG.
354
B. Song, X. Li / Neurocomputing 129 (2014) 350–361
Fig. 4. The example of the proposed approach's flow. (a) The original image with 8 power lines, some branches and a mountain, (b) The gray image, (c) MF response image, (d) FDOG response image, (e) the edge map image by the proposed approach, (f) the line pool with 15 line segments obtained by filtering the edge map image (e), and (g) the finally detected 8 power lines.
following equation: T ¼ ð1 þM n Þ T G
ð5Þ
where M n is calculated by normalizing M, which is the local mean of the response M: M ¼ M nR
ð6Þ
where R is a mean filter with size of r r whose elements are all 1=r 2 . This mean filter is used to decrease the bad effects of the noise or outliers and gets more robust detection results. By thresholding T on the response G, the final edge map image D is created as 0 if Gðx; yÞ rTðx; yÞ; D¼ ð7Þ 1 otherwise: where (x,y) is the index of the pixel. It can be seen from the Fig. 3 and Eqs. (5)–(7) that if there is a symmetrical edge in the image, the magnitude in M n will be high at the corresponding area and the threshold T will become larger by (5), so the edge can be easily detected by Eq. (7); if there is a non-symmetrical edge, the corresponding magnitude in M n will be weak and the threshold T will become smaller, so it can be suppressed adaptively. As shown in Fig. 4(e), the symmetrical power lines are detected and the non-symmetrical horizon is suppressed. Inevitably, some other symmetrical line edges are detected as the power line candidates, such as branches, borderlines of the clouds and so on. This may interfere the following global operation and even result in significant false positives. So a kind of filtering technique should be needed to filter out as many non-power line segments as possible from the edge map image. Fortunately, there are some facts: the power lines are usually throughout the whole image, which means that the power lines should not be some fragments, thus the length of its edge map can usually reach dozens of pixels; more importantly, both the straight power line and the curved one are very smooth, but some detected edges may wind severely or even like tree structures. In this paper, the non-power line edges are filtered out by taking full advantage of these facts. First, the over-short edges whose connected components with a pixel count less than a threshold are filtered out. A small threshold may cause high computational complexity, inversely a large one may suppress the power line edges and cause high false negatives. We make a trade-off and set the threshold to 30 as in [20], of which image sizes (640 480 and 720 480) are the same as in [20].
Experiments show that it is an appropriate choice. Then, a measurement is proposed to weigh the smoothness of the detected edges before filtering simply by thresholding the proposed smoothness measurement. Noticing that any one of the power lines can be fitted by a quadratic polynomial whose second order derivative is a constant, the smoothness measurement denoted by S for an edge is defined as the variance of the second order difference of each point's tangential direction: ! 1 S¼ ∑ ðDðx; yÞ μD Þ2 ð8Þ l 1 ðx;yÞ A e where (x,y) is the index of a point on the edge e, l is the length of the major axis of the ellipse which can capture the edge, Dðx; yÞ is the second order difference of the tangential direction at the point (x,y) and μD is the mean value of the second order differences of all points belonged to e. The range of the tangential direction is ð π ; π . From the above definition, it can be seen that the edge is smoother, the measurement value S is smaller. The edge map is filtered by thresholding on the smoothness measurement: kept if S rt u ; ð9Þ e is filtered out otherwise: The upper bound tu can remove the non-smooth edge, which is determined by the smoothness measurements of the training line segments. Consequently, the set of candidates is slimmed and a refined line segment pool is obtained as shown in Fig. 4(f). More examples of power line detection by the proposed twostage way algorithm is shown in Fig. 5. It can be seen that the proposed method is more robust than MF-FDOG method. Comparing to the second and the fourth columns, when the background is simple or the contrast between it and power lines is strong, the two methods are satisfactory, such as rows (1), (2) and (6). But when the background is cluttered or the power lines are background-like, the false negatives of the proposed method is lower than MF-FDOG method, such as rows (3)–(5). 2.3. Grouping detected line segments into power lines In this subsection, grouping line segments in the pool into whole power lines is introduced. There are three questions which must be answered: Which elements in the pool are the useful
B. Song, X. Li / Neurocomputing 129 (2014) 350–361
355
(1)
(2)
(3)
(4)
(5)
(6)
Original images
Line segment pool by MF-FDOG method
Detection results by MF-FDOG method
Line segment pool by Proposed method
Detection results by proposed method
Fig. 5. The examples of power line detection by the proposed two-stage way algorithm.
compositions? Which ones are the background noises? And which elements are each whole power line composed of? The classical answers are HT and morphological methods. But these methods make an assumption on line model that power lines are straight, and ignore curve situation in which power lines usually hang down for force of gravity. Mathematically, straight and curve lines can be described by first-order and high-order polynomial respectively. As a matter of fact, it is enough to represent power line by quadratic polynomial, which curves down and shapes like parabola. Hence, a quadratic polynomial is used to fit every power line instead of straight line assumption. This enables to detect not only the straight power lines but also the curve ones. The problem is formulated as a graph cut model. First, a connected undirected graph is constructed. Every line segment is defined as a vertex of the graph, the connection between any pair of vertices is the edge. The weight of every edge is computed based on the error of quadratic polynomial fitting corresponding two vertices. Then, Ncut method is used to partition the graph, finding out that which line segments are collinear. Last, power lines are obtained and localized by quadratic polynomial fitting collinear line segments. We begin with the definitions of some terminologies. Let G ¼ ðV; EÞ be a connected undirected graph where vi A V is a vertex corresponding to an element of the line segment pool V. eij A E is the edge between the line segments vi and vj, and each edge has a weight wðvi ; vj Þ which is defined to quantitatively measure the
probability of collinearity between the two line segments connected by the edge. The greater the probability, the bigger the weight. The line segment pool is wished to be partitioned into mutually exclusive groups, such that each group is a connected ~ representing a complete line, where V~ D V, E~ D E. graph G~ ¼ ðV~ ; EÞ Taking binary partition as an example, nonempty sets V1 and V2 form a partition of the graph G if V 1 \ V 2 ¼ | and V 1 [ V 2 ¼ V. In this paper, the partition is based on the graph cut model. A cut is related to a set of edges which are removed from the graph G and break the graph into two disjoint sets V1 and V2. So the grouping problem can be solved by minimizing the cut cost, which is defined as ! min cutðV 1 ; V 2 Þ ¼ min
∑
v1 A V 1 ;v2 A V 2
wðv1 ; v2 Þ
ð10Þ
Because the number of line segments belonged to each group may be varied a lot, this can cause bias toward small group. The object function (10) is transferred by the Ncut [30] to alleviate the unnatural bias as follows: cutðV 1 ; V 2 Þ cutðV 1 ; V 2 Þ min NcutðV 1 ; V 2 Þ ¼ min þ ð11Þ volðV 1 Þ volðV 2 Þ where volðV 1 Þ ¼ ∑v1 A V 1 ;v A V wðv1 ; vÞ. Hence the minimal cut bias is circumvented because the small group will not have small Ncut cost in Eq. (11). The Ncut optimization problem is well studied in
356
B. Song, X. Li / Neurocomputing 129 (2014) 350–361
Table 1 Grouping results of the line segment pool in Fig. 4(f). Whole lines
1
2
3
4
5
6
7
8
9
Line segments
4,15
8
5
7,12
6
1,14
3,10,13
2,9
11
spectral graph theory and can be modeled as the following generalized eigenvalue problem: yT ðD WÞy yT Dy subject to yðiÞ A f1; βg and yT D1 ¼ 0
min y
ð12Þ
where W is the weight matrix with W ij ¼ wðvi ; vj Þ, D is a diagonal matrix with element dðiÞ ¼ ∑nj¼ 1 W ij , and β ¼ ∑vi A V 1 dðiÞ= Σ vj A V 2 dðjÞ. According to Rayleigh–Ritz theorem [31], the second smallest eigenvector of the graph Laplacian matrix L ¼ D M is the real valued solution to our Ncut problem. Finally, partitioning the graph can be performed by thresholding on the eigenvalue of Laplacian Matrix. Now, it comes to the core component, the design of the weight wðvi ; vj Þ of the edge eij. Take Fig. 4(f) as an example to illustrate how the weight measures the probability of collinearity between any two line segments. The line segments labeled “4” and “8” have many intersections of the column coordinates, which implies that they are impossible to be collinear, so the weight wðvi ; vj Þ is set to 0. But the situations for the line segments labeled “4” and “14” or “4” and “15” are different, there are no common column coordinates. At this moment, the weight measuring the probability of collinearity is defined as err ij wðvi ; vj Þ ¼ a exp b ð13Þ eccij where a and b are constants, eccij is the eccentricity of the minimal ellipse which can capture the region of the line segments vi and vj, errij is the error of quadratic polynomial fitting them. By the above design of the weight matrix and the optimization method in literature [30], it is found from the experimental experience that when a and b are all set to 1, the global optimal solution can be achieved by thresholding on the eigenvalue of Laplacian matrix with 0. That is to say just the eigenvectors corresponding to the positive eigenvalues contribute to the solution. So the groups of line segments are got by partitioning the graph, and each group is on behalf of a line. The results of grouping for line segment pool in Fig. 4 (f) are given in Table 1. It can be seen that the collinear line segments are correctly assigned to the same group. After grouping, we fit the line segments in each group by a quadratic polynomial and localize the corresponding line by fitting coefficients. Then, the orientation of line is roughly estimated for following connection. Because power line is usually flat and small curvature, it can roughly approximate every detected line by a straight line. If the slant angle of the approximate straight line is in ½ π =4; π =4, the detected line is horizontal. If in ð π =2; π =4Þ [ ðπ =4; π =2, it is vertical. The horizontal line is connected by setting column-coordinate as all the integer from its minimum to maximum, and computing row-coordinate by column-coordinate and the fitting equation. Conversely, the vertical line is connected by setting rowcoordinate and computing column-coordinate. Algorithm 1. Grouping line segments into power lines. Input: Line segment pool. Algorithm: 1: Computing the weight matrix W of graph by Eq. (13). 2: Grouping line segments by partitioning graph with Eq. (12) as in [30].
3: Localizing and connecting lines, selecting long enough lines as power lines. Output: Power lines.
Finally, detected lines are regarded as power lines with a pixel count more than 14 minðm; nÞ, where m and n are the sizes of the original image. So, the over-short line “9” in Table 1 is eliminated. The remained first 8 lines are the detecting results which are shown in Fig. 4(g). More examples for verifying the validity of the proposed grouping method are given in Fig. 5. To have a clear overview of the grouping algorithm, pseudocode is given in Algorithm 1.
3. Experiments 3.1. Materials Three separated experiments are designed and implemented by a PC with a Core 2 CPU 2.93-GHz with 2-GB memory. The first one is to explain the parameters settings. The next one is to qualitatively show the performance of the proposed algorithm for images under different conditions or with different backgrounds. The last one is to quantitatively compare the proposed algorithm with MF-FDOG method and Kasturi's baseline algorithm [16]. 3.1.1. Experimental data According to that the power lines are straight or curved, the images used to evaluate the performance of the proposed algorithm are classed into two sets. The dataset with straight power lines includes 160 pictures, 90 of which taken by a Sony digital camera in Qin Mausoleum, Beijing, in July, 2010; 20 taken by a Samsung digital camera in Huxian, Xi'an, in November, 2011; and the rest downloaded from [32]. The dataset with curved power lines is screenshots from a 160-s video with one frame per 2 s, taken by Sony DSC-W55 digital camera in Huining of Gansu province, in July, 2011. The resolution of all these images is 640 480 except the 50 ones coming from [32] with 720 480. The datasets are enriched in the following four aspects: (1) different noise level; (2) different ambiguity by varying the distances of range perspective and the motion of the camera; (3) different weather conditions including cloudy, sunshine, fog and so on; (4) different backgrounds, such as clouds, mountains, buildings, trees and poles. A clutter measurement [20,33] on the complexity of the background is employed to quantitatively show the richness of our datasets. For a given image I, the clutter measurement is defined as the averaging variances of the intensity distribution at the sub-images level: sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 K 2 clutterðIÞ ¼ ∑ s ð14Þ Ki¼1 i where si2 is the variance of the intensity in the i-th sub-image, K is the number of the sub-images. In this paper, the sub-window size is set to 25% of the original image with reference to literature [20], which means that the image I is divided into 4 4 sub-images and K ¼16. For the dataset with straight power lines, the range of the clutter is from 1 to 70 with the mean 26.1194 and the standard deviation 14.0887, the statistical histogram is given in Fig. 6. For the other one, the mean and the standard deviation of the background clutter are 26.1494 and 7.7452 respectively. It can be seen that the standard deviation lowers nearly a half, which is reasonable. Because the images with curved power lines come
B. Song, X. Li / Neurocomputing 129 (2014) 350–361
55
357
Table 2 The contingency table for power line (PL) detection.
51 50
Type
PL present
PL absent
PL detected PL not detected
True positive (TP) False negative (FN)
False positive (FP) True negative (TN)
Number of images
40
34
S1
32 S2
30
S3 B3
S4
S6
B1
21
S5
20
B4
B5 B6
11
C1
10
C2
6 4
B2 C3
0
5
15
25
35
45
55
C5
65
Clutter
C6
C4
Fig. 6. The clutter statistics of images with straight lines.
from a video, the background varies continuously and in a relatively small range compared to frames took at different settings individually. 3.1.2. Evaluation measures For each power line, the ground truth is obtained by first manually labeling serval points, and then approximating with a single straight line or a quadratic polynomial. In order to evaluate the validity of the proposed algorithm under a reasonable performance metric, a straight power line is considered to be correctly detected if found the angle and the y-intercept between it and the ground truth within 101 and 20 pixels respectively [34]. A curved power line is considered to be correctly detected if the average distance of the corresponding points is within 20 pixels. The performance is measured by true positives and false positives at line-level. The contingency table for power line detection is given in Table 2. Focusing on the two-by-two matrix in the bottom right corner of the table, the elements on the major diagonal indicate the correct decisions made, and the one of these diagonals indicate the errors. Based on the confusion matrix, true positive rate and false positive rate are computed dividing the number of true and false positives by the number of truth power lines which are the sum of true positives and false negatives. 3.2. The parameters 3.2.1. The parameters in the edge map s: 1, the scale of MF and FDOG filters. The power line is thinner, the scale should be smaller. On the contrary, the power line is wider, the scale should be larger. L: 10, the length of the kernel for MF and FDOG filters. N: 8, the number of orientations of MF and FDOG filters kernels. r: 10, the scale of the mean filter. c: 3, determining the reference threshold and the most important parameter in our algorithm. The larger the c, the higher the reference threshold, so the edge can be easily detected which may cause high false positives. Inversely, it is smaller, the reference threshold is lower, so the edge will be suppressed
Fig. 7. The different kinds of detected edge samples.
Table 3 The smoothness measurement of the edge samples in Fig. 7. S1 10.5137 C1 16.9257 B1 100.47
S2 29.7981 C2 14.9460 B2 809.9
S3 11.1864 C3 42.0677 B3 23.1
S4 28.3541 C4 28.7300 B4 802
S5 6.0145 C5 15.2183 B5 845
S6 13.6882 C6 31.4180 B6 54.5
and which may cause high false negatives. The performance over c is evaluated ranging from 1.4 to 3.2 with the step length of 0.2. Because in the threatening obstacle avoidance for aircrafts flying, the damage caused by the false negatives is much more hazardous than the false positives, a larger value is set for c ¼3.
3.2.2. The parameters in filtering the edge map image The non-smooth edge samples are filtered out by thresholding on the smoothness measurement S in Eq. (9), so a proper choice of the upper bound tu is very important. Under the inspiration of the definition of S and the experimental experience that the major factor of controlling S is the curvature of the edge and secondarily length, tu is determined by the smoothness measurements of the training line segments shown in Fig. 7. More specifically, the more severely the edge winds, the larger the smoothness measurement S is. Three symbols S; C and B stand for the following three classes of edge samples: straight and smooth edge, curved and smooth edge and non-smooth edge or even with branches. The former two classes edge samples are smooth and on behalf of the straight power lines and the curved ones respectively, the last class samples are non-smooth and wished to be filtered out. Six representative edge samples with different length for each class are picked out. The smoothness measurement of these edge samples is given in Table 3. It can be seen that the maximum of the smoothness
358
B. Song, X. Li / Neurocomputing 129 (2014) 350–361
Original images
MF-FDOG method
Proposed method
Original images
MF-FDOG method
Proposed method
Fig. 8. The ten representative detection results by MF-FDOG method and our method are visualized. (For interpretation of the references to color in this figure caption, the reader is referred to the web version of this paper.)
measurement for the smooth edge samples is 42.0677 corresponding to C3 and the minimum for the non-smooth edge samples is 23.1 corresponding to B3. But B3 is relatively smooth and very short in fact, which means that it is hard to say B3 should be filtered out or not. Therefore, this sample can be thought as an outlier and removed under certain precision requirements. The second minimum of the smoothness measurement for the nonsmooth edge samples 54.5 corresponding to B6 is considered as the lower bound, which is larger than the maximum 42.0677. Based on the above analysis, tu is set to 45. 3.3. Experimental results To evaluate the efficiency of the improved edge map algorithm, the detection results are visualized and quantified by comparing with MF-FDOG algorithm. Then, we compare the proposed algorithm with Kasturi's algorithm [16] which is a typical power line detection method based on single-frame image just on the straight power line dataset. Because so far there is no other detection method for curved power lines to the best of our knowledge. The quantified comparisons focus on three aspects: true positive rate, false positive rate and time consumption. In Fig. 8, some detection results by MF-FDOG method and our method are visualized. In the upper half part, three rows are original images, MF-FDOG method and our method from the top
to the bottom. The lower half part is the same. Considering that either high visual complexity of background or bad weather condition may make it difficult to detect power lines which are thin and not salient, the comparison is mainly conducted on these special images. The specialties of the ten images are given in Table 4. The word “cluttered” is used for describing the visual complexity of backgrounds; “small” and “large” are for the number of the power lines; “scattered” and “dense” are for the distribution of them. It can be seen that these images cover different aspects of the data and are representative. From the detection results drawn by red lines, it can conclude that our method is as good as MF-FDOG method when the detection problem is easy, such as (a1) and (e1) with simple backgrounds or (d2) and (e2) with high contrast. The detection problem in the rest of the six pictures is tough for the dense power lines (a2), seriously cluttered backgrounds (b2), or poor visibility conditions (c) and so on. In this case, our method is better than MF-FDOG method, especially for background-like power lines as shown in (c). Since power line detection is used for threat avoidance, this requires the detection should be valid for different illumination conditions under various weather. Illumination influences the contrast between the background and lines directly, accordingly influences the detection. In fact, the proposed method can be applicable provided the contrast is not too low, even if for strong light (see (e1) in Fig. 8) or weak light (see (c) in Fig. 8 and (4) in Fig. 5).
B. Song, X. Li / Neurocomputing 129 (2014) 350–361
359
Table 4 The specialties of the ten representative images in Fig. 8. a
b
Non-cluttered and low contrast a1 Scattered
c
Cluttered and low contrast b1 Non-crossed
a2 Dense
d
Thick fog b2 Crossed
e
Strong interferences (bulidings' edges) d1 Clouds and bright
f
Illumination e1 Strong light
d2 Sky and dark
Curve lines e2 Reflect light
1
Table 5 Detection results on the dataset with straight power lines.
0.9 Line type
Methods
TP rate (%)
FP rate (%)
Computing time(s)
Straight lines
Kasturi's [16] MF-FDOG Proposed
45.39 88.44 91.95
71.54 41.30 38.96
43 0.92 0.93
MF-FDOG Proposed
85.67 91.33
31.33 29.67
1.12 1.09
Curved lines
If there is occlusion, which means that the corresponding region is covered by some barriers and it should be marked as dangerous or no-fly area. Now detecting the hiding line can require less. From this point of view, we do not focus on the occlusion problem. However, for a certain degree of occlusion, the proposed method can well localize power lines and get good detection (see (3) in Fig. 5, (b2) and (d1) in Fig. 8). The quantified comparison between MF-FDOG method and the proposed method is given in Table 5. It can be found that the time consumptions of these two methods are almost the same, but the proposed method is more efficient than the MF-FDOG method for higher true positive rate and lower false positive rate. The Receiver Operating Characteristic (ROC) curves are given in Fig. 9. The scale c in the reference threshold (4) is the control variable. It can be seen that the ROC curves of the proposed method are over MFFDOG method, which means that the proposed method is more efficient than MF-FDOG method. The last experiment is to compare the proposed algorithm with the pre-existing algorithm on the straight power line dataset. Since Kasturi's [16] contribution is a typical power line detection algorithm for threat avoidance based on single-frame image, it is used as the baseline comparison. The results including, accuracy and computational time, two aspects are presented in Table 5. The true positive rate of the proposed algorithm is 91.95%, which is more than twice of Kasturi's 45.39%. The false positive rate of the proposed algorithm is 38.96%, which is 32.58% lower than Kasturi's. The computational time of each algorithm is the average value on all images. For the proposed algorithm, most of the computational time about 70% is consumed by the convolution filtering in line segments detection, in the next place is partitioning graph in grouping line segments about 15%. It is worth mentioning that proposed algorithm only consumes about 2.16% time of Kasturi's. However, it still needs about 1 s for each image which is insufficient for real-time processing. One way to reprieve this insufficient is using more powerful computer. 3.4. Discussion Certainly, the proposed method has its weak points. From the results shown in Table 5, it can be found that the false positive rate
True positive rate (%)
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
0.1
0.2
0.3
0.4
0.5
False positive rate (%) Fig. 9. The ROC curves of power line detection performance with two datasets, comparing MF-FDOG method and the proposed method.
reaches up to 38.96%, which is far from ideal. In our analysis, there may be two reasons: Algorithm: In the local criterion of the proposed algorithm, over detection is adopted to ensure low false negatives on account that the damage caused by it may be insupportable. Data: Various line structures in images are strongly confusing, which makes the detection be much more challenging. Specially, the images coming from urban settings are pretty tricky to be done for the interference of the buildings' edges. Two examples are shown in Fig. 10.
4. Conclusions A sequential local-to-global power line detection method is proposed in this paper, by which not only the straight power lines but also the curve ones can be detected. In the local criterion, a line segment pool is obtained via over detection which is achieved by a problem-specific design edge detector prior to filtering the edge map image. In the global criterion, a unified framework based on graph cut model is novelly developed to group the straight or curve line segments into whole power lines. Experimental results
360
B. Song, X. Li / Neurocomputing 129 (2014) 350–361
Fig. 10. The example of images with high false positives.
qualitatively and quantitatively demonstrate that the proposed power line detection method is efficient. Of course, the proposed method is by no means perfect. When there is strong interference coming from various kinds of line structures, specially the edges of buildings, the method cannot discriminate the power lines from other interferences intelligently. A prospective solution is to introduce more features of the power lines, for example the spatial relationship between them and the wire towers.
Acknowledgments This work is supported by the National Basic Research Program of China (973 Program) (Grant no. 2011CB707000), by the National Natural Science Foundation of China (Grant nos. 61125106 and 91120302), and by Shaanxi Key Innovation Team of Science and Technology (Grant no. 2012KCT-04). The authors acknowledge Prof. P. Yan for having proofread the first version of this paper.
References [1] B. Bhanu, S. Das, B. Roberts, D. Duncan, A system for obstacle detection during rotorcraft low altitude flight, IEEE Trans. Aerosp. Electron. Syst. 32 (1996) 875–897. [2] T. Gandhi, M. Yang, R. Kasturi, O. Camps, L. Coraor, J. McCandless, Detection of obstacles in the flight path of an aircraft, IEEE Trans. Aerosp. Electron. Syst. 39 (2003) 176–191. [3] J. Byrne, M. Cosgrove, R.K. Mehra, Stereo based obstacle detection for an unmanned air vehicle, in: International Conference on Robotics and Automation, pp. 2830–2835. [4] E. Hanna, P. Straznicky, R. Goubran, Obstacle detection for low flying unmanned aerial vehicles using stereoscopic imaging, in: Instrumentation and Measurement Technology Conference Proceedings, pp. 113–118. [5] P. Avizonis, B. Barron, Low cost wire detection system, in: Proceedings of Digital Avionics Systems Conference, vol. 1, pp. 3.C.3–1–3.C.3–4. [6] 〈http://www.spacemart.com/reports/enstrom_to_use_safe_flight_power_line_ unit_999.html〉. [7] L. Ma, Y. Chen, Aerial Surveillance System for Overhead Power Line Inspection, Center for Self-Organizing and Intelligent Systems (CSOIS), Utah State University, Logan, Technical Report, 2004.
[8] K. Yamamoto, K. Yamada, Analysis of the infrared images to detect power lines, in: Proceedings of Conference on Speech and Image Technologies for Computing and Telecommunications, vol. 1, pp. 343–346. [9] H. Essen, S. Boehmsdorff, G. Biegel, A. Wahlen, On the scattering mechanism of power lines at millimeter-waves, IEEE Trans. Geosci. Remote Sens. 40 (2002) 1895–1903. [10] P. Garcia-Pardo, G. Sukhatme, J. Montgomery, Towards vision-based safe landing for an autonomous helicopter, Robot. Auton. Syst. 38 (2002) 19–29. [11] 〈http://www.safeflight.com/mmain.php?px=1&cm=12&cs=92〉. [12] I. Golightly, D. Jones, Visual control of an unmanned aerial vehicle for power line inspection, in: Proceedings of International Conference on Advanced Robotics, pp. 288–295. [13] G. Yan, C. Li, G. Zhou, W. Zhang, X. Li, Automatic extraction of power lines from aerial images, Geosci. Remote Sens. Lett. 4 (2007) 387–391. [14] Z. Li, Y. Liu, R. Hayward, J. Zhang, J. Cai, Knowledge-based power line detection for uav surveillance and inspection systems, in: International Conference on Image and Vision Computing New Zealand, pp. 1–6. [15] Z. Li, Y. Liu, R. Walker, R. Hayward, J. Zhang, Towards automatic power line detection for a uav surveillance system using pulse coupled neural filter and an improved hough transform, Mach. Vis. Appl. 21 (2010) 677–686. [16] R. Kasturi, O. Camps, Y. Huang, A. Narasimhamurthy, N. Pande, Wire Detection Algorithms for Navigation, NASA Technical Report, 2002. [17] C. Steger, An unbiased detector of curvilinear structures, IEEE Trans. Pattern Anal. Mach. Intell. 20 (1998) 113–125. [18] J. Candamo, R. Kasturi, D. Goldgof, S. Sarkar, Vision-based on-board collision avoidance system for aircraft navigation, in: Proceedings of SPIE, vol. 6230. [19] J. Candamo, D. Goldgof, Wire detection in low-altitude, urban, and low-quality video frames, in: International Conference on Pattern Recognition, pp. 1–4. [20] J. Candamo, R. Kasturi, D. Goldgof, S. Sarkar, Detection of thin lines using lowquality video from low-altitude aircraft in urban settings, IEEE Trans. Aerosp. Electron. Syst. 45 (2009) 937–949. [21] J. Candamo, D. Goldgof, R. Kasturi, S. Godavarthy, Detecting wires in cluttered urban scenes using a gaussian model, in: International Conference on Pattern Recognition, pp. 432–435. [22] S. Chaudhuri, S. Chatterjee, N. Katz, M. Nelson, M. Goldbaum, Detection of blood vessels in retinal images using two-dimensional matched filters, IEEE Trans. Med. Imaging 8 (1989) 263–269. [23] B. Zhang, L. Zhang, L. Zhang, F. Karray, Retinal vessel extraction by matched filter with first-order derivative of gaussian, Comput. Biol. Med. 40 (2010) 438–445. [24] A. del Toro-Almenares, C. Mihai, I. Vanhamel, H. Sahli, Graph cuts approach to MRF based linear feature extraction in satellite images, in: Progress in Pattern Recognition, Image Analysis and Applications, pp. 162–171. [25] C. Poullis, S. You, U. Neumann, Linear feature extraction using perceptual grouping and graph-cuts, in: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems. [26] D. Xiang, J. Tian, K. Deng, X. Zhang, F. Yang, X. Wan, Retinal vessel extraction by combining radial symmetry transform and iterated graph cuts, in: International Conference on Engineering in Medicine and Biology Society, pp. 3950–3953.
B. Song, X. Li / Neurocomputing 129 (2014) 350–361
[27] J. Noble, B. Dawant, A new approach for tubular structure modeling and segmentation using graph-based techniques, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 305–312. [28] N. Honnorat, R. Vaillant, N. Paragios, Graph-based geometric-iconic guidewire tracking, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 9–16. [29] W. Cheng, Z. Song, Power pole detection based on graph cut, in: Congress on Image and Signal Processing, vol. 3, pp. 720–724. [30] J. Shi, J. Malik, Normalized cuts and image segmentation, IEEE Trans. Pattern Anal. Mach. Intell. 22 (2000) 888–905. [31] H. Lütkepohl, Handbook of Matrices, John Wiley & Sons, 1996. [32] 〈http://marathon.cse.usf.edu/dataandcode.php〉. [33] D. Schmieder, M. Weathersby, Detection performance in clutter with variable resolution, IEEE Trans. Aerosp. Electron. Syst. 19 (1983) 622–630. [34] R. Cuijpers, A. Kappers, J. Koenderink, Visual perception of collinearity, Atten. Percept. Psychophys. 64 (2002) 392–404.
Biqin Song is a Ph.D. candidate with the Center for OPTical IMagery Analysis and Learning (OPTIMAL), State Key Laboratory of Transient Optics and Photonics, Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi'an, China. Her research interests include computer vision and pattern recognition.
361
Xuelong Li is a full professor with the Center for OPTical IMagery Analysis and Learning (OPTIMAL), State Key Laboratory of Transient Optics and Photonics, Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi'an, China.