Pattern Recoonition, Vol. 29, No. 12, pp. 2017-2023, 1996 Copyright © 1996 Pattern Recognition Society. Published by Elsevier Science Ltd Printed in Great Britain. All rights reserved 0031-3203/96 $15.00+.00
Pergamon
0031-3203(95)00168-9
A C O M P U T A T I O N A L L Y EFFICIENT ALGORITHM FOR E N H A N C I N G LINEAR FEATURES IN IMAGES A. G. BOLTON,* S. F. BROWN t and W. M O R A N t * Microwave Radar Division, Defense Science and Technology Organisation, Salisbury SA 5108, Australia t Department of Mathematics and Statistics, Flinders University, GPO Box 2100, Adelaide SA 5001, Australia (Received 23 March 1995; Received for publication 5 December 1995)
Abstract--We present a computationally efficient algorithm for highlighting long linear features in images.
The algorithm is based on the recursive binary decomposition of the image into subimages that have been line enhanced along different directions. After a number of successive decompositions, the subimages are recombined to yield a line enhanced image. The performance of the algorithm is similar to that of rotating-kernel-type enhancement routines. However, the new algorithm can be executed much faster, making it ideal for use on large noisy images such as those provided by synthetic aperature radar. Copyright © 1996 Pattern Recognition Society. Published by Elsevier Science Ltd. Digital image processing
Line detection
SAR images
1. I N T R O D U C T I O N
The enhancement of thin linear features is an important first step in many image processing applications, for example, the detection of roads and fences in remotely sensed images. The application that we are particularly concerned with is the rapid identification of roads in synthetic aperture radar (SAR) images, which contain a large amount of speckle noise and background features. Local enhancement operators, such as the application of small orientation selective masks to the image, are a fast and efficient method for enhancing images which have little or no noise present in them. Such methods include the Duda road operator, described by Fischler et al., I1) and the mask operators described by Rosenfeld and Kak/2) However, these techniques give an unacceptably large number of false detections when applied to images, such as SAR, that contain a significant amount of noise. The rotating kernel min-max transform (RKMT) discussed by Hou and Bamberger t3) enhances linear features in noisy images by convolving the image with a series of long narrow kernels, with major axes at regularly spaced angles. At each pixel in the image, the maximum and minimum values returned by the convolutions are stored and later used for identifying bright and dark linear features respectively. Skingley and Rye ~4) and Sarabandi et al. 15) have used the Hough transform to detect thin lines in SAR imagery. The technique maps each image pixel to a curve in a parameter space then selects points in parameter space with significantly large or small values. These correspond to bright and dark lines in the image respectively. Hall t6) uses a windowed Radon transform to enhance linear features in SAR images.
Both these methods work well but are very computationally intensive. In this paper we present a fast algorithm for rapidly highlighting linear features in noisy images. The basic algorithm is detailed in Section 2 and its performance is examined in Sections 3 and 4. 2. THE ALGORITHM
The algorithm consists of four passes over the image, with each pass comprising a number of iterations of the core algorithm. The basic algorithm enhances lines in a 45 ° range of orientations, thus the four passes are necessary to cover the complete range of line angles. The basic algorithm is as follows. Starting with the initial image, we replace it by an image of the same size composed of two parts. The first part consists of sums of vertical pairs of pixels. The second part consists of sums of pairs of diagonal elements. The two subimages are laid out so that the vertical pairs form vertical lines and the diagonal pairs form diagonal lines. The process is shown schematically in Fig. 1. Note that the vertical and diagonal sums can be calculated in parallel if desired. The process can now be repeated, starting with the previous output image. The new image will consist of four parts, the first consisting of vertical sums of groups of four pixels, the second of lines with a vertical distance of four pixels and a diagonal offset of one pixel, the third of lines with a vertical distance of four pixels and a diagonal offset of two pixels and the fourth of the image consists of sums of four diagonal pixels. The layout of the four pixels in the input image which map to pixels in each of the four sections of the second iteration output image are shown in Fig. 2. After a third iteration, the image consists of eight parts
2017
2018
A.G. BOLTON et al.
qmmlmmml
Ill
nnnn-m
d~lIIIII
~a,aalan d~q~niiin
Input Image |1 III II III I1|1 II I I
mumml nmmmll PIll ~Mlll
'lil• | 1 I,/I ~
lI I IlI l I!1J I!1
!111 I!1 I I i111 III I I
ill IIII
II1| |II|
Output, ~!11 I III I
Image
I~,1 i I/I I
I"1 dl ! I "1 Irl~ I I IIIWlll
Fig. 1. A schematic diagram of the basic line enhancement algorithm showing how pairs of pixels in the input image are averaged and mapped onto the output image.
(a)
(b)
that the summed segment makes to the pixel axes. Due to the binary nature of the algorithm, and a desire to minimize edge effects, we require that the number of rows and columns in the image both be divisible by 2m. The advantage of this binary decomposition scheme is speed of computation. The algorithm is fast because each iteration involves only addition of pairs of array elements and because the number of pixels summed increases as a power of two at each iteration. By comparison, for rotating kernel methods the number of convolutions increases linearly with the length of the summed segments. The storage requirements of the algorithm are constant for each iteration. The main drawback of the binary decomposition method is that the enhancement of a given line depends on the alignment of the line with respect to the binary decomposition. For example, consider pattern (a) in Fig. 2 and an image that contains a set of vertical segments of length four pixels. Only one quarter of the segments in the image will be exactly overlayed by the mask. The remaining segments will be split between two adjacent pixels in the decomposed image, giving a much smaller output in the enhanced image. Figure 3 illustrates this effect. For this reason, the algorithm is only suitable for enhancing structures that are at least twice as long as the length of the summed segments (2m pixels). There will be an error of approximately half this length in the location of the end points of the enhanced lines. Unless the lines of interest are all brighter or darker than the image background, it is necessary to further process the decomposed image to better enhance the
i~~i, I+ !i
liii~i L~
(c)
(d)
Iii!~
ii i~ Fig. 2. A diagram showing how the input pixels are combined together to form each pixel of the four sections of a second iteration output image. The patterns (a) to (d) represent vertical sum followedby vertical sum, diagonal sum then vertical sum, vertical sum then diagonal sum and diagonal sum followed by diagonal sum respectively.
ili ~.,i .Zi
! ! ii
! I¸ !i
describing lines ranging from eight vertically aligned pixels to eight diagonally aligned pixels. The iterations are repeated until the amount of pixels summed is sufficient to give the required amount of noise cancellation or the length of the summed segments is comparable to the curvature of the lines in the image. The number of pixels that are summed along each segment is 2", where m is the number of iterations of the basic algorithm. The length of the summed segments is 2m/cos 0, where 0 is the smaller of the angles
| I
II Fig. 3. A diagram showing how the pixels might be summed along a typical line, shown as gray in the image.The outlined rectanglesare sets of four pixels that are summed to form one pixel in the decimated image. Note that the pixels in the decimated image that correspond to the ends of the line will have a smaller response than pixels corresponding to the central parts of the line.
A computationally efficient algorithm for enhancing linear features in images lines. We do this by convolving each row of the decomposed image filter designed to enhance peaks above or below the mean value. Typically the filter will give positive pixels at the location of bright points, which correspond to bright lines in the original image, and negative pixels at the location of dark points, which correspond to dark lines. Background areas of the image are set to zero by this filtering. The filter can be tailored to give peak response to features that have the same width as anticipated for the lines in the image. After the final decomposed and enhanced image has been obtained, it can be transformed back into a single line enhanced image using a similar transform to the initial decimation scheme. At each iteration of the inverse transform, we take the current image and use it to produce two output images. The first is formed by mapping each pixel of the vertically decimated subimage back to the corresponding pixels of the input image and the second by repeating this process with the diagonally decimated subimage. The mapping can be deduced from Fig. 1 by reversing the direction of the arrows in the diagram. These two intermediate images are now combined to form an enhanced image as follows. In the case of an image which has only bright lines present, we combine the two images by performing a pixel by pixel comparison and retain the largest value at each pixel. If the decomposed image has been filtered as described above, the enhanced input image is reconstructed by taking the pixel with the largest absolute value during the comparison. Pixels within a distance of 2" pixels of the boundary of the image will be contaminated by wrap-around effects that occurs when the decimation sums are taken across the boundary of the image. These effects can be avoided by zero padding the image before the decomposition is performed.
3. T H E O R E T I C A L PERFORMANCE
The theoretical performance of the algorithm can be studied by making some simplifying assumptions. We assume that the image consists of a set of straight lines on a uniform background. The lines have a width of one pixel and their brightness is normally distributed with mean ~ and standard deviation a t. The background has mean brightness #b and standard deviation % When the decimation operation is applied to the image we average n = 2mpixels. The average ofm background pixels will be PB =/~b with standard deviation % = ab/xfn. The average of the pixels along a decimation mask that lies on a line will depend on how well the mask approximates the true shape of the line. We denote by f the fraction of the sum of the mask pixels which is contributed by the underlying line. The normalized sum of the pixels along the line mask is, therefore: #z =f/2~ + (1 - - f ) i t b,
(1)
2019
which has a standard deviation of
Note that the given expressions for the standard deviations are only correct when the pixels are independent. In textured regions of a real image, where there are significant correlations between neighbouring pixels, it is necessary to take the pixel correlations into account when calculating the standard deviations. We assume that the lines are brighter than the background and set a threshold for detection of lines in the decimated image. The threshold is set at: T = ~B + ~a~,
(3)
where the parameter e controls the number of false detections. The probability that a line segment will be detected is assumed to be P ( X > T), where X is a random variable drawn from a normal distribution with mean #L and standard deviation o-L. This gives: P(X>_ T) = 1 - ~ ( ~ L ~ L )
=1--
\
~/faZ+(l_f)a~
(4)
/#
(5)
where • is the standardized cumulative distribution function. At a given threshold level, the probability depends on the contrast between the line and background (/2z--/~b), the standard deviations of the line and background pixels, a t and ab, the number of iterations of the decimation algorithm, m = log 2 n, and the match between the decimation mask and the line f. To further simplify the calculation, we assume that the standard deviation of the line and background have the same value, a, and tabulate results in terms of the normalized line contrast, (#~- ~b)/a. Figure 4 shows contours of probability for different mask angles and normalized contrasts. It was calculated with m = 5 and = 4.265, which corresponds to a false alarm rate of 10- 5. The match between the decimation masks and the lines was calculated by overlaying each mask on a line which passed exactly through the centre of the two pixels at each end of the mask. Lines aligned along the axes have the largest probability of being detected, but the probability is only a weak function of angle. All lines with a normalized contrast greater than 1.2 have a high probability of being detected. Figure 5 shows another probability map for the same parameters m and ~ as before. However, this time the line centres fall between adjacent, parallel pixel masks. As expected, the probability of detecting faint lines deteriorates when the lines are not optimally aligned on the image grid, particularly for lines parallel to the image axes. None the less, the probability of detection remains high for all lines with a normalized contrast of more than two. For lines with widths greater than one pixel width, the degradation between optimal align-
2020
A.G. BOLTON et al.
1.8 1.6 1.4
0.9
0.6
0'4 f 0.2
°0
;
,'o
1;
2'0
2;
3'0
Angle the mask makes to the axis (degrees)
~'s
,'0
,~
Fig. 4. Contours of probability for detection of lines at different angles to the axis and different normalised contrast. This calculation assumes the best possible match between the decimation masks and the lines.
2 1.8 1.6 1.4
~0.8
Z
0.6 0.4 0.2
°o
I
Angle the mask makes to the axis (degrees)
Fig. 5. Contours of probability for detection of lines at different angles to the axis and different normalized contrast. This calculation assumes a poor match between the decimation masks and the lines.
A computationally efficient algorithm for enhancing linear features in images
Table 1. Detected segments in the enhanced simulation images. The columns are as follows: 0 is the angle the line makes to the axis, n~ is the average number of line segments detected, n~'red1 and nr red2 are the best and worst case predictions for the number of detected line segments respectively, n b is the number of background segments detected, nL"°a is the number of background segments detected excluding segments adjacent and approximately parallel to the detected line segments and n~red is the predicted number of background segments 0
nI
n~~dl
n[ red2
n~
n~°d
n[ r~d
0 5 10 15 20 25 30 35 40 45
10.8 12.4 11.6 11.6 11.2 12.4 12.2 11.8 12.4 13.4
13.9 10.6 10.1 11.3 10.6 11.6 11.1 11.4 11.9 13.6
2.7 2.5 2.6 3.0 3.1 3.8 4.1 4.8 5.9 9.1
19.0 36.0 27.8 30.8 25.8 37.4 40.4 41.6 46.4 52.8
9.8 13.8 11.8 10.6 10.0 10.8 12.6 11.2 10.4 11.6
10.5 10.5 10.5 10.5 10.5 10.5 10.5 10.5 10.5 10.5
2021
m e n t a n d worst case a l i g n m e n t will be less pronounced. These theoretical results were checked with simulations using simple test images. In each simulation, a single line with a normalized c o n t r a s t of 1.2 was overlayed o n a 512 x 512 image a n d the image decomposed five times, thresholded, then reconstructed. The t h r e s h o l d level was set at 4.265. Each line spans 14 pixels in the d e c o m p o s e d image, so an o p t i m a l reconstruction will consist of 14 segments lying a l o n g the line. The expected n u m b e r of false a l a r m (background) segments is 10.5. The angle t h a t the line m a d e to the vertical axis was varied from 0 to 45 ° in 5 ° steps. Five simulations were performed for each angle a n d the line was shifted horizontally by 0.125 pixels between each simulation, giving a total displacement of half a pixel width over the five simulations. The averaged results for each angle are summarized in Table 1. T h e most striking result from the simulations is t h a t the performance of the e n h a n c e m e n t a l g o r i t h m is corn-
Fig. 6. SAR image of a rural scene.
2022
A.G. BOLTON et al.
parable to, and for some angles exceeds, that best predicted performance. This performance increase occurs because there are many adjacent and almost parallel segments along each section of the line. The probability that one of these partially overlapping segments is detected is significantly greater than the probability of a particular one being detected. This phenomenon also affects the number of background detections. The average number of false alarms is approximately four times larger than the expected rate. However, if we ignore the multiple detections of segments of the line, false alarm detections occur at a rate only slightly higher than predicted. This small increase is caused by the presence of the line, which increases the probability that segments which cross the line will be detected. 4. APPLICATION TO SAR IMAGES
The performance of the decimation on real data is illustrated by processing the image shown in Fig. 6.
This image is a 512 x 512 pixel section of a four look SAR image of a rural area taken with the INGARA radar. The instrument and data processing used to form the image are described by Stacy et al. ~7~ The pixels are 3 x 3 m and the intensity is given in a decibel (log) scale. The scene consists of a major road with parallel fences and two minor tracks near the bottom of the image. The line enhanced image is shown in Fig. 7. It has been decimated five times before being filtered to enhance line-like structures and then reconstructed. The first feature to note in the enhanced image is the "blocky" nature of the line segments. This effect is caused by the discrete nature of the decimated image and is an intrinsic part of the algorithm's output. If desired, the output can be smoothed to remove some of these discretization effects. The decimation and line filtering have readily enhanced the main road and the track in the lower right of the image. The detail of the parallel feature making up the main road are preserved in the enhanced image.
Fig. 7. Line enhanced version of Fig. 6.
A computationally efficientalgorithm for enhancing linear features in images The track in the lower left part of the image is less well defined in the initial image and is correspondingly less well defined in the enhanced image. The shadow of the tower, the bright object visible on the left side of the main road, is significantly enhanced by the algorithm. The only section of the enhanced image showing a significant amount of noise is the lower right section which corresponds to an area of trees and shadows in the original image. As the results in Section 3 show, the large brightness variations in this region increase the probability of false detections in the enhanced image. Increasing the order of the decimation process to six gives more noise cancellation than in the example shown, but the length of the pixel masks (64 pixels) is so large that the enhanced image is distorted at the curved roads in the image. Conversely, decreasing the order of the decimation process to four improves the tracking of the curves in the image but results in a noisier enhanced image. 5. DISCUSSION We have shown that the binary decimation algorithm is an efficient technique for rapidly highlighting linear features in large, noisy images. By averaging over a large number of image pixels, the algorithm is more tolerant of noise in the image than mask operators such as the Duda road operator. The algorithm is slightly less efficient at detecting lines than other techniques which average over a large number of pixels, such as the Hough transform or the R K M T method, but it can be implemented much more efficiently than these methods, making it ideal for processing large images. The major drawback of this algorithm is the segmented nature of the enhanced image. Rather than
2023
false detections consisting of random pixels in the enhanced image, they consist of pixels lying along random line segments in the enhanced image. It is therefore more difficult to postprocess the enhanced image to remove the false detections. Future work will be directed at the automatic extraction of linked line segments from the decimated and thresholded images. Acknowledgement--The authors would like to thank the
INGARA team for the provision of the SAR images used to develop and test the algorithm.
REFERENCES
1. M. A. Fischler, J. M. Tenenbaum and H. C. Wolf, Detection of roads and linear structures in low-resolution aerial imagery using a multisource knowledge integration technique, Comput. Graphics Image Process. 15, 201 223 (1981). 2. A Rosenfeld and A. C. Kak, Digital Picture Processing, (2nd edn.) Academic Press, New York (1982). 3. J. Hou and R. H. Bamberger, Orientation selectiveoperators for ridge, valley,edge and line detection in imagery, Proc. ICASSP V, 25-28 (1994). 4. J. Skingleyand A. J. Rye,The Hough transform applied to SAR images for thin line detection, Pattern Recognition Lett. 6, 61-67 (1987). 5. K. Sarabandi, L. Pierce, Y. Oh and F. T. Ulaby, Power lines: Radar measurements and detection algorithm for polarimetric images, IEEE Trans. on Aerospace and Electron. Syst. 30, 632-643 (1994). 6. R. E. Hall, Detection and extraction of quasi-linear features in SAR images using a line segment image transform and inverse transform, SPIE Vol. 1875, Ultrahigh Resolution Radar, 77-83 (1993). 7. N. J. S. Stacy, M. P. Burgess,J. J. Douglass, M. R. Muller and M. Robinson, A. real time processor for the Australian synthetic aperture radar, Proc. ICASSP V, 193 196 (1994).
Aboutthe Author--ALAN BOLTON is a Senior Research Scientistin the Microwave Radar Division of the Australian Defence Science and Technology Organisation. His main research interests are in the field of signal processing.
About the Autbor--STEPHEN BROWN received his doctorate in Applied Mathematics from Sydney
University in 1994. He is currently working as a Postdoctoral Research Fellow at Flinders University. His main research interests are image processing, inverse methods and neural networks.
About the Author--WILLIAM MORAN is Professor of Mathematics at Flinders University and leader of
the Analytical Techniques and Medical Diagnostics programs at the Cooperative Research Centre for Sensor Signal and Information Processing (CSSIP).