Pattern Recognition Letters ELSEVIER
Pattern Recognition Letters 16 (1995) 1277-1286
Boundary detection using mathematical morphology * Jun Yang, Xiaobo Li * Computing Science Department, University of Alberta, Edmonton, Alberta, Canada T6G 2Hl Received 25 November 1994; revised 5 April 1995
Abstract
Object boundaries contain important shape information in an image. Mathematical morphology is shape sensitive and can be used in boundary detection. In this paper, we propose dynamic mathematical morphology which only operates on the parts of interest in an image and reacts to certain characteristics of the region. The next position of the structuring element is dynamically selected at each step of the operation. The technique is used to detect object boundaries and has produced encouraging results.
Keywords: Shape information; Boundary detection; Dynamic mathematical morphology; Roll dilation
1. Introduction
The closed external boundary is an essential feature of an object. It conveys the shape of the object and can be used to extract the object from an image. Boundary detection often serves as an early stage in image understanding and other high level computer vision applications. Ballard and Brown (1982) summarized some basic boundary detection methods based on the Hough transform, graph searching, dynamic programming and contour following. In recent years, more algorithms have been proposed and many of them concentrate on regular geometric shapes (e.g., Manjunath and Chellappa, 1991; Cooper et al., 1993). Different techniques such as neural networks (Huang and Chen, 1991 ) and fuzzy clustering (Dave, 1992) are employed to tackle the problem. These methods usually involve complex computations even * This research is supported in part by the Canadian Natural Sciences and Engineering Research Council under Grant OGP9198. * Corresponding author. Email:
[email protected] Elsevier Science B.V. SSDI 0167 -8655 ( 9 5 ) 0 0 0 8 2 - 8
when the objects in the image are simple, and only a few of them deal with multiple objects. While most methods handle static images, Dubuisson and Jain (1993) proposed an algorithm to process moving objects. Boundary detection algorithms based on active contour models or snakes have been developed in recent years (eg. Kass et al., 1987; Williams and Shah, 1992). An active contour is a deformable line modified by internal "forces" from the geometry of the curve and image forces due primarily to the intensity gradient. Active contour methods iteratively compute energy equations and require an initial contour to begin with. The object position and basic shape should be known beforehand. Most other boundary detection algorithms use edge elements and their performance is highly dependent on edge quality. Broken edges create difficulties for many detection routines. This paper proposes a boundary detection algorithm based on morphology. Section 2 defines dynamic morphological operations. Section 3 gives a detailed description of the proposed algorithm. Section 4 presents
J. Jang, X. Li / Pattern Recognition Letters 16 (1995) 1277-1286
1278
some experimental results and analysis, and conclusions are discussed in Section 5.
2. Mathematical morphology Mathematical morphology plays an important role in computer vision because it deals directly with the geometrical properties of objects. This quality makes it an efficient tool for extracting shape information from an image. We modify the traditional binary morphology definitions so that they can provide more power in shape extraction.
2.1. Traditional morphology Basic morphological operations include dilation, erosion, opening and closing. In this paper the notations of traditional binary morphology introduced by Haralick et al. (1987) are used. Binary dilation combines two sets using vector addition of the set elements. We use E N to denote Euclidean N-dimensional space. If F and S are sets in E N and f and s are their elements, f = ( f l . . . . . fN) and s = (sl . . . . . sN), then traditional dilation of F by S is defined as
FOS=
{xlx
~ EN,x =f +s
for some f E F and s E S}. In image processing N = 2, F is an image and S is a structuring element. Similarly traditional binary erosion of F by S is defined as
F Q S= {x Ix E E N and x + s E F V s E S}. 2.2. Proposed dynamic morphology Since traditional morphology operates on all pixels of an image, it may waste computation time as well as degrade the resulting precision. To solve these problems, we propose dynamic morphological operations which only deal with the pixels of interest and react to certain features of the image. Using the same notations as traditional dilation, dynamic dilation of F by S under a dynamic condition C is defined as F
0,,c s = U i { x
I X C E N , x = o q - s Vo E Oi
and s
E
S, Oi+l = Yc(Oi, F , S ) }.
Condition C is a control mechanism which is used to direct the dilate operation. It is composed of constraints on the selection of the position for S. Set Oi contains all valid positions of the structuring element origin at the ith step. Yc is a function based on condition C. It calculates the next position set Oi+~ from Oi and the relevant information in F under the restriction of C. Set O0 contains the starting position of the dilation. During the dilation process, condition C is tested at each potential position of the structuring element origin. The result of the test is used to determine the movement of the structuring element along the pixels with certain properties. Similar to dynamic dilation, dynamic erosion of F by S under condition C is defined as
F G ocS = U { x I x E O i c E N , i
x+sEF
VS E S, Oi+l = Yc(Oi, F, S)}.
Based on the definitions of dynamic dilation and dynamic erosion, dynamic opening and dynamic closing are defined as C
C
c
c
F oc,, S = ( F Q/, S) @,, S and F*~I S = ( F G o S) O,, S, respectively. Dynamic morphological operations on grey-level images can be defined similarly. Using these definitions, the application of morphological operations can be restricted to the parts of interest in an image. This may improve the result while reducing the computational expense. By devising different conditions, dynamic morphology can be employed in various applications.
2.3. Roll dilation We propose a powerful dynamic morphological operation, roll dilation, and apply it to object boundary detection. It operates on the edge segments of an object. During the operation, a structuring element "rolls" along the boundaries of the object to form an external coat. The boundary of an object usually has gaps between edge segments. If the largest gap on the object boundary is smaller than the structuring element, the gaps can be filled by the structuring element
J. Jang, X. Li/ Pattern Recognition Letters 16 (1995) 1277-1286 and thus a smooth and connected boundary of the object is produced. To begin the discussion, we define condition C for roll dilation as: the structuring element remains
outside of the object and touches some edge pixels. This condition can make the structuring element touch boundary edges of the object, but cannot prevent it from staying at a point or rolling around a small edge fragment. To keep the structuring element moving along the object boundary, a more precise condition is needed. The new condition employs a cost function which is based on an age counter associated with each edge pixel and a history record for each non-edge pixel. Suppose the original image F is defined on G = { ( i , j ) [i = 1 . . . . . M ; j = 1 . . . . . N} with height M and width N. After edge detection, pixels with edge strength e greater than a threshold Te will form a set E,
E = {x I x c G and e(x) > Te}. Suppose Nx is the set of neighbors of point x. Structuring element S consists of two parts: SI={SIsES
andNs cS}
1279
To choose the next position of o, a formula is designed to calculate the cost of moving to each candidate and the one with the lowest cost will be selected. The general rule for the cost function is that it should reflect the degree of wear of a candidate position. The more a position is worn, the higher its cost will be. The degree of "wear" depends on the age of each edge pixel being touched and the frequency this position has been used as the origin of S. Thus the cost function is quantitatively defined by three factors: the total age of the touched edge pixels, the number of new edge pixels among the touched ones, and the history of the position. At each candidate p E Lo, the total age of the edge pixets that would be touched by SB is calculated as
ap =
Z
A(x).
xEEfq( SH+p)
This is the basic cost of position p. Of the touched edge pixels, each new one is given a bonus to reduce the cost by b which is a constant associated with S. In this discussion "]l []" is used to represent the size of a set and b can be expressed as b = []S[[. The cost is then adjusted to
and
c,!p = a p - b k + b l
$8 = S - $1.
where k is the number of new edge pixels touched by $8 and l is the size of Sty, that is,
In other words, $/ is the internal part and $8 is the border of S. For each edge pixel x E E, an age counter A ( x ) is defined as the number of times it has been touched by SB in the operation. For each non-edge pixel x, a history counter H ( x ) is defined as the number of times that the origin of S has resided at x. After each dilation, S needs to be moved from the current position o to the next position which is selected from set Lo:
Lo={PIpENoAG,(Ss+p) and ( S I + p )
hE # 0
NE=0}.
This means that a new position for the S origin must satisfy the following three conditions. 1. This point is a neighbor of o. 2. When S resides at this point, SB touches at least one edge pixel. 3. When S resides at this point, St does not cover any edge pixels.
k = II{x I x ~ E N (SR + p ) and a ( x ) =0}ll and
l= IISsll. This guarantees to assign a lower cost to a position which will let SB touch more new edge pixels. The item b I is used to shift the cost so that cp' ~> 0. If the position has never been used as the origin of S before, the corresponding cost is divided by b. Otherwise a penalty bH(p) will be added to the cost. The aim of doing this is to encourage S to move to the least used position. When all candidate positions have the ! same ct,, this adjustment will assign the lowest cost to the least used one. Finally the cost of selecting p is calculated as
{ c~/b Cp =
!
cp + bH(p)
if H ( p ) = O, ifH(p) >0.
1280
J. Jang, X. Li/ Pattern Recognition Letters 16 (1995) 1277-1286
Among all the candidates, point Ps with the lowest cost will be selected as the next position of o. Explicitly, it has to satisfy
Ps E Lo
and
cp~ ~ Cq Vq C Lo.
After S is moved to Ps, Ps becomes o and the age of each edge pixel touched by SB and the history of the new position are updated by
A ( x ) ffiA(x) + l Vx E ( S B + o ) N E and
H(o) = H(o) + 1, respectively. This selection procedure favours new positions and S would touch as many new edge pixels as possible. If S comes to a margin of the image, it will turn around and continue in the opposite direction. The roll dilation process stops when either the structuring element returns to the start point or it reaches image margins two times. For each new position p of the structuring element origin o, the complete condition C has three parts and is summarized as:
Cj:
p E NoNG.
C2:
( S B + p ) N E ~ O a n d ( S t + p ) NEffi(L
C3:
cp <~Cq Vq E Lo.
Thus the function Oi+l = Yc(Oi, F, S) for roll dilation can be detailed as
Oi+l ffi {p I P E Lo, o E Oi and cp <~Cq Vq E Lo}. In roll dilation, Oi contains a single point. For other dynamic morphological operators, Oi may have multiple points. The set O0 contains the start point which can be determined by searching from an image comer. Fig. 1 shows an example of cost calculation and new position selection in a roll dilation process. In this figure each edge pixel is represented by a square with the age inside. The square for each candidate position is divided into two rectangles showing the H value in the top rectangle and the cp in the bottom one. The letter at the top-right comer of each image indicates the order in the computation sequence. The trace of the structuring element is shown by the grey area and the trace of the origin o is indicated by the black squares in the last image. An example of roll dilation is shown in
Fig. 2 where (a) contains a synthetic edge image of a fish and a disc shaped structuring element, (b) shows the roll dilation result and (c) is the inside contour of the roll dilation trace in (b). As can be seen from the figure, the roll dilation process has filled the gaps on the boundary of the object and produced a smooth and closed curve. From the above analysis it can be concluded that roll dilation is a dynamic process. The age A and the history H are updated at each step. The path of the structuring element origin is determined by condition C which can only be tested dynamically. Thus there is no sequence of traditional dilations or erosions which has the equivalent effect.
3. Algorithm description Here we develop a boundary detection algorithm on the basis of edge detection and roll dilation. The method can be used to detect boundaries of more than one object in an image. The background of the image should be simple to allow a reasonable separation between any two objects. The detected edges of each object may be broken. This algorithm is robust within a certain noise level. The whole process can be divided into three phases: edge detection, morphologi-
cal grouping and boundary formulation. 3.1. Edge detection Canny edge detection is applied to the input image generating an edge map. In our experiments all thresholds of the edge detection algorithm are fixed and changing these thresholds only slightly affects the result. Noise edges can cause deformation of the detected boundaries. If some scattered edges fill the area between different objects with a certain density, multiple objects may be detected as one object. Small edges near an object boundary can cause shape distortion of the object. To remove noise edges, a low pass filter is applied before edge detection and a threshoiding operation is used thereafter.
3.2. Morphological grouping To provide a rough estimate of each object position and its area range, a simple morphological dila-
J. Jang, X. Li/ Pattern Recognition Letters 16 (1995) 1277-1286
1281
c
o 0 o
o
!iiiiiiiii!!iiiii!iiiiiiiiii ~ii~:::i~Y~il :~ iiiiiiiiiliiiiiii!iiiii 3
o
o
iiiSiiii~!i:i ~ ~ :.~.............
i~iii~ii
iii~i~!i
0
i~ii~ii~iiii
~$~: i:~!:{i:{{:{:i
0
3
0
13
I|[I
l l ~ I I I I I
]~[I
I I ~ ~
0
Ili[~l
13
iiii!iiii!!iii!iiii~!iiiiii
6
iiiii~iiili~:iiiiiii~:ii~i:~iiiiJiii!ii!ii
o iiiiii~ii~:~ ~'~........... '
. . . . . . . .i=iii~ii iiiiii~iiiii .. =
,,,,,,:,,,~,:,:,, i!iii!iiii!iiiLL
i!iii!i!i!i!iiii!i! :=JL~J
[] structuring element (b=12,1=8)
__
I-"T'I
rrT-~
__
edge pixei with age inside
__
next position candidate
~ii!i~iii!i:i:ili:i::l Fig. 1. Example of computing cost and selecting new position.
I
),))_~J
(~)
.)
)
.
(b) Fig. 2. A synthetic fish image and the detected boundary.
(c)
J. Jang, X. Li / Pattern Recognition Letters 16 (1995) 1277-1286
1282
tion is applied to the edge map. Most edges will be connected into several separated groups with each one corresponding to a potential object.
3.3. Boundary formulation Roll dilation is performed around each edge group to form the boundary of an object. The process is similar for different objects. At the beginning, a structuring element moves towards the center of a group from the farthest image comer point. The first position where the structuring element touches some edges is used as the start point. Next, the structuring element rolls along the outside boundary of the object from the start point. The roll dilation process continues until the structuring element returns to its start point. When an object is at the corner or on one side of an image, the process terminates after the structuring element reaches the image margins twice. Since the edge groups are processed one at a time, the separation between them can be smaller than the diameter of S. The following pseudo code describes the algorithm more precisely. FOR (;;) IF there is no more edge group in E EXIT ELSE SELECT one group A MARK all edges in A FIND a start point (is, js) from the farthest image corner
i=i,.,j=j.~ FOR(;;) FOR each neighbor of (i,j) IF boundary of the structuring element touches some edges CALCULATE the cost ENDIF ENDFOR SELECT the neighbor (in, jn) with the lowest cost
i=in, j =jn IF (i,j) reaches (is,j~,.) or reaches image margins twice EXIT ENDIF ENDFOR
CALCULATE the area of the detected region IF size of the region > a predefined threshold COUNT it as an object ENDIF ENDIF ENDFOR
4. Experimental results and analysis The proposed algorithm has been tested on images with varying numbers and types of objects. The results are encouraging.
4.1. Experimental results Two examples of detecting object boundaries from still images are given in Fig. 3 and Fig. 4. The images are arranged in the same order for both figures. The original image is shown in (a), while the detected edges are in (b). The roll dilation result and the detected boundaries are given in (c) and (d), respectively. The algorithm detected the correct boundaries even when the edges are obviously broken. Fig. 4 shows an example with three objects of irregular shape. Although there is noise in the background, boundaries of these objects have been detected with only minor distortion in two of them. Tests on other images with several objects also produced encouraging results. We have also modified and used the algorithm to detect the boundaries of moving objects in an image sequence. An edge map is generated for each frame of an image pair from the sequence. An ExclusiveOr operation of the two edge maps is then conducted to obtain the moving edges in each edge image. The remainder of the procedure is the same as in Section 3. Fig. 5 shows the result of applying this method to two frames of an image series of street scenes which contain a pedestrian and three moving vehicles. In the figure, al and a2 are input images, bl and b2 are the corresponding edge images, cl and c2 show the detected moving edges, and the roll dilation results are shown in dl and d2 respectively. As can be seen from the result, all four boundaries have been detected with differing degrees of deformation.
1283
J. Jang, X. Li/ Pattern Recognition Letters 16 (1995) 1277-1286
edo (~)
(b)
(c)
(d)
Fig. 3. Experimental results on image 1.
~i!i~'i
d (~)
(b)
(c)
(d)
Fig. 4. Experimental results on image 2.
L
~
J
s
X,
(al)
(bl)
(cl)
(dl)
(a2)
(b2)
(c2)
(d2)
Fig. 5. An example of detecting boundaries of moving objects.
J. Jang, X. Li/ Pattern Recognition Letters 16 (1995) 1277-1286
1284
4.2. Result analysis The quality of the detected boundary is dependent on the structuring element and the edge image. To quantitatively describe this effect we introduce several variables. Assuming Br is the real size of an object, Bd is the area within the detected boundary and the object location is correct, size correctness is defined as
R~• = max _{0' 1
IBdBr--Brl }.
This simple definition represents the quality of the boundary detection result with respect to the size of the object. R, is close to 1 when a detected boundary fits the object boundary very well, and approaches 0 when the detected boundary is highly inaccurate. In our experiments we usually have IBa - Br I < Br. Shape and size of the structuring element used in the roll dilation process will affect the detection result. Without any knowledge about an object, disc-shaped structuring elements usually produce better results than other shapes. Only discs are used as structuring elements in the experiments presented here. In the following discussion, structuring element size is defined as its maximal radius and represented using Ss. Fig. 6 shows the relationship between Rs and S~ of three images. Similar relations have been observed from other tested images. As can be seen from the figure, structuring elements with Ss larger than 2.5 pixels produce a much higher Rs than smaller ones, but Rs grows slowly after Ss has exceeded 2.5 pixels. In roll dilation, when Ss is smaller than some gaps on the boundary edges of an object, it will fall inside the object and the detected boundary will not fit the real object. A structuring element which is larger than all gaps on the object boundary can leap over the gaps. The corresponding dilation process rolls along the boundary of the object and the detected boundary will fit the real object. When &. is increased further, the detected boundary will change only slightly. The turning point of the R,. curve, which occurs when Ss = 2.5 in Fig. 6, can be used to determine an optimal structuring element in terms of size correctness. Structuring element size also affects roll dilation time. Fig. 7 shows the computation time Tc observed by testing three images using different sized structuring elements. It can be concluded that when S~ is incremented, Tc drops first and then begins to increase after
a certain structuring element size. The explanation is that for small size structuring elements, roll dilation may fall into the object due to broken edges on the object boundary. Sometimes the process will go around each edge segment in the edge image and consume much computation time. When Ss reaches a certain level, the dilation process will not fall through the edge gaps. The operation will only go along the outside of the object and thus reduce computation time. Computation time Tc is defined as Tc = D t × L n in which O t is the dilation time at each position, and Ln is the total number of dilation operations. Dt can be expressed a s D t = gS~ where g is a constant. The relationship can help one select a suitable structuring element in order to minimize computation expense while achieving the correct result. Our experiments, as shown in Fig. 7, suggest that Ss = 2.5 is a good choice for these images. Further testing is necessary to determine the optimal structuring element size for general cases.
5. Conclusions
In this paper, we propose a method to detect object boundaries using dynamic mathematical morphology. The algorithm proves to be robust when applied to different images. The proposed method has the following characteristics: • Using dynamic morphological operations, the implementation of the algorithm is simple. • Fixed parameter values function for most images in both edge detection and roll dilation processes. Structuring element size Ss can be predicted from the original image given some prior knowledge, such as object size and noise level. • The algorithm can process scenes with separate objects of various shapes and sizes in a simple background. • The computation is fast. Let the detected boundary length of an object be m, the total number of operations is
t = m(n x 2rr& 2 q- 2¢rSs2) = m(n + 1 )2rr& 2 in which n is the number of neighbors to be searched at each position (in our experiments n = 8). Thus the time complexity of the algorithm is O(mnS~).
J. Jang, X. Li/ Pattern Recognition Letters 16 (1995) 1277-1286
I
I
I
I
I
I
1285
I
<>
..... + .......
.
0.8
--
[]
S
0.4 0.2
[]
imagel image2 .+. image3 []
0.6 Rs
• . . . . I . . . . .t
+ .......
4
I
I
I
I
I
I
I
I
1
2
3
4
5
6
7
8
9
S~ Fig. 6. Relation between Rs and S., for three test images.
10
I
I
I
I
I
imagel 0 ' image2--{-.image3 D
I
I
I
.fl / ~
~
I
I
I
I
I
I
I
I
1
2
3
4
5
6
7
8
S~ Fig. 7. Relation between Tc and Ss of three test images.
1286
J. Jang, x. Li/ Pattern Recognition Letters 16 (1995) 1277-1286
Traditional morphology treats all pixels in an image equally and can be implemented in parallel hardware. In roll dilation, the process of selecting the next position from several candidates can be carried out in a parallel mode. A higher level parallelism is that an edge group can be divided into several parts and roll dilation is applied to each segment at the same time. When there are several edge groups in the image, they can also be processed in parallel. The method presented in this paper can be improved in several ways. First, the roll dilation process is sensitive to noise edges. An object boundary will be deformed if there are noise edges along it. More complicated conditions can be combined into the roll dilation to combat noise. This way, some noise removal preprocessing could be simplified or even omited. Secondly, additional processing techniques should be employed to handle complex backgrounds. As shown in the example in Section 4, motion environments can provide a clean background, but in still images, certain knowledge is required to handle complex backgrounds.
Acknowledgements We thank the reviewers and the editors for their helpful comments and suggestions.
References Ballard, D.H. and C.M. Brown (1982). Computer Vision. PrenticeHall, Englewood Cliffs, NJ. Cooper, J., S. Venkatesh and L. Kidtchen (1993). Early jumpout comer detectors. IEEE Trans. Pattern Anal Mach. Intell. 8, 823-828. Dave, R.N. (1992). Boundary detection through fuzzy clustering. Proc. 1992 IEEE lnternat. ConJ~ on Fuzzy Systems, 127-134. Dubuisson, M.P. and A.K. Jain (1993). Object contour detection using color and motion. Proc. 1993 lnternat. Conf. on Computer Vision and Pattern Recognition, 471-476. Haralick, R.M., S.R. Sternberg and X. Zhuang (1987). Image analysis using mathematical morphology.IEEE Trans. Pattern Anal. Mach. lntell. 9, 532-549. Huang, D.C.D. and K.T. Chen( 1991). Boundary detection based on neural network. Proc. Third Internat. Conf. on Tools Jbr Artificial Intelligence, 254-268. Kass, M., A. Witkin and D. Terzopoulos (1987). Snakes: active contour models. Proc. First Internat. ConS on Computer Vision, 259-268. Manjunath, B.S. and R. Chellapa (1991). A computational approach to boundary detection. Proc. 1991 IEEE Internat. Conf. on Computer Vision and Pattern Recognition, 358-363. Williams, D. and M. Shah (1992). A fast algorithm for active contour and curvature estimation. CVGIP: Image Understanding, 14-26.