CVGIP:
GRAPHICAL
MODELS
Vol. 54, No. 1, January,
AND
IMAGE
PROCESSING
pp. 56-74, 1992
Distance Transforms: Properties and Machine Vision Applications DAVID
Received
September
W. PAGLIERONI
13. IY90; accepted September
ing of one set of points to another is identified (Section 7). Machine vision applications discussed include nearestneighbor interpolation and planar tesselations, morphological processing (skeletonization, pruning, thinning, and thickening), and pattern matching (object detection, stereo feature matching, and map matching for vehicle navigation). Robot collision avoidance, path finding, and aerial image registration are also mentioned. Edge thinning is treated as a problem of computing thick edge skeletons with distance transforms. Distance transform correlation properties are shown to be useful for accelerating the detection of image objects modeled as edge patterns occurring at arbitrary position and orientation. Distance transform-aided adaptive point matching is also applied to the map-matching problem.
Distance transforms are shown to be useful for a variety of existing and newly proposed machine vision applications. Their utility is established and reviewed in the context of applications for newly derived distance transform properties. These properties state the effects of binary function geometrical transforms on their distance transforms, quantify effects of translation and rotation on binary function-to-distance transform cross-correlations and identify the role of distance transforms in adaptive matching of one set of points to another. Several application examples that involve pattern matching, morphology, and interpolation are provided. 0 1992 Academic Press, Inc.
1.
10, 1991
INTRODUCTION
The ability to extract and process both metric and information intrinsic to images is required for 2. DISTANCE AND NEAREST FEATURE TRANSFORMS achievement of computer and machine vision goals. Radiametric or gray-scale information is conveyed, for exLet h be a binary function of two variables x and y. If h ample, through histogram, edge, and texture data. Alter- is continuous, its domain will be [0, xf) X [0, yf) and if h is nately , metric information has to do with spatial discrete, its domain will be [O, I, . . . , xf - I] x [O, I, relationships between image points. Such information is yf - I], where in the discrete case, xf and yf are conveyed, for example, by boundary representation pa- &;,,eis.i Feature points are defined as (x, y): h(x, y) f rameters such as spline control points, run-length region 0. More specifically, if {(x0(i), ye(i), i = 0, . . . , n - 1) is representations, and shape geometry features such as the set of feature points for h then area and perimeter. Distance transforms of binary images, first introduced I,?I in [9], contain, for each pixel, the distance between that (2.1) m, Y) = c 6(x - X”(i). y - Y”(i)), pixel and the pixel of value 1 closest to it. They constitute i-o a very important source of metric information. This paper focuses on distance transform properties and their where 6 is the Dirac delta (i.e., unit impulse) function in machine vision applications. Distance transform compu- the continuous case and the Kronecker delta (i.e., unit tation methods are summarized and the unified distance pulse) function in the discrete case. Thus, for digital imtransform algorithm [15] is shown to be an exceptionally ages, b is a bit mup and our definition of feature point feasible technique. A variety of new distance transform makes intuitive sense if bit map pixels of value 1 correproperties are stated and derived. Existing and newly spond to pixels in some gray-scale image that belong to proposed machine vision applications are reviewed as some set of features (such as edges or regions of specified usages for these properties. texture). Note that for both continuous and discrete First, effects of bit map geometrical transforms on cases, the set of feature points is finite. their distance transforms are investigated (Section 5). radiometric
Properties
of correlation
between
bit maps and distance
’ This is tantamount to assuming that the discrctc grid is a rectangular lattice. Hexagonal grids are sometimes used in distance transform work but will not be discussed here.
transforms of other bit maps are derived (Section 6). Finally, the role of distance transforms in adaptive match56 1049.9652/92 $3.00 Copyright 0 1992 by Academic Press. All rights of reproduction in any form
Inc. reserved.
DISTANCE
A nearest feature transform (FT) of binary function h is given by
or 3 Y I),
Dl(x~
[NAx, Y), N,(x, ~11A an Lx’, ~‘1: W, Y’) (2 2) f 0 closest to (x, y),
*
where nearness is based on some distance ftinction D and D[(x,, yr), (x2, y2)] is the distance from (XI, y,) to (~2, ~2). The distance transform (DT) of b is
4x, Y) g D[(x, Y), WAX, Y), Wx, YNI;
(2.3)
i.e., d(x, y) is the distance between (x, y) and an FT of (x, y). Thus, DTs are unique, FTs may not be unique, and DTs can be derived from FTs but not vice versa (Fig. 2.1). The distance functions most often associated with DTs are Euclidean and nonnegatively weighted absolute, i.e., , yd, (~2, y2)1 = [(xl - x2j2 +
Dh
(yI - yd21i” (2.W
000000000 010000000 001000000 000100000 000100000 000000000 000000000 000000000 000000001 (a) 1.0 1.0 1.0 2.0 2.0 3.0 3.0 3.0 4.0
1.0 0.0 1.0 1.0 2.0 2.0 2.0 3.0 4.0
1.0 1.0 0.0 1.0 1.0 1.0 2.0 3.0 4.0
2.0 1.0 1.0 0.0 0.0 1.0 2.0 3.0 4.0
2.0 2.0 1.0 1.0 1.0 1.0 2.0 3.0 4.0
3.0 2.0 2.0 2.0 2.0 2.0 2.0 3.0 3.0
3.0 3.0 3.0 3.0 3.0 3.0 2.0 2.0 2.0
4.0 4.0 4.0 4.0 4.0 3.0 2.0 1.0 1.0
5.0 5.0 5.0 5.0 4.0 3.0 2.0 1.0 0.0
111223333 111233333 112233333 122333333 223333338 233333388 333333888 333333888 333338888
111223333 111233333 112233333 122333333 224444448 244444488 444444888 444444888 444448888
2.0 1.0 2.0 3.0 3.0 4.0 5.0 6.0 7.0
1.0 0.0 1.0 2.0 2.0 3.0 4.0 5.0 6.0
2.0 1.0 0.0 1.0 1.0 2.0 3.0 4.0 5.0
3.0 2.0 1.0 0.0 0.0 1.0 2.0 3.0 4.0
4.0 3.0 2.0 1.0 1.0 2.0 3.0 4.0 4.0
5.0 4.0 3.0 2.0 2.0 3.0 4.0 4.0 3.0
6.0 5.0 4.0 3.0 3.0 4.0 4.0 3.0 2.0
7.0 6.0 5.0 4.0 4.0 4.0 3.0 2.0 1.0
8.0 7.0 6.0 5.0 4.0 3.0 2.0 1.0 0.0
(b) 112333333 111111111 112222222 112333333 333333338 333333388 333333888 333338888 333388888
112333333 111111111 112222222 112333333 444444448 444444488 444444888 444448888 444488868
1.4 1.0 1.4 2.2 ‘2.8 3.2 3.6 4.2 5.0
1.0 0.0 1.0 1.4 2.0 2.2 2.8 3.6 4.5
1.4 1.0 0.0 1.0 1.0 1.4 2.2 3.2 4.1
2.2 1.4 1.0 0.0 0.0 1.0 2.0 3.0 4.0
2.8 2.2 1.4 1.0 1.0 1.4 2.2 3.2 4.0
3.6 2.8 2.2 2.0 2.0 2.2 2.8 3.2 3.0
4.2 3.6 3.2 3.0 3.0 3.2 2.8 2.2 2.0
5.0 4.5 4.1 4.0 4.0 3.2 2.2 1.4 1.0
5.8 5.4 5.1 5.0 4.0 3.0 2.0 1.0 0.0
111223333 111233333 112233333 122333333 233333338 333333388 333333888 333338888 333888888
5-J
TRANSFORMS
Cc)
=
(~2
WO(Xl
3YIN -
x21 +
WIIY,
-
Y21,
Ix1 - x21 ’
1wllxl - x2/ + wolyl - y21, otherwise,
(4
FIG. 2.1. Simple DT and FT examples. (a) Bit map. d, N,, and N, based on (b) chessboard, (c) city block, and (d) Euclidean distance.
Y2
(2.4b)
where wo, M’, z 0. Each distance has its own churucteristic shape specified by the set of all points (x, y) equidistant from some center point. (~0, ~‘1) = (I, I) and (M’~,w,) = (1, 0) yield city block (alternately ahsolate or I-neighbor) and chessboard (alternately &neighbor) distances whose characteristic shapes are diamonds and squares. Sometimes, Euclidean distnnce (whose characteristic shape is a circle) is required or preferred. Thus, (MJ~,w,) is often chosen to minimize the discrepancy between Euclidean and weighted absolute distance. The resulting characteristic shape resembles an octagon and thus emulates octugonul distance better than Euclidean distance. The process of choosing (M+,,w,) to emulate certain distance functions is termed chamfering? and the distances in (2.4b) are special cases of chamber distances3 [I]. Optimal chamfer ((~0, WI) = (1, l/ti - 1 + V5C-i = 0.351)) and chamfer or quasi-Euclidean ((~0, WI) = (1, -\/;i - 1)) distances optimally emulate Euclidean distance [l-3]. Approximations to these distances with integer weights (wo, wr), such as chamfer (3, 4) ((wo, wr) = (3, 1)) and chamfer (2, 3) ((~0, w,) = (2, 1)) distance, are sometimes used instead. Other distances are discussed in [4-51. Distance functions can be visualized as the DT of a bit map whose only feature pixel is at its center (Fig. 2.2) or by the spread binury fanc’tion b, of the bit map (Fig. 2.3). Spread binary functions are nonincreasing functions of DTs that decrease at least once. There is an infinite number of valid choices but those presented in this paper are chosen because of their computational simplicity and properties. With d,(x,y 1w) = min(d(x,y), MJ), w > 0,
(2.5)
as the w-clipped distance transform, the triangular pulse spread binary function is b,(x,y (a,w) = a(1 - d,(x,y 1w)Iw),
111223333 111233333 112233333 122333333 244444448 444444488 444444888 444448888 444488888
IYl -
u,w > 0, (2.6)
where a is the amplitude parameter (in gray-scale units) and w is the spread width parameter (in pixels). Thus, if b(x, y) f 0 then bS(x,y 1a,w) = u; if the distance from (x, y) to the nearest feature point E (0, w) then 0 < ? Chamfering means grooving or beveling to woodworkers. Presumably, chamcteristic shapes are being “chamfered” to look like circles. i Chamfer distances are often characterized by two numbers, namely (I,, = MJ,,and (I, = MI,,+ ~3,. Hence the term “chamfer (q,, (11)distance.”
58
DAVID W. PAGLIERONI
4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0
4.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 4.0
2.0 2.0 2.0 2.0 2.0 3.0 4.0 3.0 3.0 3.0 3.0 3.0 3.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0
5.4 5.1 4.7 4.4 4.0 4.4 4.7 5.1 5.4
5.1 4.1 3.7 3.4 3.0 3.4 3.7 4.1 5.1
4.7 3.7 2.7 2.4 2.0 2.4 2.7 3.7 4.7
4.0 4.0 4.0 4.0 4.0 4.0 4.0 3.0 3.0 3.0 3.0 3.0 3.0 4.0 2.0 2.0 2.0 2.0 2.0 3.0 4.0 2.0 1.0 1.0 1.0 2.0 3.0 4.0 2.0 1.0 0.0 1.0 2.0 3.0 4.0
2.0 1.0 1.0 1.0 2.0 3.0 4.0
8.0 7.0 6.0 5.0 4.0 5.0 6.0 7.0 8.0
7.0 6.0 5.0 4.0 3.0 4.0 5.0 6.0 7.0
(a)
6.0 5.0 4.0 3.0 2.0 3.0 4.0 5.0 6.0
5.0 4.0 5.0 6.0 7.0 8.0 4.0 3.0 4.0 5.0 6.0 7.0 3.0 2.0 3.0 4.0 5.0 6.0 2.0 1.0 2.0 3.0 4.0 5.0
1.0 0.0 1.0 2.0 3.0 4.0 2.0 1.0 2.0 3.0 4.0 5.0 3.0 2.0 3.0 4.0 5.0 6.0 4.0 3.0 4.0 5.0 6.0 7.0 5.0 4.0 5.0 6.0 7.0 8.0 W
4.4 4.0 4.4 4.7 5.1 5.4 3.4 3.0 3.4 3.7 4.1 5.1 2.4 2.0 2.4 2.7 3.7 4.7 1.4 1.0 1.4 2.4 3.4 4.4
5.7 5.0 4.5 4.1 4.0 4.1 4.5 5.0 5.7 5.0 4.2 3.6 3.2 3.0 3.2 3.6 4.2 5.0 4.5 3.6 2.8 2.2 2.0 2.2 2.8 3.6 4.5
1.0 0.0 1.0 2.0 3.0 4.0 1.4 1.0 1.4 2.4 3.4 4.4
4.1 3.2 2.2 1.4 1.0 1.4 2.2 3.2 4.1 4.0 3.0 2.0 1.0 0.0 1.0 2.0 3.0 4.0 4.1 3.2 2.2 1.4 1.0 1.4 2.2 3.2 4.1
2.4 2.0 2.4 2.7 3.7 4.1 3.4 3.0 3.4 3.7 4.1 5.1 4.4 4.0 4.4 4.7 5.1 5.4
4.5 3.6 2.8 2.2 2.0 2.2 2.8 3.6 4.5 5.0 4.2 3.6 3.2 3.0 3.2 3.6 4.2 5.0 5.7 5.0 4.5 4.1 4.0 4.1 4.5 5.0 5.7 (4
Cc)
FIG. 2.2. Distance function visualization as DTs of bit maps with one pixel of value I at the center based on (a) chessboard, (b) city block, (c) optimal chamfer, and (d) Euclidean distance.
&(x,y ] a,w) < a and b,(x,y 1a,w) decreases linearly with increasing distance and if the distance is at least w FIG. 2.3. Distance function visualization using b,(x,y 1255.128) for then b,(x,y 1a,w) = 0. Clipped DTs are special cases of a 256 X 256 bit map of O’s except at (128,128) based on (a) chessboard, constrained DTs, i.e., DTs defined only over arbitrary (b) city of block, (c) optimal chamfer, (d) Euclidean distance. constraint regions [7]. Constrained DTs have been applied to robot collision avoidance and path finding [81. FTs apply directly to nearest-neighbor interpolation. Two-pass sequential locul algorithms split these masks in Each feature point corresponds to a location at which the half and pass left-to-right top-to-bottom over the bit map value of the quantity to be interpolated (e.g., height or with one-half and riglit-to-left bottom-to-top with the gray level) is known. Nonfeature points are assigned the other [9-121. Speed is directly proportional to the numvalues of their FTs. If the binary function is a sparse set ber of bit map pixels. Iterutive pnrullel local algorithms of points, the nearest-neighbor interpolation pattern can apply the whole mask to each pixel simultaneously and be visualized using a diagram of the dot pattern plane iterate until no further changes occur [IO, 12-141. Expendecomposed into disjoint regions containing points all sive special-purpose massively parallel hardware (i.e., having the same dot as their FT. The resulting tessela- one processor per pixel) is required and the number of tions are depicted on Voronoi diagrams that vary with iterations is directly proportional to the largest DT value. The masks can be designed to yield approximations to distance function (Fig. 2.4) [6]. FTs can be used to derive such diagrams. Voronoi diagram boundary pixels can be DTs on the basis of various distance functions by choostaken as those having FT x or y coordinate values that ing mask size and element values in appropriate ways. Perfect emulation can generally be achieved only by differ from those of any of their four nearest neighbors. choosing b such that the largest mask value is no less 3.
UNIFIED
DISTANCE
TRANSFORM
ALGORITHM
[ 151
As special cases, it is possible to easily compute DTs for simple bit maps on the basis of certain distance functions if prior knowledge of bit map characteristics is available. However, DT algorithms must apply to arbitrary bit maps and should ideally produce exact results quickly over a broad class of distance functions. Simultaneous computation of the FT should also be possible. The standard algorithms for computing bit map DTs are based on the idea that pixel DTs can be approximated in terms of neighbor pixel DTs. In this approach, distance functions are represented by (2b + I) x (2b + 1) masks, where mask(x, y) = D[(x, y), (0, 0)] x, y = -b, . . . , b.
b
a .
.
l .
.
.
,
.=
.
l
. .
:
,m FIG. 2.4. Voronoi diagram tesselations of the plane for a dot pattern based on (a) Euclidean and (b) chessboard distance.
DISTANCE
than the largest DT value and mask(x, y) = D[(x, y), (0, 0)] rather than int(D[(x, y), (0, 0)]) (i.e., the masks should be real rather than integer valued when D is real valued). When the mask values are exact, the approximate DTs will generally be exact only at pixels for which the true DT is no larger than the largest mask value. Since one does not have prior knowledge of largest DT values for arbitrary bit maps, b must be chosen equal to the larger of the number of bit map rows or columns for perfect emulation. This is not feasible. A radically different approach, first suggested in [9] for the special case of city-block distance, is developed in [15] for a broad class of distance functions. These algorithms scan all bit map rows left and right independently and then all bit map columns up and down independently. Because these scans are independent, the rows and columns can be processed either sequentially or in parallel. The corresponding architecture (Fig. 3.1) utilizes at most M + N parallel processors for M x N bit maps and is neither iterative nor massively parallel (i.e., it does not require MN parallel processors). Algorithm speed, though not strongly dependent on bit map size, is dictated by the time it takes to process the worst-case column. Speed is also influenced only weakly by bit map content and choice of distance function. Furthermore, unlike the standard algorithms, this one generates DT and FT data simultaneously. The four-pass DT algorithm of [ 151applies to any distance function in the class for which D[(xI,
YI),
(~2,
~211 = f(lx,
1.~11 < Id ev(hL
lull
- 4,
IY -
?fIlxzl,
lrrl < lb4 af(l~L IYd ?f(I-$
~21)
JYl)
IY2lL
(3. la) (3.Ib) (3. lc)
59
TRANSFORMS
FIG.
3.2.
Graphical
depiction
of distance
function
properties.
i.e., D[(x, , y,), (x2, ye)] is a function of the relative values of x, and x2 and of yt and y2, is nondecreasing in (x1 - XI/, and is nondecreasing in Jy, - y2] (Fig. 3.2). Equation (3.14 3 D[(xI,
YI),
(~2, y2)l
= D[(x2,
~21, (XI,
~111; i.e.,
our distance functions are symmetric. However, they are not necessarily positive; i.e., there is no requirement that D be nonnegative or even that D[(x, y), (x, y)] be zero. Even if positiveness were made a requirement, D would not necessarily be a metric because the triangle inequality does not always apply to distance functions of class (3.1). For example, D[(xll YI), (~2, ~211 = 1x1 - ~21 1~1 - ~21 satisfies all provisions of (3.1) but does not satisfy the triangle inequality. For example, D[(O, 0)(5, 5)] = 25 exceeds D[(O, 0), (5, O)] + D[(5, 0), (5, S)] = 0. Moreover, not all metrics belong to class (3.1) because the triangle inequality does not imply (3.Ib) and (3.1~). However, most metrics used in DT work, and specifically those in (2.4), do belong to class (3.1). The algorithm of [15] uses a distance function look-up table (LUT) where LUT(x, y) = D[(x, y), (0, O)]. It applies to whatever distance function of class (3.1) is stored there and is thus unified. To guarantee exact DTs, M x N bit maps generally require M x N LUTs (i.e., x = 0, . . . ) N- l,y=O,. . . , M - I] although for any bit map, the minimum requirement is that the LUT be just large enough to accommodate all DT values. Row Scan
Each row of bit map bean be processed independently by merging results of simultaneous right (forward) and left (reverse) scans (Fig. 3.3). The row scans produce a nearest horizontal
feature
transform
(HFT)
[NH,.,(x, y),
Nu.,,(x, y)] and the horizontal distance transform (HDT) dH(x, y) of b, where if for row y there exists at least one column X: b(x, y) = 1 then
forward
.scan 4
c
.
DT (MxNx3)
v FIG.
3.1.
Unified
distance
transform
architecture.
FIG.
3.3.
Block
diagram
for row and column
scanner.
60
DAVID W. PAGLIERONI 0
1 2 3 4 0 0 3 4 0000012030120123 ~0001100101001000l 3 2 1 0 left scan K3 3 3 3 4 32100110101101~3 scan merge < 3 3 3 3 4 right
scan
FIG.
5 4
6 4
7 7
8 7
9 9
10 11 12 13 14 15 9 9 12121212
7
7
7
9
9
12 12 12 0
4
7
I
7* 9
3.4.
Row
9
0
12 12 12 12 12
column index tight scan Fl. tight scan DT bit map row y left scan DT left scan Fr merged scan DT merged scan FT
scan example.
NH,.,.(x,y) = x coordinate of a row y pixel for which
dH(&
b(x, y) f 0 closest to (x, y)
(3.2a)
NH,&,
(3.2b)
Y)
=
Y)
Ix
-
=
Y
NH..&,
Y)I
(3.2~)
and otherwise, Nn,x(X, y), Nu&, y), and &(x, y) are undefined. As shown in the example of Fig. 3.4, the idea behind right scan is that HDT and HFT are null (8) until a feature pixel is encountered, at which time HDT is set to zero and HFT is set to the column number. HDT is incremented and HFT replicated until a new feature pixel is encountered. Left scan is identical except for scan direction. Scan merge uses right scan HDT and HFT if right scan HDT is less than left scan HDT or left scan HDT is null. Otherwise, left scan HDT and HFT are used. Column Scan Once each row of b has been processed, each column can be processed independently by merging results of simultaneous down (forward) and up (reverse) scans (Fig. 3.3). Up scan is identical to down scan except for scan direction. Scan merge for column scan is the same as that for row scan (Fig. 3.5). The key to the unified DT algorithm lies in how down scanning is performed. The idea is that for the class of distance functions (3. I), a feature pixel on row y’ closest to (x, y) is (NH,.&, y’), Nu.&, y’)) if it exists. The basic down scanner uses an M-point buffer called HFTrow, in which HFT row addresses are stored. At any row y on a down scan of column X, there exist HFTrow indices MOand Mf such that row addresses of candidate FTs for (x, y) are stored from HFTrow(Mo) to HFTrow(Mr). Initially, MO = Mf = 0 and as down scanning progresses, MO, Mf, and HFTrow contents change. It is natural to load HFTrow by fixing MO = 0, initializing Mf = 0, and, each time a new row with feature pixels is encountered on down scan, setting HFTrow (Mf) to that row address and incrementing Mf . The distance function LUT is used to determine which row stored in {HFTrow(yo) y. = MO, . . . , Mf - 1) contains the feature pixel closest to (x, y). A feature pixel on row HFTrow (yo) closest to (x, y) is
(XI,
YI)
and Nx,
= (NH..*(x,HFTrow(yo)),HFTrow(yo)) (3.3)
Y), (XI, ~111 = LUT(lx
- ~11, \Y - ~11).
Down scanning can be accelerated by reducing the number of elements in HFTrow, namely Mf - MO, that need be considered. This can be achieved by applying three tests. Thefeatuvefess row tesf leaves HFTrow, MO, and Mf unchanged if the row for the column being scanned contains no feature pixels. Otherwise, the distunce-to-column test is applied and its effect is to reduce Mf. This test does not reorder HFTrow but it may recompute Mf and write a new value into HFTrow (Mf - 1). The idea is that for distance functions satisfying (3.1), feature pixels lying on rows further above (x, y) than other feature pixels can be closer to (x, y) only if they are closer to column X. This test thus ensures that the feature pixels referenced by {HFTrow(y,J, y. = MO, . . . , Mf - l} have strictly increasing distances to column X. Each time a row with feature pixels is encountered, it does this by comparing the HDT at the pixel being processed to the distance between column x and each feature pixel referenced by the HFTrow sequence working backward. When a distance less than this HDT is encountered, Mf is set to one more than that HFTrow index and HFTrow (Mf - 1) is set to the row of the pixel being processed. Finally, the up-look distance test is always applied after the distance-to-column test and its effect is to increase MCIwithout reordering HFTrow. The idea is that for distance functions satisfying (3.1), when the pixel on row y of column x is being processed and the distance between its predecessor on row y - 1 and the predecessor’s closest feature pixel on down scan is d, feature pixels with row coordinate less than or equal to
f E 8 00 $0 1 x+8 1 8 2x+42 4 3 x-4 3 4 4x 40 5x 4 1 6x 4 2% 7x 4 8 x+3 7 41 9 x-l 9 1 10x-l 9 42 11 x 11 0 12 x 11 1 13 x 11 2’ 14 x+2 13 j5 15 x+2 1338
FIG.
3.5.
Column
scan example
based
on Euclidean
distance.
DISTANCE
ymin= Y - d - D[(O,013(07I)1
TRANSFORMS
(3.4)
need not be considered, where D[(O, O), (0, l)] = LUT (0, 1) = D[(x, y), (x, y + l)], t/x, y. Each time a row with feature pixels is encountered, the up-look distance test compares HFTrow(yO) to yminfor y = MO, . . . , Mt - I and M0 is replaced by the first y0 for which HFTrow(y”) > Ymin.
If DT and FT data are required only for points at which the DT is less than some maximum distance, the column scan algorithm can be adjusted for improved efficiency. While full M x N DTs require M x N LUTs, clipped DTs require LUTs only large enough to accommodate clipping distance. In fact, LUT size dictates degree of clip. For M, x N, LUTs, the featureless row test need only consider rows with features for which the distance between the column of interest and the nearest row feature is less than N, . Similarly, the up-look distance test need never consider rows more than Mi pixels above the row being processed. Related work in this area using a completely different procedure applies a uniform cost algorithm to computation of constrained DTs [ 161. 4.
MORPHOLOGICAL
APPLICATIONS
In machine vision, mathematical morphology refers to the study of object structure in images. Because of its importance, morphology has been heavily investigated and is fairly well understood. In the classical formulation, object shapes are modified by operating on them with structuring elements. Many morphological operations (e.g., hit-miss, opening, closing, boundary extraction, skeletonization, thinning, and thickening) can be expressed in terms of basic erosion and dilation operators [17]. Several of these processes are iterative but can be implemented in one pass using DTs, as reviewed below [20-241. Skeletons
61
1, y), d(x, y - l), d(x, y + l), and d(x + 1, y). Crude pruning results when local maxima less than some minimum distance are removed. Such skeletal approximations are not guaranteed to be connected and pruning can make things worse. Furthermore, these skeletons may be two pixels wide in places. The skeletal approximation to the PCB image of Fig. 4. la shown in Fig. 4. lc was derived from the city-block DT of the edge map in Fig. 4. lb with a pruning distance of 2. The edges of Fig. 4.2a shown in Fig. 4.2b were used to obtain the DT-based skeleton of the thick edge map in Fig. 4.2~. The thick edge map was obtained by performing adaptive gradient thresholding on the Gaussian grayscale gradient image (a = 2) of Fig. 4.2b. For comparison, Laplacian-of-Gaussian (LOG) edges (a = 2) based on similar adaptive edge strength thresholding are shown in Fig. 4.2e. The skeleton edges contain information missing from the LOG edges but are less detailed. Thickening Thickening is classically viewed as iterative dilation and dilation as incremental thickening. In the DT approach, thickened binary functions b(x, y) are obtained in one pass as special cases of spread binary functions (cf.(2.5)-(2.6)), namely (4.1) Thickened and triangular pulse spread PCB skeletons based on city-block distance are shown in Figs. 4. Id and 4.le. Thinning Thinning is classically viewed as iterative erosion and erosion as incremental thinning, where care is taken to ensure that areas thinned down to one pixel are thinned no further. The DT approach thins binary images in one pass but can thin areas out of existence. All region pixels with edge map DT values less than some minimum value are eliminated. The result of thinning Fig. 4. la down by a city-block distance of 2 is shown in Fig. 4. If.
The endoskeleton (or simply skeleton), medial axis, or medial axis transform (MAT) of a region is the set of all points interior to the region with more than one nearest boundary point [18]. Exoskeletons are defined similarly except the points must be exterior to the region. Connected approximations to binary region skeletons can be 5. GEOMETRICAL TRANSFORM PROPERTIES obtained by symmetrical thinning implemented as successive erosions. Exact medial axes of regions with Although bit map DTs and FTs can be computed effiboundaries modeled as n-sided polygons can also be com- ciently using the unified transform algorithm, it can be puted in O(n log,n) time [19]. faster to derive DTs and FTs of one bit map in terms of The DT approach derives approximate skeletons by those of another when the first bit map is a spatially transextracting edge map DT local maxima within region inte- formed version of the other. Several new properties statriors [20-231. Techniques for detecting local maxima ing relationships between DTs and FTs of bit maps and vary but the original idea was to use city-block distance scales and translations or rotations of those bit maps are and to call d(x, y) a local maximum when d(x, y) z d(x presented in this section and derived in Appendix A.
62
DAVID W. PAGLIERONI
a
C
t
3
d
e _.
._ _ ._-._ .-....- ._ _ _. __-
--__. -.- --
FIG. 4.1. (a) Threshold binary PCB image. (b) Edge map of (a). (c) DT-based skeleton of (a) with pruning distance of 2. (d) MI= 5 thickened skeleton of(c). (e) w = 20 triangular pulse spread skeleton of cc). (f) Image in (a) thinned to d,,,, = 2.
DISTANCE TRANSFORMS
a
63
b
I
--
.
,
C
d
e
FIG. 4.2. (a) Gray-scale image. (b) Gradient strengths. (c) Thresholded gradient strengths. (d) DT-based endoskeleton of(c). (e) LOG edge map (CT= 2).
64
DAVID
W. PAGLIERONI
Though proven for the continuous case, these geometrical transform properties also apply to the discrete case for valid discrete values. For invalid discrete values, continuous results can be quantized to the discrete grid producing DT and FT approximations accurate down to discrete grid resolution. In fact, geometrically transformed bit maps may themselves be discrete approximations. With P 9 [x, ylT, let AP 4 [Ax, AylT for translation, a a diag(a,, uY) for scale, and
(5.1)
This property also applies to other distance functions for certain rotations. The number of applicable rotations depends on the degree of symmetry in the distance function characteristic shape. For example, the four rotations applicable to chessboard and city-block distance are O”, 90”, 180”, and 270” because diamonds and squares are jointly symmetric about the x and y axes. The following corollary is a consequence of Properties 5.1 and 5.2: COROLLARY5.1 (Scale and Translation). V sum or (3.1), if a is nonsingular then
product separable functions
FT[b(a(P + AP)] = a-‘N(a(P + AP)) - AP. for rotation. Let N(P) = [N,(P), IVY(P) ties that follow, take
The last corollary is a consequence of Corollary 5.1 and Property 5.3.
FT[b(P)] = N(P)
(5.2a)
min D[P,P] = D[P,N(P)] = d(P). i :b(i)fO
(5.2b)
DT[b(P)] = d(P), or equivalently
PROPERTY5.1. (Translation). (3.1),
V distance functions
DT[b(P + AP)] = d(P + AP).
(5.3)
The analogous scale property does not apply to all distance functions (3.1). However, if these functions are sum or product separable, i.e., iff(x, y) = f,(x) + f?(y) or f(x, y) = fi(x)h(y), the following FT property applies: PROPERTY5.2 (Scale). V sum or product separable distance functions (3.1), $a is nonsingular then FT[b(aP)] = a-‘N(aP).
FT[b(OP)] = O-‘N(OP),
FT[b(a(OP + AP))] + AP)) - API.
(5.7)
To summarize, DTs for geometrically transformed binary functions are obtained from the original DTs by transforming the DT arguments in like manner when the transform is translation and the distance functions satisfy (3. I) or the transform is rotation and Euclidean distance is used. FTs are obtained by transforming the FT arguments in like manner and inverse transforming the FTs with transformed arguments. The translation properties apply to all distance functions (3. I). The scale properties do also if the distance functions are sum or product separable. The rotation properties apply only to Euclidean distance and, for certain rotations, to other distance functions with symmetric characteristic shapes.
(5.4)
Note that squared Euclidean and nonnegatively weighted absolute distance (2.4b) are sum separable. DT[b(aP)] has no convenient general expression in terms of DT[b(P)] although analytical expressions for DT[b(aP)] = D[P,FT[b(aP)]] can be derived for certain distance functions such as squared Euclidean and nonnegatively weighted absolute. Also, since (5.4) applies to squared Euclidean distance, it applies to Euclidean distance as well. The following rotation property applies only to rotationally symmetric distance functions (3.1) and the only such function is Euclidean: For Euclidean
COROLLARY 5.2 (Scale, Translation, and Rotation). For Euclidean distance functions, ifa is nonsingular then
= @-‘[a-‘N(a(OP
FT[b(P + AP)] = N(P + AP) - AP,
PROPERTY 5.3 (Rotation). functions,
(5.6)
For all proper-
distance
DT[b(OP)] = d(OP).
(5.5)
6. CORRELATIONPROPERTIES Certain machine vision applications, such as object detection and stereo feature matching, make use of edge correlation. Because edges are often only one pixel wide, direct edge map correlation does not work when even slight inconsistencies exist. This problem can be resolved by spreading one of the edge maps out (e.g., as in (2.6)) and correlating it with the other edge map. This technique, known as chamfer matching [3], is equivalent to correlating an edge map with the clipped DT of another edge map. Several new binary function to DT correlation properties are developed below. Translational
Correlation
Consider the cross-correlation RT&x, y) of binary function b and the DT of some other binary function with
DISTANCE
65
TRANSFORMS
respect to translational offset (x, y). Let {(x0(i), ye(i)), i = 0 . * 9 n - l} be the feature points of h. Suppose that these points are channeled into K classes of disjoint sets {(xx,,Ai), y&i)), i = 0, . . . , IQ - l} such that
posite translational cross-correlation never exceeds Kchannel translational cross-correlation for K > 1. Moreover, RT,K+I(x,
i
nh = n.
(6.1)
h=I
Then from (2.1), b can be decomposed into K binary function channels bL as
y)
2
RTK(X,
y),
K = 1, 2, . . . . (6.4)
The following property is derived in Appendix B: PROPERTY 6. I (K-Channel Translational Correlation Inequality and Upper Bound).
Cross-
IRT,K(x + Ax, Y + AY) - RT,&, Y> / I D[(Ax, Ay), (0, O)], K = 1, 2, . . . . (6.5)
(6.2a) Rotutionul 111-I
Y)
h(x,
=
c i-o
6(x
-
Y
xdi),
-
(6%)
yh.O(i)).
Consider another binary function and suppose that its feature points are channeled into the same K classes but as potentially intersecting sets. Let dk be the DT of the binary function with channel k feature points. Then &j-:m RT,&,
j-:x b&z, u) d&1 + x, u + y) dudu
Y) =
xf=:=,JIx j?r bx(u, u) dudu
Correlation
Let [x0(i), y0Ci)l = [r0(4 cm @0W,r0W sin e0W1, i=O,. . . ,n-I (6.6a) bh.d$
&O(i)]
=
[ rh.0 (‘11
cos
kJ(i)~rLO(i)
sin
6di)ly
k = I, . . . , K, i = 0, . . . , rzk - I.
(6.6b)
As in (6.2), b can be decomposed into K binary function channels br as
(6.3) =
;
g
‘z;
ddxdi)
+
x,
yh.O(i)
+
YL
b(r cos 8,r sin 0) = 2 b,& cos 8,r sin 6)
(6.7a)
h=I
Ilk-1 where in the discrete case, the integrals are equivalent to bk(r cos B,r sin 0) = 2 6(r cos 8 - r&i)cos O&i), sums. Thus, the K-channel cross-correlation at translai=O tional offset (x, y) between a binary function and a DT r sin 8 - r&i)sin &.0(i)>. (6.7b) can be obtained by summing DT values up at offset binary function feature point locations and averaging the result. This is an inherently vector as opposed to raster It can be shown that in both the discrete and continuous process; i.e., it is based on edge lists rather than rasters. case, the K-channel rotational cross-correlation between Only 12’additions and I division are required per correla- a binary function and DT is given by tion sample. Physically, RT.K(x, y) is a measure of distance between RR.K(~ 1XT Y) b translated by (-X, -y) and some other binary function. CXK,, j-P .I% bx(x + r cos 4, y + r sin 4) Specifically, it is the mean distance between class k feadk(r cos (4 + @,r sin (4 + 0)) drd+ ture points in b offset by (--X, -y) and their nearest class k feature points in some other binary function over K = &j-6” J-; b& + r cos $, y + r sin 4) drd4 classes. When K = 1, K-channel translational crosscorrelation degenerates to composite (or l-channel) = i 2 ‘z’ dk-di)cos(%0(i)+ 0) translational cross-correlation between composite binary function b and the DT of some other composite binary - x3rh.0(i)sin (~h.O(i) + 8) Yk (6.8) function. In the discrete case, a composite bit map is the logical “or” of its channel 1 to K bit planes. Physically, where bh is as in (6.7b) and in the discrete case, the inteRT.I(x, y) is the mean distance between feature points in b grals are equivalent to sums. As for translational correlatranslated by (-X, -y) and their nearest feature points tion, calculation of RR,K(O 1X, y) is based on edge lists and from any class in some other binary function. Thus, com- requires only n additions and I division per sample.
66
DAVID
W. PAGLIERONI
Physically, RR&~ ( X, y) is a measure of distance between b rotated by -8 and then translated by (x, y) and some other binary function. As for RT,K(~, y). it is a mean distance between class k feature points in b after rotation and translation and their nearest class k feature points in some other binary function over K classes. As for RT,K, cross-correlation composite (or I-channel) rotational never exceeds K-channel rotational cross-correlation for K > 1. Moreover,
where R is the set of orientation angles to be considered. In this case, bk of (6.8) is the template channel k vertex map and dk is the image channel k vertex map DT, where different vertex channels contain vertices with different degrees of acuteness. From (6.9), K > I vertex channels are more discriminating than one vertex channel in determining optimal template orientation. (x, y) is a candidate object position if RR.K(~*(x,
RR.K+I(~
1 x,
y)
2
R&0
1x, y),
K = I, 2, . . . . (6.9)
The following property is derived in Appendix B: PROPERTY 6.2 (K-Channel Rotational Cross-Correlation Inequality and Upper Bound). For Euclidean DTs, IRR,K(H
+ A0 1 x, Y) -
RR,K(~
1 x, Y)I
Y) 1 x7 Y) 5 Dc,it.
(6.12)
Therefore, 13*(x, y) need be computed only at positions y) 1X, y) could potentially be (x, y) for which R&0*(x, less than Dcric. If R&0*(X, y) I X, y) > D,,i,, Property 6.1 can be used to determine which pixels in the neighborhood of (x, y) need never be processed. Motivated by Property 6.1, the translational acceleration rule is RR,K(~*(x,Y) 1x,y> > Dcrit
3 mask out all (x’, y’) : D[(x, y), (x’, y’)l s @$ 5 rO(i) 9 ARa(A@, K = 1, 2, . . . . (6.10) i=o Accelerated
Object Detection
[253
Detection of objects at arbitrary position and orientation within images is an important machine vision task. Image analysts face such problems when examining aerial images for man-made objects. The problem of object scale is eliminated by approximate knowledge of aerial image sensor parameters. However, the process of handling object position and orientation jointly is computationally intensive. The foregoing DT correlation properties provide a mechanism for accelerating this process. Consider an M x N bit map of vertices extracted from an image. Suppose that an object model is constructed with potentially disconnected edge segments and that template vertices are obtained from this model. Template and image vertices can be matched in order to determine optimal object orientation at each position which could potentially contain an object. If
(6.13)
< RR,K(~*(~,Y) 1x,y) - &it.
Similarly, at all positions (x, y) that are deemed necessary to process, if RR,K(~ I x, y) exceeds the smallest RR,K value, namely R,i”(x, y), among orientations already tried, then Property 6.2 can be used to determine which orientations in the neighborhood of 8 need never be processed at (x, y). Motivated by Property 6.2, the rotational acceleration rule is RR,K(~ I x,~) > Rmdx, Y)
j mask out all orientations in [0 - AtI, 8 + A01 (6.14a) n-1
A8 = n[RR,K(e 1X,Y) - R,in(X, Y)] / c h(i).
(6.14b)
i=O
The accelerated position-orientation calculation algorithm is a feedback system (Fig. 6.1). The feedback element implements translational acceleration rule (6.13)
IZ +! No. of template vertices p A No. of template poses (orientations) to consider
then O(MNnp) additions are required even if the image vertex map DTs and template vertex lists at each orientation are precalculated. This enormous computational burden can be reduced by decreasing the contributions of MN and p through translational and rotational accelerations attributable to Properties 6.1 and 6.2. The optimal template orientation 0*(x, y) at positions (x, y) satisfies RR,K(~*(x, Y) 1x,
Y)
= minRd@ HEIl
1x,
YL
(6.11)
template vertex lists
osition
DT’s of image vertex channels
x
FIG. 6.1. Accelerated position-orientation calculation algorithm.
67
DISTANCE TRANSFORMS
template vertex lists
DT's of image vertex channels
I position t&Y) -
rotation
rotation traversal
0
rotational corre1ator
oDtima1 correlation and rotation at (X,Y)
'a*
FIG. 6.3. Position traversal control scheme example on 5
I
I
FIG. 6.2. Accelerated rotational correlation algorithm.
and masks out certain positions. Feedback results are input to a position traversal controller that boosts translational acceleration potential by traversing successive positions that are as widely spaced as possible and skipping previously masked out positions. A traversal control scheme based on quadrant decomposition is summarized below (Fig. 6.3):
x
5 grid.
Figure 6.5 provides an example of accelerated position-orientation computation in object detection. Objects detected as the prototype building complex of Fig. 6.5a are circled in Fig. 6Sb. The complexes not circled are related to but different than the template and should go undetected. Using the algorithm just described, only 26,179 of 142,884 positions and an average of only 71 of 120 orientations needed to be tried. The translational, rotational, and overall computational reduction factors to optimal orientation calculation at candidate object positions were 5.46, 1.69, and 9.22. All objects were detected despite variations in position and orientation and there were no false alarms even in the presence of related objects. By using distance transform properties, the detection rate was accelerated by one order of magnitude. Further examples, with greater acceleration rates and a more thorough treatment, can be found in [25].
Atlmorethanlstpowerof2rMorN while A 2 1 x-0 while x
while i < p if orientation i previously unprocessed or unmasked process orientation i i+i+A A + Al2
xl I I Ill I I xl 121 1x1 131 FIG. 6.4. Orientation traversal control example for eight angle quantizations.
68
DAVID W. PAGLIERONI
a
FIG. 6.5. (a) Template object and vertices. (b) Image with detected objects. (c) Image edge map and vertices.
the two-way correlations and more sensitive to detection feature matching, is important because knowledge of of a sparse edge pattern within a dense edge map. In this conjugate point coordinates allows depth measurements to be made for salient image (e.g., edge) points, where case, it might thus be appropriate to base the combination depth is the distance between the acquisition camera and rule on minima of two-way correlation pairs. Consider the problem of determining which points in imaged point. It is well known that for stereo pairs acquired by frame an optical image correspond or are conjugate to edge points in a companion optical image acquired from a dif- cameras, the right image point conjugate to any left image ferent frame camera position and orientation. These two point is constrained, by camera parameters and geomeimages constitute a stereo pair in that they have overlap- try, to lie on a line in the right image. The correct point on ping fields of view and frame camera focal lengths as well the line must be determined from image contextual inforas positions and orientations at image acquisition time mation (i.e., by edge matching in this problem). Edge correlation is an effective means of edge matching when that are known. The foregoing problem, known as stereo shadows, occlusion, and irregularity due to sudden depth variations are modest. This is often the case in images of bit map 2 natural terrain containing few man-made objects (Figs. 6.7a and 6.7b). For each edge point in Fig. 6.7b, a constraint line in the image of Fig. 6.7d was computed from the left and right camera parameters. A block of edges about that point was two-way correlated along that line with right image edges. Composite correlations were based on minima of two-way correlation pairs. Stereo feature matching results are shown in Figs. 6.7e and 6.7f. 7. composite
correlation
FIG. 6.6. Two-way correlator.
ADAPTIVE
CONVERGENCE
In the previous section, DT correlation properties were shown to be useful for edge pattern matching when the
DISTANCE
TRANSFORMS
69
a
e
FIG. 6.7. (a) Left stereo image. (b) Left image edges. Cc) Righl stereo image. (d) Right image edges. (e) Left image edges matched to. (f) Right image matches based on two-way correlation.
70
DAVID
W. PAGLIERONI
spatial transform parameters are translation or rotation. When other parameters are involved and some of them are known with low accuracy, coarse-to-fine DT-based methods utilizing resolution pyramids, such as hierarchical chamfer matching, apply [3, 261. The idea is to perform matching with respect to these parameters at reduced resolution using coarse quantizations of parameter space and to use the results in guiding the match at the next higher resolution with ever more narrow parameter ranges and quantizations. This method is, for example, well suited to edge-based automatic registration of aerial imagery. However, in some applications, the parameters are nominally known with fairly high accuracy and the problem is to fine tune them. This is the case in map-matching problems where it is desired to place cars back onto roads when they are incorrectly sensed as being somewhat off. The adaptive technique developed below is ideal for such problems. Let us extend our definition of distance function D between two points (x, , y,) and (x1, y2) to accommodate distance between two m-sample lists of points Pk A [xk, yk] ki {Pk(i) A [xk(i), yk(i)], i = 0, . . . , m- l},k= 1,2. Let
D,U’l,
PzI A i m$ D[Pdi), h(i)1 I0
(7.1)
be the scalar-valued vector distance function and note that DI = D. Next, consider the vector-valued spatial transform function g(P, t), where P A m x 2 array of points to be spatially transformed g 4 m x 2 array of spatially transformed points t L q x 2 array of spatial transform parameters. Now consider an iterative procedure in which at iteration k + I, the problem is to compute the spatial transform parameters fk+3 minimizing the scalar-valued cost function
J(Q+,) = D,,[g(PI,, tx+,),Nxl,
(7.2)
where POis the initial set of m points to be spatially transformed, PI+1 = go%, tL+l);
(7.3)
i.e., PI+, is PXspatially transformed by spatial transform parameters fk+,, Nk = FT[Ps] and the FT is taken for the binary function composed of feature points that POis to be matched to. This iterative procedure is adaptive in
that if Pk+, has a different FT than PX, it spatially transforms Pg+, so as to cause it to adapt to Nk+I . Convergence occurs when Nk+, = Nk. This adaptive FT matching algorithm can be formally stated as follows: k+O while k = 0 or Nk f Nk-1 1. Compute fk+ 1 minimizing J(fk+ I). 2.
P/c+1
+
gm,
fk+l)*
3. Increment k. The input point list is POand the spatially transformed point list is Pk+, . The following property is proven in Appendix C: For any uecPROPERTY 7.1 (Adaptive Convergence). tor distance function D,, , if VP, 3 a t: g(P, t) = P, then
D,,WX+I, Nl 5 D,,,D’a, Nl
(7.4a)
D,n[P~+ I , Nk+115 D,,[P~+I 7Nl
(7.4b)
&D’L+I , N+,l 5 D,,,[P,!,NJ.
(7.4c)
Furthermore, for continuous binary functions, the adaptive FT matching algorithm either produces a closer match at each iteration or converges. This property does not nCcessarily apply to discrete binary functions because of quantization error. In the discrete case, iterations cease when Nk = Nk-, or D,,,[Pk+,, Nk+J exceeds D,[Pk , Nk] due to quantization error. This often occurs for D, on the order of 1 or 2 pixels. The adaptive convergence property is most useful when efficient analytical solutions to the J minimization problem exist. Existence depends on choice of distance and spatial transform function. If D,,, is squared Euclidean distance and g(P, t) = [P, l]t, where 1 is the m X 1 vector of l’s and t is a 3 x 2 matrix of independent variables, then the J minimization problem is a leastsquares problem. If constraints are put on the six t elements such that t can be represented by less than six independent variables, then specific analytical solutions exist. Figure 7.1 demonstrates Euclidean DT-based adaptive map matching on a USGS Digital Line Graph (DLG) road map section converted to a bit map. The previously discussed spatial transform function g with six independent spatial transform parameters t was used. Six leastsquares iterations were required to convert the nominal dot pattern in Fig. 7.la to the discretely converged dot pattern in Fig. 7.lb. The dots can be viewed as sequentially sensed car positions that need to be corrected, i.e., put back onto the road.
71
DISTANCE TRANSFORMS
2.
a
Proof of Property 5.2
min D[P, i] =
min
i: b(&)#O
D[P, a-lP’],
P’ : b(P’)&O
since a is nonsingular
D[P, a-lP’]
=
f(lx - $1, 1y - $1)
= f(h FIG. 7.1. (a) Nominal point pattern (smaller dots) and its FT (larger dots) on a DLG road map. (b) Point pattern adjusted by six adaptive least-squares iterations to match the road.
8.
(A.6)
bxx - x'l&
b,Y
- Y'J). (A-7)
If f is sum or product separable then from (3.lb) and (3.lc)
CONCLUSION
When distance transforms were first discovered, their usefulness for morphological processing was recognized almost immediately. But until recently, they were not acknowledge as powerful machine vision tools. A number of new distance transform properties were developed in this paper. They were shown to be useful for a wide variety of machine vision applications. Continued work with distance transforms will undoubtedly lead to discoveries of new properties and applications.
=f(lw
- N&WI,
Juyy - N,(aP)()
(A.@ (A.7)
+f(h
lw - ~&Wl, $J I+Y - I$@() (A.9)
= D[P, a-‘N(aP)] APPENDIX
A: PROOFS FOR GEOMETRICAL TRANSFORM PROPERTIES
(A.6HA.9)3 .m;;:,,DP, f’l = D[P, a-lN(aP)].
1. Proof of Property 5.1
min
D[P, P] =
min
D[P, P’ - AP]
(A.l)
D[P, P’ - AP] = D[P + AP, P’]
(A.2)
i:b(i+AP)+O
P’ : b(P’)fO
(3.la) j @.I),
min
Equations (A.lO) and (5.2) + FT[b(aP)] = a-‘N(aP) which is property 5.2. 3.
Proof of Property 5.3
min
(A.21 3 D[P, P] =
i: b(f’+AP)fO
min
D[P + AP, P’]
(A. 10)
D[P, PI =
i: b(@@fO
min
D[P, W’P’].
(A. 11)
P’ : b(P’)fO
P’ : b(P’)J-0
= D[P + AP, N(P + AP)] = d(P + AP) (A.3)
For Euclidean distance, D[P, o-‘P’]
= D[OP, P’]
(A. 12)
(3.1~) + D[P + AP, N(P + AP)] = D[P, N(P + AP) - AP]
(A.3), (A.4) 3
min
(A.4)
D[P, fi] = D[P, N(P + AP)
f’: b@+AP)fO
- AP] = d(P + AP).
(A. 1I), (A. 12) I$
min
D[P,
i: b(8i)fO
= D[OP, N(OP)I
PI = pI rnkdo D[OP, , = d(OP).
P’]
(A. 13)
For Euclidean distance, (A.3 D[OP, N(OP)]
Equations (A.5) and (5.2) j DT[b(P + AP)] = d(P + AP), FT[b(P + AP)] = N(P + AP) - AP, which is Property 5.1.
= D[P, O-lN(OP)].
(A. 14)
Equations (A. 13), (A.14), and (5.2) + DT[b(OP)] = d(OP), FTCb(OP)I = O-‘N(OP), which is Property 5.3.
72 4.
DAVID
Proof of Corollary
W. PAGLIERONI
5.1
(6.1), (6.2b) 3 g /yrn /ym MU, u) dudu
Property 5.1 3 FT[b’(P + API1 = N’(P + AP) - AP,
(A.15)
= i j-= j= 3’ 6(u - ~~,~(i), u - yk,o(i)) du dv k=l --D -m i=O
where N’(P) 4 FT[b’(P)]. Let b’(P) = b(aP). Then =&c=n b’(P + AP) = b(a(P + AP))
(A. 16)
Property 5.2 3 N’(P) = a-‘N(aP)
(A. 17)
(A.15)-(A.17) 3 FT[h(a(P + AP))l = a-‘N(a(P + AP)) - AP,
(6.2b) 3 $, jIm j-ymMU, U) dk(u + x, u +
Proof of Corollary
Y)
dudu
(A. 18) =
which is Corollary 5.1. 5.
tB.1)
k=l
b&
- x, u - y) d&A, u) dudu
6(u- x - x@(i), u- y = 2 5’ ym/yrn k=l
5.2
Property 5.3 + FT[b’(OP)] = O-‘N’(OP),
i=O
-
(A.19)
Yk,oti))dkh
u)
d&u
K w-1
where N’(P) a FT[b’(P)]. Then
Let b’(P) = b(a(P + AP)). (A.20)
b’(OP) = b(a(OP + AP))
=
@.I),
VW,
= Oel[a-‘N(a(OP
(A.22)
c
k=l
i=O
+’
03.2)
Yk,O(i))
Ax,
+
y +
X,
b)
yk,O(i)
-
+
(B.3)
3')
RT,iYtX,
y)I
+ AP)) - API, / 2 k=l
which is Corollary 5.2. +
S-
B: PROOFS FOR CORRELATION
5’
[dk(X
+
Ax
dkb
+
xk,O(i),
+
xk,O(i),
Y
+
AY
i=O
yk,O(i))-
f, 1 g z
D[tAx,
AY),
y
+
yk,O(i))l
(0, 0)l 1
PROPERTIES
= MAX,
1. Proof of Property 6. I
AY),
(0, WI,
03.4)
which is (6.5).
In both the discrete and continuous case,
2.
+
+
Y)
RT,&,
IRT,&
Y
xk,O(i),
'
= ;
APPENDIX
3
+
i 5' n k=l i=Odkt&,oti)
=
03.3)
dktx
(6.3)
Corollary 5.1 3 N’(P) = a-lN(a(P + AP)) - AP (A.21) (A.19)-(A.21) + FT[b(a(OP + AP))]
2
Proof of Property 6.2
In the discrete case, bk is given by (6.7b), where 6 is the Kronecker delta function. Thus, ~~=,&&[ba(x RR.K(~
1 X,
Y)
=
+ r cos $, y + r sin 4) dk (r cos (4 + O),r sin (4 + @)I Xfc,IZ&.bk(x
which is equivalent to the first part of (6.8).
+ r cos 4, y + r sin 4)
9
tB.5)
DISTANCE
73
TRANSFORMS
(6.1), (6.7b) 3 2 c 2 hA(x + r cos dj, y + Y sin 4) h=l Q I
K
~-1
= c
2
k=l rk,o(i)
= i
=$$,=. h=I
03.6)
2T
=
r
-
X,
(&,0(i)
+
8)
-
(B.7)
y)-
Y)
m
dxdy
1
1
- 6(r cos 8, r sin 0) rdrde = 1 (B.8) I 0 I[0 r
we have 6(x, y) in Cartesian coordinates + 6,(r,
= 5 2 2 [bk(r cos (4 - e), r sin (4 - 6)) $
sin
m P --m I I -028x,
dk (r cos (4 + e), Y sin (4 + @)I
k=l
6)
Equations (B.6) and (B.7) yield the second part of (6.8). In the continuous case, consider the change of variables (x, y) += (Ycos 8, r sin 0). Thus, dxdy + rdrd 8. But since
(6.7b) + 2 2 c [bk(x + r cos 4, y + r sin 4) $J
+
T F ‘g: 6(x + r cm $J - m(i) cos
$~~.~(i),y + r sin 4 - rh,OO sin &,0(i))
k=l
dArk,o(i) COS(dk,O(i)
i=O
e)
r
= i 6(r cos 8, r sin /3) in polar coordinates.
dk (r cos C#I- x, r Sin C#I- y)]
(B.9)
Thus, = 2 T c 5’ kw cos (4 - 0) - rk,ow r i=O cos Ok,,(i), r sin (4 - 6) - rk,o(i) sin * dk(r
COS
p-1
bk(r cos 8, r sin 0) = 2 ’ 6(r cos 8 i=O r
ek,o(i))
- r&i)
4 - x, r sin 4 - y)]
COS&$(i>, r Sin 8 - rk,O(i) Sin e@(i))
r cos 4, y + r sin 4) d,(r cos (4 + e), r sin (C#J+ 0)) rdrd$ CF=,Jps;
IRR,K(e+ be 1-6Y>
(6.8) *
= ; / 5 x’ k=l
- RR,K(e
COS
(ek,O(i>
+
6
+
u)
sin
dk(rk,o(i) @k,O(i)
COS +
6)
(ek,O(i) -
+
6)
-
.G
@k,O(i)
+
6)
-
X,
rk,O(i)
sin
(ek,O(i)
(B. 12)
(dk(rk,o(i) Cm (ek,O(i) + 8 f A@ + 8 + Ae) - Y) - d&k,-,(i) x,
rk,O(i)
8
-
x2
(ek,O(i)
+
COS (8k,O(i)
D[(rk,O(i) +
sin
+
Ae)),(rk,O(i)
+
0) 8
cos
+
-
COS -
X,
rk,O(i)
sin
(8k,O(i)
(&,0(i) i- 6)
Y)\ A@,
(ek,O(i)
rk,O(i) +
sin
(ek,O(i)
6)
X, rk,O(i)
sin
(ek,O(i)
+
@)I
5 rk,o(i)(AO( for Euclidean distance D.
(B. 13)
Thus,
rk,o(i)
03.1% 03.13) + IRR,K(e+ A@1XTY) - RR,K(e 1 X, Y>l
Y)ll
5 i 2 z’ ldk(rk,o(i) ~0s@k,0(i)
-
03s
But for distance functions (3. I),
5 [dkh,o(i)
dk(rk,o(i)
+ e) - Y)I.
-
r@(i) sin (&,0(i) + 8 + A@ - y) -
-
1 X, Y)I
i=O
(B.ll)
hL(x + r cos 4, y + r sin e) rdrd$
Equations (B. 10) and (B. 1I) are equivalent to (6.7b) and the first part of (6.8). Furthermore, the second part of (6.8) can be derived by replacing &z, (.) in (B.7) with Jpso” (.) rdrd+. Th’IS establishes the validity of (6.7) and (6.8). In addition,
(B.lO)
+
f3 +
Ae)
<
y
2
z
rk.O(i)
=
y
2
ro(i),
(B. 14)
-x, r@(i) sin (&O(i) + 8 + A@ - y)
which is (6.10)
74
DAVID W. PAGLIERONI APPENDIX C: PROOF FOR ADAPTIVE CONVERGENCE PROPERTY
6. 7.
If 3 a t: g(P, t) = P, VP, then Dm[pk, Nd 2 4”
&[g(pk,
t), Nd.
(C.1)
8.
By definition of tk+l and (7.3), 4"
t), Nkl
&[g(pk,
C. Arcelli and G. Sanniti di Baja, Computing Voronoi diagrams in digital pictures, Pattern Recognir. Lerr. 4, 1986, 383-389. L. Dorst and P. W. Verbeek. The constrained distance transformation: A pseudo-Euclidean recursive implementation of the Lee algorithm, in Proceedings, European Signul Processing Conference. The Hugue. The Netherlunds, Sepr. 2-5, 1986. pp. 917-920. P. W. Verbeek, L. Dorst, B. J. H. Verwer, and C. A. Groen, Collision avoidance and path finding through constrained distance transformation in robot state space, in Proceedings fnterntrrionul Conference
= &k(pk,
fk+l),
Nkl
9.
(c.2)
= &[h+dNk.
10.
Vision
Equations (C.l) and (C.2) yield &[Pk+l,Nkl
s &[pk,
II.
(C.3)
Nkl,
12.
on Inrelligenr
Auronomous
Sysrems,
Amsterdtrm,
1986.
pp. 634-64 I. A. Rosenfeld and J. L. Pfaltz, Sequential operations in digital picture processing, J. Assoc. Compur. Much. 13, 1966, 471-494. G. Borgefors, Distance transformations in digital images, Compur. Graphics
Image
Process.
34, 1986, 344-371.
P.-E. Danielsson, Euclidean distance mapping, Comprrr. Gruphics Imuge Process. 14, 1980, 227-248. J. Piper and E. Granum, Computing distance transformations in convex and non-convex domains, Purrern Recognit. 20, 6. 1987, 599-615.
which is (7.4a). Also, since Nk = FT[PJ,
A. Rosenfeld and J. L. Pfaltz. Distance functions on digital pictures, Purrern Recognit. 1, I, 1968, 33-61. 14. H. Yamada, Complete Euclidean distance transform by parallel operation, in Proceedings 7rh Inrernurionul Conference on Ptrrrern Recognition, Monrreul, Canudu, 1984, pp. 69-71. 1.5. D. W. Paglieroni, A unified distance transform algorithm and architecture, Much. Vision Appl., 1991 accepted. 16. B. J. H. Verwer, P. W. Verbeek, and S. T. Dekker, An efficient uniform cost algorithm applied to distance transforms, IEEE Truns. Porrern Anul. Much. Inrelligence 11, 4, Apr. 1989, 425-429. 17. J. Serra, Imuge Anulysis und Murhematicul Morphology, Academic Press, New York, 1982. 18. H. Blum, A transformation for extracting new descriptors of shape, in Proceedings, Symposirtm on Models .for Perceprion of Speech und Visuul Form (W. Whaten-Dunn, Ed.), pp. 362-390. Cambridge, MA, MIT Press. 1967. 19. D. T. Lee, Medial axis transformation of a planar shape, IEEE Truns. Puttern Anul. Much. Intelligence 4, 4, July 1982, 363-369. 20. C. Arcelli and G. Sanniti di Baja, A width independent fast thinning algorithm, IEEE Truns. Purrern Anul. Much. Intelligence 7,4, July 13.
min
Nl =
&[Pk+l,
N: b(N)+0
&[Pk+l,Nk+ll
+'&n[Pk+l,Nk+ll
5 &[Pk+l,
Nkl,
(C.4) which is (7.4b). Equations (C.3) and (C.4) yield &[Pk+l,
Nk+ll
5 &n[pk,
(C.5)
Nkl,
which is (7.4~). In the case of inequality, the match is closer at iteration k + 1 than at iteration k. In the case of equality, (c.3)
=, &[Pk+l,Nkl
CC.61
5 &[Pk+l,Nk+ll
(c.4), CC.61+'&[Pk+l,
NkI
= &?[Pk+l,Nk+ll
(C.7) = &[pk,
Nkl.
1985, 463-474.
Equation (C.7) j
convergence has been reached.
21.
22.
REFERENCES
C. Arcelli and G. Sanniti di Baja, Finding local maxima in a pseudoEuclidean distance transform. Cornput. Vision Graphics Imuge Process. 43, 3, Sept. 1988, 361-367. F. Klein and 0. Kubler. Euclidean distance transformation and model-guided image interpretation, Putrern Recognir. Lerr. 5, 1987, 19-29.
1. G. Borgefors, Distance transformations in arbitrary dimensions, Compur.
Vision
Graphics
Imuge
Process..
23.
27, 1984, 321-345.
2. U. Montanari, A method for obtaining skeletons using a quasiEuclidean distance, J. Assoc. Comput. Much. 15, 1968, 600-624. 3. H. G. Barrow, J. M. Tenenbaum, R. C. Belles, and H. C. Wolf, Parametric correspondence and chamfer matching: Two new techniques for image matching, in Proceedings, Srh Internurionul Joinr Conference on ArriJiciul Intelligence, 1977, pp. 659-663. 4. P. P. Das and B. N. Chatterji, Knight’s distance in digital geometry, Partern Recognir. Left. 7, 1988, 215-226. 5. M. Yamashita and T. lbaraki, Distances defined by neighborhood sequences, Pattern Recognir. 19, 1986, 237-246.
24.
C. Lantuejoul and F. Maisonneuve, Geodesic methods in quantitative image analysis, Puttern Recognir. 17, 1984. 177-187. S. Suzuki and K. Abe, New fusion operations for digitized binary images and their applications, IEEE Truns. Pattern And. Much. Znrelligence
7, 1985, 638-65
I.
25.
D. W. Paglieroni, G. E. Ford, and E. M. Tsujimoto, The positionorientation masking approach to parametric search for template matching, in revision to ZEEE Trans. Pattern Anal. Mach. Znrelli-
26.
G. Borgefors, Hierarchical Chamfer matching: A parametric edge matching algorithm, IEEE Trans. Purtern Anul. Much. Intelligence 10, 6, Nov. 1988, 849-865.
gence.