Pattern Recognition Letters 8 (1988) 359 360
December 1988
North-Holland
Book Review "Image Segmentation and Uncertainty" by R. Wilson and M. Spann A.W.M. S M E U L D E R S Dept. of Medical ln[brmatics, Erasmus University, Dr. Molenwaterplein 50, Rotterdam, The Netherlands Received 20 September 1988
As most of the research in image segmentation seems to have been problem oriented, books rendering a general theoretical approach to image segmentation are welcome to broaden the base of our understanding. The book [1] discussed here takes on such a mainly theoretical analysis of one approach to image segmentation, viz. region detection on the basis of the principle of paired uncertainties. To illustrate the principle, assume that a local image property g is used to distinguish object O1 from 02 in the image, with g(01) ~ g(O2). For each pixel in the image, a window of pixels W is used to estimate g from the pixel values in W: ~,(x,y)= ~,(f(xw, yw)). When the position of a boundary B between O1 and O2 is not yet known, variations in gw may be attributed to noise inJ(xw, Yw) or to the presence of a boundary B in W, or both. Hence there is an uncertainty A~,w(x,y) in the value of the local property estimate ~,w(x,y). The complementary uncertainty A(,~,)~) is in the precise location of the boundary estimate (2B,))~). These uncertainties are coupled, as making W larger will decrease the uncertainty A~,w(x,y) but increase A(~B,3~), and vice versa. So, when assigning (x,y) to either O1 or O2, a compromise should be found between A~,w(x,y), which can be seen as a measure for the accuracy of the boundary estimate, and A(2B,)~B), measure for the bias. After an excursion through a symbol theory, the authors build from this fundamental and well conceived principle of uncertainty onto a scale invariant theory for segmentation. The main idea is to divide the image by a quadtree in windows Wk of
Lk × Lk pixels, where k is the level of the quadtree. Using a truncated tree, the size L 0 of the top level window with k = 0, typically is 16 and kma x = 4. While proceeding down the tree, now refine (XB,,gR) on the basis of ~,Wk, equivalent to deceasing (.~B, yB) 2. Simultaneously, A~,w(x,y) 2 increases, for the case of uniform, additive, white noise, with k by a factor 2 + 2k. For the latter model, the signal-tonoise ratio p improves for level k by Pk~/Po = 2k " Pk, where Pk is the probability that Wk consists of one object Oi only. The segmentation problem is translated into maximization of Pk/Po under (local) variation over k, while the question remains how to estimate signal, noise and Pk for each window WR in practice. To that end, first the number of objects O i in the image and their mean signal level are established through an unsupervised mechanism named 'local centroid clustering'. Then, all windows at level k = 0 are assigned to an Oi. Windows bordering a window of another object class and all their eight connected neighboring windows are considered for reassignment. Only these windows are divided by descending the quadtree and smoothed with a linear filter of which the coefficients depend on the estimated interwindow p. The thus obtained (smaller) windows at quadtree level k are assigned to an Oi via an unspecified "nearest class mean criterion'. The procedure of selecting border windows, smoothing, assignment and refinement of the quadtree is repeated until pixel level is reached. Some results on artificial scenes (with one or two exceptions) with an p in the range of 0.3 to 4 are shown.
0167-8655/88/$3.50 © 1988~ Elsevier Science Publishers B.V. (North-Holland)
359
Volume 8, Number 5
PATTERN RECOGNITION LETTERS
For the window property g, the grey value can be used, or the analysis can be expanded to local texture features. The only class of texture features the authors consider, is the one of the 'finite prolate spheroidal sequences' FPSS. They are due to Slepian, and result as a set of complete functions from solving eigenvalue equations of the type F - l(F(v)) = 2v, where Fdenotes the Fourier transform. The definition leaves freedom to choose the functions v by a tessellation of either (x,y)-space or frequency space. Equating the frequency content of a function v with a texture property, the authors point out another version of the uncertainty principle (this one is well known): Thesmaller the tile of v is in the frequency domain, the sharper the texture property is known but the less well known is its location in the image and vice versa. Some properties of FPSS and tessellation schemes for image properties are discussed at an abstract level. To arrive at an approach to texture segmentation, a set of 14 FPSS-filter functions vj is formed with varying frequency band responses. Again, the above mentioned model for segmentation is used, with property g now being a 14-dimensional vector. To compensate for the increased dimensionality of g, a model is derived to find the appropriate (larger) window size Wo. Also, the rule with which a window is assigned to an object is adjusted to a multidimensional Euclidean classifier with some heuristic weighing scheme. The method is illustrated by images, all with stationary statistical properties (sometimes described as 'wall-to-wall carpets'). In evaluating the book, it is clear that the authors have enough on their minds worth listening to. The book, however, does not release its contents easily. The book is an extension of papers by the authors as these appeared in scientific journals, still carrying
360
December 1988
signs of their origin in most of its six chapters. Other criteria apply to theoretical books than to textbooks, but, as it is a book after all, one may expect the authors take a little more time and pages to better formalize, bone and illustrate their argument and notation. Also, the theory exposed in the book would appeal to many more if the review of the state of the art in image segmentation were extended to appreciate the value of other approaches: edge estimation techniques, e.g. dynamic programming methods, e.g. [2], are only marginally considered, but even competing object detection and region classification approaches receive no more than two pages. The same holds for the motivation to study only FPSS textures. Finally, insertion of more realistic examples including more scenes from real life or some with non-stationary statistical properties would have enhanced the power of the book. The last extensions are not strictly necessary as the authors intended to write a theoretical book. And, that is how the book should be welcomed: as one of the very few taking a theoretical position in image processing. The value of such a proposition should not be underestimated, and it is in this sense, I have found the book inspiring. As far as this reviewer concerns, the authors did well to end the title of their sixth and final chapter "Image segmentation a problem solved?" with a question mark indeed.
References [1] Wilson, R. and M. Spann (1988). Image Segmentation and Uncertainty. Research Press Ltd, England. [2] Martelli, A. (1976). An application of heuristic search methods to edge and contour detection. Comm. ACM 19, 73 83.