Signal Processing 31 (1993) 133 163 Elsevier
133
Supervised segmentation using a multiresolution data representation Isaac Ng, Josef Kittler and John Illingworth
Department q[ Electronic' and Electrical Engineering, University o[ Surrey, GuildJbrd GU2 5XH. UK Received 17 June 1991 Revised 12 August 1992
Abstract. A method ff)r segmenting multiple feature images is presented. It builds on two well established methodologies: Bayes' decision rule and a multiresolution data representation. The proposed method is aimed at image analysis applications requiring routine processing. The goal is to improve the classification reliability of conventional statistical based pixel classifier by taking account of spatial contextual information conveyed in an image via multiresolution structure. Our method is closely related to the Spann and Wilson quadtree segmentation algorithm. However, we shall demonstrate that a supervised formulation under the assumption that image classes are normally distributed, leads to a significant simplification of Spann and Wilson's algorithm and gives more consistent segmentation results. In order to incorporate Bayes' decision rule with a multiresolution data structure, the class statistics at each resolution level must be available. However, we point out that unbiased estimate cannot be achieved by direct calculation. This leads to the development of an efficient method to acquire the parameters of class distributions at each resolution level. It involves estimating the class statistics on training sites at the full image resolution. The corresponding parameters at lower resolution are computed by predetermined scaling factors. The segmentation scheme is validated on synthetic data. To clarify the applicability of the proposed method a set of experiments of its use in texture image segmentation is illustrated in which Brodatz textures and seismic images are used. Zusammenfassung. Beschrieben wird ein Verfahren zur Segmentation von Bildern mit multiplen Merkmalen. Es basiert auf zwei wohlbekannten Methoden: der Entscheidungsregel von Bayes und einer Multiraten-Darstellung, d.h. einer Mehrfachdarstellung des Bildes mit verschiedener 6rtlicher Aufl6sung. Die vorgeschlagene Methode zielt auf Anwendungen im Bereich der Bildanalyse, die eine Routineverarbeitung erfordern. Das Ziel besteht darin, die Zuverlfissigkeit konventioneller statistischer Klassifizierer auf der Basis einzelner Bildpunkte zu verbessern, in den der rfiumliche Kontext durch die Multiraten-Darstellung mit einbezogen wird. Unsere Methode steht in enger Beziehung zum Segmentationsalgorithmus von Spann und Wilson. Jedoch zeigen wir, daft dieser Algorithmus signifikant vereinfacht werden kann und konsistentere Segmentationsergebnisse zeitigt, wenn ein iiberwachtes Lernverfahren angewendet wird unter der Annahme dab die Bildklassen normalverteilt sind. Um die Entscheidungsregel von Bayes in eine Multiraten-Datenstruktur zu inkorporieren, muB fiir jeden Wert des 6rtlichen Abtastintervails die klasseninterne Statistik zur Verfiigung stehen. Da es sich zeigt daB mittelwertsfreie Schfitzwerte nicht direkt berechnet werden k6nnen, wird eine effiziente Methode zur Bestimmung der Parameter der Klassenverteilungen fiir jeden Wert des Abtastintervalls entwickelt. Hierzu werden die statistischen Eigenschaften jeder Bildklasse anhand von Stichproben bei voller 6rtlicher Aufl6sung geschfitzt. Bei geringerer Aufl6sung werden die entsprechenden Parameter mit Hilfe vorbestimmter Skalierungsfaktoren berechnet. Validiert wird das Segmentationsschema anhand synthetischer Daten. Um die Anwendbarkeit der vorgeschlagenen Methode abzuklfiren, wurden eine Reihe von Experimenten zur Segmentierung von Bildtexturn durchgefiihrt; als Testmaterial hierzu dienten Brodatz-Texturen und seismische Bilder. R~sum& Nouspr6sentons unem~thodepourla segmentationd'images fiattributsmultiples. Ellerepose surdeuxm+thodologies bien &ablies: la r+gle de dScision de Bayes et une repr6sentation multi-r6solutionelle des donn6es. La m+thode propos+e vise fl des applications d'analyse d'images requi6rant un traitement de routine. Le but est d'am61iorer la fiabilit+ de classification de classifieurs statistiquess de pixels conventionnels en tenant compte de l'information spatiale contextuelle transmise dans
Correspondence to: Mr. Isaac Ng, Department of Electronic and Electrical Engineering, University of Surrey, Guildford 6U2 5XH, UK. 0165-1684/93./$06.00 ,t" 1993 Elsevier Science Publishers B.V. All rights reserved
1. Ng et al. / Supervised segmentation
134
une image via une structure multi-r6solution. Notre m6thode est reli6e de pr+s fi l'algorithme de segmentation en arbre quaternaire de Spann et Wilson. Toutefois, nous d+montrerons qu'une formulation supervis6e sous l'hypoth6se que les classes de l'image sont distribu6es selon la loi normale, conduit ~i une simplification significative de l'algorithme de Spann et Wilson et donne des r6sultats de segmentation plus consistents. Dans le but d'incorporer la r6gle de d6cision de Bayes avec un structure de donn6es multi-resolutionnelle, les statistiques de classe ',i chaque niveau de r6solution doivent 6tre disponibles. Toutefois, nous relevons que l'estim6e non biais6e ne peut pas ~tre obtenue par un calcul direct. Ceci conduit au d6veloppement d'une m6thode efficace d'acquisition des param+tres de distributions de classes ~t chaque niveau de r~solution. Ceci implique l'estimation des statistiques de classe sur des zones d'apprentissage ~t la r6solution maximale de I'image. Les param6tres correspondants fi une r6solution plus basse sont calcul6s ~ I'aide de facteurs d'6chelle pr6-d&ermin6s. Cette technique de segmentation est valid6e sur des donn6es synth&iques. Afin de clarifier l'applicabilit6 de la m&hode propos6e, un ensemble d'exp6riences sur son utilisation pour la segmentation d'images textur6es, dans lequel des textures de Brodatz et des images sismiques sont employ6es, est pr6sent6.
Keywords. Multiresolution data representation; Bayes decision rule; supervised segmentation scheme; texture image segmentation; seismic section segmentation.
I. Introduction
Segmenting an image into spatially disjoint regions of uniform property has been the subject of extensive research over the past decades. The prime aim is to provide a symbolic description of the constituent parts of the image for scene interpretation. The many different approaches to this topic can be classified into distinct categories on the basis of either mathematical framework (statistical, fuzzy, knowledge driven approaches) or image phenomenon (region homogeneity, edges) used for segmentation [10, 14]. Among them, the statistical approach such as per-pixel classification has been widely applied in industrial part inspection and to classify remotely sensed multispectral image data for land-cover analysis [8, 9, 18]. However, criticism has been raised against the approach of classifying pixels based solely upon the statistical distribution of individual pixel intensity or pixel features, on the ground that the spatial contextual information conveyed in an image is not taken into consideration. Consequently, any spatial coherence and localization of the regions which result from the classification is entirely fortuitous. Put in another way, noisy features of individual pixels cause erroneous decisions about the pixel labels for these single-pixel classifiers. This may be the reason why these procedures tend to produce an abundance of false regions. SignalProcessing
Hence, incorporating contextual or spatial information for improving classification reliability has been a major aim of recent research into this problem. The resultingmethodologies include contextual decision rules, probabilistic relaxation, maximum a posteriori probability (MAP) estimation, and multiresolution processing. In particular multiresolution processing, in which processing is carried out over a range of spatial scales of the input image, has proved very promising in terms of computational efficiency [ 17]. The advantage of using a multiresolution data structure is twofold. Firstly it can greatly relieve the computational burden in the classification process when the image is large and several variables are involved. Secondly the class membership will be more certain in coarser resolution and this helps to minimize the error rate of the classification process. In this paper we propose a new image segmentation scheme which is based on Bayes' decision rule and a multiresolution data representation. Applying a multiresolution data structure to image segmentation has been reported by various researchers [5, 11, 19, 22, 24]. Our algorithm closely relates to the quadtree segmentation algorithm of Spann and Wilson [23] and shares some similarity with the unsupervised image segmentation approach of Spann and Horne [11, 20]. As the latter approach was also related to Spann and Wilson's algorithm, our discussion will then be referred to Spann and
135
L Ng et al. / Supervised segmentation
Wilson's approach only. The obvious difference between our approach and Spann and Wilson's is that the latter is a non-supervised approach. Hence they require no a priori information about image statistics. This characteristic makes their algorithm ideal for general segmentation problems. However, in many image analysis applications requiring routine processing, each image region represents one of a finite set of imaged phenomena whose statistical properties are approximately invariant over a large set of images. In such situations the general segmentation scheme of Spann and Wilson is not only computationally unnecessarily complicated, but also may, as reported in [6], fail to yield consistent segmentation owing to the unpredictable effect of some of the parameters of their method which are automatically selected in a data dependent manner. Our objective in the present paper is to demonstrate that a supervised formulation leads to a significant simplification of the Spann and Wilson quadtree segmentation algorithm under the assumption that image classes are normally distributed, and this gives more consistent segmentation results. The paper is organised as follows. In Section 2 we begin by describing the image pyramid formulation. Then we discuss the potential problems in estimating the class statistics for a reduced spatial resolution image. This is essential for implementing Bayes' classifier at each resolution level of the pyramid. An efficient method of acquiring the parameters of class distributions at each resolution level is proposed. We also show the validity of the method using numerous statistical tests. Section 3 details the implementation of our supervised segmentation scheme. A numerical example using a synthetic image is given in support of our remarks concerning the Spann and Wilson algorithm. Section 4 shows some results on natural textures obtained from the Brodatz album [3]. Results are also demonstrated on a set of seismic images representing a vertical cross section of the Earth's subsurface geological structures. A detailed discussion of the results is included in this section. Finally we conclude in Section 5 with several remarks about the proposed approach.
2. Estimating the class statistics for reduced resolution image
2.1. Image pyramid An image pyramid [17] is a hierarchical data structure in which successively reduced spatial resolution versions of a given image are stacked to form a pyramid. An example of such a data structure can be seen in Fig. 1. The given image indexed as L~occupies the bottom layer of the pyramid with the highest spatial resolution with respect to individual resolution levels. The progressively ,reduced spatial resolution images // are generated recursively according to the following equation: 2
l~(x,y) =
m(i,j)l~ ,(2x+i, 2y+j),
~ i,j,
(1)
t
where M(i,j) is a 4 x 4 smoothing kernel or the generating kernel according to Butt [4]. The twodimensional kernel M(i,j) is Cartesian separable, i.e., M(i,/)=M(i)M(j). The one-dimensional function is defined subject to the following constraints: 1. Normalization: }~=-t M(i)= 1" 2. Symmetry: M ( - I ) = M ( 2 ) = h and M ( 0 ) = M(l)=a; 3. Unimodality: a)b>~O. The constraints stated above extend to M(i,j) so that M(i,j) is normalized, symmetric and unimodal. The function I~(x, y) denotes the image value at coordinates (x, y) and pyramid level I. It is referred to by the term entity or node in this paper. The value held by an entity can be a scalar or a vector. The following discussion relates to the general case of multidimensional image functions. Each entity at level l is also referred to as the father node of those in level l - 1 which are involved in (1), such as [1/ ~(2x+i, 2 y + j ) : i , j = - l , 0,1, 2}. The associated nodes on the level immediately below are called son nodes of ll(x, y).
2.1.1. The equivalent smoothing kernel, Et(i,j) In (1) we express the pyramid building process as a recursive operation defined in terms Vol, 31. No. 2, March 1993
136
I. N g et aL / S u p e r v i s e d s e g m e n t a t i o n
Level n 1xl Nodes at level (I) of the pyramid are the weighed average of V the 16 related nodes (or son nodes) from labial I1-1 ~ n f the= n v r ~ r n i H
Level 2x2
n-1
•
Fig. 1. Pyramidal image structure.
of the father-son relations between successive layers of the pyramid. This suggests that there exist other ways to formulate this process, Burt [4]. In fact (1) can be rewritten as (2). That is,
Like M(i,j), Ez(i,j) is also Cartesian separable and is given as Ez(i,j) = EI(i)Et(j), where the onedimensional equivalent kernel function E~(i) is related to M(. ) by a recursive expression 2
El(x) = ~
2
II(x,Y) = E
M(i,j)Ii-1(2x+i, 2y+j)
M(i)G l ( x - 2 t - I i ) ,
r~<<.x<~r/-,
i=--I
(3)
i,i= -- I
r{ ¢~ Ii(x,y) = ~ E1(i,j)Io(21x+i, 21y+j), i,j
(2)
where El(i,j) is a kernel which will be referred to as equivalent smoothing kernel hereafter and where r[ and r/- denote the spatial extent of the equivalent kernel. This means that the entity value It(x, y) can be determined directly from the original resolution image I0('," ) by convolving with an appropriate kernel El(i,j), instead of generating all the successive levels. Signal Processing
with the initial conditions stated as
rl
Eo(x) =
{10 f ° r x = O , otherwise.
The indices i a n d j are bound by r~ and r i and can be determined from the following equations: r[
=
r[-l + 21, r~- = r/-, -- 2 l- ',
(4)
with r~- = 2 and r~ = - 1. It should be noted that the equivalent smoothing kernel at level 1 is equal to the generating kernel, so that E~(. ) = M(. ).
L Ng et aL / Supervisedsegmentation
M(O)=M
M(-'I)
=
M(2)
= b
~
-1
_ 0
_
137
1
Level 2; lz(x) Level 1;ll(x)
Spatial Position, x
. . . . . . . . . 7 -6 -5
Level
-4 -3 -2 -1 0
l
Iz(O) =
I1(0+i)
E:=_ z M(i)
= blz(-1)
2 3
+alz(o)
(lob + (ha
) Io(-3) ) 10(-2)
+ (ha + ab + (bb + aa + (aa + ab + (ab + aa + (aa + bb + (ab + ba
) Io(-Z)
+ (ha
) Io(S)
+ (bb
) Io~6)
4
5
6
7 8
9
10
O;lo(x)
.lal
+ a Iz(1) +blz(2)
) IolO) I Io(Z) )/0(2) ) 1o(3) )/o(4)
÷ =
i=r~
E2(i) I ° ( ° + i )
Fig. 2. Graphical illustration of the procedure to determine the node values at various levels of the pyramid.
In Fig. 2 we illustrate how to obtain the coefficients of the one-dimensional equivalent kernel. In the example, a one-dimensional pyramid is generated and these nodes represent sample points of the image function L(x). Node values at any level are given as the weighted averaging of node values from the level immediately below which are connected by lines in the figure. The highlighted lines are used to describe how the node value of node 0 at level 2 can be computed directly from level 0 as expressed in (3). For l = 2, the width of the equivalent kernel k2 is 10. For instance, the node located at the spatial position x = 0 in level 2 is the weighted average of nodes from the spatial range - 3 to 6 at level 0 as shown in the figure. The weighting is given by the equivalent kernel
E2(x). For a = 0.37, we plot in Fig. 3 the characteristic profiles of the equivalent kernel at various levels. Note that as l is increased, the profile of E~(x) resembles a continuous Gaussian function as mentioned in [4].
These one-dimensonal and two-dimensional equivalent kernels inherit all the constraints applied to the original generating kernel. For example, in the one-dimensional case, 1. Normalization • Y ~'~ i=,; Et(t)" - 1; 2. Symmetry: E t ( r ? + i ) = E t ( r ~ - i ) , where i = 0 , 1, 2 . . . . . ½k/- 1; 3. Unimodality: Et(rl ) <<.Et(rt + 1) ~< Et(r~ + 2) ~<- - • <~Ei(ri + ~k~- 1).
2.2. Estimation of the class statistics We proceed to discuss the issue of estimating class statistics at higher levels of the pyramid in terms of the class statistics at the bottom level. We shall assume that the entities of the image function I(x, y), or Io(x, y), constitute a set of random vectors {X} which are statistically independent and identically distributed (iid) according to N(p, I ) or N(0/t, 0I). The multivariate normally distributed function N(p, I ) is specified by mean vector/t and Vol. 31, NO. 2, M a r c h 1993
I. Ng et al. / Supervised segmentation
138
Equivalent smoothing kernel at level 1; El(x)
Equivalent smoothing kernel at level 2; Ez(x)
(a = 0.37)
(a = 0.37)
o.so I
0.40
t
Note:
E~(x) = M(x)
w h e r e {x:
1,0,1,2}
r2 ÷ = 6 0.25
rt ~ = 2 r/ = - 1
r z- = - 3 k z = 6 - ( - 3 ) + 1 = 10
0.20-
._~
~ 0AIO-
0,15-
"2 0.10 0.20 0.05
O.lO
-5
-4
-3
[--
-2
-I
0
i
--7 2
+
+
+
m i
3
0.0"
1
-5 -4 -3 -2 -1
5
2
3
4
5
6
7
8
910
spatial index (x)
spatial index (x)
Equivalent smoothing kernel at level 3; Ej(x)
Equivalent smoothing kernel at level 4; E,(x)
(a = 0.37)
(a = 0.37)
0.080-
0.14 -
r S = 14
0.070 -
0.12
r3 = - 7
0.10
k3 = 14 - ( - 7 ) + 1 = 22
0.o0 2o.o.!
r4* = 3 0 r 4 =-15
0.0"0
k¢ = 3 0 - ( - 1 5 ) +
1 : 46
0.050 -
.~
0,040 0,030 -
0.04
0.020 -
0.02
0.010 +
00" o
15 spatial index (x)
2o
0.0"0
-20
-10
0
10
20
30
41)
spatial index (x)
Fig. 3. Characteristic profile of the equivalent kernel, E~. For a=0.37, the characteristic profile of the equivalent kernels from four successive levels are shown. The profile of Et(x) resemblesa continuous Gaussian function as l-, oc [4].
covariance matrix X defined as p=E{X},
r= E{(X-t,)(X-p)*}, where E { . } is the mathematical expectation operator. The implicit assumption we make here is that all pixels in the image represent the same physical phenomenon. Recalling (1), in the pyramidal data structure, an entity It (x, y) at level l is given by a linear combination of its son nodes at level l - 1. Since the vector I~(x, y) is obtained by a linear combination of normally distributed random Signal Processing
vectors, then It(x, y) itself is normally distributed as N(tp, tX) with mean vector /p and covariance matrix t -r. Before describing the method to acquire the parameters of the class distributions at coarse resolution images, we shall identify the problems we are going to confront.
2.2.1. Potential problems In order to obtain an unbiased estimate of a class statistics, we require that the data be statistically independent and identically distributed. However, as a result of the pyramid construction process the nodes at higher levels of the pyramid are no longer
1. N g
e t al. / S u p e r v i s e d
independent. Recalling Fig. 2, each node at higher level of the pyramid shares half the population of its son nodes one level below with its neighbouring nodes. Hence, the statistical independence assumption holds only when the nodes are separated by a certain distance. For example, at pyramid level 1, alternate nodes are statistically independent. Moving one level upward, that is l = 2, the independence is only valid for nodes which are three units apart. The distance required monotonically increases as we move further up the pyramid. As a result, neither can we determine the class statistics from the data at each level since the estimates would be highly biased nor can we estimate them from the subsampled independent data, because the sample size is not large enough to provide sufficient statistical confidence for the results. So far, we assumed that the entities of the image function I(x, y) are drawn from a single normal distribution. However, the image function is normally composed of several segments each filled with samples drawn from a distinct distribution. This gives rise to yet another problem when the image pyramid is constructed, namely the segments at higher levels of the pyramid will contain mixed population pixels. These originate from the kernel smoothing across the boundary of two adjacent segments. The relative proportion of these mixed pixels rapidly becomes significant as one moves up the pyramid and can seriously affect (bias) the estimated class statistics. Summing up, all these problems reinforce the need to find a way to estimate the class statistics for each level of the pyramid without recourse to the image at that level. Instead, we shall demonstrate how the class statistics at higher levels of the pyramid can be expressed in terms of the statistics estimated from the full resolution data.
established. For example, consider the mean vector tit = E{It(x, y)}. Substituting for l~(x, y) from (2), we have t#=E{i. Zr ' Et(i,j)Io(2tx+i, 2ty+j) t.
To show that both tit and iX can be expressed in terms of oit and oX, respectively, we shall involve the equivalence relation as expressed in (2). In this, a direct relation between the entity value at any level to the original resolution image has been
(5)
Further simplification yields
ttt= ~
E,(i,j)E{Io( 2ix+i, 2Zy+j)}
i,j -: rl rl I
=oit Y', Et(i,j).
(6)
i,j- rl
Since ~r~ ij=~, E I ( i , j ) = 1, we finally have tit =oit=it. Formally speaking, the mean vector is invariant under the weighted averaging process as long as the smoothing kernel is normalized. Similarly, the covariance matrix tX can be expressed as /X = E{ [It(x, y) - ,it] [I, (x, y - ,it]v}
x
E,(i,j)Io(2tx+i, 2~y+j) - , i t • .=
I I
.
rl
(7) Expanding the summation terms in (7) and substituting for tit from (6), we find
iX= E
E,(i,j)Et(i',j') i,l.i',j'=r[
x [10(2~x+ i, 2Zy + j ) - o i t ] x [Io(2 tx + i', 2 t y + j ') -oit] "r) r/
= 2.2.2. Parameter inference
139
segmentation
Z
Et(i,j)E/(i',j')
i,j,i ',j' - rl
x E.{ [I0(2/x + i, 2 ty + j ) -oit] x [Io(2%+ i', 2~'+j')-oit]T}.
(8)
Since I o ( ' , ' ) are independent, the mathematical expectation of the cross product-term deviations Vol. 31, N o . 2. M a r c h 1993
I. Ng et al. / Supervised segmentation
140
have zero value. That is,
Level l
E{[Io(2%+ i, 2~y + j ) -o/t]
{It: N(~z' ~tt) }
x [•o(2 ~x+ i', 2/y + j ' ) - 0/~]v } = 0,
•
ss
for (i,j) ~ t"i' ,J ""). Thus, (8) can be simply expressed
-"
as
-"
,T,= ~ E,(i,j)2E{[Io(2~x+i, 2~y+j)-o~] × [Io(2/x+i, 2~Y+J)-o/~] v} =o~" ~
s
I t
Is
,, tt
,i ' ,
•
El(i,j)
ttt
.
i.j= ri
Ii
" ,~ ~l
Level 0 {lo(x,y): N(Y.,g) }
El(i, j f
i,j = r l
The size of the two dimensional laaice and or
i X = r i o -r,
the equivalent smoothing kernel, E l (i, j)
wherefi=
j
;
El(i,j) 2 W>0,
is given as: ( rt+- rt-)2 (9)
l=0. This shows that tZ; can be obtained directly from o-r by multiplying it with a scaling factor ~ .
Fig. 4. Determining the node value at level 1 direct from the original resolution data.
distributed according to the parameters
2.3. Numerical example In the preceding section we developed a method for estimating the class statistics at higher levels of an image pyramid using zero level pyramid data. In this section, we shall test the validity of the estimation scheme using standard statistical tests. The two hypotheses we test are as follows: 1. Testing the hypothesis that a covariance matrix is proportional to a given matrix (the sphericity test). In this experiment, we test the statistical confidence about the estimate as compared with the theoretical predictions. 2. Testing the hypothesis that a mean vector and a covariance matrix are equal to a given vector and matrix. In particular, we shall test the statistical significance of our estimates of class statistics as compared with the expected statistical parameters, that is, fixed mean vector and scaled covariance matrix. A set of independent and identically distributed random variables are generated for both hypothesis tests. The random variables generated using the N A G routine G05AEF are bivariate normal, Signal Processing
0.0
091
I_0.9 2.0
A sample at a high image pyramid level is obtained from a two-dimensional lattice as shown in Fig. 4. The size of the lattice is equal to the size of the corresponding equivalent convolution mask, E l ( ' , " ). The lattice entities are filled by the random variables drawn independently from the bivariate normal population as given above. The coefficients of the equivalent convolution mask are evaluated via a two-step process. Firstly, a 4 × 4 generating mask M(i,j) is generated using equations described in Section 2.1. Then we apply equations from Section 2.1.1 to obtain the equivalent mask at level l. Burt [4] suggests that a = 0.37 provides the best fit to a Gaussian profile and we use this value for all experiments. It should be noted that each generated sample at level l is determined independently and does not therefore correspond to the actual averaging process in image pyramid construction. This method was adopted in order to avoid spatially correlated data. Alternatively, a normal pyramid smoothing
141
L Ng et al. / Supervised segmentation
could be used, but then it would be necessary to obtain a sample set of independent random variable by means of appropriate subsampling.
2.3.1. The sphericity test." the validity of the estimated scaling factor As mentioned, the statistical test conducted is the sphericity test [1, pp. 427 433] where we test the null hypothesis L'~=~ZT0 against the alternative ZTt-CJ~'0. More specifically, we use a sample of pcomponent vectors yj . . . . . Y N f r o m N ( # , ,-Y'/) t o test the null hypothesis. The hypothesis is tested by the following statistic: Z 2 - t r BI2°J,
(10)
where = sample mean vector, N
B = Z ( y _ j ~ ) ( y _p)v, i=l
N = the sample size, J)= the expected scaling factor,
are summarized in Table 1 with different values for the level of significance a. In all these tests, the 3(2 values obtained are well confined within the critical region established by the specified level of significance. Therefore, we can conclude that the null hypothesis is accepted at significant confidence level. More precisely, the covariance of the samples at higher levels of the pyramid is proportional to the covariance obtained from the original resolution samples. The ratio is given by (9).
2.3.2. Testing specified values Jor the mean vector and covariance matrix From the previous test, we confirmed that the covariance of the samples at higher levels of the pyramid can be obtained directly by multiplying by an appropriate scaling factor J~, the covariance matrix of the original samples, that is at level 0. Now, we conduct the second statistical test to verify our claim that the class statistics at higher levels of the pyramid can be estimated from the original resolution sample set. Given the p-component observation vectors y~ . . . . . yN from NUt, -r/) we test the null hypothesis H0 :/~ =/is, Z'~= ZT~using the following statistic [ 1, pp. 440 446] :
Z'0 = a positive definite matrix. The above statistic is distributed as Z 2 with p ( N - 1) degrees of freedom and the critical regions for such test are as follows: Alternative hypothesis ~'t:~J~'o, Reject null hypothesis if: Z2< Z~-,/2 or
Z 2 = _ 2 I 1 2p2+9p+ll] 6N~p~p + 3) 3 l°ge A''
(12)
where /
\pN/2
2 Z2> Z~,2,
where a is the level of significance. Also, an unbiased estimator o f ~ under the null hypothesis is given by
1 x exp{- 5[tr B~'~-I + N(35- ~,)T~'~?](.15-/Is)] },
-rs, ,u, = the specified mean vector and covariance matrix,
~ = tr BLro '
(11)
p ( N - I)"
.~ = sample mean vector, N
The test is applied to the first three levels of the pyramid, that is, l= 1, 2, 3. At each of these three levels, 50 bivariate data vectors are generated using the procedures described previously. The results
B = 2 (Y,-Y)(Yi-.I~) Ti
I
When the null hypothesis is true, the above statistic is approximately distributed as Z 2 with lp(p + 3) VoL 31, No. 2, March 1993
142
I. Ng et al. / Supervised segmentation
Table 1 Results of the hypothesis test: A covariance matrix is proportional to a given matrix Level of significance a a =0.05
a =0.01
Level
~
f/
X2
2 Z0975
2 )/50.025
2 Z0,995
,9(2.005
1 2 3
0.10154 0.01828 0.00395
0.09462 0.01945 0.00465
105.168 92.083 83.238
72.501 72.501 72.501
127.282 127.282 127.282
65.694 65.694 65.694
137.803 137.803 137.803
is the predicted value which can be evaluated using (9) fi is the estimated value which is obtained using ( 11 ) degrees of freedom = p(N - 1) = 2(50 - 1) = 98
degrees of freedom. The null hypothesis is rejected if Z 2 > Z ] , where a is the level of significance of these tests. We employed the same set of data used in the previous test for the current test. Table 2 shows the
test results. A high level of confidence for accepting the null hypothesis is achieved. Hence, the argument for estimating the class statistics at higher levels of the pyramid from the original resolution sample set is validated.
Table 2 Results of the hypothesis test: Testing specified values for the mean vector and covariance matrix Level
Expected class 2 statistics
1
fl = 0.09462
Estimated class I statistics
I00o001
1-00251]
' = LO.OOOOA
[-0.09462 X~=[0.08516 2
3
0.08516] 0.189243
]o.o18ol
/J = = LO.OOOOJ
Y2 = L0.0292]
0.01751] 0.038903
[-0.01933 $2=L0.01220 I--0.0064] ~ = L 0.0053]
[0.0000] ~,3= LO.OOOO I
S ~
r
0.00419] 0.00931]
F0.00416 $3=[0.00442
B
N-1
2
[0 0000] =/~z = P3 = P° = I_0.0000/
El0 0.9
0"9] 2.0
Degrees of freedom = 0.5p(p + 3) = 5 J} is the predicted value which can be evaluated using (9) Signal Processing
11.070
5.484
11.070
3.897
11.070
0.01220] 0.02995]
f3 = 0.00465
I
3.973
S,=[0.10132 0.07243] I_0.07243 0.16077_]
]o.ooooI
[-0.00465 %3=L0.0o419
2 X0.05
Y' = L-O.0618A
fz = 0.01945
[-0.01945 ~2=L0.01751
a =0.05 Z2
0.00442 l 0.009961
1. Ng et al. / Supervised segmentation
2.4. Summary The key requirement to incorporate Bayes' decision rule with a pyramidal data structure is that the class statistics at each level of the pyramid must be available. However, we have pointed out that at the coarse level it is difficult to obtain sufficient training data to estimate the parameters. Furthermore, the assumption of statistical independence of samples is violated at higher levels of the pyramid. Hence, it is impossible to determine the class statistics directly from each level of the pyramid. The idea of predicting class statistics for reduced resolution image suggested that there may be some ways to express the class statistics of any level of the pyramid in terms of estimated statistics of the original resolution samples. We have described a method for efficiently acquiring the parameters of the class distribution at each resolution level under the assumption that the original data are statistically independent and identically distributed according to N(/t,_r). We have also conducted some statistical hypothesis tests to justify the assertion. The proposed method has been found very effective on simulated data. It will be subjected to further tests in the following sections.
3. Segmentation scheme
143
the computational burden involved in the classification process in addition to having a beneficial effect on reducing image noise. The segmentation scheme is depicted in Fig. 5. The scheme involves three stages of processing. We first represent the input image by a pyramidal data structure as detailed in Section 2.1. In the second stage, the supervised coarse segmentation part takes the top layer of the image pyramid as input and produces the coarse segmentation. The final stage restores the spatial resolution of the segmentation result. In the following, the second and third stages of the segmentation scheme are explained in more detail.
3.1. Implementation 3.1.1. Supervised coarse segmentation scheme A. The Bayes classifier Let d-dimensional observation X belong to one of c possible classes, o91. . . . . co,. A canonical form of the Bayes classifier [7] is illustrated in Fig. 6 where discriminant function g~(.) is associated with class o)i. Assuming that classes are distributed normally with #i and covariance matrix _r~, then the discriminant function takes the tbrm gi (X) = loge P(co,) - ½loge[(2rt)al-ril] - ~1( X - # ~ ) T r g -- ~(X-#i),
In this section, we detail the proposed supervised segmentation scheme which is based on Bayes' decision rule and a pyramidal data representation. The key difference between our approach and a traditional pixel-based Bayes' classification approach is that our classification process is applied to a reduced spatial resolution version of the image to be segmented. The segmentation result, which we call a 'coarse segmentation', is then the input to a coarse-to-fine process to restore the spatial resolution of t h e segmentation result. The favourable features of this approach are that the supervised Bayes' classifier is known to be a very effective tool for classification problems, and that the pyramidal data representation can relieve
(13)
where P(og~) is the a priori probability of class o9~. Formally then, the classifier allocates an observaton X to class cog if and only if
gi(Y)>gj(Y)
Vj¢i.
(14)
B. Implementation of a Bayes classifier for reduced spatial resolution image When applying the Bayes minimum error rule to an image of reduced resolution such as ll(x, y), a new set of discriminant functions g~ defines the decision rule: assign Ii(x, y) to class o)i if
g~(I,(x y))>gJ(I,(x,y))
Vj#i.
(15)
Vol. 31, N o 2, M a r c h 1993
I. Ng et al. / Supervised segmentation
144
Image for Segmentation • the input image; which can be a grey level or multi-feature image I
lira
Image Pyramid•
• the original image is input to the bottom layer of the pyramid In order to Invoke the process • the PROCESS is terminated at level (I) of the pyramid
.g Supervised Coarse Segmentation
• the SEGMENTATION PROCESS is carried out at the top level of the truncated pyramid by using a Bayes classifier
.g Boundary /
• Inherit class label from the father node which has been classified either by the coarse segmentation process or the boundary refining process at the higher level • reclassify all the nodes which lie within the boundary region
J3
Segmented Image Fig. 5. Schematic diagram illustrating the Supervised Segmentation Scheme using pyramidal data representation.
Dlscrlmlnant Calculators
Observation X x2
2~"
'
,
Maxlmum Selector
dll c: number of classes or categories
Fig. 6. A Bayes classifier. Signal Processing
L Ng et aL/ Supervised segmentation In this case glUt(x, y)) is given as
145
Step 4. Generate the equivalent convolution mask
gti(It(x, y)) = logo P(~oi) - ½1ogc[(zx)dltz'il] -½[It(x, y) --,It, lTtE,'[It(x, y) -,#,1. (16)
El (i, j ) by specifying parameter a and then evaluate the corresponding scaling factor J). Step 5. Classify the pixels of the reduced resolution image It(x,y) by evaluating the discriminant functions given in (19).
Applying the results obtained in Section 2, we have
3.1.2. Boundary refining
gl(I/(x, y)) = log~ P(o9i) - ½log~[(2x~)dloZ'~ l] ~{1 l~ - ½[ I/(x, y) - o/li] 5. o27~ ][It(x,y) / (17) where t/li and/-ri are substituted by o/1~and j~ 0-r~, respectively, and the determinant of the matrix, .~oZ'i, is given as (fl)aloXil. Ignoring the zero level indices, we have
gl(I/(x, y)) = log~ P(coi) -½ log~[(2~h)dl_r~l] 1
-2~t [It(x, y) --#ilmX;-'[It(x, y ) - # , 1
(18)
or, equivalently,
gl(It(x, y)) = log~ P(o)~) - ½logc[(2r0dlz'i[] 1
1Oge[j')]
- - - [It(x, y) --#,]T--F,'[Iz(x, y) --/t,]. 2f,
(19)
F o r / = 0 , we have f0 = 1. then (19) is reduced to the expression given in (13). In summary the supervised coarse segmentation procedure involves the following steps as illustrated in Fig. 7. Step 1. Estimate the class statistics/t~, Z'~ for each class from the full spatial resolution training data. Step 2. Build the pyramid for image I(x, y) to be classified. Step 3. Choose the level at which the classification is to be carried out, say at level l.
The boundary estimation process is designed to restore the actual spatial location of the region borders. It closely follows the algorithm suggested by Wilson and Spann [23]. The original algorithm derives from a non-supervised approach. This means that no a priori knowledge of class statistics is required. The boundary node reclassifying process is then based on the minimum Euclidean metric. Modifications have been made in order to turn the refining process into supervised approach as adopted in our framework. In this process, preliminary segmentation has been obtained at the/th level of the pyramid from the last processing stage. Based on the assumption that the relative spatial location of the regions in an image is invariant over the scales of resolution, then we can immediately conclude that the classification introduced at this level of the pyramid is valid at lower levels. This implies that the segmentation process at the successive layers can be reduced to the problem of classifying a small number of nodes which lie near the region borders. To illustrate this, consider an image consisting of three different regions as shown in Fig. 8. Presuming coarse segmentation has been obtained at level l of the pyramid either from the coarse segmentation process or the previous boundary refining procedure applied at level l, then every node at the level immediately below, that is level l - 1 , inherits the class label from its father node at level l. Let us first introduce two terms: boundary node and interior node. A boundary node is a node which is less than two pixels away from the coarse region border inherited from the previous level. An interior node is a node which is not a boundary node. Vol. M. N o 2. March 1993
146
I. Ng et al. / Supervised segmentation Discriminsnt
Classify the coarse
resolution image;
Calculators
it(x#)
In put T
--
image
[~
Maximum Selector
I Image Pyramid Construction ~",specifythe neiqht of the pyr~lmid 1 ,~
[~
specify the kernel value
Coarse Segmentation at level i
a
pass the parameters._ land
a ~ ~ ,"~
t nml.
1
H m , l u . l l , l m a l , ,
\
Evaluate Scaling factor
fl
Class statistics
Training Data
Classification u n m l , , .
Modify the class statistics for the dlsclmlnant functions by using an appropriate scalinglactor ft
i
Fig. 7. Block diagram of the supervised coarse segmentation system. The upper part of the diagram shows a modified version of a typical Bayes classifieras depicted in Fig. 6. The lower part is composed of two mechanisms. The first one is the training mechanism which is used to estimate the class statistics from full spatial resolution training data. The second one is the control mechanism which is used to generate the appropriate class statistics at corresponding pyramid level. The true region boundary must be confined to the boundary region which is shaded as shown in Fig. 8. Thus the true border can be recovered by reclassifying those nodes which are lying within the boundary region. For the classification process, we make use of the classifier implemented in the previous stage of the segmentation scheme. However, some modifications are necessary to adapt it to the boundary refining process. Firstly, instead of considering all the possible class labels, the re-classification process can be viewed as a two-category classification problem as illustrated in Fig. 8. When more than two regions meet, a vote will be held among the eight connected neighbours to decide which two regions are most likely associated with this part of the region border. Furthermore, the classification is operated recursively until the full spatial resolution has been restored. The boundary refining process can be outlined in the following general form:
SignalProcessing
Step 1. At kth level of the pyramid: if nodes do not lie at the region boundaries then give them the same class label as their fathers else label as boundary nodes
Step 2. Nodes in the boundary region are classified by the Bayes classifier with the corresponding class statistics at level k. Step 3. Terminate the process if full resolution has been restored; otherwise proceed downward to ( k - 1)th level.
3.2. Numerical illustration and discussion It was remarked in the introductory section that using the supervised learning approach the class
I, Ng et al, / Supervised segmentation
membership will be more stable at coarser image resolution and this helps to achieve low classification error rate. The relevance o f this can be appreciated if one considers the following simulation study. We return also to the comments made on the S p a n n - W i l s o n algorithm. Consider Fig. 9. This 128 × 128 image is composed o f three concentric squares o f different size. The entities o f these regions are defined by bivariate data vectors which are drawn independently from three normal populations specified in the figure. The pyramid representation o f this image is constructed using the same convolution mask as in
level l i.e. I t (x,y) = 8x8 -- 64
147
Section 2.3.1. Figure 10(a d) shows the scatter plot o f the pixel vectors from four successive layers o f the pyramid starting from level 0 in the feature space. It can be seen from Fig. 10(a) that the original class density functions overlap quite severely. F o r instance only two modes are distinguishable at level 0 since density distribution o f class 1 is embedded within that o f class 3. Despite the fact that no class is well separated, the three clusters can be clearly identified in the scalar plot o f the data from the higher level o f the pyramid. In Fig. 10(b d) we can see that these three clusters become rapidly separated from each other and shrink towards their centre o f mass.
l e v e l 1-1 i.e. 11.1 (x,y) = 16x16 = 256
proceed downward the Image p y r a m i d
1
$¢
class 1 0)t class 2
0) 2
class 3 0)3
,
°
°
inherited from higher level of the p y r a m i d
this boundary node is associated with class i or class 3; hence the classification process can be reduced to a two class problem
true region border at this level
D
boundary nodes
in this e x a m p l e the total n u m b e r of pixels or nodes need to reclassify = 104 : 40.6%
Fig. 8. Boundary refining procedure. In this example, we consider a three-region image. First the image is expanded by a factor 2 on both sides. The shaded region in level l - 1 represents the boundary nodes. All nodes in the boundary region are reclassified in order to recover the true boundary which is shown as a dotted line. At this particular step, only 40.6% of image at level l - 1 needs to be processed. The percentage of reprocessed nodes will be reduced as the process proceeds down the pyramid. V o l . 3 1 , N O . 2, M a r c h
1993
148
I. Ng et al. / Superoised segmentation
The results shown in Fig. 10 clearly justify our opening statement that global information existing in an image is more salient in coarser resolution versions of this image. In other words, even though class identities are not well defined in the original image as in this example, correct classification may still be achieved in a lower resolution version of the same image. Before discussing the segmentation result for this illustration, let us go further to investigate why the clusters become more compact in higher levels of the pyramid. Recalling (1), nodes at one layer are obtained by the weighted averaging of 16 nodes from a local region of the layer immediately below. Moreover, the spatial relationship between nodes is preserved in all layers in this process. So, the relative spatial position of the regions in an image is invariant over a range of resolutions. Consequently, this enables the pyramidal smoothing
process to combine the nodes having the same statistics even in the presence of multiple regions. The rate of condensation of class density function is given by the corresponding scaling factor ft. The effect can also be seen as that of a local noise suppression process or low-pass filtering process. The variation of data within a local region is smoothed out during the pyramid building procedure. Hence, subsampling the filtered image is then permitted. As a result the next layer of the pyramid is produced. Figure 11 (b) shows the restored ground truth of the above bifeature image. The coarse segmentation is carried out at level 3 of the pyramid where the spatial resolution is equal to 16 × 16 pixels in size. It can be seen that the algorithm has produced a segmentation in good agreement with the given ground truth. In comparison, we show in Fig. 11 (a) the result of segmenting the bifeature image in full
Class 1: 32 x 32 square Class 2: (64 x 64 - 32 x 32) square ring Class 3: (128 x 128 - 64 x 64) square ring
Class 1
Class 2
Class 3
Sample size
1024
3072
12288
Mean vector
p1=~126.0~
p2= L162.0~
tj.3= L120.0~
h2601
r14807
Covariance [-324.0 291.61 [-486.0 0 . 0 0 1 [-648.0-324.01 Matrix T-1=L291.6 648.0J T-'2=L 0.0 486, T-'3=~324.0 324.0/ Fig. 9. The ground truth of the multipleclasses bivariate feature image and the statisticalparameters. Signal Processing
1. Ng et al. / Supervised segmentation
(a)
Scatter Diagram of Simulated Data Level 0 (sample size=16384) 1
25O
-
,
V
v
r
w
[
149
(b) Scatter Diagram of Simulated Data Level 1 (sample size=4096) i
r
200
150 '.7,,.
'"3.;",
100
50
0
•
0
- ~. . . . . . .
50
~
.
.
100
.
.
.
150
.
.
.
.
i
~ I
200
250 0
50
Xl
(c)
'
150
200
250
Xl
Scatter Diagram of Simulated Data Level 2 (sample size=1024)
250
i
1 O0
'
(d) Scatter Diagram of Simulated Data Level 3 (sample size=256)
'
200
150
100
50
0
0
i
i
i
i
50
100
150
200
Xi
250 0
i
t
i
i
50
100
150
200
250
X1
Fig. 10. Scatter plot of the data in a two-dimensional feature space (for statistical parameters, see Fig. 9). Data shown in (a), (b), (c) and (d) are from four successive layers of the data pyramid starting at level 0. The sample size for each plot is a quarter of that immediately below. Vol. 31, N o . 2, M a r c h
1993
150
L N g et al. / S u p e r v i s e d
(a)
;~
segmentation
(b)
.,,
,....:~.....~...~...
;~ ~ . ~ ; . . . ;,~', ~..,.'r~.~." ¢ =. ,
...............:
~,
=
• ,, ..
"~.~ , . ~ ' ; . ' " : t
Fig. 1I. Segmentationresults of the synthetic image. (a) Segmentation result obtained by using pixel-based classification method applied to the full resolution input image. (b) Segmentation result obtained using the proposed method. resolution directly. As expected, the segmented regions are extremely 'spotty'. We present also the results of coarse segmentations obtained using the Spann Wilson algorithm. As mentioned, our proposed segmentation scheme shares some similarities with Spann and Wilson's approach. For instance, multiresolution data representation is employed in both methods. However, the fundamental difference lies in the stage performing coarse segmentation. In their approach, a non-parametric classification method known as local centroid clustering is used and requires no a priori information about image statistics. These characteristics make their algorithm ideal for general segmentation problems. The local centroid clustering algorithm is most easily described by considering a single feature image, for example a grey-level image. The image information available for the clustering process is the histogram of the image. The algorithm can be viewed as an iterative process shifting the histogram masses to their nearby local distribution peaks defined within some local window. It is assumed that local maxima represent region identities. The algorithm can be easily extended to handle multiple feature images in which it works on locating the centroids of clusters in hyperspace. However, the local centroid clustering method works efficiently and effectively only under certain Signal Processing
conditions. For example, the algorithm performs well when applied to an image which can be segmented solely by grey-level information. This is because the feature space the algorithm is working in is then one-dimensional with finite resolution, say 256 possible locations for an 8-bit grey-level image. Hence, the performance in terms of computational effort is independent of the image size. However, in the case of a multiple feature image, it is impossible to visit all the possible sites existing in the finite continuous feature space in order to locate the cluster centroids. One can take advantage of the sparsity of the feature space to speed up the process, but one should remember that each pixel in an image holds some values which define a point to the feature space. In the worst case, the total number of distinct points in the feature space matches the size of the image. Since the clustering process is applied to some coarser resolution images, in general, the image size is much smaller than the dimension of the feature space. Hence substantial time reduction can be achieved. In other words, the computational efficiency should be degraded when applying the process to a large size image. Comparatively speaking, our supervised coarse segmentation method is more efficient since it is a one-pass process while the local centroid clustering algorithm needs to proceed iteratively.
151
1. Ng et al. ,; Superuised segmentation
The estimate of local centroid is constrained to some local window size. As reported in [6], the class dominance is determined by the local window size. Too small a window will tend to locate some local peaks which do not represent any distinct regions in the image. On the other hand, too big a window will smooth out small regions. In Wilson and Spann's implementation, they introduced a mechanism to remove the dependence on the local window size in which the window size is increased by some fixed value after each iteration and the clustering process terminates when successive runs produce consistent results. However, Fig. 12 shows the coarse segmentations obtained from the local centroid clustering algorithm with different local window sizes. It can be seen that the initial choice
of the local window size affects the judgement of successive runs and hence gives inconsistent coarse segmentation. Furthermore, the choice in the algorithm, such as the minimum detectable region size, affects qualitatively the segmentation result, yet there is no prior knowledge available to help to make a correct choice. Figure 13 shows the coarse segmentaton results with a different minimum size of the desired regions for the data in Fig. 12.
3.3. Summary This section covered the details of our proposed segmentation scheme. We also demonstrated on simulated data the consistency of our algorithm as
.......
iiiL++, II'I'¸ m I+ m L.W.S = 4
• ~i~]~']~ il~
tll~J~"'] +,-,N i
L.W.S = 5
L.W.S=26
L.W.S=27
mum mmm Emm L.W.S=IO
L.W.S=11
L.W.S=12
L.W.S=16
L.W.S=17
L.W.S=18
L.W.S = 22
L.W.S -- 23
L.W.S = 24
L.W.S=28
L.W.S=29
i
L.W.S = 6
~.::
L.W.S = Local Window
iii!!i!
::.:
i
ii!iiill
...... ~ , ' ~ - . , , ~ ~
::::::.::.
L.W.S=30
L.W.S=31
Size
Fig. 12. Coarse segmentations using local centroid clustering algorithm. Effect of varying local window size with the minimum detectable region size set to 6% of the image size. The segmented regions are grey-level coded. ~,'ol 31, No 2. M a r c h 1993
152
L Ng et al. / Supervised segmentation
mmm B mmm mmN L.W.S = 3
~ i ~ ' ~ ~ il~
LIWIS = 4
L.W.S = 5
L.W.S = 6
L.W.S = 9
L.W.S=IO
LlWlS=11
L.W.S=12
k.W.S -- 15
L.W.S=16
L.W.S=17
L.W.S=18
i,!iiiiiiiiii il;iii!iiii!ii!iiiiiiiiiii,iiiiill !i!!!:!~!i!i i !i!!!!i!i!i!!~ii !i!i i i,i~i 'i i i !i~
m~]~
~ il~
iiiiiiiii!i!i!i!ili!i! !i!iiiii!iii ;!iiiii! :i!i!iiiiiii ii:i!i~i!ii!i!ii!!iii~!~!!!ii!~i!iiii
0
iiiiiiii~!i!~ii!~i~!ii!i!~i!!i!!ii!~i!ii!i~!ii~i! L.W.S = 21
k.W.S=22
L.W.S=23
L.W.S=24 ii:i:~ i:ii:i:!:iii:!:i:!:!:ii:i:!:ii::!:!:
L.W.S=26
L.W.S=27
L.W.S = Local Window
L.W.S=28
i/
k.W.S -- 29
ii•i..... ! . ii ;if!iiili i i i.'i!i!iiiii i !i i i !iiiiiiii!iiii k.W.S = 30
Size
Fig. 13. Coarse segmentations using local centroid clustering algorithm. Effect of varying local window size with the minimum detectable region size set to 10% of the image size. The segmented regions are grey-levelcoded.
compared with the results achieved using Spann and Wilson's method. Some remarks have been made about both approaches especially concerning segmentation consistency. We have shown that the local centroid clustering algorithm may fail to yield consistent segmentation owing to the unpredictable effect of some of the parameters of this method which are automatically selected in a data-dependent manner. However, it cannot be overemphasized that the above findings do not diminish the importance of Spann and Wilson's approach which is designed for nonsupervised segmentation. We have simply shown that when the problem at hand can be cast as a supervised image segmentation task, the prior knowledge can be used to advantage to achieve more consistent results. Signal
Processing
4. Texture segmentation To clarify the applicability of the proposed algorithm, a set of experiments of its use in texture image segmentation is illustrated. The results are demonstrated on two sets of textured images. The first set of images is shown in Fig. 14 and is formed by combining portions of textures from the Brodatz album [3]. The use of such texture composites allows us to perform quantitative evaluation of our algorithm, since the textures are fairly homogeneous and the textured regions are unambiguously defined. Figure 15 shows the second set of images which contains two pieces of seismic sections featuring a complex salt dome structure [ 15]. A seismic section
I. Ng et al. " Supervised segmentatio~
~
(a)
~
.....
!iiiiiii!iii!i!iiiii!i,
i ~ iii!iilili!i!~ 1 : raffia 2: leather
~'~.
153
"~.~
1 : raffia (b) 2: leather
::!iii
!i!i!'
1: leather (C) 2: pigskin 3: raffia
1 : pigskin ( d ) 2: leather 3: paper
Fig. 14. Test images set 1 : Brodatz images.
(a)
. .:
: . , :s -
Fig. 15. Test images set 2: Seismic section. Vol
3 1 , N { ~ 2, M a r c h
1993
L Ng et aL / Supervised segmentation
154 Table 3
intensity corresponds to the amplitude of the recorded reflected acoustic wave from the subsurface strata in response to some artificial near-surface acoustic source excitation. To interpret the geological events, a human interpreter generally starts by seeking visually predominant features in a seismic section. For example, in these two images, the main features are laterally extended lines with varying vertical spacing and intensity. From the geology point of view, they represent various stages of sediment deposition. The continuity of the deposition was disrupted by the uprising of buried salt. The sedimentary overburdens were pulled up alongside the salt dome's flanks. The chaotic configuration inside the salt dome reflects the disruption of the continuity of the sedimentary beds during the salt tectonics. Because of these
A set o f eight 3 × 3 c o n v o l u t i o n m a s k s derived from discrete cosine t r a n s f o r m ( D C T ) Basis vector:
UI:{1,1,1~
U2: { 1 , 0 , - 1 }
U3:{1,
UI U2
U I U3
U2 U1
u2 u2
1
1
1
1
1
1
1 0
-1
0
0
0
2
2
2
1 0
-1
I
1
1
1 0
-1
-I
-1
-I
U3 U1
U2 U3 1 0 -2
1 -2
-I
0
2
I 0
-1
1 0 -1 0 0 0 -I 0 l u3 u3
U3 U2 1
1 -2
1
1 -2
1
1 -2 0 -1
2,1}
1 0
1
0
2
-2
2
I
4
2
1 -2
-1
1
can be considered as an acoustic image of the subsurface geologic structures of the Earth along a vertical cross-section line. In these images, the pixel
Level 0 120x120 ,.
"'
•
.
Level 2 30x30
Level 1 60x60 .
:.
'
'.."
,.,
J~:iiii!it~/t ~i!/~/!i!i/i!ii
-..,
.~
Level 3
IlI/lt
15x15
:::::::::::::::::::::: ::::::::::::::::::::::: :::::::::::::::::::::: ::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::::: :+:+:+:+:+:, ::::::::::::::::::::::
Pyramid Construction
::::::::::::::::::::::
..............................................................
::::::::::::::::::::::: :::::::::::::::::::::::
Boundary Refining
~ : ~ : ~ : : ::,::,::~::~,~' ::::::::::::::::::::::: ~ :::::::::::::::::::::::: :::::::::::::::::::::: ::::::::::::::::::::::
~~ ::i:i:i:i:i:i: ::::::::
~.~; ~r
i!iii!ii!!ii~!i!!!ii!!!!
~#~:~:
~.:.$.~
~
iiiiiiiiii! :::::::::::
::
!ii~i!i!ii!~i!i~!!i~ill
~'~
LevelO 120x120 final segmePitation~
.~i~ :~.a..:~!:~
.>.:i::.?~!i:':~!:
~:~:~":~:=~
Level1 60x60
: --
i!iiiiii iii
~i!~i!i!~ii~iii!i!i~!
~
.:~:.,e.i. .::: ~ ~ :
~.{~:.ii.::i!:::
Processing
:i:i:i:i:i:;:i:i:i:i:i: :::::::::::::::::::::::
::~::i!i!;!i!;!;!;!;!;il)i!i !i!iii!;i;!;ili;ii!iiii ::::::::::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::::::::
,
"=~:'~
:!:i:i:!:i:i:i:i:!:iii:i?ili~i:iiiii!!!!!!!!!i: !i:!:!:!!!!!!:i!!:: '
Segment,
~'~:~'"~:~:~:~:~'~:~:~:~:~:~:~:~:~:~::
Level2 30x30
Fig. 16. G r a p h i c a l illustration o f the p r o p o s e d s e g m e n t a t i o n on i m a g e (c) of Fig. 14. Signal
:::::::::::::::::::::::
:i~i:~'.i:~:~? t :i~:i:;zi:~i:~;.. . . . . !:?:!:!:!:!:i:!:i:!:!::' :::::::::::::::::::::::
155
L Ng et al. / Supervised segmentation
(a)
(b) ~"~g".~." ~ ' ~ . ~ : "
.~:/~.'~N
(c)
i/
~i!:..
ql
(d)
.....,~=~.~ •
" .~a
Fig. 17. Segmentation results on Brodatz images. From left to right, the images in column 1 show the result of segmenting the multiple feature image in full spatial resolution directly. The results shown in column 2 were obtained from the proposed segmentation scheme. In columns 3 and 4, the images show the estimated region superimposed on the original test image and the corresponding ground truth, respectively.
V o l ~1, NO 2, March 1993
156
L Ng et al. / Supervised segmentation
Table 4 Correct classification rate on Brodatz images Algorithm used
Image (a): raffia/leather Image (b): raffia/leather Image (c): leather/pigskin/raffia Image (d): pigskin/leather/paper
Supervised classification at level 0
Proposed segmentation scheme
Class
Class
I
2
78.69 78.70 59.79 53.01
68.38 68.22 43.70 62.68
3
1
2
3
61.85 53.40
94.97 95.43 90.92 93.95
96.84 89.44 93.42 83.80
95.80 89.40
Correct classification rate as % of the class region sizes
structural characteristics a seismic section can generally be considered as a texture image [24]. The segmentation process operates on a set of 'feature images' which have been extracted from the original textured image. As texture is characterized by the correlation of grey levels of neighbouring pixels, individual pixel values cannot be used directly for the segmentation purpose. Hence, in this paper, texture is represented by a set of filter outputs. More specifically, we convolve the textured images with a set of eight 3 x 3 masks derived from the Discrete Cosine Transform (DCT) and compute the absolute values of the filter outputs as features. These masks are shown in Table 3 which are the product of three basis vectors. Here, we use directly the rectified filter outputs as the basis for collecting the statistics. In contrast to the Laws method [13] which involves a second stage of averaging, our approach is similar to Sher and Rosenfeld [19] in which eight 3 × 3 Laws [13] masks were used instead. The choice of using 3 x 3 D C T masks is made on the basis that they form a complete set of operators from the Fourier domain point of view. As pointed out in [16], the 3 x 3 D C T masks can be considered as bandpass filters and the set of eight 3 x 3 D C T masks covers the Fourier plane except for the centre low-pass region which bears no contribution to texture discrimination. In the following, we detail the experiment procedure which involves obtaining training samples and class statistics in both sets of images, Signal Processing
respectively. Segmentation results will then be presented, followed by a detailed discussion. In addition, we shall point out that the so-called absolute average ABSAVE feature defined by Laws [13] can be effectively represented in our multiresolution data structure.
4.1. The experiment 4.1.1. Brodatz texture
The first set of four textured images are 128 x 128 pixels in size and are composed of various combinations of four Brodatz textures: raffia, leather, pigskin and paper. The Brodatz texture images are independently histogram equalized into 256 grey levels so that they have the same first-order statistics. Hence the low-pass D C T mask can be safely discarded. The test images are then generated according to the corresponding ground truth as shown in Fig. 14. The training data are obtained in the following way. The Brodatz texture images used are of 256 x 256 pixels in size. We divide each image into four non-overlapping subimages of size 128 x 128. F r o m each texture image we extract training samples from one of the subimages which is not used to compose the test images. The training samples are subjected to the same texture extraction process as described above in order to obtain the class statistics for the classifier. In fact, the classifier is designed in an eight-dimensional feature space.
1. Ng et al. / Supervised segmentation
(a)
(b)
(c)
(d)
157
Fig. 18. Segmentation results on seismic image (a) of Fig. 15. Images (a) (d) show the regions with similar textural appearance to the training classes 1 to 4, respectively. The salt dome is delineated as shown in image (d).
T o illustrate the w a y the a l g o r i t h m operates, we refer to a series o f images in Fig. 16 which d e m o n strates e x p e r i m e n t a l results on image (c) o f Fig. 14. The a c t u a l feature images we started with are o f size 120 x 120. The u p p e r p a r t o f the figure shows
the coarse s e g m e n t a t i o n o b t a i n e d at the first f o u r layers o f the p y r a m i d . W e use the same c o n v o l u t i o n m a s k M(i,j) as in the previous experiments for the p y r a m i d building process with the p a r a m e t e r a = 0.37. As can be seen, the region identities b e c o m e Vol. 31, N o 2, March 1993
158
I. Ng et al. / Supervised segmentation
more certain when moving up the pyramid. As a compromise between the degree of noise immunity and the ability to retain small distinct regions, the pyramid is terminated at level 3 with the size of 15 x 15 at which we obtained the coarse segmentation. Boundary refining is then progressed until full spatial resolution is restored as shown in the lower part of Fig. 16. The same process is applied to the rest of the test images which include the seismic sections. For quantitative comparison, the segmentation results for these four images are given in the series of images in Fig. 17. From left to right, the images in the first column show the result of segmenting the multiple feature image in full spatial resolution directly. The results shown in the second column were obtained from our segmentation scheme. In the last two columns, the images show the estimated region superimposed on the original test image and the corresponding ground truth, respectively. It can be seen that our segmentation scheme has produced very good segmentation on these four images. This finding is supported by the numerical evaluation of the correct classification rate of all images as shown in Table 4. It is notable that the classification accuracy goes up substantially compared with the classification based on full resolution feature images.
4.1.2. Seismic section The two images used in the second set of experiments feature a complex salt dome structure of size 512 x 512 and 476 x 500, respectively. The experimental results are demonstrated on segregating the salt structure from the sedimentary basin in both images. Alongside the centre salt structure we visually perceive three other texturally distinct regions in the seismic image in Fig. 15(a). Training samples were acquired from these regions as highlighted in the image with the corresponding class labels. The training samples are then preprocessed by convolving with the eight DCT masks followed by taking absolute values as the basis for collecting the class statistics. These class statistics are used in segmenting both images. In order to ensure that the class Signal Processing
statistics apply to the other image, we histogramequalized the second image in Fig. 15(b) with the intensity histogram of the first image as reference. For each test image, we divide it into nine overlapped 242 x 242 images. These images are then segmented using the proposed segmentation scheme. The coarse segmentation is obtained from level 4 of the pyramid with the spatial resolution of the feature images reduced down to 15 x 15 pixels. The results are then pasted together to show the segmented regions. Figures 18 and 19 show the segmentation results. As can be seen, the salt structure is clearly delineated in both images with good perceptual accuracy. Furthermore, the training patterns obtained in the first image seems well applied to the second image with similar textured composites. The results confirm that the algorithm is suitable for routine applications where each image region represents one of a finite set of image phenomena whose statistical properties are approximately invariant over a large set of images.
4.2. Discussion 4.2.1. Relationship to Laws' ABSA VE features In his work on texture analysis, Laws [13] developed a set of textural properties based on average values of a set of small convolution mask outputs. He also concluded that one of the most useful statistics was the sum of the absolute values of the image over a large moving window after the set of small size masks, called Laws masks, are convolved with it. The ABSAVE feature f(i,j) defined in a moving window, say Wx W, is given as 1
f(i,J)=~2
W-I
Z ly(i-p,j-q)[, p,q=0
where y(-, • ) denote the Laws mask outputs. As pointed out by Hsiao and Sawchuk [ 12] the choice of averaging window size affects the classification and segmentation quality. They indicated that when calculating statistical features the moving averaging window may sometimes overlap two types of textures. When this occurs, the resulting
1. N g et al.
/Supervised s e g m e n t a t i o n ~:
159
......
.~.
.'-.:~L
.........T*.~72T:..,,.~
•~ ~=C~ .~-.. ~ -
;5<:,,:,~.~,;-
tc)
-
::?:----~-~-:-~
.-: .
_~.~c.__ ~ ' ; ~
-:,, ...... ~'~sL-.;U':.~- ~
-
•
,~,:"..~
"........
(d)
:,:,.,;:::.~:.
;":':.~- 7
~
<-~
.
- ~.,-': "
"~.~..,,:,~ ...~..~...~...:T,.~;z,5"
~'~'~. " - " : " ; . - ~ L ~
.'--'~,'~-'~-,,.~
Fig. 19. Segmentation results on seismic image (b) of Fig. 15. Images (a) (d) show the regions with similar textural appearance to the training classes 1 to 4, respectively. The salt dome is delineated as shown in image (d).
statistics b e c o m e the mixture o f two sets o f statistics. They then introduced an additional step using a w i n d o w o f smaller size to reduce the possibility o f mixing statistics along region boundaries. The
estimates are further s m o o t h e d in a w i n d o w o f larger size by an edge-preserving noise s m o o t h i n g filter. However, the size of the additional w i n d o w is still chosen empirically. Vol.
~1, No.
2, M a r c h
1993
160
L Ng et al. / Supervised segmentation
To show that our multiresolution data structure provides an effective representation of the ABSAVE features, we consider (2). It can easily be observed that individual pyramid layers hold the equivalent of the ABSAVE feature obtained from varying averaging window size. As a result, we have estimates of ABSAVE corresponding to different moving window size concurrently available for use by the segmentation purpose. Thus various degrees of within-region homogeneity are used in a cooperative way to achieve segmentation. The highly smoothed features are used to decide the region identities, while the less smoothed features can be used to fine tune the accuracy of the region boundary location. Furthermore, the statistics for various smoothed features can be estimated efficiently using (6) and (9).
4.2.2. Segmentation quality To evaluate the segmentation accuracy, we consider the experimental results from the Brodatz texture images. The discussion is also applied to the seismic sections. The proposed supervised segmentation scheme using multiresolution data representation seems to work satisfactorily in all experiments and produce very good segmentation. Although we cannot recover the exact ground truth of the Brodatz test images, the classification accuracy achieved is still high. However, it should be pointed out that there exist a lot of factors which may affect the classification accuracy. In our case, for instance, the segmentation algorithm relies on the performance of the texture characterisation/representation method used. Since the texture property is defined in terms of a patch of neighbouring pixels rather than in single pixel scale as in grey-level image, uncertainty affects the texture features extracted near the region border. Such feature vectors are prone to misclassification. Another factor affecting the classification accuracy is the height of the pyramid. On the one hand, to cope with high levels of noise, it is desirable to start the coarse segmentation process at a high level of the pyramid affording Signal Processing
a good separation of image segment classes. However, operating at a high pyramid level limits unfavourably the minimum size of detectable segments. In addition, this is associated with a large proportion of boundary pixels for which the assumption about scaleability of class statistics does not hold. A suitable compromise, therefore, is necessary and it is of course easier to reach it in the supervised segmentation mode than in the non-supervised one. Even though our proposed algorithm does not perform so well near region boundaries, it performs much better than a pixel-based supervised classification algorithm when considering the classification accuracy of the region interiors. In general, a post-processing step must be used to improve the consistency of the region interiors in order to obtain similar results. Such post-processing algorithms include probabilistic relaxation techniques [12] and maximum a posteriori probability estimation [2, 21]. However, they are computationally demanding.
4.2.3. Computational efficiency We proceed to discuss the computational efficiency of our scheme. Table 5 lists the computational times required in the Brodatz texture experiment. In our case, all algorithms are implemented in FORTRAN on a MicroVaxII. From Table 5 we can see that the computational time required for our algorithm is at most half the time required to classify the feature images at full spatial resolution. Most of the computational time of the proposed algorithm is spent on the boundary refining process. For more complex images, more boundary nodes are generated, and as a result it takes longer to reclassify them (see image (b) of Fig. 14). The computational efficiency is also slightly affected by the total number of texture classes involved. For example, the time taken to compute the coarse segmentation on images (c) and (d) of Fig. 14 is about 0.12 seconds of CPU time longer than on images (a) and (b) of Fig. 14 which contain
L Ng el al, : Supervised segmentation
161
Table 5 Times taken for the complete process of image segmentation on Brodatz images (see Fig. 14)
Coarse segmentation at levcl 3 (15x 15j
Image (a)
Image (b)
Image (c)
Image (d)
CPU time ~ Classified (sec) nodes
CPU time' Classified (sec) nodes
CPU time t Classified (sec) nodes
CPU time Classified (sec) nodes
0.48
225
0.48
225
0.70
225
(t.71
225
Boundary refining 0.57 at level 2 (30 × 30) at level 1 (60 x 60) 1.23 at level 0 (120x 120) 4.71
404 876 1988
0.98 2.(11 8.12
640 1424 3420
(I.74 1.58 5.79
512 1100 2440
0.75 1.72 6.74
496 1188 2836
3493 14400 ~
11.59 30.64 ~
5709 14400 ~
8.81 45.71 ~
4277 14400:
9.92 45.69 ~
4745 144002
Total
6.99 30.642
The experiment is executed on a MicroVaxll • Computational time taken for the coarse segmentation algorithm applied to the full resolution image, i.e. 120 × 120 pixels
two distinct textures and the former have three distinct textures. However, the increase of distinct classes will not slow down the boundary refining process as the reclassification of boundary nodes is considered as a two-category classification problem between two predominant regions around the boundary nodes. The reduction of computational time compared with full spatial resolution pixel-based classification methods is not only determined by the image complexity, it also depends on the choice of pyramid level at which the coarse segmentation is taking place. The computational time required will be reduced substantially if the whole segmentation process starts from a higher level of the image pyramid. To demonstrate this, consider the example shown in Fig. 20. In this example, a 32 x 32 image with two regions separated by a vertical edge is used. A three-layer pyramid is constructed in which the coarsest resolution is 4 x 4. In the coarse segmentation process, there are 32, 64 and 128 boundary nodes identified at level 2, level 1 and level 0 of the pyramid, respectively. Summing up, the number of nodes being processed in both procedures is 16 nodes from coarse segmentation process and total of 224 from the boundary refining process. Hence, to obtain the final segmentation, it is only required to classify 240 nodes in total. On
the other hand, if we apply the pixel-based classification algorithm to the full spatial resolution image, we need to process a total of 1024 nodes. About 76%, of computational time is saved if we exclude the overhead of generating the pyramid. Suppose that we start the whole process far from the highest level of the pyramid, say level 1: in other words, 256 nodes are classified to obtain the coarse segmentation result. Presuming the estimated border is located at the same position as in the previous case and at the same pyramid level as well, then we have 128 boundary nodes being generated at level 0 of the pyramid. We will now need to classify 384 nodes to obtain the final segmentation. As a result, 144 extra nodes are involved in the whole process compared with the previous case. Moreover, the segmentation quality will be downgraded (more 'noisy') as the coarse segmentation process starts fl-om a much lower level of the pyramid. As can been seen, in order to obtain higher computational efficiency, we should start with the coarsest resolution which we can afford. Furthermore, as mentioned before, the certainty of class membership will be strengthened in a lower-resolution representation of an image. This means that better segmentation quality may be obtained. However, we should note that small regions tend Vol :~1,Nt!. 2, March 1993
162
1. Ng et al. /Supervised segmentation
!ii iii!ii:iii i~;ii:!!i !!i iiiiiii i~iii
Boundary Refining
iiii;ii I
!iii!iil
- Level 2 No. of n o d e s = 8 x 8 = 6 4 No. of b o u n d a r y n o d e s = 32
- Level 1 No. of n o d e s = 1 6 x 1 6 = 2 5 6 No. of b o u n d a r y n o d e s = 64
- Level 0 No. of n o d e s = 3 2 x 3 2 = 1 0 2 4 No. of b o u n d a r y n o d e s = 1 2 8
Fig. 20. A four-layer pyramid.
to disappear when moving up the pyramid. This is a side effect of the noise suppression process as discussed in Section 3.2. Thus, there is a trade-off between the degree of noise immunity and the ability to retain small distinct regions. The choice of the pyramid must be made with these trade-offs in mind.
5. Conclusion A supervised segmentation scheme has been described in which a Bayesian approach incorporating a pyramidal data structure is used. The favourable features of this approach are that the supervised Bayes classifier is known to be a very Signal Processing
effective tool for classification problems and that the pyramidal data representation can relieve the computational burden involved in the classification process. Furthermore, spatial information can be incorporated into the classification process to overcome the weaknesses of purely statistical methods. The key question in incorporating Bayes decision rule with pyramidal data structure is that the class statistics at each level of the pyramid must be available. However, we have pointed out that unbiased estimate cannot be achieved by direct calculation. Instead, a method for efficiently acquiring the parameters of class distributions at each resolution level has been introduced. Experimental results have been presented which demonstrate the power of the method. However, this performance
L Ng et al. / Supervised segmentation
can only be expected when class statistics are normally distributed and image segments are of commensurate size to the level of noise corrupting the image.
Acknowledgments The first author was supported by a scholarship from the Croucher Foundation. We gratefully acknowledge the comments of the reviewers who helped to improve the paper. Furthermore, we wish to thank BP Exploration for providing seismic data.
References [ 1] T.W. Anderson, An Introduction to Multivariate Statistical Analysis, Wiley, New York, 2nd Edition, 1984. [2] J. Besag, "On the statistical analysis of dirty pictures", J. R. Statist. Soc. B, Vol. 48, 1986, pp. 259 302. [3] P. Brodatz, Textures A Photographic Album Jor Artists and Designers, Dover, New York, 1966. [4] P.J. Burt, "Fast filter transforms for image processing", Comput. Graph. Image Process., Vol. 16, 1981, pp. 20 51. [5] P.J. Burt, T.H. Hong and A. Rosenfeld, "Segmentation and estimation of region properties through co-operative hierarchical computation", IEEE Trans. Systems Man Qvbernet., Vol. SMC-I 1, 1981, pp. 802 809. [6] J. Daida, R. Samadani and J.F. Vesecky, "Object-oriented feature-tracking algorithms for SAR images of the marginal ice zone", 1EEE Trans. Geosci. Remote Sensing, Vol. 28, 1990, pp. 573 589. [7] O.R. Duda and P.E. Hart, Pattern Classification and Scene Analysis, Wiley, New York, 1973. [8] K.S. Fu, "'Pattern recognition in sensing of the earth's resources", IEEE Trans. Geosci. Electronics. Vol. 14, 1976, pp. 10 18. [9] M. Goldberg and S. Shlien, "'A clustering scheme for multispectral images", IEEE Trans. Systems Man Cybernet., Vol. SMC-8, 1978, pp. 86 92.
163
[10] R.M. Haralick, "Image segmentation survey", in: O.D. Faugeras, ed., Fundamentals in Computer Vision, Cambridge Univ. Press, Cambridge, 1983, pp. 209 224. [ I 1] C. Home, Unsupervised Image Segmentation, PhD Thesis, Ecole Polytechnique Fbd6rale de Lausanne, Lausanne, Switzerland, 1990. [12] J.Y. Hsiao and A.A. Sawchuk, "Supervised textured image segmentation using feature smoothing and probabilistic relaxation techniques", 1EEE Trans. Pattern Anal. Machinc Intell., Vol. 11, 1989, pp. 1279 1292. [13] K.I. Laws. "Textured image segmentation", USCIPI report no. 940, Image Processing Institute. University of Southern California, January 1980. [14] M.D. Levine and A.M. Nazif, "Rule-based image segmentation", Comput. Vision Graph. Image Process., Vol. 32, 1985, pp. 104 126. [15] R. McQuillin, M. Bacon and W. Barclay, An Introduction to Seismic Interpretation, Graham and Trotman, London, 1979. [16] 1. Ng, T. Tan and J. Kittler, "On local linear transform and gabor filter representation of texture" in: Proc. llth lnternat. ('on/i Pattern Recognition, Vol Ill, The Hague, The Netherlands, 30 August 3 September 1992, pp. 627 631. [17] A. Rosenfeld, ed., Multiresolution Image Processing and Analysis, Springer, Berlin, 1984. [18] A. Rosenfeld and A.C. Kak, Digital Picture Processing, Vol. 2, Academic Press, New York, 1982. [19] C.A. Sher and A. Rosenfeld, "'Detecting and extracting compact textured regions using pyramids", hnage Vision Computing, Vol. 7, 1989, pp. 129 134. [20] M. Spann and C. Home, "Image segmentation using a dynamic thresholding pyramid", Pattern Recognition, Vol. 22, 1989, pp. 719 732. [21] C. W. Therrien, "'An estimation-theoretic approach to terrain image segmentation", Comput. Vision Graph. Image Process., Vol. 22, 1983, pp. 313 326. [22] M. Unser and M. Eden, "Multiresolution feature extraction and selection for texture segmentation", IEEE Trans. Pattern Anal. Machine hztell., Vol. PAMI-11, 1989, pp. 717 728. [23] R. Wilson and M. Spann, Image Segmentation and UncertainO', Research Studies Press, 1988. [24] Z. Zhang and M. Simaan. "A knowledge-based system controlled by an iterative quadtree splitting scheme for segmentation of seismic sections", IEEE Trans. Geosci. Remote Sensing, Vol. GE-26, 1988, pp. 518 524.
V o l 51. No. 2, March 1993