Pattern Recognition 33 (2000) 1819}1833
A multiresolution texture gradient method for unsupervised segmentation Tao-I. Hsu *, Jiann Ling Kuo , R. Wilson Chung Cheng Institute of Technology, Department of Computer Science and Engineering, Tai-Shi, Tao-Yuan 335, Taiwan, ROC Department of Computer Science, University of Warwick, Coventry CV4 7AL, UK Received 7 January 1999; received in revised form 25 June 1999; accepted 25 June 1999
Abstract A texture segmentation algorithm is described, which is designed using a cooperative algorithm within the Multiresolution Fourier Transform (MFT) (Wilson et al., IEEE Trans. Inform. Theory 38(2) (1992) 674}690) framework. The magnitude spectrum of the MFT is employed as feature space wherein the detection of texture boundaries is estimated by means of the combination of boundary information and region properties. A pre-smoothing process is "rst applied to the MFT magnitudes to reduce the texture #uctuations followed by Sobel operator performed on the MFT magnitudes to give a texture boundary estimate. The re"nement of the boundary estimate is accomplished by utilizing both boundary link probabilities and region link probability in an iterative manner. Results on a number of synthetic and natural textures that illustrate the e!ectiveness of the scheme are presented. 2000 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. Keywords: Texture segmentation; Unsupervised; Multiresolution Fourier transform
1. Introduction Segmenting an image into uniformly textured regions is a prerequisite for many computer vision and image understanding tasks [2]. Texture segmentation can be achieved either by extracting uniform texture regions or detecting textural discontinuities (texture edge detection) between image regions. It can be performed either in a supervised or an unsupervised mode, depending on whether a priori knowledge about the type and number of textures in an image is given or not. 1.1. Problems in texture segmentation Natural textures generally contain an inherent randomness or variability. This variation a!ects any local
* Corresponding author. Tel.: #886-3-3805249; fax: #8863-3894770. E-mail address:
[email protected] (T.-I. Hsu)
texture measure used in segmentation. The MFT coe$cients clearly provide a rich set of features for discrimination as described in Ref. [3]. What is required is a way of identifying region boundaries without making unrealistic assumptions about the homogeneity of the texture within a given region. 1.2. Uncertainty and segmentation To date, automatic segmentation has only met with limited success [2]. A fundamental di$culty in automatic segmentation centres around "nding an adequate measurement of texture properties and the compatibility of simultaneously measuring texture features and their spatial position, both of which are essential in segmentation. Uncertainty arises from the con#ict between these two measurements [2]. In order to segment textured images into meaningful regions, a su$ciently large area averaging process is needed to reduce the #uctuations in texture properties. A consequence of the averaging is that the textured image becomes more homogeneous, but at the cost of blurring the boundaries. On the other hand,
0031-3203/00/$20.00 2000 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. PII: S 0 0 3 1 - 3 2 0 3 ( 9 9 ) 0 0 1 7 7 - 6
1820
T.-I. Hsu et al. / Pattern Recognition 33 (2000) 1819}1833
boundary detection methods having high spatial resolution are likely to generate spurious responses within the textured regions. To resolve this &what is it' and &where is it' problem, Wilson and Spann advocated the use of a multiresolution approach [2]. A useful characteristic of multiresolution approaches is region property invariance: those nodes in an estimate considered as interior to a region are given as the same class as their &fathers' at lower resolution. This assumption of consistency of region properties across scales is a fundamental principle, which is related to the scale-space ideas introduced by Witkin [4] and re"ned by Koenderink [5]. 1.3. Approaches to texture segmentation Texture segmentation can be achieved either by detecting texture discontinuity between texture image regions or by extracting uniform texture regions. Therefore, segmentation methods are often categorised as boundarybased, region-based or as a hybrid of the two [6,7]. The aim of boundary-based approach is to detect feature inhomogeneities [8}11]. Particularly, Bovik et al. [10] used 2-D Gabor functions as localized "lters to encode images into multiple narrow spatial frequency and orientation channels. By analyzing the responses of the "lters, it was possible to segregate textural regions of di!erent spatial frequency, orientation or phase characteristics. On the other hand, the wavelet transform is obviously adequate for dealing with scale-invariant structures such as luminance edges, it seems unlikely that the standard wavelet transform would provide a very e!ective means of discriminating textures. As reported in Ref. [12], the author views the texture element as a particular function of the wavelet orthonormal basis and uses the orientation and narrow frequency tuning to discriminate the textured image. This method has achieved limited success. Furthermore, the wavelet transform are limited in their resolution of key texture properties, such as orientation and granularity [13]. This is an inevitable result of their representation of frequency with just three orientations per scale within the wavelet dyadic transform. Therefore, unlike the Multiresolution Fourier Transform, there are many textures which cannot be discriminated on the basis of their wavelet transform energy distributions. The basic premise of region-based methods is to seek feature homogeneities [14,15]. In their work, a method based on pyramid node linking is used [14]. By linking regions (nodes) of one level with the most similar regions in adjacent levels, segmentation can be achieved. Particularly, a combination of &bottom-up' and &top-down' linking was found e!ective. Hybrid Approaches combined region and boundary information in image segmentation [16}21]. Work reported by Spann and Wilson [18] uses a quadtree method using classi"cation at the top level of the tree, followed by boundary re"nement.
A non-parametric clustering algorithm [19] is used to perform classi"cation at the top level, yielding an initial boundary, followed by downward boundary estimation to re"ne the estimate. It is e$cient and requires no a priori knowledge of the image regions. A generalization of this work was applied to texture segmentation [20,21].
2. Texture segmentation using the MFT The MFT is a generalized form of wavelet transform and Gabor transform, which has the ability to represent arbitrary signals of compact support over a range of resolutions in an interference-free manner [1]. In this hierarchical representation form, the MFT can trade o! the resolution in spatial and spatial frequency domains, by increasing the resolution in the spatial domain and decreasing the resolution in the frequency domain. In the 2-D case, the top level of the MFT representation is the discrete Fourier transform of the image, and the intermediate levels resemble the windower Fourier Transform with decreasing window size. The window function used in the MFT is a band-limited Finite Prolate Spheroidal Sequence (FPSS). The FPSS is used as a windowing function with maximal simultaneous energy concentration in both spatial and frequency domains [22]. The bottom level is the original image. In segmentation, the class resolution is determined by the bandwidth of the "lters, a smaller bandwidth corresponding to a larger impulse response, giving higher class resolution and lower spatial resolution. Thus, the use of FPSS's leads to optimum class and spatial resolution in the segmentation. There is good reason to think that the Fourier phase carries relatively little useful information for segmentation [2,23], although its use in positioning the prototype in synthesis is clearly essential [3]. The Fourier phase depends on the relative position of the features. As an illustration, in Fig. 1, the "gures in the left column show the single feature case; the right column shows a case of two features. The phase plots in (c) and (d) show no major di!erence between the two cases, whilst the magnitude plots in (e) and (f ) show clearly that the presence of a pair of features in a given orientation has a large e!ect on the Fourier magnitude spectrum. Therefore, while the phase carries some information about the structure of the texture element, a signi"cant amount is carried in the magnitude. Since averaging is necessary to reduce the e!ects of variations in the texture, it can only be e!ective if the linear phase information is removed. The simplest way to achieve this is to use magnitudes as feature set in the segmentation. Bovik et al. [10] adopted a simple peak "nding scheme and used 2-D Gabor functions as localized "lters to represent images. The Gabor amplitude and phase are used to locate the texture
T.-I. Hsu et al. / Pattern Recognition 33 (2000) 1819}1833
1821
Fig. 1. Fourier phase for various feature combinations: top "gures are input images; middle "gures are the corresponding Fourier phase plots in the horizontal direction; bottom "gures show their amplitude spectra.
boundary. Their method has achieved very good results if the simple test images are considered. Moreover, as reported in Buf 's work [25] the quality of the phase information degrades signi"cantly if the textures are disturbed by a small amount of jitter or by additive noise which limits phase spectrum appicability. Additionally, the energy spectrum has been shown to be a useful texture measurement [18,24]. This suggests that the modulus of the MFT coe$cient "x( (n, x, p )" is used as the J
feature set to be used in texture segmentation. A given scale l of the MFT is a windowed FT (WFT) of the signal, with the window scale (and sampling intervals) a function of l x( (n (l), x (l), l)" w (n !n (l)) x (n ) G H J I G I I ;exp [!jx (l) . n ] H I
(1)
1822
T.-I. Hsu et al. / Pattern Recognition 33 (2000) 1819}1833
where n (l) are the spatial sampling points and x (l) the G H frequency sampling points at scale l of the transform. The sampling intervals follow the dyadic scheme: n (l)"2n (l#1) G G
(2)
x (l)"2x (l!1) H H
(3)
In other words, the scale is given by p "2+\J for an J image of size 2+;2+ pixels. Correspondingly, the window on level l, w (n), is twice as big as that on level l#1. J Use of the Fourier energy spectrum is of course equivalent to using the autocorrelation of the signal. 2.1. Outline of segmentation algorithm In analogy with the synthesis work presented in Ref. [3], the segmentation work makes use of the spatial localization of the MFT to measure texture properties. In this process, both the boundary and region information are combined to place the texture boundary accurately. A pre-smoothing process is applied to the MFT magnitudes to reduce the texture #uctuations. Edge detection is performed by a simple pair of Sobel operators on the MFT magnitudes to give a texture boundary estimate. To reduce false edges and gaps, the estimated boundary is enhanced by making use of estimated boundary link probabilities. The gradient vectors are averaged to yield an iteratively updated boundary estimate. An iterative scheme is also adopted for local smoothing within regions. Similarly, the region properties are updated after each iteration by averaging with neighbors, based on region link probabilities. The re"ned boundary estimate obtained at a given level from the averaging process is interpolated using a low pass kernel [26]. Following interpolation, the information is propagated down to the next resolution in a multiresolution framework, whereby both the needed boundary and region information are used successively until the "nest spatial resolution is reached. Interaction between the region and boundary processes is based on link probability estimates derived from the gradient estimation. An overview of the scheme is shown in Fig. 2.
Fig. 2. Outline of the texture segmentation at single level.
If the pre-smoothing operator S is given as
1 1 1 1 S" 1 1 1 9 1 1 1
(4)
then the smoothing operations are accomplished by the convolution operation among the blocks and within the blocks and implemented in a cartesian separable manner
3. Pre-smoothing process n
Local magnitude spectra are often highly variable. Under these circumstances, segmenting the image into regions is di$cult. This can be alleviated by smoothing the MFT coe$cient "eld prior to the classi"cation process. The smoothing can be accomplished by simple local averaging of the MFT magnitude spectra, i.e. replacing the local magnitude spectrum at each MFT block by an average of neighboring MFT blocks.
x
a (n , x , l)"S(n ) * S(x ) * "x( (n , x , l)" G H S T S T " S(n !n ) S(x !x ) "x( (n , x , l)" (5) G S H T S T ST where "x( (n , x , l)" is the MFT coe$cient magnitude at G H n x a given level l, * denotes spatial convolution and * denotes convolution in the frequency domain.
T.-I. Hsu et al. / Pattern Recognition 33 (2000) 1819}1833
Within a given region, averaging reduces local #uctuations and hence the variability of the local spectrum, while preserving the mean value. Inevitably, averaging also blurs the borders of the region, but the boundary process will still extract the regions more or less correctly, although it will smooth out irregularities in their borders.
1823
with a block size of N ;N , the vertical and horizontal J J gradient components of texture boundary are given by n
a4(n , x , l)"a (n , x , l) * S (n) K H K H 4
(9)
and n
4. Gradient vector estimate
a&(n , x , l)"a (n , x , l) * S (n) K H K H &
The texture boundary position between di!erent regions can be considered as a location where texture properties change abruptly. This transition can be measured in terms of the gradient of a function x along r in a direction h, that is, *x *x *m *x *f " # "x cos h#x sin h. K D *r *m *r *f *r
(6)
The magnitude of the gradient (x#x corresponds to K D the amount of texture change across the boundary. In order to use gradient methods, two problems must be solved. First, a suitable de"nition of gradient must be used for the MFT. To solve this, a vector image has to be employed. Secondly, an appropriate representation of the gradient must be used. 4.1. Orientation representation To deal with the second point, the orientation estimate is to be used as contextual information for further processing (cf. Section 6). In averaging operations, it is important to "nd a suitable orientation representation, which will prevent orientation sign ambiguity. Furthermore, unlike scalar boundaries, vector boundaries are not oriented (i.e. there is no ordering in RL, n'1). This leads to the requirement for an orientation representation which takes a line with orientation h as the same as a line with orientation h#p. As noted by Granlund [9], a &double angle' representation satis"es this requirement; 2h"2 (h#p) mod 2p.
n
where * is convolution over spatial co-ordinates as in Eq. (5). In this process, a Sobel operator [28] is employed, with a pair of kernels S and S de"ned as 4 &
1 0 !1
S " 2 0 !2 , 4 1 0 !1
S " &
x!x cos 2h D "r KD h " K (8) KD KD 2x x sin 2h K D KD where x and x are the gradient components, r " K D KD x#x. This is the representation used in the segmentaK D tion algorithm. 4.2. Gradient operation The initial texture boundary detection is obtained by block-based gradient estimation. Using the presmoothed MFT local magnitude spectrum a (n , x , l) G H
1
2
1
0
0
0 .
(11)
(12)
!1 !2 !1
Based on these two kernels, the gradients in vertical and horizontal directions can be obtained respectively, for each Fourier magnitude component a (x ). These must G then be combined to give an overall estimate of the spatial variation of the coe$cients. Fortunately, there is a simple way to accomplish this, which uses the double angle representation. Since the representation of [8] can be applied independently to each MFT coe$cient, all that is needed to obtain an overall estimate is to sum over all frequencies, giving
h(n, l)"
(7)
A gradient vector estimate h (n) in terms of the &double angle' representation [27] is obtained as follows:
(10)
a (n, l)!a (n, l) cos 2h (n, l) 4 & "r(n) , 2a (n, l) a (n, l) sin 2h (n, l) 4 &
(13)
where ,\ a (n, l)" a4(n, x , l)a4(n, x , l), 4 G G G
(14)
,\ a (n, l)" a&(n, x , l) a&(n, x , l), & G G G
(15)
,\ a (n, l) a (n, l)" a4(n, x , l) a&(n, x , l), 4 & G G G
(16)
where N is the input MFT block size. As a consequence of the inner product with respect to the frequency coordinates, the orientation vector estimate is a subsampled image corresponding to the block size used. Each
1824
T.-I. Hsu et al. / Pattern Recognition 33 (2000) 1819}1833
Fig. 3. One level of the orientation pyramid is created corresponding to the MFT block-based gradient operation.
orientation vector corresponds to one MFT block of size N;N acting as the overall estimate of the spatial variation of the coe$cients. Fig. 3 shows this mapping on one level, where M;M is the total number of MFT coe$cients in a given level and N;N is the MFT block size followed by the block-based innner product yielding a level of orientation pyramid size of N/M;N/M. The smaller block size N is the larger orientation vertor estimate will be and vice versa. This process is repeated over di!erent scales of the MFT spectrum, creating an orientation pyramid [29]. Averaging the orientation vector estimate can reduce the #ucturation among the coe$cients signi"cantly which is essential in segmentation. As an illustrating example, Fig. 13 shows the processing results of the test images in Figs. 7}10.
5. Boundary enhancement 5.1. Constrained enhancement The enhancement of the detected texture boundary makes use of a sigmoid function to give a notional &link probability function P ' de"ned as follows. Suppose for @ a given gradient vector estimate, h(n, l), a weighting function f can be formed to measure the interaction between
di!erent vectors, given by (h(n, l)#h(f, l)) ) u(n!f) f (n, f, l)" p J where
(17)
1 p" ""h(n , l)"" (18) J G M G and u(n!f) is the unit vector in the direction of the di!erence vector (f!n), represented in double angle form. In e!ect, f (n, f, l) can be thought of as the interaction energy between pixels n and f, which is maximum when the link vector (n!f) is collinear with the local orientation estimate. Averaging the orientation vectors at the two points ensures that the energy is symmetric, f (n, f, l)"f (f, n, l). This form of energy function has been used by Yang and Wilson in a Hop"eld neural network for edge detection [30]. Given this energy function, a simple way to estimate the link probability P (n, f, l) is @ via the logistic function, which is analogous to the Gibbs distribution approach used in Markov Random Field theory [31]. Speci"cally, if a neighborhood system is in thermal equilibrium with its surroundings, then the probability of molecular con"guration H is given by G P (i)"e\&G / e\&G @ G
(19)
T.-I. Hsu et al. / Pattern Recognition 33 (2000) 1819}1833
which is a Gibbs distribution with respect to the neighborhood system. Analogously, the &boundary link probability' function is written as 1 P (n, f, l)" @ 1#exp [!f (n, f, l)]
(20)
where 1 in the dominator represents a normalization factor. Boundary links are established between a given &pixel' on some level l and its 8-neighbors, as shown in Fig. 4. For each iteration n, the gradient vector h(n, l) is updated by weighted averaging, based on the link strengths 1 P (n, f, l) hL\(n, l). (21) hL(n, l)" n P (n, f, l) n @ fZ, @ fZ, As a result of constrained enhancement, the link probability P (n, f, l) will be large if the boundary goes @ through blocks n and f and low otherwise. The energy along the boundary will tend to be redistributed along the direction of the boundary.
6. Region averaging process 6.1. Region linking
1825
Rosenfeld et al. [32]. Probabilistic relaxation is an iterative approach for using neighborhood properties and compatibilities to reduce local ambiguity. The scheme used here is similar to that used in Ref. [29] in two respects: it is an iterative scheme in which region properties are smoothed by neighborhood operations; boundary and region processing is co-operative. In this operation, the local links between blocks are iteratively updated. In analogy with the interaction coef"cient in the boundary enhancement algorithm of Section 5, for a given gradient h(n, l) at n, a region link probability of n and its four neighbors f3N, n P (n, f"h(n, l)), is approximated by 0 P (n, f"h(n, l))Jexp [!( f (n, f, l)] ∀f3N 0 n
(22)
where 1 p" ""h(n , l)"" J G M G
(23)
and f (n, f, l) is de"ned in Eq. (17). If the average vector in terms of double angle representation is given as u(n, f, l)"h(n, l)#h(f, l)
The use of cooperative processing through the mechanism of probabilistic relaxation was "rst introduced by
"""u(n, f, l) ""
cos 2h(n, f) sin 2h(n, f)
(24)
then f (n, f, l)"""u(n, f, l)"" cos (h(n, f)! (n, f))
(25)
and
(n, f)"Arg (n!f)
(26)
In other words, blocks n and f at level l are considered to be part of the same region unless the local gradient has large component parallel to the vector joining the pixels. Note that this is not the same as saying that there is a boundary link between them. The new local spectrum in the region averaging process at iteration n is obtained by weighting averaging aL\(f, x, l) PL (n, f"h(n, l)) 0 aL(n, x, l)" fZ,n n PL (n, f"h(n, l)) fZ, 0
Fig. 4. Boundary enhancement using 8 direct neighbor gradient vectors.
(27)
To compare the boundary and region links, Fig. 5 illustrates the case in which both links are high, because the boundary is parallel to the vector joining the two blocks. After a number of iterations, the change in the boundary estimate should fall within some bound, terminating the process.
1826
T.-I. Hsu et al. / Pattern Recognition 33 (2000) 1819}1833
Fig. 5. Case where adjacent pixels have high region and boundary link probabilities (cf. Fig. 15). Fig. 6. Consistency check among 4 neighbour gradient vectors against their father node.
7. Propagation process 7.1. Propagation of estimates to higher resolutions
higher levels can be propagated down to lower levels. An orientation pyramid is created, based on the process described above, over a range of scales. Further enhancement of the boundary estimate is obtained on the basis of the multiresolution features, in a form of coarse-to-"ne approach that assumes the invariance of region properties over a range of scales. In this multiresolution approach, evidence of the region properties at one resolution is passed to following resolutions. Based on the spatial consistency of the region, the accuracy of the boundary estimate can be improved (cf. Ref. [2]). Speci"cally, a boundary region is de"ned at the coarsest level by thresholding the region link probabilities. At successive levels, the blocks not classi"ed as boundary region are assigned to the same class as their father. Otherwise, further estimation is performed to re"ne the candidate boundary by a factor of two using the MFT hierarchical structure. As a result, at the "nest resolution a boundary at the full image size can be produced. The following algorithm is used for propagation. stage 1: Construct the region link probability on the largest scale of the orientation vector estimate. stage 2: Set link weighting function ¹ by comparing J with a given threshold value
l if P (n, f"h(n, l))'THRESHOLD 0 ¹ (n, f)" J 0 otherwise
(28)
where the THRESHOLD is given as 0.75 which assumes that 75% of the links are &on'. stage 3: Construct the region link probability at one level down only when their father's link weighting functions with its 4 neighbors are not all mutual &on' shown as Fig. 6. Those nodes whose father's link weighting functions with their 4 neighbors are all mutual &on' are forced to be &on' too. Hence, the more reliable estimate from
l if (¹ (n, f)"1), J\ ¹ (2n#m, 2f#n)" J 0 otherwise.
(29)
where m and n represent the relative addresses of the children on level l of a node on level l!1. stage 4: Re"ne the boundary estimate by further checking the nodes where their fathers' link weighting function is not on. Because the strength of region links along the boundary is less than that in the homogeneous area, further iterations are performed on these links. Meanwhile, boundary enhancement is also performed only over those blocks where the region links are not on to avoid the emergence of spurious boundary nodes. stage 5: Generate a new version of the MFT magnitude spectrum as in Eq. (27). As a result of one level of this process, the regions are averaged and the boundary enhanced. 7.2. Boundary propagation In the multiresolution approach, features detected at higher levels are generally more reliable than those from lower levels, as the bigger window size reduces #uctuations. Thus, interpolation of the boundary estimated from higher levels may be used to improve the accuracy of the gradient estimate at following levels. This principle of propagating gradient estimates from low to high resolution was "rst established by Clippingdale [33] and has been used by Bhalerao [29] and Yang [30]. The gradient estimate resulting from a higher level is interpolated and combined with the gradient estimate from the current level as follows: ) ) h(n, l)"b(l) w(m, n) h(m /2#m, m /2 K\) L\) #n, l!1)#(1!b(l)) h(m , m , l)
(30)
T.-I. Hsu et al. / Pattern Recognition 33 (2000) 1819}1833
1827
where n"(m , m )2 is the spatial co-ordinate, w (m, n) is the low pass kernel used as the interpolation "lter [26], and h(n, l) is the gradient estimate before propagation. A 4;4 version of this lowpass "lter is employed in this work, so that the computational cost is kept at a minimum. The coe$cient b(l) is the variability ratio of the gradient estimate, which controls the amount of information propagated down to the lower level and is based on the measured gradient variances at the levels, p J\ b(l)" p#p J J\
(31)
where p is the average value of the magnitude square of J the gradient estimate (cf. 23). If the value of b(l) is large, which indicates larger variation of the gradient estimate obtained at the lower level, then more information from the higher level is needed. This will typically occur when the block size of the MFT is smaller than the texture element size. Fig. 7. Synthetic structured texture pair.
8. Presentation and description of results Various experiments have been performed on both synthetic and natural images of size 256;256 pixels to test the e$cacy of the method. Note that in these experiments extrapolation of the blocks on the image boundary was used to minimize edge artifacts. The testing textured images are synthesized based on composition of the various natural images. The texture boundary is formed using a linear recursion equation. For a sample of white noise u(n), the boundary position b(n) on line n is de"ned as b(n)"b(n!1)#w(n)
(32)
where w(n) is the increment, de"ned by w(n)"0.9w (n!1)#0.43u (n).
(33)
As a result of the linear recursion, the current position on the boundary depends on the last position of the boundary so that the boundary line is generated smoothly. The experiments are performed over 5 scales and 4 texture pairs with results of 1, 2, 4 and 8 iterations. The 4 texture pairs are depicted in Figs. 7}10. 8.1. Gradient pyramid The orientation is represented in double angle such that one color corresponds to an angle h and its complement to h#n/2. The color reference map is displayed in Fig. 11. The gradient estimate of the texture boundary consists of two components, the magnitude and the orientation.
Fig. 8. Synthetic random texture pair.
Fig. 12 shows the orientation estimates of four test images over MFT levels 3, 4, 5 and 6, with MFT block sizes of 16;16, 32;32, 64;64 and 128;128. As a consequence of the random #uctuations in the textures, there exist spurious edges in the orientation estimates. These spurious edges are reduced gradually after applying the iterative estimation process, while the boundary magnitude is preserved (cf. Section 8). The magnitudes of the
1828
T.-I. Hsu et al. / Pattern Recognition 33 (2000) 1819}1833
Fig. 9. Natural random texture pair.
Fig. 11. Synthesized reference map for orientation.
strength on the "rst iteration; the enhanced boundary link after three iterations is shown in (b); (c) and (d) show the results with di!erent scales. The boundary link probabilities depicted in Fig. 14 show that the link strength between two points along the boundary has been enhanced if their orientations are similar, but otherwise the link strength has been reduced. In particular, the boundary breaks which occur at the image edge have been reconnected. 8.3. Region link probability
Fig. 10. Natural structured texture pair.
preliminary estimate of the texture boundary are shown in Fig. 13. 8.2. Boundary link probability Fig. 14 displays the boundary link probability of the grass}water texture pairs at the MFT block scale of 16;16 and scale of 32;32: (a) shows the boundary link
The region blocks are averaged using the region link probability as the weighting function during the iteration process. Fig. 15 shows the region link probabilities of the grass}water texture pair at the MFT block size of 16;16. (a) depicts the region link strength of the "rst relaxation process; two ambiguous regions of the right hand homogeneous region are joined after eight iterations as shown in (b). Thus, the iterative smoothing removes much of variation in the original data, while preserving the boundary. 8.4. Segmentation results Since the segmentation results are produced from one level and propagated down to the next level, it is useful to show the generating order of the results, as in Fig. 16. The results of the iterative estimation of the synthetic random texture of Fig. 8 are shown in Fig. 17. The results of the iterative estimation of the natural texture of Figs. 9 and
T.-I. Hsu et al. / Pattern Recognition 33 (2000) 1819}1833
1829
Fig. 12. Orientation pyramids of testing texture pairs: (a) Synthetic structured texture, (b) synthetic random texture, (c) natural random texture grass water, (d) natural structured texture burlap reptile skin.
Fig. 13. Magnitudes of orientation estimates correspond to those in Fig. 12: (a) Synthetic structured texture, (b) synthetic random texture, (c) natural random texture grass water, (d) natural structured texture burlap reptile skin.
10 are shown in Figs. 18 and 19, respectively, where the spurious edges have been removed. For the burlap-reptile texture pair in Fig. 19, the detected boundary reveals two high magnitude boundaries at high resolutions. For
the 128;128 boundary estimate, the MFT block size of 4;4 is much smaller than the size of texture element of both burlap and reptile. The result is that the MFT at this scale acts primarily as a luminance edge detector, so
1830
T.-I. Hsu et al. / Pattern Recognition 33 (2000) 1819}1833
Fig. 14. Boundary estimate is enhanced via boundary link probability iteratively: (a) Boundary links at MFT block scale of size 16;16, (b) boundary links after 3 iterations, (c) boundary links at MFT block size 8;8, (d) boundary links after 3 iterations.
Fig. 15. Region connectivity is described via region link probability iteratively: (a) is the region link probabilities of the grass-water texture pair at the MFT block size of 16;16 in the "rst relaxation process, (b) depicts region link strength after 8 iterations.
T.-I. Hsu et al. / Pattern Recognition 33 (2000) 1819}1833
1831
Fig. 16. The generating order of the segmentation results illustrated in Figs. 17}19.
Fig. 18. Iteration results of orientation pyramid of grass-water texture pair.
9. Conclusions
Fig. 17. Iteration results of orientation pyramid of synthetic random texture.
that the action of the Sobel operators on this detected edge gives a double response. Otherwise, the results show that the boundary can be located quite readily.
A segmentation algorithm based on the MFT has been presented in this paper. The uni"cation of both boundary and region information is employed in the algorithm. A block-based Sobel operator is "rst implemented to yield a boundary gradient estimate. The use of the double angle representation to indicate the orientation is an e!ective method to detect boundaries in vector "elds. The boundaries obtained here show some uncertainty because they are derived from area properties. If the MFT block size is smaller than the size of the texture element, spurious edges are obtained in the initial estimate. Hence, cooperative processing of the boundary information is employed. This uses estimated link probabilities to enhance the initial boundary estimate. Also, boundary vectors that are generated at the coarse resolutions are propagated through to the next level to constrain the estimate of the texture boundary. This has
1832
T.-I. Hsu et al. / Pattern Recognition 33 (2000) 1819}1833
Fig. 19. Iteration results of orientation pyramid of burlapreptile texture pair.
the advantage of giving a sharp and accurate boundary, while using only local operations, resulting in computational savings. Another advantage of using iterative linking is that it does not require initial parameter settings or prior knowledge of the number or size of the regions. By basing all the processing on local operations, no assumption of uniformity of region properties is required, unlike the work of Spann and Wilson [2,18], for example. The encouraging results demonstrated illustrate the e!ectiveness of the scheme.
References [1] R. Wilson, A. Calway, E.R.S. Pearson, A generalized wavelet transform for fourier analysis: the mult iresolution fourier transform and its application to image and audio signal analysis, IEEE Trans. Inform. Theory 38 (2) (1992) 674}690.
[2] R.G. Wilson, M. Spann, Image Segmentation and Uncertainty, Pattern Recognition and Image Processing Series. Research Studies Press Ltd, England, 1988. [3] Taoi Hsu, R.G. Wilson, A two-component model of texture for analysis and synthesis, IEEE Trans. Image Process. 7 (10) (1998) 1466}1476. [4] A. Witkin, Scale-space "ltering, Proc. IEEE ICASSP-84, 1984. [5] J.J. Koenderink, The structure of images, Biol. Cybernet 50 (1984) 363}370. [6] J.M.H. du Buf, M. Kardan, M. Spann, Texture feature performance for image segmentation, Pattern Recognition 23 (1990) 291}308. [7] T.R. Reed, J.M.H. du Buf, A review of recent texture segmentation and feature extraction techniques, Image Understanding 57 (1993) 359}372. [8] D. Wermser, C. Liedtke, Texture gradient: a new tool for the unsupervised segmentation. The Third Scand. Conf. on Image Analysis, July 1983, pp. 85}89. [9] G.H. Granlund, In search of a general picture processing operator, Comput Graphics Image Process. 8 (1978) 155}173. [10] A. Bovik, M. Clark, W. Geisler, Multichannel texture analysis using localized spatial "lters, IEEE Trans. Pattern Anal. Mach. Intell. 12 (1) (1990) 55}72. [11] K. Fukunaga, Introduction to Statistical Pattern Recognition, Academic Press, New York, 1979. [12] S.G. Mallat, Multifrequency channel decompositions of images and wavelet models, IEEE Trans. Acoust. Speech Signal Process. 37 (12) (1989) 2091}2110. [13] S. Livens, P. Scheunders, G. Van de Wouwer, D. Van Dyck, Wavelets for texture analysis, an overview, Proc. 6th Int. Conf. on Image Processing and its Aplplications, Vol. 2, Dublin, Ireland, July 1997, pp. 581}585. [14] M. Pietikainen, A. Rosenfeld, Image segmentation by texture using pyramid node linking, IEEE Trans. Systems Man Cybernet SMC-11 (1981) 822}825. [15] A. Teuner, O. Pichler, B.J. Hosticka, Unsupervised texture segmentation algorithm with feature space reduction and knowledge feedback, IEEE Trans. Image Process. 7 (1998) 53}61. [16] J. Haddon, J. Boyce, Image segmentation by unifying region and boundary information, IEEE Trans. Patt. Anal. Machine Intell., 12(10) : 929}948, October 1990. [17] P. Andrey, P. Tarroux, Unsupervised segmentation of Markov random "eld modeled textu red images using selectionist relaxation, IEEE Trans. Pattern Anal. Mach Intell. 20 (1998) 252}262. [18] M. Spann, R.G. Wilson, A quad-tree approach to image segmentation which combines statistical and spatial information, Pattern Recognition 18 (3/4) (1985) 257}269. [19] R. Wilson, M. Spann, A new approach to clustering, Pattern Recognition 23 (12) (1990) 1413}1425. [20] R. Wilson, M. Spann, Finite prolate spheroidial sequence and their applications II: Image feature description and segmentation, IEEE Trans. Pattern Anal. Mach. Intell. 10 (1988) 193}203. [21] M. Spann, Texture description and segmentation in image processing, Ph.D. Thesis, Department of Electrical and Electronic Engineering, University of Aston in Birmingham, 1985.
T.-I. Hsu et al. / Pattern Recognition 33 (2000) 1819}1833 [22] A.D. Calway, The multiresolution fourier transform: a general purpose tool for image analysis, Ph.D. Thesis, Department of Computer Science, The University of Warwick, UK, September 1989. [23] J. Eklundh, On the use of fourier phase features for texture discrimination, Comput. Graphics Image Process 9 (1979) 199}201. [24] C.H. Chen, A study of texture classi"cations using spectral features, Proc. 6th Int. Conf. on Pattern Recognition, Munich, 1982, pp. 1064}1067. [25] J.M.H. du Buf, P. Heitkamper, Texture features based on Gabor phase, Signal Processing 23 (1991) 227}244. [26] R. Wilson, A.H. Bhalerao, Kernel designs for e$cient multiresolution edge detection and orientation estimation, IEEE Trans. Pattern Anal. Mach. Intell. 14 (1992) 384}390. [27] H. Knutsson, Filtering and reconstruction in image processing, Ph.D. Thesis, University of LinkoK ping, Sweden, 1982.
1833
[28] W. Pratt, Digital image processing, Wiley, New York, 1976. [29] A.H. Bhalerao, Multiresolution image segmentation, Ph.D. Thesis, The University of Warwick, UK, 1991. [30] H.-C. Yang, R. Wilson, Orientation-directed edge detection using a Hop"eld neural net work, in: Proc. 8th IEEE Workshop on Image and Multidimensional Signal Processing, Cannes, France, 1993. [31] S. Geman, D. Geman, Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images, IEEE Trans. Pattern Anal. Mach. Intell. 6 (6) (1984) 721}741. [32] A. Rosenfeld, R.A. Hummel, S.W. Zucker, Scene labelling by relaxation operations, IEEE Trans. Sys Man Cyber SMC-6 (1976) 420}433. [33] S.C. Clippingdale, R. Wilson, Quad-tree image estimation: a new image model and its applicat ion to MMSE image restoration, Proc. 5th Scand. Conf. Image Anal. Stockholm, Sweden, 1987, pp. 699}706.
About the Author*TAO-I HSU received the B.S. degree in Applied Mathematics from Chung Cheng Institute of Technology, Tao-Yuan, Taiwan, in 1982, the M.S. degree in Electrical and Computer Engineering from Naval Postgraduate School, Monterey, CA, USA in 1988, and the Ph.D. degree in Computer Science from the University of Warwick, Coventry, UK, in 1994. He was an Instructor in Computer Science at Chung Cheng Institute of Technology from 1988 to 1991. Since 1994, he has been with the Computer Science Department, Chung Cheng Institute of Technology, where he is currently an Associate Professor and Director of the Image Analysis and Multimedia Communication Laboratory. His research interests include wavelets, image processing, data compression and multimedia communication. Dr. Hsu has acted as a reviewer in a number of conferences, mainly in computer vision and image processing. He has led a number of projects funded from the National Science Council of Taiwan, ROC, and other sources. About the Author*JIANN-LING KUO received the B.Sc. degree in the Applied Mathematics from Chung Cheng Institute of Technology, Tao-Yuan, Taiwan, in 1986, the M.Sc. degree in Information Technology for Manufacturing from the Department of Engineering, University of Warwick, Coventry, UK, in 1993. He is now a Ph.D. candidate in the Graduate School of Electrical Engineering, Chung Cheng Institute of Technology. His main research interests are Image Processing and Multimedia Communication. About the Author*ROLAND WILSON received the B.Sc. and Ph.D. degrees from the Department of Electrical and Electronic Engineering, University of Glasgow, UK, in 1971 and 1978, respectively. From 1978 to 1985, he was a Lecturer in the Department of Electronic and Electrical Engineering, University of Aston. From 1982 to 1983, he was a Visiting Professor at Linkoping University, Sweden. In 1985, he was appointed to a Senior Lectureship in the Department of Computer Science, University of Warwick, Coventry, UK In 1992, he was promoted to a Readership. He has published more than 80 papers in the areas of Communication Theory, Image and Audio Signal Processing and Neural Networks. Dr. Wilson was jointly awarded the Pattern Recognition Society Medal for Best Paper (published in Pattern Recognition), with M. Spann (1985). He is an Editorial Board member for the journal Pattern Recognition and an Associate Editor of the IEEE TRANSACTIONS ON IMAGE PROCESSING.