Neurocomputing 99 (2013) 46–58
Contents lists available at SciVerse ScienceDirect
Neurocomputing journal homepage: www.elsevier.com/locate/neucom
Granule-view based feature extraction and classification approach to color image segmentation in a manifold space Tingquan Deng a,b, Wei Xie b,n a b
College of Science, Harbin Engineering University, Harbin 150001, PR China College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, PR China
a r t i c l e i n f o
a b s t r a c t
Article history: Received 30 September 2011 Received in revised form 2 May 2012 Accepted 5 June 2012 Communicated by B. Apolloni Available online 3 July 2012
Sound segmentation technology is the central in image analysis and computer vision. In this paper, a granule-based approach to color image segmentation is introduced, formulated by combining color jump connection-based granule construction with manifold learning-based feature extraction technique. Aiming at characterizing irregular objects in an image, a granulitized model is established by using techniques of jump connected segmentation and morphological reconstruction to accurately represent objects. Laplacian eigenmaps (LE) manifold learning technique is applied to extract features of granules automatically, which allows taking smooth and texture information into consideration effectively. Markov Chain Monte Carlo (MCMC) method is explored for the process of granule merging. Experiments demonstrate that the proposed approach to color image segmentation reaches high precision of granulitized representation and reliable feature characterization of objects, and yields promising segmentation results. & 2012 Elsevier B.V. All rights reserved.
Keywords: Granulation Jump connection Laplacian eigenmaps (LE) Feature extraction Markov chain Monte Carlo (MCMC)
1. Introduction Image segmentation describes the act of defining what is an object in images and determining to which object, referred to as class, each pixel belongs. Although the notion of an object often involves semantic information depending on the considered application, in general, an object contains similar pixels, whereas pixels from different objects are dissimilar. As segmentation aims at dividing the whole image into several meaningful parts, it is the premise of image understanding. An efficient image segmentation technique is beneficial to almost all underlying fields related to image understanding and computer vision, such as object recognition, multi-scale image registering and piecewise rendering, and content-based image retrieval. Formally, in the image segmentation approach, we are dealing with a pixel space X , and a space of class labels O. All pixels in X are potential candidate to be labeled, and X can be any kind of space (e.g. the scalar or real vector space Rn). Obviously, an overly complex classifier system f : X -O may allow perfect partitioning a collection of all pixels fxi gi ¼ 1,...,n into k meaning full groups C 1 , . . . ,C k . These groups are commonly referred to as objects, and the process of finding a natural division of pixels into homogeneous regions is referred to as segmentation.
n
Corresponding author. E-mail address:
[email protected] (W. Xie).
0925-2312/$ - see front matter & 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.neucom.2012.06.021
Various techniques of image segmentation have been developed, based on active contours and level sets [9], region splitting and merging [23,15,10,5], mean shift [8], graph cuts [17], and so on. Although the literature on image segmentation is so vast, almost all techniques fall into three categories: pixel-oriented, edge-oriented and region-oriented. The pixel-oriented techniques usually mode the feature vectors (e.g. color) associated with each pixel as a sample from an unknown probability density function and then try to find clusters (modes) in this distribution. The mean shift algorithm, K-means and mixtures of Gaussians are main methods based on pixel clustering for segmentation. The edge-oriented techniques often detect boundary iteratively by moving towards their final solution under the combination of image and optional user-guidance forces. The key point toward them is to define an energy function such that segmentation is achieved by minimizing the energy function. A typical example of edge-oriented methods is active contours. On the contrary, more recent region-oriented algorithms often estimate regions with some measures such as intra-similarity and inter-dissimilarity and combine adjacent regions whose interdissimilarity is below a given threshold or whose regions are too small, or split the region whose intra-similarity is below a given threshold iteratively. At the initial stage, each pixel is taken as a region in region merging algorithms while the whole image is taken as a region in region splitting algorithms. Here, we go into region splitting and merging techniques in detail. The graph based region merging [10] and probabilistic multiscale approach [15] are typical examples of region merging
T. Deng, W. Xie / Neurocomputing 99 (2013) 46–58
techniques. The former considered as a kind of bottom-up technique defines an image as a graph in which pixels are nodes and the edges are weighted by measuring the relative dissimilarities between nodes and merges the regions (nodes) if the edge weight is less than the minimum internal difference. The latter is based on gray-level (or color) similarity and texture similarity and treats an object as a statistical region in which color and texture have the same statistics [15,5]. The key to this approach is the merging criterion and merging order. The pairwise statistics between regions are used to calculate the likelihoods that two regions should be merged based on different criterions. The accuracy of segmentation can be evaluated based on the law of large numbers. Splitting an image into meaningful finer regions is also one of the oldest region-based techniques in computer vision. The oldest splitting technology obtains the segmentation resulted from a quadtree representation of images. If the intra-region similarity is less than a given threshold, the region will be separated into four subregions with the same size. This method lacks flexibility in segmenting images. More recent splitting algorithms often optimize some global criterions, such as intra-region consistency and inter-region dissimilarity. Normalized cuts is a typical region splitting based method. As opposed to graph-based segmentation, the normalized cuts approach is considered as a kind of top-down technique [23]. It also defines an image as a graph in which pixels are nodes and edges are formed by examining the affinities (similarity) between nodes (e.g. color differences and spacial distances) and attempts to partition groups that are connected by weak affinities by considering edges in non-decreasing order by weights. In order to yield segmentation which is neither too coarse nor too fine, both the splitting and merging techniques are combined and regions are allowed to be split or merged in two-way movements. The key concern in region-based segmentation algorithms is how to extract the features of regions and measure the similarity or distance between them since regions are merged or split in terms of their distances in feature spaces. Thus, automatically extracting features is crucial to image segmentation. Unfortunately, a usually used version of region representations which takes all color values of pixels in regions as regional feature lacks of precision and efficiency. Furthermore, the aforementioned three kinds of segmentation techniques, pixel-based, edge-based and region-based techniques, take each pixel as a processing unit, namely two pixels each can be possibly separated. The use of pixel-level technology implicates three limitations. First, a single pixel-based technique is sensitive to noises and rarely sufficient for preserving the shape of an object. Because neighborhood structures of pixels imply plenty of information, techniques of image segmentation should concern directly information in regions, rather than in pixels for effectively characterization of objects. Second, if a pixel is directly considered as a processing unit, a pixel-level technique possibly clusters similar pixels into different objects due to merging and splitting processes. Third, computational complexity may increase significantly in comparison with granule-level segmentation. To solve these problems, regular block-based segmentation methods arise naturally. A typical technique was developed based on partitioning an original image into n n square blocks (patches) with overlapping a width of m (m on) pixels and all patches are clustered into distinct classes [26]. However, no matter what methods for feature extraction and classification of blocks are used, edge-blocking effects cannot be avoided, see Fig. 1, for example. In this figure, the concerned object appears in pink. We choose n¼8 and m¼2. The non-overlapping pixels are indicated in blue and the result of segmentation is shown in red. Because each 4 4 block of non-overlapping pixels has only one
47
Fig. 1. Edge-blocking effects of image segmentation based on regular partition. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
label leading to that 16 pixels belong to the same class. As long as the edge of an object lies in certain blocks, the pixels crossing the edge cannot be divided into different classes. Thus, edge-blocking effects are unavoidable. It seems that taking the neighborhood of each pixel as a sample in data set for clustering can not only consider the region information around the pixel but also eliminate edge-blocking effects. However, in that case, there is redundant information among samples (pixels). The huge number of points in the sample space or manifold space makes extracting features and clustering pixels at a great computational cost. What’s more important, the regular blocking is not a granule-level technique in nature since it cannot ensure that pixels in the same block are homogeneous, that is they cannot be considered as an entirety (granule) in processing. Motivated by above facts, a granule-view based image segmentation is proposed based on feature extraction and classification in a manifold space. The granule-view based method reflects the cognitive ability of humans [24,31,32]. In segmentation, it considers a group of homogeneous points rather than a pixel as a processing unit and then improves the pixel-level segmentation schemes. The technique of morphological connectivity segmentation [11,21,22] is extended to color images and all connected components are regarded as a family of basic granules. The modified color jump connection takes all pixels belonging to the same local extremum as a basic granule which is equipped with great consistency. In order to solve the problem of automatic feature extraction, the basic granules are embedded in a low-dimensional feature space after size normalization. The manifold learning is a widely accepted method for feature extraction and dimensionality reduction which embeds high-dimension samples into low-dimension feature space by preserving some local or global geometric structures [20,18,25,3,7,30]. The popular manifold learning algorithm, LE, is utilized to extract features of basic granules in this paper. It can not only fuse the information of color intensity and texture in basic granules, but also concern the geometric structures of the set of basic granules. Thus, the manifold learningbased feature representation of basic granules is much better than only intensity and color representation in clustering procedures. Further, an unsupervised granule synthesis algorithm, the Markov Chain Monte Carlo (MCMC) method [4,1], is applied to segment the image by clustering the basic granules, in which the state and transition probabilities are revised based on the regional similarity and size in granule synthesis and fission. The MCMC algorithm outperforms some well-known clustering algorithms, such as FCM, in overcoming the shortage of choosing clustering centers. The rest of the paper is structured as follows. Section 2 proposes a method of granulitizing color images based on morphological
48
T. Deng, W. Xie / Neurocomputing 99 (2013) 46–58
connectivity theory. A family of basic granules is constructed by a color morphological jump technique. Section 3 investigates the issue on feature characterization and extraction of basic granules. An improved LE manifold learning strategy is put forward to reducing the dimensionality of basic granules. Section 4 explores the MCMC algorithm to merge basic granules which leads an effective technique for color image segmentation. Experiments and analysis are presented in Section 5 and conclusions follow in Section 6.
2. Color image granulization based on morphological connections 2.1. Granulization of color images Given a data set on a nonempty finite set E D RD , called a universe of discourse, how can we cover or partition this domain by finite sets such that the data set is a meaningful concept on every finite set? The whole contents of every meaningful concept are indiscernible and referred to as a granule in the field of granular computing [31,24,32]. The choice of appropriate granulization strategy strongly depends on specific purposes and applications. In image segmentation, one of the main aims focuses on discovering or emphasizing meaningful objects in an image. Each of such objects is referred to as a granule constituted by grouping pixels with homogeneous color or structural descriptions. Typically, the granules are founded on the basis of some similarity or distance measures. A basic granule can be considered as a fundamental and minimal unit in image processing and segmentation. There exist various approaches to the construction of basic granules. Each pixel (voxel) in an image can be considered as a granule in image processing. Here, we do not focus our attention on individual pixels or regular blocks. Instead, in the view that the morphological jump connection is one of the important and effective techniques to acquire a family of basic granules, we segment an image into a family of meaning components. Each component possesses an explicit interpretation on representing an object in an image and is referred to as a basic granule. 2.2. Construction of basic granules by color jump connection Each voxel of a color image can be regarded as a basic granule. However, such granules not only fail to take neighborhood information of voxels into consideration but also bring high computation complexity. Another approach for the construction of basic granules is to segment a color image f into a set of n n regular square blocks with overlapping a width of m voxels. Each block is regarded as a basic granule. This method is simple but inefficient. It is likely that voxels in different objects will fall into the same block, which will lead to an inaccurate segmentation result. To avoid such a situation, the morphological connected segmentation method based on criteria, namely the jump connection, is employed to construct the family of basic granules [22]. Definition 2.1 (Serra [21]). Let E and G be two arbitrary nonempty sets and let F be a family of functions from E to G. A criterion s on the class F is a binary function from F PðEÞ into f0; 1g such that, for each function f A F and for each set A A PðEÞ, either s½f,A ¼ 1 or s½f,A ¼ 0. If s½f,A ¼ 1, then f is said to satisfy the criterion s on A, while s½f,A ¼ 0 means that the criterion s is refused on A.
Definition 2.2 (Serra [21]). A criterion s : F PðEÞ-f0; 1g is called connective if, for each f A F , all the sets A in PðEÞ satisfying s½f,A ¼ 1 form a connection, i.e. (1) 8f A F and fxg A S ) s½f,fxg ¼ 1, where S ¼ f|g [ ffxg9x A Eg; (2) for any f A F and any family fAi g of PðEÞ, f satisfies the criterion T V S s on Ai, Ai a | and s½f,Ai ¼ 1, then ) s½f, Ai ¼ 1. Every set A satisfying s½f,A ¼ 1 is referred as a zone or a connected component of f. Every connected component A is homogeneous with respect to f under the criterion s, while different zones have distinct properties. Proposition 2.3 (Serra [21]). Consider two arbitrary nonempty sets E and G, and a family F of functions f : E-G. Let s be a criterion on F , then the following three statements are equivalent: 1. Criterion s is connective. 2. For each function f A F , the sets on which the criterion s is satisfied constitute a connection C C ¼ fA A PðEÞ9s½f,A ¼ 1g
ð1Þ
3. Criterion s segments all functions of the family F . Definition 2.4 (Serra [21]). Let AD E be a C-connected set, and k be a positive constant. s is called a jump connection criterion if
s½f,A ¼ 13A A C,and 8x A A, either of the following conditions holds:
l is a minimum of f in A satisfying 0 rfðxÞl ok and 9fðxÞl9 r9fðxÞm9 for 8m A M,
n is a maximum of f in A satisfying 0 r nfðxÞ o k and 9fðxÞn9 r9fðxÞm9 for 8m A M0 , where M and M0 stand for the sets of all minima and maxima of f, respectively. The parameter k is referred to as the step length of jump connection. It is well known that an object always has a local extremum at least. The well-known watershed algorithm and the jump connection method both take the local extremums as seeds for image segmentation. A remarkable difference lies in: the former chooses the minima only as seeds, while the latter takes the minima and maxima simultaneously as seeds. What is more, the latter allows combination of multiple local extremums that are less than the step length k of jump connection. With the jump connection method, the phenomena of over-segmentation are eliminated to some extent. Fig. 2 shows a draft of jump connection in one dimension case [21]. With iterations, the result of first step (one-jump) is depicted by the ‘1’ arrows and that of the second (the last) step of the iteration is indicated by the ‘2’ arrows. It is observed that the regions labeled ‘1’ are induced by real local extremums and should be retained and the regions labeled ‘2’ are not induced by local extremums and must be merged into regions ‘1’. For color images, illumination will cause visible contrast in hue and lighteness. If only the gray information is concerned, the edge between objects will appear inaccurately. To solve this problem, the classical jump connection algorithm has to be revised. In the jump connection segmentation algorithm for color images, a vector ordering and distance between two voxels in a color space have to be prescribed. In the following, we will analyze how to define them in detail. The extension of mathematical morphology to color images is an open problem due to the difficulty in introducing a general definition of vector ordering [11,29]. Despite the absence of a
T. Deng, W. Xie / Neurocomputing 99 (2013) 46–58
widely accepted solution, the hue based model has become frequent practice to rank vectors due to the intuitiveness of luminance, hue and saturation for describing color vectors. In this paper, the reference hues based method is investigated to rank vectors in HSI space. The derived ordering is established based on the distance of a vector from a family of predefined reference values, representing the dominant wavelengths of color vectors [2]. Since reference hues are leading hues of the given image, we seek for them by manipulating the normalized hue histogram. The normalized hue histogram is thresholded by its average value. Every connected section of thresholded histogram with the maximum of the histogram is regarded as one of the manipulating hues. Here it is assumed that the range of hues is ½0 ,2p and there are l such connected sections. The hues on the l peaks are supposed to be the reference hues, denoted by h1 , h2 , y, hl . Given two voxels u ¼ ðuH ,uS ,uI Þ and v ¼ ðvH ,vS ,vI Þ in the HSI space, the ordering, r H , between u and v is defined based on their hue distances from each of the l references as follows: l
l
i¼1
i¼1
u r H v3uH rvH3 minfvH Chi g r minfuH Chi g
ð2Þ
49
where ( hCh0 ¼
9hh0 9
if 9hh0 9 o p
2p9hh0 9
if 9hh0 9 Z p
and uH ¼vH if and only if uH r vH and vH r uH . The ordering defined by (2) is a reduced ordering. It is just a quasi-ordering. One may combine it with the idea of lexicographical ordering to derive a total ordering for color vectors 8 > < uH r vH or ð3Þ u r H v3 uH ¼ vH 4uS r vS or > : u ¼ v 4u ¼ v 4u r v H H I I S S Note that in the HSI space for color image processing, the first component of a vector represents an angle ranged from 0 to 2p, so the distance between two color vectors is substituted by qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi JuvJ2 ¼ ðuI vI Þ2 þ u2S þ v2S 2uS vS cosðuH CvH Þ ð4Þ Having defined vector ordering and distance in color space, color jump-k connected component gx is introduced as follows. For a region extremum (maximum or minimum) point x of a color image f in region E and a step length k for jump connection, let
g0x ðfðEÞÞ ¼ fy A E9JfðyÞfðxÞJ2 o kg and
glx ðfðEÞÞ ¼ fyA E9JfðyÞfðxn ÞJ2 o k,xn A M \ gl1 x ðfðEÞÞg
ð5Þ
where M stands for the set of all extremum (maximum or minimum) points of f on E. Whenever glx ðfðEÞÞ ¼ gl1 x ðfðEÞÞ for some l, gx ðfðEÞÞ ¼ glx ðfðEÞÞ is set to be the color jump-k connected component of x for image f on E. Based on the aforementioned discussion, the revised jump connection technique for color image segmentation can be implemented according to the following algorithm. Algorithm 2.5. The jump connection segmentation for a color image f.
Fig. 2. Draft of jump connection segmentation in one dimension case.
(1) Choose the step length k for color jump connection and find the sets of regional minima and maxima (vectors) of f based
Fig. 3. Granulation of a synthetic image. (a) Original image; (b) jump step ¼0.25, regional number¼ 19; (c) jump step ¼0.15, regional number¼ 22; (d) jump step¼ 0.1, regional number¼ 29; (e) jump step ¼ 0.075, regional number ¼31; (f) jump step ¼0.05, regional number¼ 33; (g) jump step ¼0.025, regional number¼ 36. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
50
T. Deng, W. Xie / Neurocomputing 99 (2013) 46–58
Fig. 4. The results of jump connection segmentation. (a) Original image, (b) k ¼0.15, (c) k¼ 0.25, (d) k¼0.35, (e) k¼ 0.45. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Fig. 5. The tendency of some indexes regarding the change of step length.
on the vector ordering r H in the HSI space, denoted by X and Y, respectively. (2) For each extremum point x A X [ Y and the unlabeled region E of the image f, determine and label its color k-jump connected component gx ðfðEÞÞ. Delete all voxels from X [ Y which belongs to gx . (3) Repeat step (2) until the set X [ Y is empty. (4) Output the segmentation result by collecting all the connected components after merging excessively small subregions.
Fig. 3 shows the granulation of a synthetic image which is composed of thirty-six 20 20 blocks with different colors. The basic granules become finer with the decreasing of the jump step. Fig. 4 presents some results of image segmentation for a natural color image by using the jump connection algorithm, where Fig. 4(a) is the original color image and Fig. 4(b)–(e) shows the results of segmentation by using different jump step lengths, 0.15, 0.25, 0.35, or 0.45 respectively. It is observed from Fig. 4 that while the jump step length is increasing, more voxels will get together, and the number of connected components will decrease, or alternatively the connected regions will enlarge in area. That is, an increment of the jump step length usually leads to an extension of connected components in area. As a consequence, the intra-variances of connected components and the inter-variances between the connected components will have increasing tendencies following the increase of step lengths, see Fig. 5. Fig. 5(a) plots the relationship between the number of connected regions and the jump step length, while Fig. 5(b) and (c) shows the change of intra-variance of connected components and the change of intra-variance between connected components following the increase of step length, respectively. With the jump connectivity segmentation algorithm, each of the jump connected parts is referred to be a basic granule. Generally, all the basic granules are not regular (square or rectangular) blocks, but have different sizes and shapes. What is more, each of the granules has its own intrinsic characteristic. Before processing of the basic granules, the intrinsic features of basic granules have to be characterized.
3. Feature description of basic granules by manifold learning technique The considered domain on which the basic granules are defined is a multidimensional space, namely a subset of RD . It is difficult to describe the features of basic granules in a high dimensional space. A dimensionality reduction technique is required to acquire intrinsic characterization on basic granules. Manifold learning techniques are popular approaches to dimensionality reduction and feature extraction. They assume that high-dimensional data (basic granules) are embedded in a lowdimensional manifold space. The premise of manifold learning algorithms is that all the basic granules have the uniform dimensionality. However, the basic granules obtained by the color jump connection segmentation technique are of irregular shapes and of different sizes. Thus, their intrinsic features cannot be precisely extracted by manifold learning techniques. In this section, all the basic granules are regularized with the help of size histogram and morphological reconstruction technique. The regularized basic granules make up a patch manifold M embedded in RD . LE, a typical manifold learning algorithm, is investigated to meet the purpose of feature extraction from the regularized basic granules. 3.1. Regularization of basic granules Before feature extraction, all the basic granules have to be converted into granules with square domains. In other words, we want to seek for a family of blocks with size n n to cover the basic granules as much as possible and the loss of information of every basic granule out of a related regular block is as little as possible. A key of regularizing the basic granules is to keep the information as much as possible and to fill in pertinent information by a morphological reconstruction technique. 3.1.1. Size optimization of patch Let xi denote a basic granule, or exactly the domain of a basic granule, and sðxi Þ denote the length of the smallest square
T. Deng, W. Xie / Neurocomputing 99 (2013) 46–58
covering xi totally. The histogram of the family of smallest squares related to the basic granules of an image with L totally possible size levels is defined as a discrete function hðr k Þ ¼ nk or pðr k Þ ¼
nk n
PL S¼
k¼1
fillðs,xi Þ ¼ 9x& i xi 9 is the where 9X9 stands for the cardinality of set X and x& i regularized granule of xi with size s. To find the optimal squares to cover all basic granules such P P that both Li ¼ 1 lossðs,xi Þ and Li ¼ 1 fillðs,xi Þ are as small as possible, a feasible solution to the optimal size of squares is given by sn ¼ arg min
where rk is the kth size level of smallest squares, nk is the number of smallest squares related to basic granules whose size levels are rk, and n is the total number of basic granules. Fig. 6 shows a draft of the size histogram of smallest squares for basic granules of an image shown as Fig. 4(a). In comparison with the size histogram hðr k Þ, pðr k Þ is called the normalization size histogram. From the basic probability theory, pðr k Þ can be considered as an estimation of the probability of occurrence of smallest squares with size rk covering the basic granules. The kurtosis and skewness of the size histogram are described by following expressions: PL n ðr rÞ4 K ¼ k ¼ 1 k4 k ns nk ðr k rÞ3 ns3
where s is the standard deviation of size histogram and r is the average size of squares. The optimal size of squares is evaluated based on the measures of kurtosis and skewness by 1K 1K jn ,s0 þ jn ½sl,sh ¼ s0 ð6Þ 1S 1þS where s0 is the mode of size histogram and j is the standard normal cumulative distribution function. For the purpose of effectively evaluating the optimal size of squares we set jn ¼ jð0:25Þ ¼ 0:59 which makes regular blocks properly cover half of all basic granules, namely the granular size for regularization depend on most of middle basic granules. Suppose an optimal size of squares is determined and let lossðs,xi Þ and fillðs,xi Þ denote the quantity of information loss and of filled information when the basic granule xi is covered by the determined square x& i with size s. They are defined by lossðs,xi Þ ¼
9xi x& i 9
51
s A ½sl,sh
L X
ai lossðs,xi Þ þ bi fillðs,xi Þ
ð7Þ
i¼1
where ai ¼ fillðs,xi Þ=s2 and bi ¼ lossðs,xi Þ=s2 are used to evaluate the importance degrees of lossðs,xi Þ and fillðs,xi Þ. For the synthetic image shown in Fig. 3, with respect to different jump steps, the optimal sizes of squares fall in [20,23] calculated as (7), which is near to the real size of basic granules. 3.1.2. Color morphological reconstruction-based basic granule regularization Having known the optimal size of square to cover all the basic granules, there are two issues to be addressed. The one is where the square block is shifted to cover appropriately a basic granule and the other is how the basic granules are extended if necessary. If the domain of a basic granule is smaller than the square, the center of the square is located at the center of the basic granule or at the position where the basic granule is totally contained in the square. Otherwise, the square is located at the position where the square and the domain of the basic granule have the largest amount of intersection in area, or alternatively both the lost area and the filled area are required to be as small as possible. When there exists one or more filled areas, the method of morphological reconstruction is introduced to assign color and intensity to the areas. For a granule xi of an image f covered by a square x& i , let 8 t A xi < fðtÞ, W mðtÞ ¼ fðqÞ, t A x& i xi : qAx i
g0 ðtÞ ¼
8 <
fðtÞ, V : q A x fðqÞ,
t A xi t A x& i xi
i
and gk ðtÞ ¼ dðgk1 ÞðtÞ
^
mðtÞ,
k ¼ 1; 2, . . .
ð8Þ
where d is the color morphological geodesic dilation, defined by _ dðgÞðtÞ ¼ gðtbÞ ð9Þ bAB
with B, a 3 3 flat structuring element. Note that the supremum and infimum used in above equations are induced by the vector ordering defined by (3). Whenever gk ¼ gk1 for some k, gn ¼ gk is set to be the regularized granule of xi. Thus regularization of the ith basic granule xi of color image f is a sn sn color image, & denoted by f i . 3.2. Manifold learning-based feature extraction of basic granules
Fig. 6. The histogram of size distribution of smallest squares for basic granules.
Manifold learning is a technique to map a high-dimensional data set into a surrogate low-dimensional space. The low-dimensional description of the high-dimensional data can be taken as the feature of itself. A large number of manifold learning techniques have been proposed recently, including isometric feature mapping (ISOMAP), locally linear embedding (LLE), Laplacian eigenmaps (LE), local tangent space alignment (LTSA), diffusion maps and so forth [25,18,3,19,7,30]. Each manifold learning algorithm attempts to preserve a different geometrical property
52
T. Deng, W. Xie / Neurocomputing 99 (2013) 46–58
of underlying manifold structure, for instance, proximity relationship among data or metrics at all scales. It is well-known that each granule can be considered as a highdimensional data and its lower dimensional structure can be embedded into a lower dimensional manifold, called a patch manifold. In this paper, the LE manifold learning-based dimensionality reduction is served as an automatic approach to feature extraction by combining all important local information of the regularization basic granules.
3.2.1. Dimensionality estimation Each data set has its own intrinsic dimensionality. In manifold learning technique for nonlinear feature extraction, dimensionality estimation is an important issue on feature description of patch manifold and granule fusion for image segmentation:
The intrinsic dimensionality can be interpreted as the least number of parameters required to account for the data.
The intrinsic dimensionality can be viewed as the number of freedom degrees of the data set and can be interpreted as a measure of data complexity. Recent work by Gabriel [16] showed that in gray-scale image space, the dimensionality of patch manifold of locally smooth regions of images is three, while of parallel textures is five. Evidently, the patches will exhibit varying dimensionality since they usually lie in multiple manifolds for each image. Therefore, a key to feature extraction is to estimate the intrinsic dimensionality of patch manifold. The local and global intrinsic manifold dimensionalities are estimated, respectively, by 2
31 K 1 X 1 T ðx Þ K i 5 ^ iÞ ¼ 4 mðx log K2 j ¼ 1 T j ðxi Þ
ð10Þ
and ^ ¼ M
P
^
i mðxi Þ
9X9
ð11Þ
where K is the number of nearest neighbors of point xi A X and T j ðxi Þ is the distance from xi to its jth nearest neighbor in patch manifold X [6]. Fig. 7 shows that the local manifold dimension distribution of basic granules for the image shown as Fig. 4(a) where K ¼8 and
the step length in jump connection is 0.15. The estimated global ^ ¼ 5:8701 is obtained by Eq. (11). dimension M The dimension of the feature space is relevant to the number of basic granules and intra-similarity of basic granules. Note that the dimensionality estimation algorithm requires that the data have a dense distribution and presents certain structures. If the step length in jump connection is too large, namely the number of basic granules is too small, the estimated manifold dimension^ of the family of basic granules is possibly not the real ality M feature dimensionality. 3.2.2. Feature extraction of basic granules by LE Our mental representations of images are formed by processing large numbers of sensory inputs, for example, the voxel intensities of images. While complex stimuli of this formation is represented by points in a high-dimensional vector space, they typically have a much more compact description [20]. The description making use of color density directly cannot reliably classifies granules since it does not capture intrinsic features of granules. Coherent structure in granules leads to strong correlations between inputs, which lie in or close to a smooth lowdimensional manifold. The judgement on similarity and clustering of granules depend crucially on feature representation in accordance with nonlinear geometry of low-dimensional manifolds. Dimensionality reduction based on manifold learning serves as an automatic learning approach to extract features of granules by combining all important cues. Feature characterization on basic granules by a manifold learning technique brings a strong rationale to overcome the disadvantages of describing granules by color intensity directly. Each basic granule is treated as a point in high-dimensional Euclidean space. The intrinsic features of granules are extracted by preserving geometric properties between any two of granules. Feature extraction based on the LE manifold learning technique allows to take more neighborhood and structural information of basic granules into consideration which are quite effective in synthesizing basic granules. Let X ¼ fxi g be the family of basic granules, an adjacency graph G related to X is constructed by using the KNN. Each of basic granules is regarded as a vertex of G and two vertexes, xi and xj, are connected if and only if xi is in the k-nearest neighbor of xj and vice-versa. Let Y ¼ fyi g be the low-dimensional representation of X. A reasonable criterion of feature extraction is minimizing the cost function X Jyi yj J22 W ij ð12Þ i,j
where the weight Wij is in accordance with the distance between xi and xj. The criterion ensures that the LE map is distancepreserving. The optimization problem (12) is equivalent to the following optimization problem minimize
Y T LY
subject to
Y T MY ¼ 1
ð13Þ P
where M is a diagonal matrix with mii ¼ j W ij , W ¼ ðW ij Þ is the weighted matrix and L ¼ MW is the Laplacian matrix of the graph G. The optimization problem (12) or (13), or the problem of the low-dimensional feature representation Y of X, can be calculated by solving the generalized eigenvalue problem Lu ¼ lMu
Fig. 7. The histogram of local manifold dimension.
ð14Þ
Let ui be the eigenvector corresponding to the ith smallest eigenvalue, then the low-dimensional representation of Y can be described as Y ¼ fu2 ,u3 , . . . ,ud þ 1 g.
T. Deng, W. Xie / Neurocomputing 99 (2013) 46–58
53
0.1 0.05
0.1
0
0
−0.05
−0.1
−0.1
−0.2
−0.15
−0.3 0.2 0.1
−0.2 −0.25 −0.2
0 −0.1 −0.15
−0.1
−0.05
0
0.05
0.1
−0.2
0.15
−0.15
−0.1
−0.05
0
0.05
0.1
Fig. 8. Feature representation of basic granules based on LE. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
The key to the LE algorithm is the determination of the weighted matrix W, which depends on the choice of distance in a manifold space. Here, a revised expression of manifold distance is proposed based on recurring the local dimensionality of the manifold by ( & & ^ i Þmðx ^ j ÞÞ2 =2 distðf i ,f j Þeðmðx if xi and xj are neighboring dM ðxi ,xj Þ ¼ 0 otherwise ð15Þ &
&
where distðf i ,f j Þ is measured by the Euclidean distance or Mahalanobis distance of feature vectors of basic granules, and ^ i Þ denotes the local dimension of basic granule xi. Thus, the mðx weight Wij between xi and xj is defined by W ij ¼ edM ðxi ,xj Þ=2Z
2
ð16Þ
Here, we choose the standard Gaussian kernel of samples as the weight, namely Z ¼ 1. Based on the above statements, the improved LE algorithm for feature extraction of basic granules can be summarized as follows. Algorithm 3.1. The improved LE algorithm of dimensionality reduction for basic granules. (1) Construct the adjacency graph G using the KNN. The nodes are connected if and only if xi is in the k-nearest neighbor of xj and vice-versa. ^ i Þ of xi. (2) Estimate the local dimension mðx (3) Compute the weighted matrix W by (16). (4) The low-dimensional representations of xi are obtained by solving the generalized eigenvalue problem (14). Fig. 8 illustrates an example of feature representation of basic granules of the color image shown as Fig. 4(a) where the step length in jump connection is 0.15. The first (resp. second) subfigure shows the two- (resp. three) dimensional feature representation of basic granules by LE.
4. Image segmentation by fusing basic granules
further and edges between neighboring nodes are defined as connections of granules based on some measures like similarity or distance. Suppose the neighborhood graph is o ¼ ft1 , t2 , . . . , tP g after some procedures of fusion, where tk ¼ fxk1 ,xk2 , . . . ,xk9t 9 g denotes k the kth synthesized granule. The globally optimal technique of image segmentation can be performed by maximizing the posterior probability of o by
on ¼ arg max Pðo9V 0 Þ
ð17Þ
where the posterior probability is taken as the Gibbs distribution form Pðo9V 0 Þ ¼
1 UðoÞ=T e Z
ð18Þ
In (18) Z is a normalizing parameter, T is a given parameter, called the controlling temperature in the process of granule fusion, and , P P X X Uðtk Þ Vðti , tj Þ ð19Þ UðoÞ ¼ iaj
k¼1
is the energy function of o. Uðtk Þ denotes the potential of object tk based on intra feature variance compatibility and Vðti , tj Þ is the potential based on feature compatibility between different objects ti and tj . The potential Uðtk Þ of an object tk is composed of two parts, Usize and Uvariation. The first item is a penalizing item for small objects and the latter describes the uniformity of features in two objects. In details, Uðtk Þ ¼ U size ðtk Þ þ U variation ðtk Þ
ð20Þ
Let Sðtk Þ be the size of tk , and Smin be the assumptive minimum size of basic granules, Usize is defined as aK 1 ðSðtk ÞÞ if Sðtk Þ oSmin ð21Þ U size ðtk Þ ¼ a otherwise with K 1 ðÞ being a monotonic decreasing function and a 4 0 being a given constant. Here, we let K 1 ðsÞ ¼ Smin s. Evidently, all the objects whose sizes are smaller than Smin will be penalized by (21). The variation Uvariation can be characterized by considering the distances between basic granules in the object tk by X Jyi yj J2 ð22Þ U variation ðtk Þ ¼ b xi ,xj A tk ,i a j
4.1. Problem formulation We formulate image segmentation as a result of granule fusion. The whole set of basic granules is taken as the universe of discourse, denoted by V0. To reduce the computational complexity of fusing granules, we define a neighborhood graph G0 ¼ ðV 0 ,E0 Þ for the granule set. Each node in the graph represents a basic granule which cannot be divided
where yi is the feature expression of basic granule xi in the feature space. The variation Uvariation takes the effect of penalizing synthesized granules with severe feature deviation. That is, a synthesized granule should vary slightly in the feature space. The potential term V, denoted by X Vðti , tj Þ ¼ g Jri rj J2 ð23Þ ti , tj A o,i a j
54
T. Deng, W. Xie / Neurocomputing 99 (2013) 46–58
prevents a single object being split into several granules, where ri is the feature expression of synthesized granule ti in the feature space. 4.2. MCMC-based granule synthesis The fusing process of granules has two types of strategies, split (splitting a single granule into two or more granules) and merge (merging two or more similar neighboring granules into one). Obviously, the fusing process is a Markov process. Its globally optimal solution, i.e. (17), is hardly computed analytically. In this section, the MCMC method is introduced to seek for the optimal solution of (17) by a simulated way [4,1]. For the splitting process, let ti A o be split into two new granules, ti1 and ti2 , with a probability psplit ðti1 , ti2 Þ. The new state of fused granules is denoted by o0 ¼ ft1 , t2 , . . . , ti1 , ti2 , . . . , tP g. For the merging process, the new state is represented by o0 ¼ ft1 , . . . , ti1 , ti þ 1 , . . . , tj1 , tj þ 1 , . . . , tP g when a new granule tp is merged from two granules ti , tj A o with a probability pmerge ðti , tj Þ. Both the splitting probability and merging probability are required to be evaluated so as to reach an optimal fusion. Let passo ðti , tj Þ ¼ eJri rj J2 =2s
2
ð24Þ
be the joint likelihood estimation between two granules ti and tj . The splitting probability and merging probability can be defined by, respectively 1p ðt , t Þ psplit ðti1 , ti2 Þ ¼ P9o9 asso i1 i2 i ¼ 1 ð1passo ðti1 , ti2 ÞÞ pmerge ðti , tj Þ ¼ P
passo ðti , tj Þ
ti , tj A o passo ðti , tj Þ
ð25Þ
ð26Þ
The process of splitting and merging is referred to as the MCMC process. It can be implemented by the popular Metropolis–Hastings algorithm. For simplicity, the transition probability of states from o to o0 is denoted by pðo-o0 Þ. Algorithm 4.1. The Metropolis–Hastings algorithm for fusing basic granules.
Initialize segmentation state o ¼ oð0Þ , set initial temperature, T ¼ T init , and terminating temperature, Tfinal.
Do iterations. Decrease T by a small scale DT every N iterations. For each iteration step: (1) randomly choose a move type: splitting or merging, (2) compute the transition probability pðo-o0 Þ by (25) or (26), (3) compute the reverse transition probability pðo0 -oÞ, (4) determine the objective probability Pðo9V 0 Þ and Pðo0 9V 0 Þ through (18), (5) compute the acceptance ratio of state transition from o to o0 by ( ) Pðo0 9V 0 Þ pðo0 -oÞ 0 Rðo-o Þ ¼ min 1, ð27Þ Pðo9V 0 Þ pðo-o0 Þ (6) take a random number g U½0; 1 and update the state 0 o if g r Rðo-o0 Þ ð28Þ o¼ o otherwise (7) if T ¼ T final , set on ¼ o and terminate iteration.
The finial state on is regarded as the optimal segmentation result.
5. Experiments and analysis In this section, a series of experiments are implemented to test the effectiveness of color image segmentation using the proposed granule-view feature classification in manifold space. For the purpose of comparison, we also implement some other state-ofart algorithms including mean shift algorithm [8], statistical region merging [15], regular blocking method [26] for image segmentation from the well-known Berkeley Segmentation DataSet (BSDS) [12]. 5.1. Quantitative evaluation For quantitative comparison of the experimental results by different approaches, the accuracy of segmentation according to ground-truth by human is analyzed by Probabilistic Rand Index (PRI) [27,28], Variation of Information (VI) [14] and Global Consistency Error (GCE) [13]. The calculation of these performance measures can be explained as follows. Given a set of ground-truth segmentation fGk g, the Probabilistic Rand Index is defined as 1X PRIðS,fGk gÞ ¼ ½c p þ ð1cij Þð1pij Þ ð29Þ T i o j ij ij where cij is the event that pixels i and j have the same label and pij its probability. T is the total number of pixel pairs. The variation of information is a measure of the distance between two clusters. It is also used to evaluate the performance of segmentation. Considering a ground-truth segmentation (a division of a set of pixels into several regions) X ¼ fX 1 , . . . ,X k g and experimental segmentation Y ¼ fY 1 , . . . ,Y l g where pi ¼ 9X i 9=n P and n ¼ ki ¼ 1 9X i 9. Then the variance of information between X and Y is VIðX,YÞ ¼ HðXÞ þHðYÞ2IðX; YÞ
ð30Þ
where H(X) is the entropy of X and IðX; YÞ is the mutual information between X and Y. Global Consistency Error permits all local refinement to be in the same direction and is mathematically expressed as ( ) X X 1 GCEðS1 ,S2 Þ ¼ min EðS1 ,S2 ,pi Þ, EðS1 ,S2 ,pi Þ ð31Þ n i i where EðS1 ,S2 ,pi Þ, called the local refinement, measures the degree to which two segmentations S1 and S2 agree at pixel pi and is defined as EðS1 ,S2 ,pi Þ ¼
9RðS1 ,pi Þ\RðS2 ,pi Þ9 9RðS1 ,pi Þ9
where RðS1 ,pi Þ is the set of pixels in segmentation S1 which are in the same region as pixel pi.
5.2. Parameter settings We employ granule-based feature extraction for image segmentation. There are a number of parameters that must be quantified, besides parts of them determined in the above corresponding sections. The rest are addressed as follows. A 3 3 square flat structuring element is selected in reconstructing the basic granules by the color morphological geodesic dilation. The optimal size of basic granules and feature dimensionality are calculated by Eqs. (7), (10) and (11), respectively. The result of jump connection segmentation is chosen as the initial segmentation o in MCMC. And the Euclidean distance is used in calculating the similarity between granules. When
T. Deng, W. Xie / Neurocomputing 99 (2013) 46–58
implementing the MCMC scheme, the time interval, time steplength and the number of steps are always fixed as T init ¼ 500, T final ¼ 1, DT ¼ 0:5, N ¼5. The assumptive minimum size of regions and its penalty factor in Eq. (21) are fixed as 20 and 50, respectively. The default value of scale and weighted factor in variance function are fixed as s ¼ 1, a ¼ 50, b ¼ g ¼ 1.
55
5.3. Comparative experiments and discussions In order to evaluate the effectiveness of the proposed approach to color image segmentation, it is compared with state-of-art algorithms, including the mean shift algorithm [8], the statistical region merging [15], and the method combining regular-blocking, LE and FCM proposed by Tziakos [26].
Fig. 9. The comparative results from different color image segmentation methods. Input images are shown in the first column, results yielded by the mean shift, the statistical region merging and the regular blocking method are shown from the second to the fourth column, respectively, and the last column demonstrates the result of segmentation by the proposed method. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
56
T. Deng, W. Xie / Neurocomputing 99 (2013) 46–58
PRI
VI
1
GCE
3.5
0.55
3
0.45
2.5
0.35
2
0.25
1.5
0.15
1
0.05
0.5
0.9 0.8
0.4 0.7 0.6
0.3
0.5
0.2 0.4
mean shift statistical region merging regular blocking+LE+FCM the proposed method
0.3
0.1
0.2 1
2
3
4
5
6
1
2
3
4
5
6
1
2
3
4
5
6
Fig. 10. Calculations of segmentation accuracy with respect to three indexes for some experimental images.
Fig. 9 shows the experimental results obtained for the segmentation of natural color images in Berkeley Segmentation Database. In the mean shift color segmentation algorithm, we fix the bandwidth parameter as 30 for simplicity. In the obtained results using the mean shift, many small regions are generated, particularly in texture images. In the statistical region merging the objects are not preserved very well, because the difficulties in determining the merging probability, the merging ordering and the termination principles allow the occurrence of overmerging, undermerging and mismerging. In order to further demonstrate the superiority of the granuleview technique to the method proposed by Tziakos [26], a number of comparative experiments are shown in Fig. 10. In that method, the considered image is partitioned regularly into 8 8 blocks with an overlap of two pixels wide. Then, all blocks are reshaped into d-dimensional vectors by the LE algorithm for feature extraction. The classical FCM algorithm is used to cluster all blocks into c classes to realize image segmentation. It is observed that poor findings of edges are caused due to regularly blocking, see the images in the fourth column of Fig. 9. As well known, in FCM algorithm, the number of classes c must be determined factitiously, which makes a great impact on the clustering result. The choice of clustering centers is also a serious question. In addition, the set of granule features obtained by LE is ‘‘crescent’’ in shape and is endowed with strong linearity in general. So the classical clustering algorithms including FCM cannot draw a quite accurate distinction of basic granules between different objects. On the contrary, MCMC does not suffer from the above puzzles and can result in a perfect clustering of granules. The proposed granule-viewed segmentation method avoids the block-effect of edges and leads to smooth edges that owe to the contribution of color jump connection segmentation and making full use of the neighborhood information of voxels. It is interesting to quantitatively compare segmentation of different methods. The measures of Probabilistic Rand Index (PRI), Variation of Information (VI) and Global Consistency Error (GCE) are employed to evaluate the segmentation quality according to ground-truth. In Fig. 10, the x-coordinate indicates the number of images and the y-coordinates indicate the values of indexes. The comparative results in Fig. 10 demonstrate that the proposed segmentation method achieves the best accuracy according to PR and VI index on average and is just slightly less accurate than the statistical region merging algorithm with respect to VI index. On an average, the proposed method reaches results more accurate than that obtained by employing mean shift and those obtained by combining regular blocking, LE and FCM. When taking computational efficiency into consideration, in the proposed method, most of the computational time is used for the MCMC process. After optimization the computational time of
the proposal becomes acceptable though the proposal is slower than some well-known algorithm at present. 5.4. Parameter analysis The proposed algorithm involves several parameters, such as the step length k for jump connection, the feature dimension d, and the parameters in MCMC. Some of them play important roles in the proposal, while others are inessential. Now, we analysis their influences on the result of segmentation in detail. 5.4.1. The step length for jump connection As discussed before, the family of basic granules is generated by the color jump connected segmentation. In the jump connection, the change of the step length k will bring severe impacts on the size of basic granules because the jump connection is a multiscale segmentation method. Namely the larger the step length is, the larger size of the basic granules is. With the increase of the step length, many voxels belonging to different objects are likely clustered into a basic granule, which contravenes the requirement of basic granules. Thus, the jump step length cannot be too large. However, a small step length is also not compulsive as it may yield over-segmentation and the number of basic granules will be large. Although basic granules with small size can be merged into other granules during the fusing process by MCMC, the small granules need to be filled by morphological dilation for regularization. Pseudo color and inaccurate segmentation might be produced. Many factors, including the tendency of variances of segmentation results following the changes of step length and visual perception can be used to determine the step length for jump connection to achieve a perfect result of color image segmentation. 5.4.2. Feature dimensionality It is known that the feature dimensionality of basic granules is required to be estimated in the LE algorithm. Although there exist various approaches to estimating the feature dimensionality of a manifold, it is an open problem to estimate dimensionality of the patch manifold. It is verified by experimental results (shown as Fig. 11) that neither smaller feature dimensionality nor larger one can lead to a desired performance of color image segmentation. In Fig. 11, when the feature dimensionality of the original image shown as Fig. 4(a) varies from 2 to 12, it is observed that the segmentation quality of the color image becomes better and then worse. It is satisfied when the feature dimensionality is around 6, which approaches to the intrinsic dimensionality of the patch manifold. It is further verified that the evaluation of feature dimensionality of the patch manifold (basic granules) in the LE
T. Deng, W. Xie / Neurocomputing 99 (2013) 46–58
57
Fig. 11. The results of image segmentation according to different feature dimensionalities. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
algorithm defined by (10) (for locally intrinsic dimensional estimation) and (11) (for globally intrinsic dimensional estimation) is feasible. 5.4.3. Other parameters in MCMC To analysis the influence of parameters, a, b, g, Z, and T, on the performance of image segmentation in the proposed algorithm, assume a ¼ 0, then the acceptance ratio Rðo-o0 Þ from the state o to the state o0 is ( ) 0 Pðo0 9V 0 Þ C 0 ¼ minf1,C 0 eb=gTðCC Þ g min 1, Pðo9V 0 Þ where C0 is irrelevant to both b and g, C (resp. C 0 ) stands for the ratio between intra-variance and inter-variance of the state o (resp. o0 ). Clearly, the acceptance ratio R is irrelevant to Z. For given ratio between intra and inter variance, C and C 0 , if CC 0 4 0 (CC 0 o 0, resp.), the larger b=gT is, the larger (smaller, resp.) the acceptance ratio R is. Thus, the region with a large variance ratio C is of big possibility to be remained with increasing b=gT. Similarly, when b ¼ 0, it is verified that a larger a will bring a larger influence on suppressing small regions. Moreover, we consider the influence of parameter s in MCMC algorithm on the performance of image segmentation. Because the acceptance ratio Rðo-o0 Þ is relevant to the proportion between the state transition probabilities pðo0 -oÞ and pðo-o0 Þ, the parameter s acts on the same effect simultaneously for forward transition and backward transition in MCMC. Thus the acceptance ratio Rðo-o0 Þ and the final segmentation are not sensitive to the choice of s, though it has influence on the posterior probability Pðo9V 0 Þ (or Pðo0 9V 0 Þ) and the state transition probability pðo-o0 Þ, respectively. As weight parameters a, b, g bring little effect on final results of segmentation due to randomness of MCMC, we experimentally set their default values as mentioned above. A very large default value of time interval and a small default value of time step are fixed in order to force the convergence of the algorithm. In practical, time parameters may be slackened.
6. Conclusions A granule-view based feature extraction and classification method is developed to segment color images in this paper. All connected components are formulated by extending the popular jump connection segmentation technique to color images. Each connected component is referred to as a basic granule that is not indiscernible in intuitive senses. The basic granules are regularized to meet the requirement of the LE manifold learning method for extracting effectively their intrinsic features and to enhance the classification performance of basic granules. The MCMC method is investigated to fuse basic granules in order to yield prefect image segmentation results. Experimental results demonstrate that the proposal leads to correct granulization of images,
effective feature representation of basic granules, sound granule fusion strategy, and acceptable accuracy of the color image segmentation. In contrast to some supervised image segmentation algorithms (e.g. graph cut) successfully applied to solve a special problem (e.g. finding the same object in a set of images with different backgrounds), the developed method is referred to as a kind of unsupervised techniques and works suitable to segment general natural images where no priori knowledge is available. Note that the proposed method can address the images with similar color and intensity in texture regions (e.g. grass) but not deal with images with great difference in texture regions (e.g. zebra). It is because texture information is not taken into consideration in the model of the proposal. The pixels in the similar texture region can be grouped into a basic granule, while the ones in a texture region with large difference will be separated. The issue on adding texture information into the proposal to segment nature images with complex textures will be explored in the future. References [1] C. Andrieu, N. De Freitas, A. Doucet, M.I. Jordan, An introduction to MCMC for machine learning, Mach. Learn. 50 (2003) 5–43. [2] E. Aptoula, S. Lefe´vre, On the morphological processing of hue, Image Vis. Comput. 27 (2009) 1394–1401. [3] M. Belkin, P. Niyogi, Laplacian eigenmaps for dimensionality reduction and data representation, Neural Comput. 15 (2003) 1373–1396. [4] S.P. Brooks, Markov chain Monte Carlo method and its application, J. R. Stat. Soc. Ser. D: Stat. 47 (1998) 69–100. [5] F. Calderero, F. Marques, Region merging techniques using information theory statistical measures, IEEE Trans. Image Process. 19 (2010) 1567–1586. [6] K.M. Carter, R. Raich, A.O. Hero, On local intrinsic dimension estimation and its applications, IEEE Trans. Signal Process. 58 (2010) 650–663. [7] R.R. Coifman, S. Lafon, Diffusion maps, Appl. Comput. Harmon. Anal. 21 (2006) 5–30. [8] D. Comaniciu, P. Meer, Mean shift: a robust approach toward feature space analysis, IEEE Trans. Pattern Anal. Mach. Intell. 24 (2002) 603–619. [9] D. Cremers, M. Rousson, R. Deriche, A review of statistical approaches to level set segmentation: integrating color, texture, motion and shape, Int. J. Comput. Vis. 72 (2007) 195–215. [10] P.F. Felzenszwalb, D.P. Huttenlocher, Efficient graph-based image segmentation, Int. J. Comput. Vis. 59 (2004) 167–181. [11] A.G. Hanbury, J. Serra, Morphological operators on the unit circle, IEEE Trans. Image Process. 10 (2001) 1842–1850. [12] D. Martin, C. Fowlkes, D. Tal, J. Malik, A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics, in: Proceedings of the Eighth International Conference on Computer Vision, vol. 2, 2001, pp. 416-423. [13] D. Martin, An Empirical Approach to Grouping and Segmentation, Ph.D. Dissertation, U.C. Berkeley, 2002. [14] M. Meila, Comparing clusterings by the variation of information, learning theory and kernel machines, Lect. Notes Comput. Sci. 2777 (2003) 173–187. [15] R. Nock, F. Nielsen, Statistical region merging, IEEE Trans. Pattern Anal. Mach. Intell. 26 (2004) 1452–1458. [16] G. Peyre´, Manifold models for signals and images, Comput. Vis. Image Underst. 113 (2009) 249–260. [17] C. Rother, V. Kolmogorov, A. Blake, ‘GrabGut’—interactive foreground extraction using iterated graph cuts, ACM Trans. Graph. 23 (2004) 309–314. [18] S.T. Roweis, L.K. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science 290 (2000) 2323–2326. [19] L.K. Saul, S.T. Rowels, Think globally, fit locally: Unsupervised learning of low dimensional manifolds, J. Mach. Learn. Res. 4 (2004) 119–155.
58
T. Deng, W. Xie / Neurocomputing 99 (2013) 46–58
[20] H.S. Seung, D.D. Lee, The manifold ways of perception, Science 290 (2000) 2268–2269. [21] J. Serra, A lattice approach to image segmentation, J. Math. Imaging Vis. 24 (2006) 83–130. [22] J. Serra, Connective segmentation, in: Proceedings of the 15th IEEE International Conference on Image Processing (ICIP), 2008, pp. 2192–2195. [23] J. Shi, J. Malik, Normalized cuts and image segmentation, IEEE Trans. Pattern Anal. Mach. Intell. 8 (2000) 888–905. [24] A. Skowron, J. Stepaniuk, Information granules: towards foundations of granular computing, Int. J. Intell. Syst. 16 (2001) 57–85. [25] J.B. Tenenbaum, V. De Silva, J.C. Langford, A global geometric framework for nonlinear dimensionality reduction, Science 290 (2000) 2319–2323. [26] I. Tziakos, C. Theoharatos, N.A. Laskaris, G. Economou, Color image segmentation using Laplacian eigenmaps, J. Electron. Imaging 18 (2009) 023004, http ://dx.doi.org/10.1117/1.3122369. [27] R. Unnikrishnan, M. Hebert, Measures of similarity, in: Proceedings of the Seventh IEEE Workshop on Computer Vision Applications, 2005, pp. 394–400. [28] R. Unnikrishnan, C. Pantofaru, M. Hebert, Toward objective evaluation of image segmentation algorithms, IEEE Trans. Pattern Anal. Mach. Intell. 29 (2007) 929–944. [29] M.I. Vardavoulia, I. Andreadis, Ph. Tsalides, Vector ordering and morphological operations for colour image processing: fundamentals and applications, Pattern Anal. Appl. 5 (2002) 271–287. [30] S. Yan, D. Xu, B. Zhang, H.J. Zhang, Q. Yang, S. Lin, Graph embedding and extensions: a general framework for dimensionality reduction, IEEE Trans. Pattern. Anal. Mach. Intell. 29 (2007) 40–51. [31] Y.Y. Yao, Granular computing: basic issues and possible solutions, in: Proceedings of the Fifth Joint Conference on Information Sciences, Beijing, 1999, pp. 186–189. [32] Y.Y. Yao, Perspectives of granular computing, in: IEEE International Conference on Granular Computing, 2005, pp. 85–90.
Tingquan Deng received his BS degree in Mathematics, MS degree in Applied Mathematics, and PhD degree in Fundamental Mathematics from the Harbin Institute of Technology, Harbin, China, in 1987, 1990 and 2002, respectively. He was a visiting scholar from Center for Mathematics and Computer Science, Amsterdam, the Netherland from 1999 to 2000 for 1 year and a postdoctoral research fellow in Department of Automation, Tsinghua University, Beijing, China from 2003 to 2005. Currently, he is a professor in College of Science as well as in College of Computer Science and Technology, Harbin Engineering University, Harbin, China. His research interests include uncertainty theory, image processing and pattern recognition, and data mining and machine learning.
Wei Xie received her BS degree in information and computing science and MS degree in systems theory from the Harbin Engineering University, Harbin, China, in 2008 and 2011. She is currently a candidate for PhD degree in College of Computer Science and Technology, Harbin Engineering University. Her research interests include image processing, machine learning and uncertainty theory.