Information Fusion xxx (2013) xxx–xxx
Contents lists available at SciVerse ScienceDirect
Information Fusion journal homepage: www.elsevier.com/locate/inffus
Full Length Article
SAR image multiclass segmentation using a multiscale and multidirection triplet Markov fields model in nonsubsampled contourlet transform domain Yan Wu a,⇑, Peng Zhang b, Ming Li b, Qiang Zhang a, Fan Wang a, Lu Jia b a b
Remote Sensing Image Processing and Fusion Group, School of Electronic Engineering, Xidian University, P.O. Box 140, Xi’an, Shaanxi 710071, China National Key Lab. of Radar Signal Processing, Xidian University, Xi’an 710071, China
a r t i c l e
i n f o
Article history: Received 7 April 2012 Received in revised form 10 December 2012 Accepted 10 December 2012 Available online xxxx Keywords: SAR image multiclass segmentation NSCT-TMF model NSCT-HMT NSCT-TMF energy function Multiscale information fusion
a b s t r a c t Triplet Markov fields (TMFs) model recently proposed is to deal with nonstationary image segmentation and has achieved promising results. In this paper, we propose a multiscale and multidirection TMF model for nonstationary synthetic aperture radar (SAR) image multiclass segmentation in nonsubsampled contourlet transform (NSCT) domain, named as NSCT-TMF model. NSCT-TMF model is capable of capturing the contextual information of image content in the spatial and scale spaces effectively by the construction of multiscale energy functions. And the derived multiscale and multidirection likelihoods of NSCT-TMF model can capture the dependencies of NSCT coefficients across scale and directions. In this way, the proposed model is able to achieve multiscale information fusion in terms of image configuration and features in underlying labeling process. Experimental results demonstrate that due to the effective propagation of the contextual information, NSCT-TMF model turns out to be more robust against speckle noise and improves the performance of nonstationary SAR image segmentation. Ó 2012 Elsevier B.V. All rights reserved.
1. Introduction Synthetic aperture radar (SAR) is a coherent microwave imaging system. Then due to the presence of coherent speckles, it is difficult to visually and automatically interpret SAR images. SAR image segmentation, which can provide the spatial structures of the imaging region and thus reveal the nature of SAR images, is a significant step towards SAR image interpretation. Consequently, SAR image segmentation has promoted SAR applications in many fields, such as geological exploration, ocean research and disaster monitoring. The purpose of SAR image segmentation is to partition an image into regions of different characteristics. During the last decades, SAR image segmentation technology has been widely studied [1–6]. For nonstationary image segmentation, D’Elia et al. propose a tree-structured Markov random field (MRF) model and introduce several edge-penalty parameters to define different potential functions for each different pair of classes [7]. Such a model aims at describing the hidden structure of the data by a sequence of binary MRFs, each corresponding to a node in the associated tree, with all parameters defined locally to each node. So this model considers the nonstationary property of images implicitly. Considering the nonstationary property of images explicitly, Benboudjema and Pieczynski propose triplet Markov fields (TMFs) model which is suitable for nonstationary ⇑ Corresponding author. E-mail address:
[email protected] (Y. Wu).
SAR images segmentation recently [2]. In this model, the nonstationarities of the images are described by the third field U. Moreover, TMF model can adopts diverse statistical models for SAR data related to diverse radar backscattering sources while dealing with SAR image segmentation. TMF models give satisfactory results in nonstationary image segmentation [2,4–6]. However, classical TMF model has a limited ability to describe the contextual behavior within a range of scales. Consequently being lacking in the guidance from larger segmented structure at the coarser scale, it cannot avoid some mis-segmentations in pixel-level usually. To incorporate the global and local information of images at different scales while labeling the pixels, several hierarchical Markov models have been proposed recently, in which larger scale structure segmentation will imply the global spatial information [8– 10]. However, some of these models do not take into account the global image features adequately. In such hierarchical Markov model, data are only available at the finest level, consequently, only the SAR intensity in pixel-level is used as the feature to represent the difference between the targets in likelihood [8]. And some of the models do not adopt the multiscale and multidirection transform, such as wavelet, contourlet transform, nonsubsampled contourlet transform (NSCT) to decompose the observed images [9,10]. In fact, most gray-scale texture images are well characterized by their singularity structure, it is natural for wavelet and contourlet to model texture images [11,12]. Consequently, transform-domain models are proposed recently and have been successfully applied to image denoising, texture retrieval and
1566-2535/$ - see front matter Ó 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.inffus.2012.12.001
Please cite this article in press as: Y. Wu et al., SAR image multiclass segmentation using a multiscale and multidirection triplet Markov fields model in nonsubsampled contourlet transform domain, Informat. Fusion (2013), http://dx.doi.org/10.1016/j.inffus.2012.12.001
2
Y. Wu et al. / Information Fusion xxx (2013) xxx–xxx
image segmentation [11–16]. The proposed model in [16] is the extension of hidden MRF model to wavelet domain for SAR image segmentation. In this model, the multiscale image features are considered based on wavelet hidden Markov tree (HMT). On the other hand, the inter-scale and intra-scale dependencies between the class labels are captured by hidden MRF model. The segmentation results in [16] have verified that the coarser-scale information is able to help guide finer-scale decisions. And in this way, the robustness and accuracy of segmentation are improved. But wavelet is not a true two-dimensional representation since it cannot model the dependency across directions. And the nonstationary property of SAR image is not taken into account. Motivated by the discussion above, in this paper, considering the nonstationary property of SAR image explicitly, we propose a multiscale and multidirection TMF model in NSCT domain (NSCTTMF) for SAR image multiclass segmentation. NSCT-TMF model constructs a new causal energy function to capture contextual information of image content in the spatial and scale spaces, that is to say the inter-scale dependencies and intra-scale interactions within label fields (X, U) are taken into account. In NSCT-TMF model, the distribution of triplet fields at each scale consists of the causal energy function and the multiscale and multidirection likelihood computed based on NSCT hidden Markov tree (NSCTHMT). In this way, NSCT-TMF model is able to model the multiscale information of images in terms of image configuration and features, and thus can implement multiscale Bayesian decision fusion in segmentation. On the other hand, we analyze the nonstationarity of SAR images from the textural point of view in a multiscale framework and present a reasonable initialization of U field based on the research D’Hondt et al. [17]. In [4], we fuse the traditional energy function of TMF model with the principle of edge penalty to enhance the segmentation performance in edge location. In [5], SAR image is mapped into an edge-based pixon-representation by quad-tree decomposition. And the extended TMF model is constructed based on the obtained pixon-representation. The extended TMF models above do not belong to the family of multiscale transform-domain models [9,10]. Consequently, they take the multiscale information of image content less effectively than the proposed NSCT-TMF model. NSCT-TMF model, which is a multiscale and multidirection context model in NSCT domain, is capable of capturing the contextual information of image content within the spatial and scale spaces effectively in underlying labeling process by the construction of new causal energy function, and has high degree of directionality and anisotropy. Thus in nonstationary SAR image segmentation, it turns out to be robust against speckle noise and can obtain better segmentation results. And of course, the idea that incorporates the local feature of edge strength into the energy function, can be easily extended into NSCT-TMF model to improve the edge location of segmentation. The structure of paper is as follows. In Section 2, a brief review of TMF model is presented. In Section 3, we present the proposed model and the parameters estimation method. Moreover, a novel reasonable initialization of U field is given. The experimental results are presented and compared in Section 4. Finally, in Section 5, we give the concluding remarks. 2. Analysis of TMF model In this section, we give a brief review of TMF model. Let S denote a finite set of sites with a neighborhood system defined on it. In image segmentation, a Markov random field (MRF) is defined as a set of discrete valued random variables, X = (Xs)seS, defined on S. Each variable takes a value in X = {1, . . . , K}, representing the class to which the site s belongs, and has to be recovered. The observed
field Y = (Ys)s e S can be seen as a noisy version of X = (Xs)seS. Pairwise fields (X, Y) = (Xs, Ys)seS will be considered nonstationary in the probabilistic and spatial configuration sense when the energy function depends on the position of the cliques. In TMF model, the possible nonstationarity of the distribution p(x, y) of (X, Y) is managed by introducing a third random field U = (Us)seS, each Us taking its value from a finite set. In this paper, for simplicity and without loss of generality, we consider two different stationarities and the random field U = (Us)seS takes on its value from the set K = {a, b}. In TMF model, random fields S are assumed to be Markovian. The distribution of the Markov fields (X, U) is defined by the energy function [2]:
Wðx; ujhÞ ¼
X ðs;tÞ2C H
¼
X
GHðs;tÞ ðx; ujhÞ þ
X
GVðs;tÞ ðx; ujhÞ
ðs;tÞ2C V
a1H ð1 2dðxs ; xt ÞÞ ða2aH d ðus ; ut ; aÞ
ðs;tÞ2C H
þ a2bH d ðus ; ut ; bÞÞð1 dðxs ; xt ÞÞ þ
X
a1V ð1 2dðxs ; xt ÞÞ
ðs;tÞ2C V
ða2aV d ðus ; ut ; aÞ þ a2bV d ðus ; ut ; bÞÞð1 dðxs ; xt ÞÞ
ð1Þ
where h ¼ fa1H ; a2aH ; a2bH ; a1V ; a2aV ; a2bV g, are the model parameters of TMF model defining energy function, and d(xs, xt) verifies d(xs, xt) = 1 for xs = xt, and d(xs, xt) = 0 for xs – xt. d(us, ut, a) = 1 for us = ut = a, and d(us, ut, a) = 0 otherwise; d(us, ut, b) = 1 for us = ut = b, and d(us, ut, b) = 0 otherwise. CH is the set of horizontal cliques, CV is the set of vertical cliques. GHðs;tÞ ðx; ujhÞ is horizontal clique potential while GVðs;tÞ ðx; ujhÞ is vertical clique potential. Consequently, according to the definition in Eq. (1), the energy function is a sum of clique potentials over all possible cliques. And the value of clique potential depends on the local configuration on the clique. If we assume that the random variables Y = (Ys)seS are conditionally independent given Y = (Ys)seS, and verify the following hypothesis p(x, y), the distribution of the triplet (X, U, Y) can be defined as [2]:
pðx; u; yÞ ¼ cTMF exp½Wðx; ujhÞ þ
X
log pðys jxs Þ
ð2Þ
s2S
where cTMF is the partition function of TMF model. TMF models have achieved satisfactory results in nonstationary SAR image segmentation [2,4–6]. However, classical TMF model has a limited ability to capture contextual information in scale space. Consequently, being lacking in the larger structure segmentation at the coarser scale to guide the pixel-level segmentation, some mis-segmentations cannot usually be avoided. Moreover, in Ref. [2], a reasonable initialization of random field U is not discussed and random field U is obtained according to posterior P pðus jyÞ ¼ xs 2X pðxs ; us jyÞ in the iteration. However, some of the energy minimization methods, such as iterated conditional modes (ICMs) depend very much on the initial estimator [18]. Consequently, a proper initialization of random field U is needed to take into account based on the analysis of nonstationarities of SAR images. 3. SAR image segmentation using NSCT-TMF model In this section, to resolve the problems in nonstationary SAR image segmentation discussed above, NSCT-TMF model is proposed to model the multiscale information of images more precisely in terms of image configuration and features. We will present how to construct this multiscale context model and estimate the model parameters in detail in this section. And for this nonstationary statistical model, a reasonable initialization method of U field based on nonstationary texture analysis is presented.
Please cite this article in press as: Y. Wu et al., SAR image multiclass segmentation using a multiscale and multidirection triplet Markov fields model in nonsubsampled contourlet transform domain, Informat. Fusion (2013), http://dx.doi.org/10.1016/j.inffus.2012.12.001
Y. Wu et al. / Information Fusion xxx (2013) xxx–xxx
3.1. Nonsubsampled contourlet transform NSCT is a fully shift-invariant, multiscale and multidirection transform as an expansion of contourlet transform proposed by Cunha et al. [19]. NSCT offers a high degree of directionality and anisotropy, and it is capable of modeling the dependencies across directions, scales and space. So NSCT is a true two-dimensional representation of images. The structure of NSCT consists of two filter banks: (1) a nonsubsampled pyramid that generates the multiscale representation of images; (2) a nonsubsampled directional filter bank (DFB) that gives directional information at each scale level. The overview of NSCT is shown in Fig. 1. In a word, NSCT provides a multiscale and multidirection representation of images. The nonsubsampled directional filter bank can generate any arbitrary power of two’s number of directions at each scale. So it offers a high degree of directionality and anisotropy. Moreover, the inter-direction dependency can be captured by NSCT-HMT. And the basis functions have elongated supports that are able of capturing linear segments of contours more efficiently [12]. The research in Ref. [19] has proven that NSCT is efficient in image denoising and image enhancement. 3.2. NSCT-TMF likelihoods computation based on NSCT-HMT To apply Bayesian segmentation, it is necessary to compute the distribution of triplet fields in NSCT-TMF model. Here we present the computation of NSCT-TMF likelihoods. The probability density function (PDF) of nonsubsampled contourlet coefficients is nonGaussian, which is demonstrated by the kurtosis and the histogram as shown in Fig. 2. It exhibits a sharp peak at zero amplitude and heavy tails. In this paper, each nonsubsampled contourlet coefficient is modeled by two-state, zeros-mean Gaussian mixture model. As shown in Fig. 2, it is empirically verified that the marginal
(a) Nonsubsampled Filter Bank structure
3
distributions of NSCT coefficients can be modeled by this simple model accurately. To each nonsubsampled contourlet coefficient, we associate a discrete hidden state z that takes on the values from the set m = {0, 1}, denoting the small and large variance. The coefficient associated with state ‘‘1’’ is regarded as of edge area, while the coefficient associated with state ‘‘0’’ is regarded as smooth area. The Markovian dependency between the hidden state variables of nonsubsampled contourlet coefficients across scale and direction can be effectively captured by NSCT-HMT. NSCT-HMT is a tree-structured statistical model which is able to capture the dependencies across scales, space and directions. Especially, the inter-direction dependency property of NSCT-HMT is a major advantage, the dependencies in NSCT-HMT can span several adjacent directions in the finer scales. According to the research in [12], this can be illustrated by Fig. 3. Then NSCT-HMT is parameterized by the following parameters: (a) pL,d: the state probability vector for the root state variable at the coarsest scale L and in direction d; (b) Al,d: the state transaction probability matrix at scale l and in direction d; (c) rl,d: Gaussian standard deviation of the subband at scale l and in direction d. To train NSCT-HMT parameters set k ¼ fpL;d ; Al;d ; rl;d g, we resort to iterative expectation–maximization (EM) algorithm [13]. Note that the parent–child dependency is typically the most significant according to the properties of contourlet coefficients [12], so in this paper, only the parent–child interactions across scales and directions are taken into account in NSCT-HMT. And the hidden states of parent and child across scale are connected in a Markov chain. The training images can be obtained by picking homogeneous regions from the SAR image. In the training, tying is performed.
(b) Frequency partitioning
Fig. 1. Nonsubsampled contourlet transform.
(a)
(b)
Fig. 2. (a) The finest subband in direction D0. Kurtosis is 16.7462. (b) Corresponding histogram and estimated PDF.
Please cite this article in press as: Y. Wu et al., SAR image multiclass segmentation using a multiscale and multidirection triplet Markov fields model in nonsubsampled contourlet transform domain, Informat. Fusion (2013), http://dx.doi.org/10.1016/j.inffus.2012.12.001
4
Y. Wu et al. / Information Fusion xxx (2013) xxx–xxx
Fig. 3. (a) Parent–children relationship for a possible NSCT decomposition. (b) Dependency links between subbands of the NSCT decomposition with 4, 4, 8, and 8 directions.
Given NSCT-HMT parameters, the conditional likelihood of coefficients in each NSCT-HMT subtree f ðTds;l jzds;l ¼ m; kd Þ, can be calculated by sweeping up NSCT-HMT. zds;l is the hidden state variable of the coefficient of site s at scale l, and kd denotes NSCT-HMT parameters in direction d; Tds;l denotes the subtree of the nonsubsampled contourlet coefficients in direction d, rooted at site s and at scale l. Let fyls ; L P l P 1g be the nonsubsampled contourlet coefficients at scale l in each subband and the multiscale and multidirection likelihood is given by:
f ðyls jkÞ ¼
D Y
f ðTds;l jkd Þ ¼
d¼1
D X Y pðzds;l ¼ mjkd Þf ðTds;l jzds;l ¼ m; kd Þ
ð3Þ
d¼1 m
Then, with the multiscale and multidirection likelihoods at hand, both the global and local image features are considered, and the inter-scale dependencies between NSCT coefficients are captured using Markov chain. In the procedure of segmentation, the initial segmentation of X field at scale l can be obtained through maximum-likelihood method. 3.3. Novel initialization of the third random field U in NSCT-TMF model In TMF model, the possible nonstationarity of the distribution p(x, y) of the fields (X, Y) is managed by the third random field U = (Us)seS. Then the distribution p(x, u, y) does not depend on the position of the cliques and the triplet (X, U, Y) is a stationary Markov field. TMF model is a natural choice for implementing the hidden models for the segmented region and its corresponding stationarity, because it can incorporate spatial correlations of images into the segmentation process and take into account the nonstationarities of images. And the pixel-labeling problems based on statistical model can be represented in terms of energy minimization. Then, an important factor here is how to obtain a reasonable initialization of the third random field U because some energy minimization methods depend very much on the initial estimator [18]. Motivated by this problem, in this paper, we analyze the nonstationarity of SAR images from the textural point of view, and propose a novel initialization of the third random field U according to the nonstationary anisotropy Gaussian kernel (NAGK) parameters based on the research in [17]. To analyze the nonstationary texture of SAR images, nonstationary Gaussian Process modeled by NAGK has been introduced to represent SAR texture:
Tð‘Þ ¼
Z R2
k‘ ðwÞgðwÞdw
ð4Þ
where k‘ ðÞ is the NAGK, gðÞ denotes the Gaussian white noise, and ‘ denotes the 2-D spatial location.
The NAGK k‘ ðÞ is oriented with the angle x and characterized by its spatial standard deviations {rh, rv}. These NAGK parameters represent the nonstationary texture statistics of SAR images. And the local autocovariance, as second-order statistics, is seemed as the statistical descriptor of textures that describes the spatial correlations within a certain spatial extent. The local autocovariance can be expressed as a function of the kernel. Consequently, the parameters of NAGK and local autocovariance of texture are directly related conditionally. In this case, the local autocovariance of texture is given by:
C T ðdÞ ¼ r2T expðdT R1 T dÞ
ð5Þ
where d denotes the 2-D displacement vector between the two sites, r2T is the (0, 0) coefficient of CT and can be retrieved from the autocovariance of SAR intensity, RT is decomposed as follows:
RT ¼ MTx NMx
ð6Þ
cos x sin x 0 ð2rh Þ2 where M x ¼ , and N ¼ 2 . sin x cos x 0 ð2rv Þ Then based on Eqs. (5) and (6), the parameters {x, rh, rv} of NAGK can be derived from the local autocovariance by means of the geometric moments. Given the NAGK parameters, K-means is utilized to initialize the random field U. The proposed novel initialization of U field based on NAGK parameters is shown in Fig. 4. The simulated SAR image is shown Fig. 4a, and the hand segmentation of the random field U is shown in Fig. 4b. Compared to the hand segmentation, the initialization of the random field U by our proposed initialization method that is shown in Fig. 4c, is reasonable. 3.4. NSCT-TMF energy function based on Multiscale Bayesian decision fusion Our aim is to make NSCT-TMF model be able to capture the contextual information in term of spatial configuration, thus achieving multiscale Bayesian decision fusion. In this way, the coarser scale structure segmentations Xl+1 and Ul+1 should be utilized to guide the segmentations Xl and Ul at fine scale, and (Xl+1, Ul+1) ? (Xl, Ul) forms a Markov chain. Fig. 5 illustrates this case. Note that the guidance of large scale information exists in random fields (X, U). And in fact, the initialization of U field at coarser scale takes into account the longer range interactions while calculating the autocovariance of SAR texture. When we define the multiscale energy function, two factors should be taken into account: (1) the spatial interaction between the pixels within the same scale that models the local spatial correlation of image structure, (2) the causal relationship between the parent and child across the scale that provides the contextual
Please cite this article in press as: Y. Wu et al., SAR image multiclass segmentation using a multiscale and multidirection triplet Markov fields model in nonsubsampled contourlet transform domain, Informat. Fusion (2013), http://dx.doi.org/10.1016/j.inffus.2012.12.001
5
Y. Wu et al. / Information Fusion xxx (2013) xxx–xxx
(a) Simulated SAR image
(b) U field
(c) Novel initialization of U field
Fig. 4. Initialization of U field based on NAGK parameters.
8 l l l¼0 > < logðpðys jxs ÞÞ ! l l D X Y hs ðxs ; ys Þ ¼ > pðzds;l ¼ mÞf ðTds;l jzds;l ¼ m;kd Þ l ¼ 1; 2; .. .; L : log d¼1 m
ð10Þ At the coarsest scale, the posterior distribution is given by:
2 L
L
L
L NSCT-TMF
PX L ;UL jY L ðx ; u jy Þ / c
exp 4W L ðxL ; uL Þ þ
3 X L L 5 hðxs ; ys Þ
ð11Þ
s2SL
Fig. 5. Multiscale labeling structure in NSCT-TMF model.
information in scale space in segmentation. Then at scale l, 0 6 l 6 L 1, the new multiscale causal energy function can be expressed as: Q l ðxl ; ul jxlþ1 ; ulþ1 ; gÞ X ¼ jl1 a1st ð1 2dðxls ;xlt ÞÞ ða2ast d ðuls ; ult ; aÞ þ a2bst d ðuls ;ult ; bÞÞð1 dðxls ;xlt ÞÞ ðs;tÞ2C l
þ
X
2 l lþ1 jl2 ½b1 ð1 2dðxls ; xlþ1 parentðsÞ ÞÞ þ ba ð1 2d ðus ; uparentðsÞ ;aÞÞ
Note that the segmentation is pixel-level when l = 0, and we cannot directly obtain the pixel-level segmentation based on NSCT-HMT. To carry the segmentation down to pixel-level, a statistical model for the pixel brightness of each class is required. In this paper, generalized Gamma (GGamma) model is adopted for SAR backscattered signal. The feasibility of such a model for the characterization of the statistics of SAR data is investigated in Ref. [20]. As shown in Fig. 6a, the chosen regions correspond to 3 classes and the estimated PDFs and histograms for the three classes are shown in Fig. 6b. It proves that this model performs well in fitting the histograms of SAR images. The parameters of GGamma model are estimated according to the method proposed by Song [21]. Given these model parameters, the likelihood of each pixel can be computed. Then pixel-level segmentation can be obtained by extending the inter-scale transition to pixel-level. Based on the discussion above, the segmentation of SAR images is stated as conditional MAP estimations:
s2Sl
þ b2b ð1 2d ðuls ; ulþ1 parentðsÞ ; bÞ
ð7Þ
^xL ¼ arg maxX L
X
PX L ;UL jY L ðxL ; uL jyL Þ
ð12Þ
U L 2K
where g ¼ fa1st ; a2ast ; a2bst ; b1 ; b2a ; b2b g denote the NSCT-TMF model parameters, and to keep consistency, a1st ¼ fa1H ; a1V g; a2ast ¼ fa2aH ; a2aV g; a2bst ¼ fa2bH ; a2bV g:jl1 and jl2 are weighting parameters in NSCT-TMF model. According to [9], jl1 ¼ 2l þ 2ð2l 1Þ; jl2 ¼ 4l : a1st ; a2ast ; a2bst are the intra-scale model parameters gintra while b1 ; b2a ; b2b are the inter-scale fusion parameters ginter. Cl is the set of cliques at scale l. At the coarsest scale, energy function is defined as:
W L ðxL ; uL Þ ¼
X
jL1 a1st ð1 2dðxLs ; xLt ÞÞ ða2ast d ðuLs ; uLt ; aÞ
X
P Xl ;Ul jX lþ1 ;Ulþ1 ;Y l ðxl ; ul jxlþ1 ; ulþ1 ; yl Þ 0 6 l 6 L 1
U l 2K
ð13Þ Because it is difficult to maximum the posterior probability of triplet (X, U, Y), iterated conditional modes (ICMs) is utilized to perform the maximization of local conditional probabilities sequentially. 3.5. Estimation of NSCT-TMF model parameters
ðs;tÞ2C L
þ a2bst d ðuLs ; uLt ; bÞÞð1 dðxLs ; xLt ÞÞ
^xl ¼ arg maxX l
ð8Þ
Finally, at scale l, the posterior distribution of the triplet (Xl, Ul, Yl) of NSCT-TMF model is defined as:
PX l ;Ul jXlþ1 ;Ulþ1 ;Y l ðxl ; ul jxlþ1 ; ulþ1 ; yl Þ / clNSCTTMF exp½Q l ðxl ;ul jxlþ1 ;ulþ1 ; gÞ X hs ðxls ; yls Þ ð9Þ þ s2Sl
where clNSCTTMF is the partition function of NSCT-TMF model, and s ðxls ; yls Þ is NSCT-TMF likelihood, it can be expressed as: h
In this paper, to estimate NSCT-TMF model parameters, we resort to iterative conditional estimation (ICE) method which has been successfully applied in the parameters estimation of Hidden MRF and TMF model [2,22,23]. NSCT-TMF model is a multiscale framework, and the model parameters capture the dependencies across the scale, so we need to extend the ICE method used in non-multiscale model. This extension is implemented by performing Gibbs sampler at current scale conditioned on the labeling of the coarser scale. When updating the model parameters, least square error method proposed by Derin and Elliott [24] and sto-
Please cite this article in press as: Y. Wu et al., SAR image multiclass segmentation using a multiscale and multidirection triplet Markov fields model in nonsubsampled contourlet transform domain, Informat. Fusion (2013), http://dx.doi.org/10.1016/j.inffus.2012.12.001
6
Y. Wu et al. / Information Fusion xxx (2013) xxx–xxx
(a)
(b)
Fig. 6. (a) SAR image and (b) estimated PDFs and histograms for three classes.
Table 1 The procedure of parameter estimation. 1. Take g(0) as the initial value; 2. g(t+1) is computed from g(t) and (YL, YL1, . . ., Y0) in the following way: – Using Gibbs sampler, simulate q realizations ðxl1 ; ul1 Þ; ðxl2 ; ul2 Þ; . . . ; ðxlq ; ulq Þ of (Xl, Ul) according to posterior distribution given by Eqs. (11) and (9) corresponding to g(t) and (YL, YL1, . . ., Y0) from coarse scale to fine scale; – For each obtained set fðxLj ; uLj Þ; ðxL1 ; uL1 Þ; . . . ; ðx0j ; u0j Þg, estimate intra-scale model parameters and inter-scale fusion parameters gj = {gintra, ginter}j based on SG j j algorithm (which requires n new realizations simulated by Gibbs sampler according to conditional prior distribution derived by Eq. (7) from coarse scale to fine scale) by the following formulas: h i A ~ 0iþ1 j~ ~ 1iþ1 Þ rgintra Q ðx0 ; u0 jx1 ; u1 Þ ; 0 6 1 6 n 1 gintra ði þ 1Þ ¼ gintra ðiÞ þ iþ1 rgintra Q ð~x0iþ1 ; u x0iþ1 ; u h i A ~ 0iþ1 j~ ~ 1iþ1 Þ rginter Q ðx0 ; u0 jx1 ; u1 Þ ; 0 6 i 6 n 1 ginter ði þ 1Þ ¼ ginter ðiÞ þ iþ1 rginter Q ð~x0iþ1 ; u x1iþ1 ; u ~ 0iþ1 Þ and ðx ~1iþ1 ; u ~ 1iþ1 Þ are x0iþ1 ; u where A is a constant, rgintra Q ðÞ is the gradient of Q( ) with respect to gintra, rginter Q ðÞ is the gradient of Q( ) with respect to ginter, ð~ realizations of (X0, U0) and (X1, U1) simulated by Gibbs sampler according to conditional prior distribution from coarse scale to fine scale using the current model parameters {gintra(i), ginter(i)}, (x0, u0) and (x1, u1), which vary with iterations, are realizations of (X0, U0) and (X1, U1) simulated by Gibbs sampler according to posterior distribution from coarse scale to fine scale using the current model parameters. Then gj is obtained; – g(t+1) is given by gðtþ1Þ ¼ 1q ðg1 þ g2 þ þ gq Þ; 3. If the sequence g(t) becomes steady, parameter estimation is finished.
Table 2 The procedure of the proposed algorithm. 1. Train NSCT-HMT parameters using EM algorithm and GGamma model parameters using the method proposed by Song [21]. 2. Perform the initial segmentation of (X, U). – Initialize X field using ML method at each scale. – Initialize U field using K-means according to NAGK parameters at each scale. 3. Initialize the NSCT-TMF model parameters g(0). 4. If l = L, segment the SAR image at the coarsest scale according to Eq. (12); else, segment the SAR image at other scale according to Eq. (13). l = l 1; 5. lf l P 0, go to step 4, else go to step 6. 6. Estimation of NSCT-TMF model parameters. 7. Segment SAR image as in step 4 based on the new NSCT-TMF model parameters obtained in step 6 to obtain the final pixel-level segmentation.
chastic gradient (SG) algorithm [22,23] can be utilized. For least square error method, we need to solve the over-determined linear system equations corresponding to different scales in the leastsquares sense. In this paper, SG algorithm is utilized. After Gibbs sampler form coarse scale to fine scale, motivated by the simplicity-stability tradeoff, the samples at scale 0 and scale 1 (in fact, these samplers fuse the global structure information and features from coarser scales), are utilized to estimate the inter-scale fusion parameters and intra-scale model parameters based on SG. The procedure of parameter estimation is shown in Table 1. Finally, the whole steps of the proposed multiscale SAR image segmentation algorithm using NSCT-TMF model are shown in Table 2.
4. Experiments and discussions 4.1. Data sets description and reference models In this paper, an optical image and four real SAR images are utilized to illustrate the validity and popularity of the proposed NSCTTMF model. The prior information of the real SAR images applied is shown in Table 3. The optical image for experiment is an area of Oberpfaffenhofen near Munich, Germany, shown in Fig. 7a. In the experiments, the optical image is corrupted by multiplicative speckle noise with different variances. The corresponding effective numbers of looks (ENL) of the corrupted images are 2 look, 4 look, 6 look, 8 look, and 10 look respectively.
Please cite this article in press as: Y. Wu et al., SAR image multiclass segmentation using a multiscale and multidirection triplet Markov fields model in nonsubsampled contourlet transform domain, Informat. Fusion (2013), http://dx.doi.org/10.1016/j.inffus.2012.12.001
Y. Wu et al. / Information Fusion xxx (2013) xxx–xxx Table 3 The prior knowledge of the real SAR images. Image
Size
Class
ENL
Band
Image source
Oberpfaffenhofen Sanfrancisco Chinalake Airport Niigata
512 512 512 512 256 256 512 512
3 3 3 3
2 4 6 2
L L Ku L
ESAR AIRSAR SNL PISAR
Two reference models are used for SAR image segmentation in comparison with our proposed NSCT-TMF model: (1) a novel MRF model proposed by Deng and Clausi [1]. In this novel MRF model, a function-based parameter is utilized to weight image spatial relationships and image feature; (2) TMF model proposed by Benboudjema and Pieczynski [2]. 4.2. Analysis of performance in robustness against speckle To analyze the performance of NSCT-TMF model in robustness against speckle, optical images corrupted by multiplicative speckle noise with different ENLs are segmented into three classes, using the novel MRF model, TMF model and NSCT-TMF model respectively. Given the hand-segmentation, we evaluate the segmentation performance utilizing the two global quality indicators: the overall accuracy s and Kappa parameter [25]. The two global quality indicators of the segmentation results obtained by the proposed and the reference segmentation methods are shown in Fig. 7. As shown in Fig. 7b and c, for each ENL, s and Kappa parameter obtained based on TMF model are mostly higher than those based on novel MRF model while NSCT-TMF model achieves the best numerical results in s and Kappa parameters. This has proven the superiority of NSCT-TMF model in segmentation. On the other hand, the sharpness of the curves of s and Kappa parameter reveals the robustness against speckle noise. As shown in Fig. 7b, for s, the curve obtained based on NSCT-TMF model is not sharper than the others. For the s curve obtained based on NSCT-TMF model, the corresponding amplitude difference is 0.0657, while the others based on novel MRF model and TMF model are 0.2284 and 0.2008 respectively. For Kappa parameter, we have the analogous result as s. For the Kappa parameter curve obtained based on NSCT-TMF model, the amplitude difference of Kappa parameter at ENL = 2 and ENL = 10 is 0.0961, while the others based on novel MRF model and TMF model are 0.3374 and 0.2946 respectively. Thus, NSCTTMF model turns out to be more robust against speckle noise. 4.3. Experimental results on SAR images Real SAR images are shown in Fig. 8 case1(a), case2(a), case3(a) and case4(a) respectively. The corresponding segmentation results are shown in Fig. 8 case1(b)–(d), case2(b)–(d), case3(b)–(d) and
7
case4(b)–(d) respectively. According to the segmentation results, the segmentation methods based on TMF model can achieve better performance compared to the results obtained by the novel MRF model. The reason is that TMF model proposed recently takes into account the nonstationarity of SAR images. TMF model turns out to be a more accurate statistical model for SAR images in modeling the spatial structures. As shown in Fig. 8, TMF model can suppress speckle well and obtain accurate edges compared to novel MRF model. However, there are still some defects in the results obtained by TMF model, such as mis-segmentations. That is because TMF model is lack of coarser scale structure information as guidance. As shown in Fig. 8 case2(c), lots of the urban areas are missegmented according to the corresponding optical image. Fortunately, NSCT-TMF model solves mis-segmentations effectively and achieves better segmentation. Compared to Fig. 8 case1–4(c), the homogeneities in Fig. 8 case1–4(d) are better obviously. NSCT-TMF model presents a high robustness against speckle. On the other hand, mis-segmentations and ambiguities in the results based on NSCT-TMF model also decrease. For instance, the segmentation of urban areas in Fig. 8 case2(d) is more accurate than the result obtained based on TMF model. Moreover, the edge locations in Fig. 8 case1(d) are more accurate. These improvements can be attributed to the multiscale information fusion in NSCT-TMF model. In NSCT-TMF model, both the dependencies across the scale and intra-scale interactions in label fields (X, U) are captured by multiscale energy functions. On the other hand, SAR intensity and nonsubsampled contourlet coefficients are used as the features to represent the difference between the targets at different scales. Consequently, according the distribution of the triplet fields (Xl, Ul, Yl), NSCT-TMF model can capture multiscale image information in term of image configuration and features. In this way, the contextual information is propagated effectively. The larger structure segmentation at coarser scale can sever as a mask to segment the finer structure in finer scale, and thus the mis-segmentations have been reduced. Moreover, this multiscale context model in NSCT domain has high degree of directionality and anisotropy and can capture the inter-direction dependencies of NSCT coefficients, so the segmentation results exhibit better edge locations. Note that the nonstationarities of SAR images texture at different scales are considered. Because when we initialize U field, we need to derive the NAGK parameters from the local autocovariance, then in fact, longer range interactions between the pixels (the spatial correlation) are taken into account at coarser scale. Now, we proceed with the evaluation of the image segmentation performance utilizing the objective evaluation criteria. In this paper, in order to evaluate the segmentation performance of real SAR images numerically, two objective evaluation criteria [26] are utilized. The objective evaluation criteria consist of the variance RIvar of the ratio image and the normalized log measure D of the ratio image. To perform the objective evaluation, the ratio
Fig. 7. (a) Optical image. (b) Analysis of performance in accuracy. (c) Analysis of performance in Kappa parameter.
Please cite this article in press as: Y. Wu et al., SAR image multiclass segmentation using a multiscale and multidirection triplet Markov fields model in nonsubsampled contourlet transform domain, Informat. Fusion (2013), http://dx.doi.org/10.1016/j.inffus.2012.12.001
8
Y. Wu et al. / Information Fusion xxx (2013) xxx–xxx
Case1: (a) Oberpfaffenhofen
(b) Novel MRF
(c) TMF
(d) NSCT-TMF
Case2: (a) Sanfrancisco
(b) Novel MRF
(c) TMF
(d) NSCT-TMF
Case3: (a) Chinalake Airport
(b) Novel MRF
(c) TMF
(d) NSCT-TMF
(b) Novel MRF
(c) TMF
(d) NSCT-TMF
Case4: (a) Niigata
Fig. 8. The segmentation results of real SAR images.
image is needed. The ratio image can be obtained through dividing the original image by its segmentation. The normalized log measure D is defined as:
D¼
K X nk k¼1
n
Dk
where Is is the pixel value that belongs segmentation k, Ik is the mean of segmentation k, and r denotes the ratio image. Substituting Eq. (15) into Eq. (14), we can obtain:
ð14Þ D¼
where nk is the number of pixel that belongs segmentation k, n is the total number of pixel of the observed image, and Dk is the normalized log measure in segmentation k, Dk can be written as follows:
1X Is ¼ ln r k Dk ¼ ln nk s2S Ik k
ð15Þ
K X nk k¼1
n
ln r k ¼ ln r
ð16Þ
The value of |D| is around zero. In theory, the smaller the value of |D| is, the better the segmentation performance is. It is of heterogeneity for the segmented image when |D| takes unusually large values.
Please cite this article in press as: Y. Wu et al., SAR image multiclass segmentation using a multiscale and multidirection triplet Markov fields model in nonsubsampled contourlet transform domain, Informat. Fusion (2013), http://dx.doi.org/10.1016/j.inffus.2012.12.001
Y. Wu et al. / Information Fusion xxx (2013) xxx–xxx Table 4 The comparison of segmentation performance. Image
Oberpfaffenhofen Sanfrancisco Chinalake Airport Niigata
Acknowledgements
Novel MRF
TMF
NSCT-TMF
|D|
RIvar
|D|
RIvar
|D|
RIvar
0.1494 0.1636 0.1910 0.0961
0.2626 0.4338 0.2813 0.2045
0.1020 0.0966 0.1838 0.0896
0.2563 0.2690 0.2369 0.1938
0.0859 0.0800 0.1793 0.0767
0.2015 0.1856 0.2292 0.1538
K X nk 1 k¼1
vk ¼
n1
vk
nk 2 r rk 2 nk 1 k
The authors would like to thank the anonymous reviewers for their constructive comments. This work was supported by the Natural Science Foundation of China (Nos. 61271297; 61272281), the Specialized Research Fund for the Doctoral Program of Higher Education (No. 20110203110001), and the Program for Changjiang Scholars and Innovative Research Team in University under Grant No.IRT 0954. References
The variance RIvar of the ratio image can be defined as:
RIvar ¼
9
ð17Þ
ð18Þ
where rk denotes the average ratio over segmentation k. RIvar represents the change of the pixel value in the ratio image. In theory, the smaller the value is, the better the segmentation performance is. The results are shown in Table 4. As the results shown in Table 4, compared to the objective evaluation criteria obtained by the two reference models, the values of |D| and RIvar obtained by the proposed model decrease obviously, which indicates the segmentations are more accurate and homogeneities of the segmented images are better. Consequently, objective evaluation also demonstrates that our proposed NSCT-TMF model achieves improvement over the proposed models recently including novel MRF model and TMF model.
5. Conclusions In this paper, we have proposed a multiscale and multidirection NSCT-TMF model for SAR image multiclass segmentation. The new model is capable of capturing the multiscale image information, including image configuration and features. The inter-scale dependencies and intra-scale interactions in label fields (X, U) have been captured by NSCT-TMF model, while the dependencies of NSCT coefficients across scale and directions are taken into account by NSCT-HMT. Moreover, the local autocovariance (seemed as statistical descriptor of textures) is utilized to analyze the nonstationarity of SAR images and we have proposed a novel initialization of the third random field U according to NAGK parameters. The nonstationarities of SAR images texture within a range of scales are considered. NSCT-TMF model has been applied to SAR image segmentation. Experimental results have proven that the multiscale decision fusion implemented by NSCT-TMF model is able to drive the propagation of contextual information of image content effectively. And the larger structure segmented at coarser scale can provide a mask for the pixel-level segmentation. Compared to the proposed models recently, NSCT-TMF model can achieve the multiscale information fusion in segmentation, and thus take the contextual information and nonstationary property into account more effectively in the considered context. Consequently, in nonstationary SAR image segmentation, NSCT-TMF model is more robust against speckle noise and obtains better region homogeneity and accurate edge location. In this paper, we consider two different stationarities of SAR images, which is just a simple case. However, sometimes the more complex image does need larger set of possible stationarities to describe its nonstationary property. In the future, larger set of possible stationarities can be taken in NSCT-TMF model to improve this multiscale context model.
[1] H.W. Deng, D.A. Clausi, Unsupervised segmentation of synthetic aperture radar sea ice imagery using a novel Markov random field model, IEEE Trans. Geosci. Remote Sens. 43 (3) (2005) 528–538. [2] D. Benboudjema, W. Pieczynski, Unsupervised statistical segmentation of nonstationary images using triplet Markov fields, IEEE Trans. Pattern Anal. Mach. Intell. 29 (8) (2007) 1367–1378. [3] F. Galland, J.M. Nicolas, H. Sportouche, M. Roche, F. Tupin, P. Refregier, Unsupervised synthetic aperture radar image segmentation using Fisher distributions, IEEE Trans. Geosci. Remote Sens. 47 (8) (2009) 2966–2972. [4] Y. Wu, M. Li, P. Zhang, H.T. Zong, P. Xiao, C.Y. Liu, Unsupervised multi-class segmentation of SAR images using triplet Markov fields models based on edge penalty, Pattern Recogn. Lett. 32 (2011) 1532–1540. [5] Y. Wu, X. Wang, P. Xiao, L. Gan, C.Y. Liu, M. Li, Fast algorithm based on triplet Markov fields for unsupervised multi-class segmentation of SAR images, Sci. China Ser. F: Inform. Sci. 40 (12) (2011) 1636–1645. [6] P. Zhang, M. Li, Y. Wu, L. Gan, M. Liu, F. Wang, G.F. Liu, Unsupervised multiclass segmentation of SAR images using fuzzy triplet Markov fields model, Pattern Recogn. 45 (2012) 4018–4033. [7] C. D’ Elia, G. Poggi, G. Scarpa, A tree-structure Markov random field model for Bayesian image segmentation, IEEE Trans. Image Process. 12 (10) (2003) 1259–1273. [8] C. Collet, F. Murtagh, Multiband segmentation based on a hierarchical Markov model, Pattern Recogn. 37 (2004) 2337–2347. [9] M. Mignotte, C. Collet, P. Perez, P. Bouthemy, Sonar image segmentation using an unsupervised hierarchical MRF model, IEEE Trans. Image Process. 9 (7) (2000) 1216–1231. [10] A. Katartzis, I. Vanhamel, H. Sahli, A hierarchical Markovian model for multiscale region-based classification of vector-valued images, IEEE Trans. Geosci. Remote Sens. 43 (3) (2005) 548–558. [11] H. Choi, R.G. Baraniuk, Multiscale image segmentation using wavelet-domain hidden Markov models, IEEE Trans. Image Process. 10 (9) (2001) 1309–1321. [12] D.D.Y. Po, M.N. Do, Directional multiscale modeling of images using the contourlet transform, IEEE Trans. Image Process. 15 (6) (2006) 1610–1620. [13] M.S. Crouse, R.D. Nowak, R.G. Baraniuk, Wavelet-based statistical signal processing using hidden Markov models, IEEE Trans. Signal Process. 46 (4) (1998) 886–902. [14] Z.L. Long, N.H. Younan, Statistical image modeling in the contourlet domain using contextual hidden Markov models, Signal Process. 89 (2009) 946–951. [15] N. Signolle, M. Revenu, B. Plancoulaine, P. Herlin, Wavelet-based multiscale texture segmentation: application to stromal compartment characterization on virtual slides, Signal Process. 90 (2010) 2412–2422. [16] M. Li, Y. Wu, Q. Zhang, SAR image segmentation based on mixture context and wavelet hidden-class-label Markov random field, Comput. Math. Appl. 57 (2009) 961–969. [17] O. D’Hondt, C. Lopez-Martinez, L. Ferro-Famil, E. Pottier, Spatially nonstationary anisotropic texture analysis in SAR images, IEEE Trans. Geosci. Remote Sens. 45 (12) (2007) 3905–3918. [18] S.Z. Li, Markov Random Field Modeling in Image Analysis, third ed., SpringerVerlag, London, 2009. [19] A.L.D. Cunha, J.P. Zhou, M.N. Do, The nonsubsampled contourlet transform: theory, design, and applications, IEEE Trans. Image Process. 15 (10) (2006) 3089–3101. [20] H.C. Li, W. Hong, Y.R. Wu, P.Z. Fan, An efficient and flexible statistical model based on generalized Gamma distribution for amplitude SAR images, IEEE Trans. Geosci. Remote Sens. 48 (6) (2010) 2711–2722. [21] K.S. Song, Globally convergent algorithms for estimating generalized Gamma distributions in fast signal and image processing, IEEE Trans. Image Process. 17 (8) (2008) 1233–1250. [22] F. Salzenstein, W. Pieczynski, Parameter estimation in hidden fuzzy Markov random fields and image segmentation, Graph. Models Image Process. 59 (4) (1997) 205–220. [23] D. Benboudjema, W. Pieczynski, Unsupervised image segmentation using triplet Markov fields, Comput. Vision Image Understand. 99 (2005) 476–498. [24] H. Derin, H. Elliott, Modeling and segmentation of noisy and textured images using Gibbs random fields, IEEE Trans. Pattern Anal. Mach. Intell. 9 (1) (1987) 39–55. [25] G. Poggi, G. Scarpa, J.B. Zerubia, Supervised segmentation of remote sensing images based on a tree-structured MRF model, IEEE Trans. Geosci. Remote Sens. 43 (8) (2005) 1901–1911. [26] R. Caves, S. Quegan, R. White, Quantitative comparison of the performance of SAR segmentation algorithms, IEEE Trans. Image Process. 7 (11) (1998) 1534–1546.
Please cite this article in press as: Y. Wu et al., SAR image multiclass segmentation using a multiscale and multidirection triplet Markov fields model in nonsubsampled contourlet transform domain, Informat. Fusion (2013), http://dx.doi.org/10.1016/j.inffus.2012.12.001