ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 103–114
Contents lists available at ScienceDirect
ISPRS Journal of Photogrammetry and Remote Sensing journal homepage: www.elsevier.com/locate/isprsjprs
A fuzzy topology-based maximum likelihood classification Kimfung Liu a , Wenzhong Shi a,b,∗ , Hua Zhang a,c a
Advanced Research Centre for Spatial Information Technology, Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong
b
Key Laboratory of Land Use, Ministry of Land and Resources, Beijing, China
c
School of Environment Science and Spatial Informatics, China University of Mining & Technology, Xuzhou, China
article
info
Article history: Received 29 April 2009 Received in revised form 21 September 2010 Accepted 22 September 2010 Available online 3 December 2010 Keywords: Fuzzy topology Maximum likelihood classification (MLC) Thresholding Remote sensing Land cover mapping
abstract Classification is one of the most widely used remote sensing analysis techniques, with the maximum likelihood classification (MLC) method being a major tool for classifying pixels from an image. Fuzzy topology, in which the set concept is generalized from two values, {0, 1}, to the values of a continuous interval, [0, 1], is a generalization of ordinary topology and is used to solve many GIS problems, such as spatial information management and analysis. Fuzzy topology is induced by traditional thresholding and as such gives a decomposition of MLC classes. Presented in this paper is an image classification modification, by which induced threshold fuzzy topology is integrated into the MLC method (FTMLC). Hence, by using the induced threshold fuzzy topology, each image class in spectral space can be decomposed into three parts: an interior, a boundary and an exterior. The connection theory in induced fuzzy topology enables the boundary to be combined with the interior. That is, a new classification method is derived by integrating the induced fuzzy topology and the MLC method. As a result, fuzzy boundary pixels, which contain many misclassified and overclassified pixels, are able to be re-classified, providing improved classification accuracy. This classification is a significantly improved pixel classification method, and hence provides improved classification accuracy. © 2010 Published by Elsevier B.V. on behalf of International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS).
1. Introduction Classification is a widely used remote sensing (RS) analysis technique, the accuracy quality often relying on a good understanding of the system of the spatial classes. There are mainly three types of classification methods: supervised, unsupervised and hybrid. The supervised classification method requires ‘‘training pixels’’ to define each class (Bruzzone and Carlin, 2006; Tseng et al., 2008). The unsupervised classification does not require extraneous data as classes are determined purely in accordance with the differences in spectral values (Liu et al., 2004). The hybrid classification method is a combination of the unsupervised and supervised classification methods (Tang et al., 2005). Various fuzzy set theories have been proposed by researchers for dealing with fuzzy classification problems in remote sensing and image processing as well as for land cover classification (Wu and Zheng, 1991; Andrew et al., 2001; Tang et al., 2005;
∗ Corresponding author at: Advanced Research Centre for Spatial Information Technology, Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong. E-mail addresses:
[email protected] (K. Liu),
[email protected] (W. Shi),
[email protected] (H. Zhang).
Tseng et al., 2008), high-resolution image segmentation and urban area mapping (Bruzzone and Carlin, 2006; Gamba et al., 2007) and image classification and modulation classifications (Wei and Mendel, 1999). Mathematically, fuzzy topology is generalized from ordinary topology by introducing the concept of membership value in a fuzzy set, as introduced by Zadeh (1965). The theory of fuzzy topology (Zadeh, 1965; Chang, 1968; Wong, 1974; Wu and Zheng, 1991; Liu and Luo, 1997) has actually been developed based on this fuzzy set concept. Consequently, fuzzy topology provides an elementary tool for the development of fuzzy classification. Maximum Likelihood classification (MLC) (Yang, 1993), a remarkable classification method based on multivariate normal distribution theory (Abkar, 1999), has found wide application in the remote sensing field. For example, the MLC of fused image and sub-pixel classification has been used to classify logged points, and also an unlogged forest in Indonesia (Santosh and Yousif, 2003). Mesev et al. (2001), using MLC, noted that if the pixels are adequately trained, the standard per pixel classification achieves the same classification accuracy as textural and contextual methods and the use of fuzzy datasets.Abkar (1999) has, further, tried to use an a priori probability distribution function to improve the accuracy of the MLC and Yang (1993) extended the MLC to include the fuzzy sense.
0924-2716/$ – see front matter © 2010 Published by Elsevier B.V. on behalf of International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). doi:10.1016/j.isprsjprs.2010.09.007
104
K. Liu et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 103–114
Thresholding has been found to be important for image enhancement and has been widely used, threshold determination being the first step in image processing for many applications. Snyder et al. (1990) presented a method for finding the global optimal threshold value by using a Gaussian sum to fit a multimodal distribution and Chang et al. (2000) used wavelet thresholding for image denoising and compression. In addition, a number of valuable image enhancement methods have been proposed for the purpose of improving the accuracies of image classification (Otsu, 1979; Kittler and Illingworth, 1986; Belkasim et al., 2003; Liu et al., 2004; Shi et al., 2010). For example, Belkasim et al. (2003) presented a new technique for the automatic thresholding of images, based on maximizing the correlation between the phase of the gray-level image and the phase of the thresholded version of the image. Automatic thresholding is a challenging issue for classifications. The aim of thresholding is to produce an optimum threshold value of an image for separating objects from their backgrounds, so that object extraction and classification accuracy can be enhanced. Boulder mapping is an example of an object extraction application, conducted making use of image processing technologies (Ng et al., 2003). Another example given by Shi et al. (2010) is the use of threshold theory to create an induced fuzzy topological space in object extraction. Snyder et al. (1990) presented a technique to determine the value of a global optimal threshold by using a Gaussian sum to fit a multimodal histogram. For a multimodal histogram, a standard technique for finding the best threshold of this image is to fit the histogram with∑the sum of n probability density functions (pdfs), n that is d(z ) = i=1 Pi (z )pi (z ) (Chow and Kaneko, 1972; Liu et al., 2004), where Pi is the ith priori probability and pi is the pdf of the ith priori probability. As stated above, mathematically, fuzzy topology is generalized from ordinary topology and this generalization enables a clearer understanding of the fuzzy relations between spatial objects in GIS, and is applied to the numerical description of those fuzzy relations (Egenhofer and Franzosa, 1991; Cohn and Gotts, 1996; Winter, 2000; Tang and Kainz, 2001; Liu and Shi, 2006). It can be further applied for GIS spatial queries and topological consistency checking. Every interior or closure operator can essentially define a fuzzy topology (Liu and Luo, 1997) separately. Based on this understanding, a fuzzy topology can be defined by the interior and closure operators which are, in turn, defined by a suitable level cut. At present, to the writer’s knowledge, no method exists to implement the theoretical concept of fuzzy topology in classification. To do so, it is necessary to determine how a particular class of image can be treated in a fuzzy set in a fuzzy topological space, to enable the induced level cutting fuzzy topology to facilitate the decomposition of classes into the three parts: interior, boundary and exterior. This approach to image classification is reported and demonstrated in this paper. In keeping with the theory of fuzzy topology, the classification methods of classes are concentrated on the operation of interior and boundary and the exterior parts can be ignored. The objective of the study presented in this paper, and as indicated above, is to enhance the recognition of certainty in the classification of pixels possessing a degree of uncertainty. The new classification method is derived by integrating induced fuzzy topology and the MLC method. The focus of the study is the fuzzy topology space making the boundary and the interior the significant classification parts. The exterior is excluded. The paper is organized as follows. The MLC method is described and the concept of fuzzy topology-based classification is introduced in Section 2. An experimental study in which the developed method is applied to a land cover classification application is provided in Section 3.This is followed, finally, by analysis and discussion of the newly proposed method.
2. The fuzzy topology-based maximum likelihood classification (FTMLC) In this section, the MLC method is described in Section 2.1. The notion that an image class can actually be viewed as a fuzzy set in fuzzy topological space is presented in Section 2.2. An inter-correlation coefficient threshold method is introduced in Section 2.3. This method creates a suitable fuzzy topological space enabling the theory developed and described in Section 2.2 to be applied. In addition, the classes in the image can be well separated. To handle boundary pixels a method, based on connectivity properties and the principle of maximum likelihood, is proposed and described in Section 2.4. The logic flow of the newly developed fuzzy classification method is presented in Section 2.5. 2.1. Maximum likelihood classification Maximum likelihood classification (MLC) is a method for determining a known class distribution as the maximum for a given statistic (Scott and Symons, 1971). MLC is, in fact, widely used in remote sensing, in which a pixel with the maximum likelihood is classified into the corresponding class. Suppose there are m predefined classes, the class a posteriori probability is defined as P (k | x) =
P (k)P (x | k) , m ∑ P (i)P (x | i)
(1)
i =1
where P (k) is the prior probability of class k, P (x | k) is the conditional probability of observing x from class k (probability density function). In the case of normal distributions, the likelihood function, P (x | k), can be expressed as Lk (x) =
1 n
1
1 −1 T exp − (x − µk ) Σk (x − µk ) ,
(2)
2 (2π ) 2 |Σk | 2 T where x = x1 x2 . . . xn is the vector of a pixel with n bands; Lk (x) is the likelihood membership function of x belonging T to class k; µk = µk1 µk2 . . . µkn is the mean of the σ σ ... σ 11
σ21 kth class; Σk = .. . σn1
12
σ22 . . . σn1
... .. . ...
1n
σ2n . is the variance covariance . . σnn
matrix of class k. In the real case, one pixel may relate to more than one class. By introducing the concept of fuzzy theory into the classification, each pixel may relate to several classes, where the percentage representation of each of the classes, within a pixel, can be represented by the corresponding membership value. That is, a pixel is considered as having different membership values, the latter depending on the area proportion of each class within the mixed pixel. 2.2. A class as a fuzzy set Fuzzy topology is an extension of ordinary topology. According to Chang (1968), if X is a non-empty ordinary set, and I is an Ilattice. δ is known as an I-fuzzy topology if δ ⊂ I x on X . (I x , δ) is called an I-fuzzy topological space (I-fts), if δ satisfies the following conditions: (i) 0, 1 ∈ δ ; (ii) if A, B ∈ δ , then A ∧ B ∈ δ ; (iii) if {Ai : i ∈ J } ⊂ δ , where J is an index set, then ∨i∈J Ai ∈ δ . For any fuzzy set A, the interior of A is defined as the junction of all the open subsets contained in A, denoted by Ao . The closure of A is the meet of all the closed subsets containing A, denoted by A. The boundary of A is defined as ∂ A = A ∧ Ac .
K. Liu et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 103–114
105
Fig. 1(b). The interior of these two classes. Fig. 1(a). The relationship between two classes.
Xα , τα , τ 1−α
: α ∈ (0, 1) , is induced by two threshold values (Liu and Shi, 2006), α and 1 − α . These two threshold values (α and 1 − α ) can also be A fuzzy topological space structure,
cj
thought of as the interior and closure operators, respectively. This fuzzy topological space has many interesting properties in the fieldof mapping. Several are listed below. Let A ∈ I X , B ∈ I Y . Also let I X , δ , I X , µ be I-fts’s induced by the interior operator
X I , δ → I Y , µ and f ← : I X , µ → I Y , δ be fuzzy and reverse mapping, then both open and closed fuzzy sets are preserved by fuzzy and fuzzy reverse mapping.That is, (i) f → (Aα ) = [f → (A)]α , (ii) f → A1−α = [f → (A)]1−α , (iii) f ← (Bα ) = [f ← (B)]α , and (iv) f ← B1−α = [f ← (B)]1−α . For the proofs related to the above see Liu and Shi and closure operator. Let f → :
(2006). The next step is to determine the relations between the classes, and between the classes and their background. If there are m predefined classes, c1 , c2 , . . . , cm with likelihood function Lc1 , Lc2 , . . . , Lcm , respectively, then for each class, ci , Lci is the membership function of ci . The interior membership function of ci is Lci α and the boundary membership function of ci is ∂ Lci =
1−α
c
L ci ∧ Lci α . This now means that each class in the image can be regarded as a fuzzy set in a fuzzy topological space. If there are m predefined classes, each class then has its interior and its boundary. The relationship between classes may disjoint, overlap, contain, touch and so on. A class A and a class B overlap relationship in the spectral space defined by Band 1 and Band 2 is shown in Figs. 1(a) and 1(b). 2.3. Determining the thresholding value The techniques for determining the correlation between two or more distributions have been developed. An intercorrelation coefficient,
∑
(xk − µx )T yk − µy
k
2 , ∑ ∑ |(xk − µx )|2 yk − µy
rxy =
k
Fig. 2(a). Intercorrelation coefficient thresholding.
variables is low when the intercorrelation coefficient is closer to zero. In those cases where the intercorrelation is positive, if one distribution increases, the other distribution also increases, where the intercorrelation is negative, if one distribution increases, the other decreases. Thresholding is a process that splits a class into regions (or clusters). With a threshold value, as stated above, a region can be split into three parts: an interior, a boundary and an exterior. In Section 2.2 the fact was discussed that each class, ci , in the image can be regarded as a fuzzy set in a fuzzy topological space, denoted by Li . Therefore, for each α ∈ [0, 1) and a class ci , (Li )α is the interior of Li . For two classes, ci and cj , the inter-correlation coefficient relating to these two classes is defined by Eq. (3). The thresholding value of these two classes can then be defined as α = rij (or α =
2 rij or α = rij ). The inter-correlation thresholding coefficient for these two classes, within the spectral space of two bands, is illustrated in Fig. 2. Suppose there are m predefined classes, c1 , c2 , . . . , cm , and L1 , . . . , Lm is the corresponding likelihood membership function, for the overall predefined m classes, the threshold value is then defined as:
α = max rij . (3)
k
is an index that measures the magnitude and the direction of the relationship between two distributions. It is designed for the range [0, 1] and is designed to measure intercorrelation between two or more distributions. Variables are highly correlated when the value of the intercorrelation coefficient between two variables is closer to unity, while the intercorrelation between the
i,j
(4)
A fuzzy space is then obtained from the value α = topological maxi,j rij (Liu and Shi, 2006) and each class is now split into the three parts: an interior, a boundary and an exterior. The interior of class ci is denoted by Lci α and the boundary of ci is denoted by
1−α c ∂ L ci = L ci ∧ Lci α . The interior part of the class represents
the precise part of this class; the boundary is the imprecise part of this class. This imprecise part (or fuzzy region) is discussed further in the next section.
106
K. Liu et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 103–114
Fig. 2(b). The interior of the two classes ci and cj .
Fig. 3. The logic flow of the classification method.
Definition 2.4.2 (Supported Connected Fuzzy Set). Let X , I X , δ be an I-fts and (X , β) be its background topology space, A, B ∈ I X . A and B are called supported separated, if
ci
Supp(A) ∩ Supp(B) = Supp(A) ∩ Supp(B) = φ. A is called supported connected in X , I X , δ, β , if supported separated C does not exist, D ∈ I X \ {0} such that A = C ∨ D and Supp(A) = Supp(C ) ∪ Supp(D).
Fig. 2(c). The boundary of the two classes ci and cj .
2.4. Determination of class boundary Fuzzy topology can be used to describe and study the structure of a neighbourhood and the leveling of spaces. Thus, if the interior and boundary are assumed to be limited by a certain relationship, they can then be reconstructed by using this relationship. Connectivity is actually one of the features. The usual definition of a fuzzy subset, A, connection in fuzzy topology is that A cannot be separated by two non-zero open or closed fuzzy sets, called open connected and closed connected, respectively. According to the characteristics of fuzzy topology, this kind of connection also contains two types of structure, neighbourhood and levelling (Martin, 1980; Liu and Luo, 1997; Luo, 1988). The connectivity of spatial object depends on the neighbourhood structure of the object itself, rather than the leveling structure. Thus, the ordinary definition of a fuzzy topology connection is not appropriate for this application. As a result, it is necessary to provide a new connection definition for spatial objects. The concept of connection for spatial objects is thus newly defined to indicate whether a spatial object is connected to another spatial object, in the sense of background space only. This definition is more fully associated with the concept of neighbourhood, rather than the concept of levelling. Therefore, the concept of connection for spatial objects has to be defined, on the basis of background topology only. Definition 2.4.1 (Support of A). Support (A) or Supp(A) is equal to the set {x ∈ X : A(x) > 0}. The closure of Supp(A) in background topology is denoted by Supp(A).
the Actually, supported connectivity of a fuzzy topological space X , I X , δ, β does not depend on the topological space X , I X , δ ; this supported connectivity depends on its background topological space (X , β) only (Liu and Shi, 2006). Thus, the concept of a supported connection makes the handling of the fuzzy topological space easier, when modeling topological relationships. In addition, the connectivity can be used to support spatial analysis related to the class boundary. The concept of supported connection is used to classify the boundary pixels in the next step of object extraction. 2.5. The logic flow of the classification method Connectivity of spatial objects in fact depends on the neighbourhood structure of the objects themselves. For example, classes of vegetation and water are assumed to be a neighbourhood. Thus, the boundary and interior of a class should be connected and the concept of connection is used to determine the class of a boundary. In other words, the boundary pixel of class A is classified as class A only if this boundary pixel is connected to the interior of class A. Otherwise, this is classified as a null class pixel. The following is a detailed description of the generation of the interior of the boundary. In addition, the boundary of each class is re-classified based on the concept of supported connection. Fig. 3 shows the logic flow of this classification method. Step 1: Define the classes The classes to be classified are first defined. For this experimental study, four classes, including building, woodland, water and farmland, are defined. For each predefined class ci , the mean µci was computed as well as the variance matrix σci and variance–covariance matrix Σci , where i = 1, 2, 3, . . . , n. Step 2: Compute the inter-correlation coefficients
K. Liu et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 103–114
Fig. 4. The 4- and 8-connections used in connectivity analysis.
If the inter-correlation coefficients are calculated according to Eq. (3), the threshold value is then determined by α = maxx,y rxy . Step 3: Calculate the likelihood value for each pixel For each pixel xo , the likelihood value (or probability distribution function value at xo ) is calculated by using Eq. (2). Step 4: Determine the interior of class for each pixel For each pixel xo and for each class ci , let Lko (xo ) = maxk Lk (xo ), if Lko (xo ) > α , then pixel xo belongs to the interior of class ko , if Lko (xo ) ≤ α , then pixel xo belongs to the boundary of a certain class and those boundary pixels have to undergo further treatment. Step 5: Determine the class for the boundary pixel For each pixel of the boundary, search its 8-connected pixels, the greatest numbers of the connected pixels belong to the interior of a certain class, this boundary pixel belongs to this class. Instead of 8-connetion, 4-connection can also be applied here as an alternative. Repeat Step 5 until no more boundary pixels are to be processed. Remark. In Step 5, the 4- or 8-connection is used for connectivity analysis in the object extraction. These two connections are illustrated in Fig. 4 which shows the concept of the boundary pixel classification process.
3. Experimental study Two experimental studies have been conducted to test the performance of the fuzzy topology-based maximum likelihood classification (FTMLC) proposed in this paper. The two test areas are located in Xuzhou City, China. The classification scheme included four categories: building, woodland, water and farmland. Landsat TM satellite images acquired in 2000 for two different sites were used as data for the experimental study except for TM6.In other words, landsat images with seven spectral bands were taken, and bands 1, 2, 3, 4, 5 and 7 were used for image classification. Moreover, a ground survey was also taken for accuracy assessment and for the kappa statistics. That is, firstly, random points (test points) were picked from the classification map. A 1:2000 landuse map, which was produced in around the year 2000, the same date as the TM images, was then used to test the accuracy of those test points, as the reference data. Finally, with the aid of field surveying, the landuse type of each of the test points was obtained. The medium level resolution of the satellite image is one of the sources of the problem of mixed pixels and the proposed FTMLC was applied to improve the accuracy of the classification. In these two studies, the pixels were first trained. That is, classifiers were trained based on selected sample pixels for each of the four classes. The membership function for each class was then generated. The classification was then made by assigning each pixel on the image to a specific class. The following test results were obtained from the experimental studies. First experiment The first test area is located in the north of Xuzhou City. Fig. 5(a) shows the Landsat satellite image of the experimental area
107
located in the north of Xuzhou taken in 2000. The Landsat satellite image was analyzed in terms of the density distributions of each predefined class. A threshold value α , which ‘‘induced’’ a fuzzy topology for the segmentation, was computed. Hence, each class can be decomposed into two parts, an interior and a boundary. If the classes were connected, the connectivity theory introduced in Section 6 and also by Liu and Shi (2006) was applied in order to classify the boundary pixels. Figs. 5(b) and 5(c) respectively show the results of MLC and FTMLC. It is suggested that readers should look at the classification result maps together with the corresponding statistical results, listed in Tables 1 and 2. In the MLC, for each pixel x, the maximum likelihood classifier creates a membership function of Lk (x) (Eq. (1)). Based on the most possible principle, an unknown pixel is classified as class A, if its class A likelihood value LA (x) of class A is larger than the class B likelihood value LB (x) of class B. Fig. 5(b) shows the result of the application of the MLC techniques. By applying the FTMLC techniques proposed for this study, for each pixel x, the maximum likelihood classifier creates a membership function. If the membership value is greater than α (the greatest inter-correlation coefficients) then pixel x is set to be the interior of class ko . Otherwise, pixel x belongs to the boundary of a certain class and those boundary pixels have to undergo further processing. As each class is assumed to be connected, the connectivity theory introduced in Section 6 is applied. Furthermore, for each boundary pixel x, if a search of its 8-connected pixels reveals that the greatest number of connected pixels belong to the interior of a certain class then that boundary pixel belongs to that class.This step is repeated until no more boundary pixels remain. Following the application of the FTMLC method, the final classification result reached is illustrated in Fig. 5(c). Second experiment The second test area for the experimental study is located in the south-west of Xuzhou City. Fig. 6(a) shows the Landsat satellite image of the experimental area located in the south-west of Xuzhou taken in 2000. After following the procedure adopted in the first experiment, Figs. 6(b) and 6(c) show the results of MLC and FTMLC respectively. Tables 3 and 4 are equivalent to Tables 1 and 2 of the first experiment. 4. Accuracy assessment by the users, producers and kappa statistics The accuracy assessment was carried out for both the MLC and the FTMLC. The accuracy assessments were divided into the aspect of producer and user. In addition, the kappa statistics and overall accuracy for classes were computed. Actually, kappa statistics quantify the level of agreement between two maps when compared with the null hypothesis, which states that the maps do not differ by chance from a random map. In general, kappa statistics values of less than 0.4 reflect poorly performing models, 0.4–0.6 are fair and 0.6–0.8 are good. Kappa statistics values greater than 0.8 represent excellent agreement between the model and observed datasets. The following is the kappa statistics equation used. n ∑
K =
P (A) − P (E ) 1 − P (E )
=
Pii −
i =1
1−
n ∑ n ∑
Pij .Pji
i =1 j =1 n ∑ n ∑
(5)
Pij · Pji
i=1 j=1
where p(A) is the observed proportion in agreement and p(E ) is the proportion of agreement expected by chance. In the first experimental study, a total of 8953 accuracy assessment pixels were sampled and the MLC accuracy assessments are
108
K. Liu et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 103–114
Fig. 5(a). Landsat satellite image of the north of Xuzhou City taken in 2000.
Fig. 5(b). The result of MLC.
Fig. 5(c). The result of FTMLC.
presented in Table 1. The producer accuracies for building, woodland, water and farmland are 0.82, 0.54, 0.81 and 0.84 respectively. The user accuracies for building, woodland, water and farmland are 0.88, 0.63, 0.45 and 0.92 respectively. The overall accuracy is 0.80. The kappa statistics values for building, woodland, water and farmland are 0.81, 0.59, 0.39 and 0.87 respectively. The overall kappa statistics value is 0.70. A summary is shown in Table 1. As for the FTMLC, a total of 8953 accuracy assessment pixels were sampled and the accuracy assessments are presented in
Table 2. The producer accuracies for building, woodland, water and farmland are 0.90, 0.52, 0.80 and 0.82 respectively. The user accuracies for building, woodland, water and farmland are 0.82, 0.70, 0.58 and 0.94 respectively. The overall accuracy is 0.82. The kappa statistics values for building, woodland, water and farmland are 0.70, 0.66, 0.53 and 0.90 respectively. The overall kappa statistics value is 0.73. A summary is shown in Table 2. In the second experimental study, a total of 8721 accuracy assessment pixels were sampled and the MLC accuracy assessments
K. Liu et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 103–114
109
Fig. 6(a). Landsat satellite image of the south-west of Xuzhou City taken in 2000.
Fig. 6(b). The result of MLC.
Fig. 6(c). The result of FTMLC. Table 1 The confusion matrix of using MLC.
Building Woodland Water Farmland Total Producer’s accuracy Kappa statistics
Building
Woodland
Water
Farmland
Total
User’s accuracy
2967 170 381 88 3606 0.82 0.81
164 526 174 116 980 0.54 0.59
92 25 698 52 867 0.81 0.39
133 109 313 2945 3500 0.84 0.87
3356 830 1566 3201 8953
0.88 0.63 0.45 0.92
First experiment’s accuracy assessment by the users, producers and kappa statistics.
Overall accuracy = 0.80 Overall kappa = 0.70
110
K. Liu et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 103–114
Table 2 The confusion matrix of using FTMLC.
Building Woodland Water Farmland Total Producer’s accuracy Kappa statistics
Building
Woodland
Water
Farmland
Total
User’s accuracy
3244 101 211 50 3606 0.90 0.70
307 511 75 87 980 0.52 0.66
108 25 692 42 867 0.80 0.53
301 96 220 2883 3500 0.82 0.90
3960 733 1198 3062 8953
0.82 0.70 0.58 0.94 Overall accuracy = 0.82 Overall kappa = 0.73
First experiment’s accuracy assessment by the users, producers and kappa statistics.
Table 3 The confusion matrix of using MLC.
Building Woodland Water Farmland Total Producer’s accuracy Kappa statistics
Building
Woodland
Water
Farmland
Total
User’s accuracy
4392 333 34 866 5625 0.78 0.95
20 1038 3 17 1078 0.96 0.66
10 26 694 12 742 0.94 0.94
42 79 2 1153 1276 0.90 0.49
4464 1476 733 2048 8721
0.98 0.70 0.95 0.56 Overall accuracy = 0.83 Overall kappa = 0.73
Second experiment’s accuracy assessment by the users, producers and kappa statistics.
Table 4 The confusion matrix of using FTMLC.
Building Woodland Water Farmland Total Producer’s accuracy Kappa statistics
Building
Woodland
Water
Farmland
Total
User’s accuracy
4919 147 10 549 5625 0.87 0.78
59 1017 0 2 1078 0.94 0.80
27 21 691 3 742 0.93 0.98
322 45 0 909 1276 0.71 0.56
5327 1230 701 1463 8721
0.92 0.83 0.99 0.62 Overall accuracy = 0.86 Overall kappa = 0.76
Second experiment’s accuracy assessment by the users, producers and kappa statistics.
are presented in Table 3. Producer accuracies for building, woodland, water and farmland are 0.78, 0.96, 0.94 and 0.90 respectively. The user’s accuracy of building, woodland, water and farmland are 0.98, 0.70, 0.95 and 0.56 respectively. The overall accuracy is 0.83. The kappa statistics values for building, woodland, water and farmland are 0.95, 0.66, 0.94 and 0.49 respectively. The overall kappa statistics value is 0.83. A summary is shown in Table 3. The second experimental study is also tested by FTMLC. A total of 8721 accuracy assessment pixels were sampled and the MLC accuracy assessments are presented in Table 4. The producer accuracies for building, woodland, water and farmland are 0.87, 0.94, 0.93 and 0.71 respectively and the user accuracies are 0.92, 0.83, 0.99 and 0.62 respectively. The overall kappa statistics value is 0.86. A summary is given in Table 4.
In this section, the experimental results are given and the improvements in accuracy and kappa values noted. The methods for comparison included MLC and FTMLC. In addition, we also try to compare the visual improvement of FTMLC with MLC. In this case, several small parts of images have been extracted to compare the visual feeling of MLC and FTMLC.
This result is because the fuzzy boundary pixels are re-classified using the concept of connection in fuzzy topological space (Liu and Shi, 2006). This leads to an improvement in the classification results, especially for those pixels located at the object boundary. The FTMLC attains the highest overall classification accuracy and the highest kappa statistics, although the accuracy in classifying individual classes is not the highest. This is due to the fuzzy topological space method and the further enhancement that the fuzzy boundary pixels, which contain many misclassified and over-classified pixels, are re-classified. In fact, the smaller the probability a classification is based on, the vaguer the classification result for the pixel becomes. The FTMLC method presented in this paper has applied the concept of connection in fuzzy topology. The re-classification carried out for the pixels at the boundary of the objects led to an improvement in the classification results, especially for those pixels located at object boundaries. Figs. 8(a) and 8(b) show the boundary pixels of both maps studied in this paper. In the first experimental map (see Fig. 8(a)), a total of 7404 boundary pixels needed to be re-classified, that is about 16.3% of the total number of pixels. About 20% of those pixels are reclassified correctly (see Table 5). In the second experimental map (see Fig. 8(b)), a total of 3334 boundary pixels or 7.55% of all pixels needed re-classifying. About 45% of those pixels are re-classified correctly (see Table 6).
Accuracy and kappa improvement The distributions in Fig. 7 show the accuracies and the kappa values for both MLC and FTMLC methods for both locations. For the first map, the accuracies increase from 0.8 to 0.82 and kappa values increase from 0.7 to 0.73. For the second map, the accuracies increase from 0.83 to 0.86 and the kappa values increase from 0.73 to 0.76. Thus the performance of FTMLC is superior to that of MLC.
Visual analysis In order to analyze the classification quality improvement by applying the FTMLC compared with the MLC, two images are classified by using FTMLC and MLC respectively. The classification results of the first image are demonstrated by Fig. 9. The upper image (Fig. 9(a)) is the result classified by using the MLC method and the lower image (Fig. 9(b)) is the result classified by the FTMLC
5. Discussion and analysis of experimental results
K. Liu et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 103–114
a
111
b
Fig. 7. Statistical comparison of the (a) accuracies, and (b) kappas.
Legend
Fig. 8(a). Boundary pixel distribution of the first map.
Legend
Fig. 8(b). Boundary pixel distribution of the second map.
method. There are many misclassified pixels displayed on the image that is classified by the MLC, for example those areas marked by the ellipses Fig. 9(a)(I)–(IV). On the other hand, these areas with the misclassified pixels have been improved significantly on the image classified by using the FTMLC (see the areas marked by the ellipses of Fig. 9(b)(I)–(IV)). Take the classification for the water body as an example, the area of Fig. 9(a)(II) has many misclassified pixels and the corresponding area that is classified by FTMLC
(Fig. 9(b)(II)) gives a much better classification result, where the shape of the water body is clearly visualized. The classification results of the second image are visualized by Fig. 10. The classification results demonstrate the same findings graphically: there are many misclassified pixels on the classification image from MLC (see areas marked as Fig. 10(a)(I)–(IV)), but fewer misclassified pixels occur on the image from FTMLC (see the corresponding areas marked as Fig. 10(b)(I)–(IV)).
112
K. Liu et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 103–114
Fig. 9. Visual comparison of (a) MLC and (b) FTMLC of the first map.
Table 5 The boundary pixels of the first map. Class no.
Class
FTMLC
1 2 3 4 0
Building Woodland Water Farmland Boundary class pixels
13 990 5 775 4 218 13 593 7 304
Table 6 The boundary pixels of the second map. Class no.
Class
FTMLC
1 2 3 4 0
Building Woodland Water Farmland Boundary class pixels
18 094 7 200 3 915 12 337 3 334
6. Conclusions This paper presents an improved MLC by introducing fuzzy topology theory into the design of the classification method. Firstly, each class in the image is treated as a fuzzy set in a fuzzy space, giving a natural representation of spatial objects. The fuzzy set (class) is then decomposed into two parts, an interior and a boundary, which is based on the idea of the maximum intercorrelation coefficient thresholding of all classes. The interior represents the core of a class and the boundary represents the overlapping area between classes. Finally, the two parts, the boundary and interior of the object, are combined by using the
property of spatial connectivity in fuzzy topology. The latter is the major contribution of the work presented in this paper. The experimental study to integrate the method of MLC and the fuzzy topological space (FTMLC) improved upon the original MLC method by providing better classification accuracy. The improved accuracy is due to the introduction of fuzzy topology and specifically the connection between classes, based on that topological approach. The improvement in the classification results relates especially to the re-classification of pixels at the object boundaries. The contributions of the newly proposed method described in this paper include: (a) the treatment of classes in the image as fuzzy sets in a fuzzy topological space; (b) the development of a method for determining the suitable threshold value so that classes can be viewed as fuzzy sets in a fuzzy topological space; (c) the introduction of fuzzy topology into the MLC method leading to a number of effective properties of the FTMLC method. Further work is needed in the area of pixel certainty because in this study the ‘‘uncertain’’ pixels are assumed to be the source of error. However there are many other sources of error, such as fitting errors, and sampling errors. When the distribution of the class does not fit the normal distribution, the maximum likelihood method may not be good enough or may not even be applicable. The inverse matrix of the variance–covariance matrix becomes unstable in cases where a very high correlation exists between two bands or the ground truth data are homogeneous. Hence an interesting possibility for further research lies in dealing with class distributions that are non-normal and those with high correlation distributions.
K. Liu et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 103–114
113
Fig. 10. Visual comparison of (a) MLC and (b) FTMLC of the second map.
Acknowledgements The work presented in this paper was supported by grants from National Key Basic Research and Development Program (973 Program, No. 2006CB701305), The Hong Kong Polytechnic University (G-YX0P, G-YF24, G-YG66, 1-ZV4F) and Hong Kong RGC General Research Fund (Project No. 5276/08), National Natural Science Foundation (40629001). References Abkar, A., 1999. Likelihood-based segmentation and classification of remotely sensed images. Ph.D Dissertation, University of Twente, ITC, Enschede, The Netherlands. Andrew, J.T., Hugh, G.L., Peter, M.A., Mark, S.N., 2001. Super-resolution target identification from remotely sensed images using a hopfield neural network. IEEE Transactions on Geoscience and Remote Sensing 39 (4), 781–796. Belkasim, S., Ghazal, A., Basir, O.A., 2003. Phase-based optimal image thresholding. Digital Signal Processing 13 (4), 636–655. Bruzzone, L., Carlin, L., 2006. A multilevel context-based system for classification of very high spatial resolution images. IEEE Transactions on Geoscience and Remote Sensing 44 (5), 2587–2600. Chang, C.L., 1968. Fuzzy topological spaces. Journal of Mathematical Analysis and Applications 24 (1), 182–190. Chang, S.G, Yu, B., Vetterli, M., 2000. Adaptive wavelet thresholding for image denoising and compression. IEEE Transactions on Image Processing 9 (9), 1532–1546. Chow, C.K., Kaneko, T., 1972. Automatic boundary detection of the left ventricle from cineangiograms. Computer in Biomedical Research 5 (4), 388–409. Cohn, A.G., Gotts, N.M., 1996. The ‘egg-yolk’ representation of regions with indeterminate boundaries. In: Burrough, P., Frank, A. (Eds.), Geographic Objects with Indeterminate Boundaries. Taylor & Francis, London, pp. 171–187. Egenhofer, M., Franzosa, R., 1991. Point-set topological spatial relations. International Journal of Geographical Information Sciences 5 (2), 161–174.
Gamba, P., Dell’Acqua, F., Lisini, G., Trianni, G., 2007. Improved VHR urban area mapping exploiting object boundaries. IEEE Transactions on Geoscience and Remote Sensing 45 (6), 1513–1514. Kittler, J., Illingworth, J., 1986. Minimum error thresholding. Pattern Recognition 19 (1), 41–47. Liu, Y.M., Luo, M.K., 1997. Fuzzy Topology. World Scientific, Singapore. Liu, K.F., Shi, W.Z., 2006. Computation of fuzzy topological relations of spatial objects based on induced fuzzy topology. International Journal of Geographical Information Systems 20 (8), 857–883. Liu, K.F., Shi, W.Z., Huang, C.Q., 2004. Fuzzy image processing method in GIS. In: Proceedings of the Greater China GIS Conference 2004, 9–11 December, Hong Kong, China. Luo, M.K, 1988. Paracompactness in fuzzy topological spaces. Journal of Mathematical Analysis and Applications 130 (1), 55–77. Martin, H.W., 1980. Weakly induced fuzzy topological spaces. Journal of Mathematical Analysis and Applications 78 (2), 634–639. Mesev, V., Gorte, B., Longley, P.A., 2001. Modified maximum-likelihood classifications algorithms and their application to urban remote sensing. In: Donnay, J.P., Barnsley, M., Longley, P.A. (Eds.), Remote Sensing and Urban Analysis. Taylor and Francis, London, pp. 71–94. Ng, K.C., Li, X.C., Shi, W.Z., Zhu, C.Q., Shum, W.L., 2003. Detection of boulders on natural terrain using image processing techniques. In: Processings of Intelligent Engineering Application of Digital Remote Sensing Technology, Hong Kong Institution of Engineers, 11 April, Hong Kong, pp. 55–63. Otsu, N., 1979. A threshold selection method from grey-level histograms. IEEE Transactions on Systems, Man and Cybernetics 9 (1), 62–66. Santosh, P.B., Yousif, A.H., 2003. A comparison of sub-pixel and maximum likelihood classification of Landsat ETM+ images to detect illegal logging in the tropical rain forest of Berau, east Kalimantan, Indonesia. In: Map Asia Conference, 13–15 October, Malaysia. Scott, A.J., Symons, M.J., 1971. Clustering methods based on likelihood ratio criteria. Biometrics 27 (2), 387–397. Shi, W.Z., Liu, K.F., Huang, C.Q., 2010. Fuzzy image processing method in GIS. IEEE Transactions on Geoscience and Remote Sensing 48 (1), 147–154. Snyder, W., Bilbro, G., Logenthiran, A., Rajala, S., 1990. Optimal thresholding: a new approach. Pattern Recognition Letters 11 (12), 803–810. Tang, X.M., Kainz, W., 2001. Analysis of topological relations between fuzzy regions in general fuzzy topological space. In: The SDH Conference 02’, 9–12 July, Ottawa Congress Centers, Ottawa, Canada, pp. 114–123. Tang, X.M., Kainz, W., Fang, Y., 2005. Reasoning about changes of land covers with fuzzy settings. International Journal of Remote Sensing 26 (14), 3025–3046.
114
K. Liu et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 103–114
Tseng, M.H., Chen, S.J., Hwang, G.H., Shen, M.Y., 2008. A genetic algorithm rule-based approach for land-cover classification. ISPRS Journal of Photogrammetry and Remote Sensing 63 (2), 202–212. Wei, W., Mendel, J.M., 1999. A fuzzy logic method for modulation classification in nonideal environments. IEEE Transactions On Fuzzy Systems 7 (3), 333–344. Winter, S., 2000. Uncertain topological relations between imprecise regions. International Journal of Geographical Information Science 14 (5), 411–430.
Wong, C.K., 1974. Fuzzy points and local properties of fuzzy topology. Journal of Mathematical Analysis and Applications 46 (2), 316–328. Wu, G., Zheng, C., 1991. Fuzzy boundary and characteristic properties of orderhomomorphisms. Fuzzy Sets and Systems 39 (3), 329–337. Yang, M.S., 1993. On a class of fuzzy classification maximum likelihood procedures. Fuzzy Sets and Systems 57 (3), 365–375. Zadeh, L.A., 1965. Fuzzy sets. Information and Control 8 (3), 338–353.