Infrared Physics & Technology 81 (2017) 79–88
Contents lists available at ScienceDirect
Infrared Physics & Technology journal homepage: www.elsevier.com/locate/infrared
Classification of visible and infrared hyperspectral images based on image segmentation and edge-preserving filtering Binge Cui a,⇑, Xiudan Ma a, Xiaoyun Xie a, Guangbo Ren b, Yi Ma b a b
College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China The First Institute of Oceanography (FIO), State Oceanic Administration (SOA), Qingdao 266061, China
h i g h l i g h t s Edge-preserving filtering is applied to remove the noise while preserving boundaries. Three kinds of spatial information are combined to improve the classification result. Visible and infrared bands are jointly used to improve the classification accuracies. The proposed method can achieve high classification accuracy with a few samples.
a r t i c l e
i n f o
Article history: Received 18 October 2016 Revised 7 December 2016 Accepted 16 December 2016 Available online 27 December 2016 Keywords: Hyperspectral image classification Image segmentation Edge-preserving filtering Feature extraction
a b s t r a c t The classification of hyperspectral images with a few labeled samples is a major challenge which is difficult to meet unless some spatial characteristics can be exploited. In this study, we proposed a novel spectral-spatial hyperspectral image classification method that exploited spatial autocorrelation of hyperspectral images. First, image segmentation is performed on the hyperspectral image to assign each pixel to a homogeneous region. Second, the visible and infrared bands of hyperspectral image are partitioned into multiple subsets of adjacent bands, and each subset is merged into one band. Recursive edge-preserving filtering is performed on each merged band which utilizes the spectral information of neighborhood pixels. Third, the resulting spectral and spatial feature band set is classified using the SVM classifier. Finally, bilateral filtering is performed to remove ‘‘salt-and-pepper” noise in the classification result. To preserve the spatial structure of hyperspectral image, edge-preserving filtering is applied independently before and after the classification process. Experimental results on different hyperspectral images prove that the proposed spectral-spatial classification approach is robust and offers more classification accuracy than state-of-the-art methods when the number of labeled samples is small. Ó 2016 Elsevier B.V. All rights reserved.
1. Introduction Hyperspectral remote sensing is a major breakthrough in remote sensing technology that can simultaneously acquire hundreds of spectral bands from visible light to near infrared light, each of which span approximately 10 nm. Hyperspectral images contain a wealth of spectral information that makes it possible to finely classify ground objects. However, as shown in the Hughes phenomenon [1], when the number of samples is limited, the classification accuracy will decrease as the number of bands increases. Thus, dimensionality reduction is critical to improve the classification accuracy of hyperspectral image. To achieve this,
⇑ Corresponding author. E-mail address:
[email protected] (B. Cui). http://dx.doi.org/10.1016/j.infrared.2016.12.010 1350-4495/Ó 2016 Elsevier B.V. All rights reserved.
feature selection and feature extraction have been widely studied in recent years [2–4]. The goal of feature selection is to find the subsets of the spectral bands that provide the highest class separability [2]. Depending on whether labeled samples are required or not, feature selection algorithms can be broadly classified into two categories: supervised methods and unsupervised methods [5]. Unsupervised methods such as simplex-based feature selection methods [6] and cluster-based feature selection methods [7] select the most informative and distinctive features, while supervised methods typically require a search strategy [8–10] and a criterion function [11–14]. The feature subset with the best criterion function value determines the output of the feature search method. Unlike feature selection, feature extraction can effectively remove noise in the extracted features using certain types of linear transformations. Feature-extraction algorithms can also be
80
B. Cui et al. / Infrared Physics & Technology 81 (2017) 79–88
classified as supervised and unsupervised methods. Principal Component Analysis (PCA) [15], Maximum Noise Fraction (MNF) [16] and Independent Component Analysis (ICA) [17] are unsupervised methods, while Linear Discriminant Analysis (LDA) [18] is one supervised method. Both supervised and unsupervised methods can transform a hyperspectral image from a high-dimensional space to a low-dimensional space meanwhile preserving most of the desired information in a few principal components. In the remote sensing image, the adjacent pixels are likely to belong to the same category. Inspired by this, many researchers have worked on spectral-spatial classification methods that can incorporate spatial information into the classification process [19,20]. Spectral-spatial classification methods include numerous categories including fixed-window based methods (e.g., morphological filtering [21], Markov random fields (MRFs) [22] and texture measures [23]); spectral-spatial kernel based methods (e.g., morphological kernel [24], composite kernel [25] and graph kernel [26]); and image segmentation based methods (e.g., partitional clustering [27], watershed transformation [28] and hierarchical segmentation [20]). These methods can be combined to further improve the classification accuracy of hyperspectral images [29]. Spectral-spatial classification methods can eliminate ‘‘salt-andpepper” noise on classification maps and can markedly improve image-classification accuracy. Recently, edge-preserving filtering has been applied successfully in many applications such as image fusion [30], image denoising [31] and image classification [32,33]. In principle, edgepreserving filtering is a type of low-pass filter that can smooth out small changes while making the edges of objects clearer. Unlike traditional low-pass filters, spatial and spectral distances are jointly used for each pixel to define the weights of its neighborhood pixels in edge-preserving filters. Thus, neighborhood pixels that lie on the same side of a strong edge will have larger weights, while those that lie on the opposite sides of a strong edge will have a negligible weight. In hyperspectral image classification, edgepreserving filtering can be applied to improve the image quality or to improve the classification map. In this study, image segmentation and edge-preserving filtering technologies are jointly utilized to improve the classification accuracy of hyperspectral images. The proposed method is based on two facts: (1) the pixels belonging to a homogeneous region should be classified as one class; (2) the neighboring pixels usually have strong correlations with each other. Based on the first fact, object-oriented image segmentation is performed to output a lot of homogeneous regions. The pixels within one region are given the same region number. Based on the second fact, edgepreserving filtering is utilized to ensure that neighboring pixels on the same side of an edge have similar features and belong to the same class. To achieve this, different types of edge-preserving filtering are applied independently before and after the classification process. Experiment results show that the proposed method can improve the classification accuracy of SVM significantly. The remainder of this paper is organized as follows. Section 2 introduces an object-oriented image segmentation algorithm and two widely used edge-preserving filters (EPFs). Section 3 describes the proposed spectral-spatial classification approach. The experimental results are presented in Section 4, and conclusions are given in Section 5.
2.1. Object-oriented image segmentation Image segmentation attempts to divide an image into spatially continuous and homogenous regions. Each pixel belongs to a region, and all the pixels belonging to one region are spectrally similar and spatially adjacent. Among various segmentation algorithms, object-oriented image segmentation has been widely used to analyze remote sensing images, which can produce image objects known as ‘‘patches” in landscape ecology. One of the most noteworthy object-oriented image segmentation methods—multiresolution segmentation—has been proposed and carried forward by the well-known commercial eCognition software [34,35]. Color criterion and shape criterion are jointly used to create homogeneous image objects. The segmentation function is constructed as follows:
Sf ¼ xcolor hcolor þ ð1 xcolor Þ hshape
ð1Þ
where hcolor is the color criterion that measures changes in spectral heterogeneity, hshape is the shape criterion that measures changes in the shape heterogeneity, xcolor is the weight of hcolor , and 1 xcolor is the weight of hshape . Spectral heterogeneity is calculated using the standard deviation of the spectral values multiplied by the corresponding weights, while shape heterogeneity is calculated using two types of landscape ecology measures: compactness and smoothness. For a complete description of multi-resolution image segmentation, please refer to the original study in Ref. [35]. 2.2. Edge-preserving filtering During the last decade, many different EPFs (e.g., joint bilateral filter [36], weighted least-squares (WLS) filter [37], guided filter [38] and domain transform filter [39]) have been proposed. The primary advantage of edge-preserving filtering is that edges become clearer after filtering rather than becoming blurred. Most of these EPFs provide a different weight calculation method for the neighborhood pixels, which makes pixels about the same distance away from the current pixel have different weights. Pixels on the same side of the current pixel will have a larger weight, while pixels on a different side of the current pixel will have a smaller weight. In the following sections, two widely used edge-preserving filtering methods—joint bilateral filtering and domain transform recursive filtering—will be described in detail. 2.3. Joint bilateral filtering Compared to other low-pass filters, bilateral filtering has the ability to maintain the edge information of images when smoothing. In addition to the geometric distance, the color distance between pixels is used in the calculation of the neighboring pixel weights. The bilateral filter has two weights—one is based on the geometric distance, while the other is based on the color distance. Joint Bilateral Filtering (JBF) is an improved bilateral filtering method in which the color distance is calculated based on the color differences between pixels in a guidance image that has lower noise and sharper edges. Specifically, a joint bilateral filter applied to an input image f(x) and its corresponding guidance image g(x) produces an output image h(x) defined as follows: 1
X f ðnÞcðn; xÞsðgðnÞ; gðxÞÞ
2. Image segmentation and edge-preserving filtering
hðxÞ ¼ k ðxÞ
In the past ten years, image segmentation and edge-preserving filtering techniques have made marked progress. In this section, some typical image segmentation and edge-preserving filtering techniques are introduced.
with the normalization:
ð2Þ
X
kðxÞ ¼
X X
cðn; xÞsðgðnÞ; gðxÞÞ
ð3Þ
B. Cui et al. / Infrared Physics & Technology 81 (2017) 79–88
where x is the neighborhood center; n is a nearby point; X is a local 2 12
knxk
r
d neighboring window; cðn; xÞ ¼ e is the closeness, which is a Gaussian function of the Euclidean distance between n and x; 2
1 kgðnÞgðxÞk sðgðnÞ; gðxÞÞ ¼ e2ð rr Þ is the color similarity, which is a Gaussian function of the color difference between n and x in the guidance image g(x); k k represents the vector 2-norm; and rd and rr control the width of the Gaussian filter in spatial and spectral domains, respectively. Thus, neighborhood pixels closer to and more similar to the center pixel in the guidance image contribute more to the filtering result.
2.4. Domain transform recursive filtering Most edge-preserving filtering technologies operate in twodimensional space. When the filter window is large, the efficiency of these algorithms is low. To perform high-quality edgepreserving filtering of images in real time, Gastal and Oliveira [39] proposed a novel edge-preserving filtering method that works in one-dimensional space based on a domain transform. The domain transform defines a distance-preserving transformation that can preserve the geodesic distance between pixels. For a 1D signal I with a given sampling interval h, the domain transform ct must satisfy the following equality (in l1 norm):
ctðx þ hÞ ctðxÞ ¼ h þ jIðx þ hÞ IðxÞj
ð4Þ
After the domain transformation, uniformly sampled signals become non-uniformly sampled signals, as shown in Fig. 1. Because an inherently 2D domain transform generally does not exist, a 1D domain transform is used to perform 2D filtering. Specifically, horizontal filtering is performed along each image row. Then, vertical filtering is performed along each image column. To eliminate the stripes introduced by a single iteration of a twopass, 1D filtering process, multiple iterations are required to achieve good results. A recursive edge-preserving filter can be defined in the transformed domain as:
J½xn ¼ ð1 ad ÞI½xn þ ad J½xn1
ð5Þ
81
where J is the output signal, a 2 ½0; 1 is a feedback coefficient, and d ¼ ctðxn Þ ctðxn1 Þ is the distance between the neighborhood pixels xn and xn1 in the transformed domain (Xx ). As the gray-level difference between adjacent pixels increases, d increases, and ad approaches zero, stopping the propagation chain and thus preserving the edges in the image. A recursive filter is unidirectional; thus, it must be applied twice (e.g., left-to-right and then right-to-left) to achieve a symmetric, filtered result. 2.5. Comparison of the three types of spatial information All of the above techniques (i.e., image segmentation, bilateral filtering and recursive filtering) can be used for hyperspectral image classification. Using neighborhood information, certain pixel classification errors can be corrected. However, the scope of the neighborhoods is difficult to determine in practice. For efficiency, the neighborhood range cannot be infinite; therefore, it is necessary to find the optimal combination from a variety of neighborhood ranges. As shown in Fig. 2, image segmentation can introduce spatial contextual information from an adaptive neighborhood region; bilateral filtering can introduce spatial contextual information from a fixed-size window; and recursive filtering can introduce spatial contextual information from the same line or column as the current pixel. However, these techniques are complementary in their use of spatial information; therefore, they can be combined to improve the image classification results. 3. Proposed method As mentioned earlier, the spatial contextual information introduced by various techniques can be utilized to improve the image classification accuracy. Thus, we can combine the above techniques together to further improve the classification accuracy of hyperspectral images. Hereinto, segmentation and edge-preserving filtering are used during the feature extraction of hyperspectral images to get high-quality features; meanwhile edge-preserving filtering is used to refine the classification map. The hyperspectral image classification process can be divided into
Fig. 1. Domain transform. (Left) Input signal I. (Right) Signal I in the transformed domain.
Fig. 2. Examples of spatial contextual information introduced by (a) Image segmentation. (b) Bilateral filtering. (c) Recursive filtering. Neighboring pixels are shown in green; the pixel of interest is in red. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
82
B. Cui et al. / Infrared Physics & Technology 81 (2017) 79–88
three phases: feature extraction, image classification and result refinement. 3.1. Hyperspectral image feature extraction Hyperspectral sensors can capture the reflectance and radiation signal of ground objects in the visible light, near infrared and shortwave infrared spectral range. Since most objects have diagnostic spectral features, they will become easier to distinguish if the corresponding feature bands are used. To analyze the effects of feature bands on different ground objects, one classification experiment was done on Indian Pines hyperspectral image. This scene was taken in June and some of the crops present (corn, soybeans) were in early stages of growth. The experimental results were presented in Table 1. From Table 1, we can see that visible light have advantages in classifying grass/pasture, grass/trees, wheat, woods and stonesteel towers; near infrared bands have advantages in classifying hay-windrowed, wheat and woods; short wave infrared bands have advantages in classifying grass/pasture-mowed, wheat and woods. A further analysis of the infrared bands shows that they perform better with respect to dried crops and woods. Noteworthy, the image classification accuracies of all spectral bands are almost always better than using any portion of the spectral bands, which means that they are complementary in nature. Thus, both visible and infrared bands should be utilized in the classification of hyperspectral images. Based on the above analysis, hyperspectral image feature extraction can be described as follows: (1) Band partitioning: partition the visible bands, near infrared bands and short wave infrared bands into multiple smaller subsets of adjacent bands separately. Each subset contains up to K bands where K can be set to a number between 5 and 10. (2) Band merging: in order to reduce the dimensionality and noise of hyperspectral image, the adjacent bands in each subset are merged into one band using the averaging method; (3) Recursive filtering: edge-preserving recursive filtering is performed on each merged band to obtain one high-quality feature band; (4) Spatial feature band generating: in order to utilize the irregular neighborhood information, image segmentation is performed on the hyperspectral image to obtain a segmentation band. At last, the resulting hyperspectral dataset is constructed as follows: ðkÞ
HDi;j
8 > < SBi;j ðk1Þ ¼ FBi;j > :
where HD is the feature band set that contains n + 1 bands, SB is the segmentation band, FB is the filtered feature band set that contains n bands, and i, j and k represent the pixel’s line, column and band numbers, respectively. 3.2. Hyperspectral image classification An SVM (Support Vector Machine) is a commonly used pixelwise classifier that provides good classification accuracy. Compared to its counterparts, SVM has two major advantages: (1) it requires only a few samples (i.e., support vectors) to locate the optimal separating hyperplane; and (2) a Gaussian kernel SVM can manage infinite-dimension classification problems and is thus robust to the spectral dimension of a hyperspectral image. Therefore, we used the SVM classifier to classify the hyperspectral dataset and provide the best pixel-wise classification result. 3.3. Classification map refinement The pixel-wise classification map obtained from the SVM classifier usually contains salt-and-pepper noise because the spectrum of the pixel is volatile and vulnerable to the effects of environmental noise. Determining the class of a pixel depends both on its spectrum and the classes of the pixels around it. In the classification map post-processing phase, the multi-class classification map is first decomposed into multiple single-class classification maps, where pixels belonging to that class are assigned values of 1 (or 0 otherwise). Then, joint bilateral filtering is performed on each single-class classification map to generate a smoothed abundance map for that class. The first one or three principal components that contain most of the spatial information are used as the guidance image. Finally, the multiple abundance maps are fused into a single classification map where each pixel is given the class label with the maximum abundance value. Formally, the final result can be calculated using the following formula: ðkÞ
Li;j ¼ arg max pi;j
ð7Þ
k
where L is the final classification map, pðkÞ is the abundance map of th
k¼1 ð6Þ
26k6nþ1
the k class, and i and j represent the pixel’s line and column numbers, respectively. Salt-and-pepper noise in the classification map can be effectively removed using the preceding sequence of operations.
Table 1 Classification accuracies (in %) of SVM classifier with respect to different band subset for the Indian Pines hyperspectral image. The training set accounts for 10% of the ground truth. Class
Train
Test
Visible light
Near Infrared
Short wave infrared
All spectral bands
Alfalfa Corn-no till Corn-min till Corn Grass/pasture Grass/trees Grass/pasture-mowed Hay-windrowed Oats Soybeans-no till Soybeans-min till Soybeans-clean till Wheat Woods Buildings-grass-tree-drives Stone-steel towers
23 95 81 65 73 73 14 70 10 80 105 74 68 80 67 46
23 1333 749 172 410 657 14 408 10 892 2350 519 137 1185 319 47
71.11 56.71 55.99 42.71 87.91 94.53 34.44 94.06 56.18 52.02 74.57 42.41 89.84 94.76 57.13 95.41
48.07 56.96 49.08 28.91 82.95 86.21 57.99 97.87 33.37 43.65 57.75 39.46 97.65 91.78 54.32 85.83
59.12 67.64 63.67 37.24 84.57 88.16 79.81 95.93 63.07 55.99 70.34 61.33 89.56 94.03 48.82 81.48
71.08 78.29 73.24 52.52 88.98 94.82 88.1 98.98 64.14 73.43 83.65 71.65 96.34 96 58.44 85.12
The ‘bold’ values cited in the table are used to highlight the best scores for each method.
B. Cui et al. / Infrared Physics & Technology 81 (2017) 79–88
4. Experimental results and discussions 4.1. Experimental setup 4.1.1. Dataset The proposed method is performed on three hyperspectral datasets—the Indian Pines image, the University of Pavia image, and the Salinas image. The Indian Pines image, which depicts the agricultural Indian Pine test site in northwestern Indiana, was
Fig. 3. (a) Three-band color composite of the Indian Pines image (bands 43, 30, and 21); (b) and (c) ground truth data of the Indian Pines image.
83
acquired by the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) sensor. The image has 220 bands of size 145 145 with a spatial resolution of 20 m per pixel and a spectral coverage ranging from 0.4 to 2.5 lm. Note that 20 water absorption bands (nos. 104–108, 153–163, and 200) were removed before these experiments were performed. There are 16 types of ground objects that have been identified in this hyperspectral image. Fig. 3 shows a color composite of the Indian Pines image and the corresponding ground truth data. The University of Pavia image was captured by the ROSIS-03 satellite sensor over the University of Pavia. The image has 115 bands of size 610 340 with a spatial resolution of 1.3 m per pixel and a spectral coverage ranging from 0.43 to 0.86 lm. Note that the 12 noisiest bands were removed before these experiments were performed. There are 9 types of ground objects that have been identified in this hyperspectral image. Fig. 4 shows a color composite of the University of Pavia image and the corresponding ground truth data. The Salinas image was captured by the AVIRIS sensor over Salinas Valley with a spatial resolution of 3.7 m per pixel. The image has 224 bands of size 512 217. As with the Indian Pines image, 20 water absorption bands (nos. 108–112, 154–167, and 224) were discarded. There are 16 types of ground objects that have been identified in this hyperspectral image. Fig. 5 shows a color composite of the Salinas image and the corresponding ground truth data. For the three images, the number of training and test samples for each class is listed in Tables 2–4, respectively.
4.1.2. Quality indices Three widely used quality indices (i.e., overall accuracy (OA), average accuracy (AA), and the kappa coefficient) were used to evaluate the performance of the proposed method. OA is the percentage of correctly classified pixels, and AA is the mean of the percentage of correctly classified pixels for each class. The kappa coefficient yields the percentage of correctly classified pixels corrected by the number of agreements that would be expected purely by chance. All the results are obtained by averaging the values obtained after 30 Monte Carlo runs.
Fig. 4. (a) Three-band color composite of the University of Pavia image (bands 56, 33, and 13); (b) and (c) ground truth data of the University of Pavia image.
84
B. Cui et al. / Infrared Physics & Technology 81 (2017) 79–88
Fig. 5. (a) Three-band color composite of the Salinas image (bands 30, 21, and 10); (b) and (c) ground truth data of the Salinas image.
Table 2 Classification accuracies (in percent) and j statistic (standard deviation included) for the SVM, EPF, IFRF and SnEPF methods for the Indian Pines image. The training set accounts for 5% of the ground truth. Class
Train
Test
SVM
EPF
IFRF
SnEPF
Alfalfa Corn-no till Corn-min till Corn Grass/pasture Grass/trees Grass/pasture-mowed Hay-windrowed Oats Soybeans-no till Soybeans-min till Soybeans-clean till Wheat Woods Buildings-grass-tree-drives Stone-steel towers
23 37 36 34 33 32 14 33 10 38 48 36 33 36 36 33
23 1391 794 203 450 698 14 445 10 934 2407 557 172 1229 350 60
52.31 ± 13.20 70.64 ± 5.15 62.78 ± 6.57 44.70 ± 5.87 85.39 ± 4.09 92.20 ± 1.99 74.86 ± 11.12 98.60 ± 0.88 48.64 ± 8.95 66.30 ± 5.43 78.69 ± 3.28 57.06 ± 5.63 90.93 ± 5.26 94.62 ± 1.54 54.81 ± 6.62 77.20 ± 9.93
82.34 ± 15.79 85.79 ± 5.95 81.08 ± 8.36 54.67 ± 7.95 93.48 ± 3.01 95.51 ± 1.66 95.78 ± 7.12 99.72 ± 0.51 76.36 ± 10.40 74.55 ± 6.46 86.42 ± 4.22 70.43 ± 8.11 97.00 ± 3.89 95.61 ± 1.64 69.18 ± 8.16 86.92 ± 9.22
97.84 ± 5.15 92.80 ± 4.63 91.38 ± 5.47 88.98 ± 6.25 94.16 ± 5.12 97.57 ± 2.45 83.68 ± 6.13 100.00 ± 0.00 74.92 ± 4.59 88.84 ± 7.18 96.16 ± 2.32 89.85 ± 6.58 95.47 ± 5.53 99.36 ± 0.78 90.55 ± 6.72 97.37 ± 2.60
97.90 ± 4.21 93.36 ± 4.01 94.75 ± 3.05 92.56 ± 5.77 94.14 ± 6.04 99.55 ± 1.79 84.47 ± 5.62 100.00 ± 0.00 87.79 ± 2.32 89.73 ± 7.02 97.53 ± 1.53 92.86 ± 4.96 97.02 ± 4.04 99.34 ± 0.57 95.99 ± 4.73 97.76 ± 2.27
OA AA Kappa
– – –
– – –
74.61 ± 1.61 71.86 ± 1.41 71.20 ± 1.74
83.93 ± 1.70 84.05 ± 1.90 81.70 ± 1.88
94.02 ± 1.62 92.43 ± 2.50 93.16 ± 1.84
95.55 ± 1.13 94.67 ± 2.16 94.91 ± 1.28
The ‘bold’ values cited in the table are used to highlight the best scores for each method.
4.2. Image segmentation The image-segmentation process occurs as follows. First, PCA decomposition is performed to transform the hyperspectral image into a gray-scale image and a color image. The gray-scale image corresponds to the first principal component, and the color image is composed of the first three principal components. The multiresolution image segmentation algorithm is used [35], where the scale is set to 10, and the spectral weight is set to 0.9. Fig. 6 shows the gray-scale and color images of the Indian Pines hyperspectral image associated with the corresponding segmentation results. The segmentation process was performed on the gray-scale and color images using eCognition software [34]. The segmentation result was exported as a TIF file, in which each pixel contains the
Fig. 6. India Pines hyperspectral image segmentation: (a) gray-scale image segmentation; (b) color-image segmentation.
85
B. Cui et al. / Infrared Physics & Technology 81 (2017) 79–88
Table 3 Classification accuracies (in percent) and j statistic (standard deviation included) for the SVM, EPF, IFRF and SnEPF methods on the University of Pavia Image. The training set accounts for 1% of the ground truth. Class
Train
Test
SVM
EPF
IFRF
SnEPF
Asphalt Meadows Gravel Trees Metal sheets Bare soil Bitumen Bricks Shadows
48 48 47 47 47 49 47 47 47
6583 18,601 2052 3017 1298 4980 1283 3635 900
95.31 ± 1.89 95.18 ± 1.02 67.74 ± 4.76 74.65 ± 7.95 96.01 ± 2.05 62.96 ± 5.69 58.71 ± 7.76 79.96 ± 3.46 99.90 ± 0.12
97.42 ± 1.42 97.74 ± 0.81 84.39 ± 5.49 80.29 ± 9.74 97.04 ± 2.01 73.73 ± 7.11 75.31 ± 10.39 86.66 ± 3.19 99.38 ± 0.40
95.95 ± 1.22 99.30 ± 0.37 75.34 ± 5.72 90.57 ± 7.46 95.38 ± 4.49 90.22 ± 6.70 83.12 ± 5.15 84.39 ± 4.97 99.53 ± 1.16
96.69 ± 1.33 99.36 ± 0.28 90.91 ± 3.68 91.94 ± 6.14 98.64 ± 2.02 91.81 ± 5.56 89.47 ± 5.04 88.82 ± 4.97 99.87 ± 0.12
83.28 ± 1.45 81.16 ± 1.13 78.50 ± 1.73
89.68 ± 1.60 88.00 ± 1.72 86.61 ± 1.98
93.51 ± 1.19 90.42 ± 1.38 91.46 ± 1.52
95.51 ± 1.14 93.90 ± 1.40 94.08 ± 1.49
OA AA Kappa
The ‘bold’ values cited in the table are used to highlight the best scores for each method.
Table 4 Classification accuracies (in percent) and j statistic (standard deviation included) for the SVM, EPF, IFRF and SnEPF methods on the Salinas Image. The training set accounts for 1% of the ground truth. Class Weeds1 Weeds2 Fallow Fallow-P Fallow-S Stubble Celery Grapes Soil Corn Lettuce4 wk Lettuce5 wk Lettuce6 wk Lettuce7 wk Vineyard-U Vineyard-T
Train 33 34 35 33 33 35 33 37 34 34 33 34 33 33 34 33
OA AA Kappa
Test 1976 3692 1941 1361 2645 3924 3546 11,234 6169 3244 1035 1893 883 1037 7234 1774
SVM 99.41 ± 0.77 99.28 ± 0.41 93.19 ± 2.04 97.68 ± 0.68 98.26 ± 0.74 99.97 ± 0.08 98.88 ± 0.95 74.65 ± 2.98 99.41 ± 0.19 81.53 ± 4.66 88.02 ± 4.71 96.08 ± 1.43 95.58 ± 2.13 89.28 ± 7.10 58.52 ± 4.26 96.33 ± 1.97
EPF 99.99 ± 0.03 99.86 ± 0.13 94.76 ± 1.80 97.94 ± 0.69 99.30 ± 0.49 99.99 ± 0.05 99.26 ± 0.81 81.10 ± 4.75 99.45 ± 0.13 86.24 ± 4.56 93.77 ± 3.97 97.98 ± 1.03 97.36 ± 1.58 94.14 ± 3.96 69.70 ± 7.51 98.52 ± 1.07
IFRF 100.00 ± 0.00 100.00 ± 0.00 99.85 ± 0.08 96.74 ± 1.13 99.95 ± 0.06 100.00 ± 0.01 99.57 ± 0.23 99.19 ± 1.05 99.95 ± 0.11 99.73 ± 0.17 98.84 ± 0.98 99.63 ± 0.45 97.65 ± 1.84 97.88 ± 1.29 92.95 ± 4.20 99.72 ± 0.78
SnEPF 99.99 ± 0.01 100.00 ± 0.00 99.84 ± 0.15 97.40 ± 0.82 99.83 ± 0.23 100.00 ± 0.00 99.71 ± 0.20 98.92 ± 1.22 99.98 ± 0.05 99.76 ± 0.17 99.04 ± 0.94 99.50 ± 0.78 98.22 ± 1.95 97.69 ± 1.63 94.11 ± 3.96 99.87 ± 0.41
86.34 ± 1.26 91.63 ± 0.87 84.82 ± 1.38
89.99 ± 1.68 94.33 ± 0.91 88.87 ± 1.86
98.52 ± 0.58 98.85 ± 0.33 98.35 ± 0.64
98.67 ± 0.51 98.96 ± 0.31 98.52 ± 0.57
The ‘bold’ values cited in the table are used to highlight the best scores for each method.
label of the region to which it belongs. Because the first three principal components contain more information than the first principal component, we select the color-image segmentation results as a new feature band. 4.3. Classification results In this section, the proposed method that uses segmentation and edge-preserving filtering is compared to several methods, i.e., the SVM method, the EPF (Edge-Preserving Filtering) method [33], and the IFRF (Image Fusion and Recursive Filtering) method [32]. Both the EPF method and the IFRF method use SVM as the classifier. The SVM algorithm is implemented using the LIBSVM library [40]. For the EPF and IFRF methods, the code used in this study is available on Dr. Kang’s homepage.1 The first experiment was performed on the Indian Pines dataset. Table 2 presents the number of training and test samples. The training set that accounts for 5% of the ground truth was chosen randomly, and all classification methods have the same labeled samples and testing samples. Fig. 7 shows the classification results obtained by the different methods for the Indian Pines image associated with their corresponding OA values. As shown in Fig. 7, the 1
http://xudongkang.weebly.com/index.html.
EPF method performs better than the SVM method, and the IFRF method performs better than the EPF method. For simplicity, the method proposed in this paper was renamed as the SnEPF (Segmentation and Edge-Preserving Filtering) method. By incorporating all kinds of spatial information into the classification process, the SnEPF method performs better than other methods in terms of OA. The classification map of SnEPF is more homogeneous and smooth, which is very close to the ground truth. Table 2 shows the classification accuracies of the different methods, including the overall accuracy, the average accuracy and the kappa coefficient, as well as their classification accuracies for each type of ground object. From this table, the OA of the SVM is shown to increase from 74.61% to 95.55% using the proposed SnEPF method. Also, the AA of the SVM is shown to increase from 71.86% to 94.67%, and the kappa coefficient also increased significantly. The second and third experiments were performed on the University of Pavia image and the Salinas image, respectively. Tables 3 and 4 show the number of training and test samples. The training set that accounts for 1% of the ground truth was chosen randomly, and all classification methods used the same labeled samples and testing samples. Figs. 8 and 9 show the classification results obtained by the different methods for the University of Pavia image and the Salinas image, respectively, along with their corresponding OAs. As shown in Figs. 8 and 9, adding the image-
86
B. Cui et al. / Infrared Physics & Technology 81 (2017) 79–88
(a) OA = 73.28
(b) OA = 88.23
(c) OA = 92.42
(d) OA = 97.02
Fig. 7. Classification results (Indian Pines image) obtained by (a) the SVM method, (b) the EPF method, (c) the IFRF method, and (d) the SnEPF method. The value of OA is given as a percentage.
(a) OA = 79.78
(b) OA = 91.57
(c) OA = 93.23
(d) OA = 96.57
Fig. 8. Classification results (University of Pavia image) obtained by (a) the SVM method, (b) the EPF method, (c) the IFRF method, and (d) the SnEPF method. The value of OA is given as a percentage.
(a) OA = 84.80
(b) OA = 90.52
(c) OA = 98.52
(d) OA = 99.57
Fig. 9. Classification results (Salinas image) obtained by (a) the SVM method, (b) the EPF method, (c) the IFRF method, and (d) the SnEPF method. The value of OA is given as a percentage.
segmentation band can help to eliminate most of the ‘‘noisy pixels” produced by the SVM method, and the overall classification accuracy improves by more than 12%. For example, in the brown and green regions at the center and bottom of Fig. 8, respectively, and in the two larger blocks at the top-left corner of Fig. 9, misclas-
sified pixels were corrected, and the classification map has become smoother. Tables 3 and 4 show the classification accuracies for the different methods tested in this study. For these two examples, the proposed SnEPF method is shown to always outperform the EPF and IFRF methods in terms of OA, AA and kappa. Compared to
B. Cui et al. / Infrared Physics & Technology 81 (2017) 79–88
the SVM method, the proposed method can improve the image classification accuracies significantly. For example, in Table 3, the classification accuracy of the Bare Soil class increases from 62.96% to 91.81%. In Table 4, the classification accuracy of the Grapes class increases from 74.65% to 98.92%. The two sets of results further demonstrate the advantages of the proposed method. From the above three experiments, we can see that the SnEPF method always performs better than the other three classification methods in terms of OA, AA and kappa. However, we should also see that SnEPF is not so good in terms of the classification accuracy of individual ground objects such as ‘‘Asphalt” and ‘‘Shadows” in Table 3. The reason is that inappropriate merge of adjacent spectral bands may eliminate some diagnostic spectral features of ground objects. Moreover, although the image segmentation scale is set very small, it is still possible that pixels belonging to different classes are divided into the same patch, which will mislead the SVM classifier to some extent. Thus, preserving the diagnostic spectral features of various objects in bands merging and avoiding the phenomenon of patch under-segmentation are two important issues to be addressed further. 5. Conclusion This study proposed a new hyperspectral image-classification approach called SnEPF. When only a few labeled samples are available, spatial information is required to improve the classification accuracy. The existing spectral-spatial classification methods only use one kind of spatial information which is usually insufficient to depict complex spatial autocorrelation in real hyperspectral images. The SnEPF method introduces three kinds of spatial information which aim to increase the similarities between pixels of the same class or to remove the noise in the selected feature bands and the classification map. This method can not only improve the classification accuracy of hyperspectral image, but also make the classification map smoother. Experiments were performed on three real hyperspectral images. Comparisons with other state-of-theart methods showed that the proposed SnEPF method produces results with better accuracy in terms of OA, AA and the kappa coefficient. Acknowledgements This work was co-supported by National Natural Science Foundation of China (NSFC) (41406200); Shandong Province Natural Science Foundation of China (ZR2014DQ030). The authors would like to thank the anonymous reviewers of this manuscript for their constructive comments, which helped improve this study significantly. The authors would also like to thank Prof. C. Lin for providing the LibSVM software and Dr. X. Kang for sharing the MATLAB source code for the EPF and the IFRF methods. References [1] G. Hughes, On the mean accuracy of statistical pattern recognizers, IEEE Trans. Inf. Theory IT-14 (1968) 55–63. [2] S.B. Serpico, G. Moser, Extraction of spectral channels from hyperspectral images for classification purposes, IEEE Trans. Geosci. Remote Sens. 45 (2007) 484–495. [3] M. Pal, G.M. Foody, Feature selection for classification of hyperspectral data by SVM, IEEE Trans. Geosci. Remote Sens. 48 (2010) 2297–2307. [4] Q. Zhang, Y. Tian, Y. Yang, C. Pan, Automatic spatial-spectral feature selection for hyperspectral image via discriminative sparse multimodal learning, IEEE Trans. Geosci. Remote Sens. 53 (2015) 261–279. [5] Q. Du, H. Yang, Similarity-based unsupervised band selection for hyperspectral image analysis, IEEE Geosci. Remote Sens. Lett. 5 (2008) 564–568. [6] L. Wang, X. Jia, Y. Zhang, A novel geometry-based feature-selection technique for hyperspectral imagery, IEEE Geosci. Remote Sens. Lett. 4 (2007) 171–175. [7] P. Mitra, C. Murthy, S.K. Pal, Unsupervised feature selection using feature similarity, IEEE Trans. Pattern Anal. Mach. Intell. 24 (2002) 301–312.
87
[8] L. Zhang, Y. Zhong, B. Huang, J. Gong, P. Li, Dimensionality reduction based on clonal selection for hyperspectral imagery, IEEE Trans. Geosci. Remote Sens. 45 (2007) 4172–4186. [9] M.L. Raymer, W.F. Punch, E.D. Goodman, L.A. Kuhn, A.K. Jain, Dimensionality reduction using genetic algorithms, IEEE Trans. Evol. Comput. 4 (2000) 164– 171. [10] L.N. De Castro, F.J. Von Zuben, Learning and optimization using the clonal selection principle, IEEE Trans. Evol. Comput. 6 (2002) 239–251. [11] L. Bruzzone, F. Roli, S.B. Serpico, An extension of the Jeffreys-Matusita distance to multiclass cases for feature selection, IEEE Trans. Geosci. Remote Sens. 33 (1995) 1318–1321. [12] R. Huang, M. He, Band selection based on feature weighting for classification of hyperspectral data, IEEE Geosci. Remote Sens. Lett. 2 (2005) 156–159. [13] T. Kailath, The divergence and Bhattacharyya distance measures in signal selection, IEEE Trans. Commun. Technol. COM-15 (1967) 52–60. [14] H. Peng, F. Long, C. Ding, Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy, IEEE Trans. Pattern Anal. Mach. Intell. 27 (2005) 1226–1238. [15] S. Prasad, L.M. Bruce, Limitations of principal components analysis for hyperspectral target recognition, IEEE Geosci. Remote Sens. Lett. 5 (2008) 625–629. [16] A.A. Green, M. Berman, P. Switzer, M.D. Craig, A transformation for ordering multispectral data in terms of image quality with implications for noise removal, IEEE Trans. Geosci. Remote Sens. 26 (1988) 65–74. [17] A. Villa, J.A. Benediktsson, J. Chanussot, C. Jutten, Hyperspectral image classification with independent component discriminant analysis, IEEE Trans. Geosci. Remote Sens. 49 (2011) 4865–4876. [18] C.-I. Chang, H. Ren, An experiment-based quantitative and comparative analysis of target detection and image classification algorithms for hyperspectral imagery, IEEE Trans. Geosci. Remote Sens. 38 (2000) 1044–1063. [19] M. Fauvel, Y. Tarabalka, J.A. Benediktsson, J. Chanussot, J.C. Tilton, Advances in spectral-spatial classification of hyperspectral images, Proc. IEEE 101 (2013) 652–675. [20] A. Plaza, J.A. Benediktsson, J.W. Boardman, J. Brazile, L. Bruzzone, G. CampsValls, J. Chanussot, M. Fauvel, P. Gamba, A. Gualtieri, Recent advances in techniques for hyperspectral image processing, Remote Sens. Environ. 113 (2009) S110–S122. [21] A. Plaza, P. Martinez, J. Plaza, R. Perez, Dimensionality reduction and classification of hyperspectral image data using sequences of extended morphological transformations, IEEE Trans. Geosci. Remote Sens. 43 (2005) 466–479. [22] Y. Tarabalka, M. Fauvel, J. Chanussot, J.A. Benediktsson, SVM- and MRF-based method for accurate classification of hyperspectral images, IEEE Geosci. Remote Sens. Lett. 7 (2010) 736–740. [23] M. Kim, M. Madden, T.A. Warner, Forest type mapping using object-specific texture measures from multispectral ikonos imagery: segmentation quality and image classification issues, Photogramm. Eng. Remote Sens. 75 (2009) 819–829. [24] M. Fauvel, J. Chanussot, J.A. Benediktsson, A spatial–spectral kernel-based approach for the classification of remote-sensing images, Pattern Recogn. 45 (2012) 381–392. [25] G. Camps-Valls, L. Gomez-Chova, J. Muñoz-Marí, J. Vila-Francés, J. CalpeMaravilla, Composite kernels for hyperspectral image classification, IEEE Geosci. Remote Sens. Lett. 3 (2006) 93–97. [26] G. Camps-Valls, N. Shervashidze, K.M. Borgwardt, Spatio-spectral remote sensing image classification with graph kernels, IEEE Geosci. Remote Sens. Lett. 7 (2010) 741–745. [27] Y. Tarabalka, J.A. Benediktsson, J. Chanussot, Spectral–spatial classification of hyperspectral imagery based on partitional clustering techniques, IEEE Trans. Geosci. Remote Sens. 47 (2009) 2973–2987. [28] Y. Tarabalka, J. Chanussot, J.A. Benediktsson, Segmentation and classification of hyperspectral images using watershed transformation, Pattern Recogn. 43 (2010) 2367–2379. [29] Y. Tarabalka, J.A. Benediktsson, J. Chanussot, J.C. Tilton, Multiple spectral– spatial classification approach for hyperspectral data, IEEE Trans. Geosci. Remote Sens. 48 (2010) 4122–4132. [30] S. Li, X. Kang, J. Hu, Image fusion with guided filtering, IEEE Trans. Image Process. 22 (2013) 2864–2875. [31] C.-H. Lin, J.-S. Tsai, C.-T. Chiu, Switching bilateral filter with a texture/noise detector for universal noise removal, IEEE Trans. Image Process. 19 (2010) 2307–2320. [32] X. Kang, S. Li, J.A. Benediktsson, Feature extraction of hyperspectral images with image fusion and recursive filtering, IEEE Trans. Geosci. Remote Sens. 52 (2014) 3742–3752. [33] X. Kang, S. Li, J.A. Benediktsson, Spectral–spatial hyperspectral image classification with edge-preserving filtering, IEEE Trans. Geosci. Remote Sens. 52 (2014) 2666–2677. [34] M. Baatz, A. Schäpe, Multiresolution segmentation: an optimization approach for high quality multi-scale image segmentation, in: Angewandte Geographische Informationsverarbeitung XII, Wichmann Verlag, Heidelberg, 2000, pp. 12–23. [35] U.C. Benz, P. Hofmann, G. Willhauck, I. Lingenfelder, M. Heynen, Multiresolution, object-oriented fuzzy analysis of remote sensing data for GIS-ready information, ISPRS J. Photogramm. 58 (2004) 239–258. [36] C. Tomasi, R. Manduchi, Bilateral filtering for gray and color images, in: Proc. 6th Int. Conf. Comput. Vis., IEEE, Berlin, Germany, 1998, pp. 839–846.
88
B. Cui et al. / Infrared Physics & Technology 81 (2017) 79–88
[37] Z. Farbman, R. Fattal, D. Lischinski, R. Szeliski, Edge-preserving decompositions for multi-scale tone and detail manipulation, ACM Trans. Graphics 27 (2008), 67:61–67:10. [38] J.-M. Yang, B.-C. Kuo, P.-T. Yu, C.-H. Chuang, A dynamic subspace method for hyperspectral image classification, IEEE Trans. Geosci. Remote Sens. 48 (2010) 2840–2853.
[39] E.S. Gastal, M.M. Oliveira, Domain transform for edge-aware image and video processing, ACM Trans. Graphics 30 (2011), 69:61–69:11. [40] C.-C. Chang, C.-J. Lin, LIBSVM: a library for support vector machines, ACM Trans. Intell. Syst. Technol. 2 (2011), 27:21–27:27.