Dimensionality reduction and derivative spectral feature optimization for hyperspectral target recognition

Dimensionality reduction and derivative spectral feature optimization for hyperspectral target recognition

Optik 130 (2017) 1349–1357 Contents lists available at ScienceDirect Optik journal homepage: www.elsevier.de/ijleo Original research article Dimen...

1MB Sizes 0 Downloads 69 Views

Optik 130 (2017) 1349–1357

Contents lists available at ScienceDirect

Optik journal homepage: www.elsevier.de/ijleo

Original research article

Dimensionality reduction and derivative spectral feature optimization for hyperspectral target recognition Yufu Qu ∗ , Ziyue Liu School of Instrumentation Science and Opto-Electronics Engineering, Beihang University, Beijing 100191, China

a r t i c l e

i n f o

Article history: Received 8 October 2016 Accepted 26 November 2016 Keywords: Hyperspectral image Dimensionality reduction Derivative Target recognition

a b s t r a c t Huge data volumes and redundant information are common problems in the field of hyperspectral target recognition. In this study, we propose a method to ensure the accuracy of target recognition while reducing the amount of data, where the effective bands in the hyperspectral data are selected for which the third-order derivative spectrum approaches zero. Next, a feature optimization method based on a combination of the derivative spectrum is proposed for hyperspectral target recognition, where a combination of the derivative spectrum and original spectrum is used as the basic vector after dimensionality reduction. The analyzed bands are decreased to reduce spectral interference and the data volume. The dimension of the combinatorial derivative spectrum is then increased to obtain more spectral information from the effective bands of the hyperspectral data. Thus, the proposed method can identify the target more accurately with fewer bands. Our experiments showed that the proposed method outperformed principal components analysis, local discriminant analysis, and kernel principal components analysis in low dimensions. © 2016 Elsevier GmbH. All rights reserved.

1. Introduction As a type of image-spectrum data [1], hyperspectral images contain high-quality spectra covering a wide wavelength region with hundreds of spectral bands [2]. These images contain a wealth of information regarding the target and they are highly useful for real-time identification, maritime surveillance [3], environmental monitoring [4], and many other novel applications. The spectral data volume grows exponentially with respect to the number of bands [5], which allows the targets to be characterized more accurately, as well as causing new challenges in data processing [6]. To address various problems, many approaches have been proposed to reduce the dimension of hyperspectral data, which are divided into two main categories: band selection [7] and feature extraction [8]. Band selection is used to select a number of bands directly from the original feature space for subsequent processing, where a subset of the original feature space is obtained, and it can be maximally distinguished from other targets. Typical band selection methods include band selection based on joint entropy [9], optimal index factor [10], and band selection based on dispersion [11]. However, dimensionality reduction for hyperspectral data is more likely to employ a feature extraction method due to the severe information loss that occurs during the process of band selection. Feature extraction is used to transform the original data according to a certain function so the original feature space can be projected into a new, low-dimensional feature space. In general, principal component analysis (PCA) [12] is used most often for feature extraction in hyperspectral images [13]. In addition,

∗ Corresponding author. E-mail address: [email protected] (Y. Qu). http://dx.doi.org/10.1016/j.ijleo.2016.11.143 0030-4026/© 2016 Elsevier GmbH. All rights reserved.

1350

Y. Qu, Z. Liu / Optik 130 (2017) 1349–1357

methods such as locality preserving projection [14], local discriminant analysis (LDA) [15], and kernel PCA (KPCA) [16] have been applied successfully in many cases for dimensionality reduction. The feature extraction process retains some of the necessary features of the original data, thereby avoiding the curse of dimensionality to a great extent. The methods described above can effectively reduce the dimensionality of data, but it is inevitable that the accuracy of subsequent target recognition will be affected. In order to reduce the data volume without any losses in the precision of target recognition, we combine dimensionality reduction and derivative spectral feature optimization in the present study. Derivative features are extracted and extended based on selected bands in the spectral data, which can ensure the accuracy of target recognition with a lower data dimension. 2. Related work The derivative spectrum is derived from spectrometry and it often plays an important role in the analysis of spectral signatures, where it can characterize the variation in hyperspectral data by using the gradient difference between adjacent bands. Based on the differential values of different orders, the derivative spectrum can rapidly determine the positions of special points that highlight the characteristics and the trends in the variations in the spectra. Thus, the derivative spectrum is stronger than the original in terms of its resolving power. From a mathematical viewpoint, the first derivative of a specific point on a curve is the tangent slope at the point. The second order derivative spectrum is the derivative of the derivative of the raw data and we can obtain higher order derivative spectra in the same manner. The first-order to the third-order derivative of a hyperspectral signature f (i ) can be computed by: f  (i ) =

f (i+1 ) − f (i ) f (i+1 ) − f (i ) df (i ) = , = d i+1 − i ␭

(1)

f  (i ) =

f  (i+1 ) − f  (i ) d2 f (i ) , =  d2 

(2)

f  (i ) =

f  (i+1 ) − f  (i ) d3 f (i ) , = 3  d 

(3) 

where i is the wavelength at band i; f  (i ), f  (i ), and f (i ) are the first-order to third-order derivative spectra at the wavelength of i ; f (i ) is the reflectance value of a point measured at wavelength i ; and  is the wavelength difference between two adjacent bands i+1 and i . In spectral data processing, the first derivative is often described as the rate of change in the original spectral curve, the zero points of which represent the wave crests and troughs. In addition, the second derivative measures the concavity of the original spectral curve. A data point with a positive second derivative is concave up in the original curve. Similarly, a point with a negative second derivative indicates the position where the curve is concave down. The third derivative is the rate at which the second changes, which can be regarded as the rate of change in the original curviness at a certain wavelength. In recent decades, derivative spectra have been used widely in spectroscopy for chemical analysis [17]. They have also been applied in the field of hyperspectral remote sensing. In 1989, Wessman et al. used derivative techniques modified from laboratory spectroscopy to evaluate the potential use of remote sensing for estimating the chemistry of the forest canopy and found the combinations of wavelengths that had the highest correlations with canopy chemistry by stepwise regression [18]. In 1990, Demetriades-Shah et al. reviewed methods for generating derivatives of high spectral resolution data and the applications of this technique, where they indicated the superior performance of derivative spectral indices and their potential applications in remote sensing [19]. In 1997, Gong et al. considered the smoothed reflectance and first derivative spectra as input features for an artificial neural network system to identify six conifer tree species. They also showed that a smaller set of selected bands can allow more accurate identification than all the spectral bands [20]. Tsai et al. adapted a derivative algorithm to deal with hyperspectral data and demonstrated the use of derivative spectrum for extracting subtle information [21]. In 1998, Blackburn et al. used the spectral derivative to estimate the concentrations of chlorophyll at the plant canopy and leaf scales [22]. Based on texture features, Chang et al. proposed spectral derivative feature coding for hyperspectral signature analysis in 2009, which is effective for capturing spectral characteristics [23]. In 2012, Li et al. described the use of the hyperspectral derivative based on the Clifford manifold for band selection in hyperspectral images, which is highly efficient in terms of computation and time [24]. In 2013, Bao et al. showed that the classification of hyperspectral remote sensing data can be improved using derivative features. They demonstrated that the use of additional first order derivatives can improve the classification accuracy, especially with a small number of training data [25]. In 2014, Wang et al. applied the first-order derivative transformation based on a PROBA CHRIS hyperspectral remote sensing image of the Yellow River Estuary coastal wetland and showed the beneficial effects of the derivative features on the classification results [26]. In 2015, Wang et al. presented a spatial-spectral derivative-aided kernel joint sparse representation for hyperspectral image classification that considers the first and second derivative features, and experiments confirmed the effectiveness of the proposed approach [27]. Thus, previous studies have shown that derivative spectra can achieve good hyperspectral band selection and target classification results. Based on previous studies, we propose a method where the effective bands of the hyperspectral data are selected by limiting the third-order derivative spectrum. We perform feature optimization by combining the derivative spectrum and band selection method to reduce the amount of redundant data as well as increasing the dimension of the combinatorial derivative spectrum, where sufficient spectral information is

Y. Qu, Z. Liu / Optik 130 (2017) 1349–1357

1351

Table 1 Overall accuracy (OA) and kappa coefficient for different dimensionality reduction methods. Methods

HIS

PCA

LDA

KPCA

Proposed

Dimensions OA (%) kappa

250 76.88 0.357

2 81.12 0.578

2 69.78 0.273

2 61.45 0.286

2 78.68 0.418

guaranteed and the dimension of the data is reduced. The rich derivative spectral characteristics can be useful for identifying the target more accurately and quickly in a few bands. 3. Proposed method 3.1. Dimensionality reduction The aim of dimensionality reduction is to solve the problems of huge data volumes and redundant information during hyperspectral data processing. In this study, we use band selection based on the theory of derivative spectrum to reduce the dimensionality of the original hyperspectral data. The optimal bands are selected by determining the wavelength corresponding to the specific curve shape. In derivative spectral analysis, derivatives of second or higher orders are relatively insensitive to the variations in illumination intensity caused by changes in sun angle, cloud cover, or topography [28], which is beneficial for the full use of light intensity information in the original spectrum. In addition, the first and second order derivatives both reflect the positions of extreme values in the original spectra in terms of their geometrical significance, which are susceptible to noise. However, the third derivative describes the rate of change in the curviness for a certain point, which highlights the trend in the variation of the spectral curve of the target. Thus, it is more effective to weaken the influence of the spectral band overlap for different targets, and this approach is more powerful than lower order derivatives in terms of the resolving power during the target recognition process. When the third derivative is close to zero, the slope of the original spectral curve tends to be either unchanged or mutated. Thus, the spectral data for the corresponding bands can better represent the typical spectral characteristics of the target so it can be distinguished from other objects or background regions. In this study, the spectral points with a third-order derivative that is close to zero are taken as the specific locations of the selected bands. Based on standard samples, the band selection process is conducted without denoising in order to avoid weakening the shape features of the original spectral curve. For the average spectrum of the target and previously obtained background, the thresholds selected are infinitely close to zero to restrict the third derivative spectral range of the target and the background, respectively, where the threshold values can be fine-tuned according to the desired dimension. The union of the bands selected fromthe target and the non-target is used as the optimum selection of bands. The dimensionality of the original spectrum is reduced when the data from the optimal bands are taken out for analysis. For comparison conveniently, we use a simple and effective classification method, minimum distance classifier, to obtain an intuitive hyperspectral dimensionality reduction result. The average spectra of the target and the background obtained from the standard samples are considered as the centers of the two classes in the feature space. By computing the distance between each point for testing and the centers of different classes, every testing point is . We use the Euclidean distance as the distance criterion function. Let xa = (xa1 , xa2 , · · ·, xan )T and xt = (xt1 , xt2 , · · ·, xtn )T be the standard vector and testing vector for a single band, respectively. Then, the Euclidean distance between the two can be obtained as follows.

  n  (xak − xtk )2 , d(xa , xt ) = xa − xt  = 

(4)

k=1

The Euclidean distances of different classes are compared during classification, where the point is classified as a target if the Euclidean distance between the testing pixel and the standard target sample is smaller; otherwise, it is not a target. To demonstrate the effect of the proposed dimension reduction method, we imported the original hyperspectral image sample (HIS) and the spectral data obtained after dimensionality reduction by PCA, LDA, KPCA, and the proposed method into the classifier for comparison. The standard samples (often called training samples in some studies) and the test samples came from different hyperspectral image cubes captured within a certain interval of time. The initial spectral dimension was 250, and we used the overall accuracy (OA) [29] and kappa coefficient [30] to evaluate the recognition results. Fig. 1 shows the hyperspectral images at the 677-nm band. The targets in the experiments are marked in red and the non-targets or the background are marked in green. Fig. 1(a) presents the selected standard samples for 1019 points, i.e., 526 target points and 493 background points. The test samples for 1393 points are shown in Fig. 1(b), i.e., 455 target points and 938 background points. The sample size was not considered here because the subsequent classification was based on the average spectrum of the samples. Two bands were selected by the proposed method, i.e., the 33rd (390 nm) and 241st (974 nm) bands. The recognition results obtained by the different dimensionality reduction methods are shown in Table 1, which demonstrate that OA was over 10% lower using LDA and KPCA compared with HIS. Using PCA and the proposed band selection method the OA was slightly higher compared with HIS. The advantage of PCA was obvious because the OA was 2.44% higher than that

1352

Y. Qu, Z. Liu / Optik 130 (2017) 1349–1357

Fig. 1. Hyperspectral data set 1. (a) Locations of the selected standard samples. (b) Locations of the selected test samples.

Table 2 Overall accuracy (OA) and kappa coefficient for different combinations of spectral features for hyperspectral data set 2. Bands Combinatorial Dimension OA (%) kappa

250 1 74.76 0.339

2 1 89.15 0.679

2 2 90.80 0.740

2 3 90.92 0.738

2 4 91.16 0.738

2 5 90.68 0.731

2 4 83.51 0.679

2 5 83.51 0.679

Table 3 Overall accuracy (OA) and kappa coefficient of different combinations of spectral features of hyperspectral data set 3. Bands Combinatorial Dimension OA (%) kappa

250 1 69.90 0.429

2 1 76.29 0.540

2 2 76.29 0.540

2 3 76.91 0.561

using the proposed method. These results show that the proposed band selection method could achieve similar recognition accuracy to the original data but with lower dimensionality. However, the results were still dissatisfactory and the spectral features need to be optimized further.

3.2. Derivative spectral feature optimization Spectral feature optimization aims to retain as much intrinsic information as possible but with relatively few data dimensions. In two-dimensional spectrum analysis, the derivative spectrum can strengthen the spectral characteristics and eliminate systematic errors in the spectral data. In addition, the influences of atmospheric scattering, background noise, and spectral band overlap are weakened, and thus it can help to improve the target recognition accuracy. The derivative spectra of different orders comprehensively reflect the spectral characteristics of objects of interest. Based on numerous real experiments, we found that the first-order to third-order derivative spectra could achieve a higher recognition accuracy when combined with the original hyperspectral data, but there was little effect on the results when we added higher order (fourth-order or above) derivative spectra. The experimental data are shown in Tables 2 and 3. Thus, in the proposed method, we combine the first-order to third-order derivative spectra with the original hyperspectral data to form a combinatorial spectrum, which can be used to optimize the spectral features in an effective manner. The raw spectral data are the reflectance data for the target and the background at different wavelengths, so the original reflectance values are between 0 and 1, but the derivative spectra of different orders are relatively much smaller. In order to amplify the functions of the derivative spectra in the overall combinatorial spectrum, normalized processing is applied to

Y. Qu, Z. Liu / Optik 130 (2017) 1349–1357

1353

Fig. 2. Hyperspectral data set 2. (a) Locations of the selected standard samples. (b) Locations of the selected test samples.

the derivative spectra to improve their weights in the combinatorial spectrum. Assuming that the dimension of the spectral data is s, then the combinatorial spectrum can be defined as follows.

X=[

f (о1 )

f  (о1 ) max(f  (i ))

f  (о1 ) max(f  (i ))

f  (о1 ) max(f  (i ))

f (о2 )

f  (о2 ) max(f  (i ))

f  (о2 ) max(f  (i ))

f  (о2 ) max(f  (i ))

.. .

.. .

.. .

.. .

f (оs )

f  (оs ) max(f  (i ))

f  (оs ) max(f  (i ))

f  (оs ) max(f  (i ))

],

(5)

Using this method, single-band information is expanded from one to four dimensions, which contains the information for the original data as well as the information for the shape features in the spectral curves. Thus, we no longer focus on one point, but instead we concentrate more on the integrity of the spectral data. The spectrum characteristics of a single band are optimized using the combinatorial spectrum, which allows a few bands to represent more intrinsic information from the target of interest. In this study, the combinatorial spectrum has a single-band vector of n dimension, where n = 4. For each class, the sum of the Euclidean distances between the standard vector and the testing vector of the selected bands is regarded as the discrimination parameter, which determines whether the sample under testing belongs to the target class. If we suppose that the number of selected bands is s, then the discrimination parameter is: s  i=1

 s    4  (xaik − xtik )2 , d(xai , xti ) = i=1

(6)

k=1

where xai is a four-dimensional vector at band i in the standard combinatorial spectrum. The four dimensions are the dimension of the original reflectance values and the three dimensions of the derivative spectrum. Similarly, xti is the test combinatorial vector for band i, which has four dimensions. d(xai , xti ) is the Euclidean distance between the two vectors for band i. Finally, the discrimination parameters for different classes are compared for classification, where the point will be classified as a target if the distance between the testing pixel and the standard target sample is smaller; otherwise, it is not a target. We compared the results obtained by using different combinations of spectral features for target identification with two hyperspectral data sets, as shown in Figs. 2 and 3. The targets are marked in red and the non-target or background locations are marked in green. In hyperspectral data set 2, we selected 842 random standard samples, which comprise 212 target points and 630 background points, as shown in Fig. 2(a). The test samples with 848 points are shown in Fig. 2(b), i.e., 218 target points and 630 background points. Two bands were selected by the proposed band selection method, i.e., the 92nd (556 nm) and 220th (915 nm) bands. For hyperspectral data set 3, the total number of standard samples was 465 due to the greater distance between the target and the detector. There were 269 target points and 196 background points in the standard samples, as shown in Fig. 3(a), where 289 target points and 196 background points were included in the test samples in Fig. 3(b). The 35th

1354

Y. Qu, Z. Liu / Optik 130 (2017) 1349–1357

Fig. 3. Hyperspectral data set 3. (a) Locations of the selected standard samples. (b) Locations of the selected test samples.

(396 nm) and the 247th (991 nm) bands were selected as the optimal bands. Next, the original spectrum and different combinations of the original and derivative spectral features were imported into the minimum distance classifier and the target recognition results obtained are shown in Tables 2 and 3. When the combinatorial dimension was one, the recognition vector simply contained the original spectrum. When the combinatorial dimension was two, the recognition vector contained the original spectrum and the first derivative spectral feature. Similarly, the original spectrum, the first-order derivative, and the second-order derivative features were all included when the combinatorial dimension was three, and so on. The tables show that the OA for hyperspectral data set 2 reached 91.16% when the dimension of the combinatorial spectrum was four, which was an improvement of 16.4% compared with the original spectrum. In addition, the OA for hyperspectral data set 3 was 83.51% under the same conditions, which was 13.61% higher than the original. When the combinatorial dimension was five, the recognition accuracy of the two sets was not improved any further, thereby indicating that the fourth-order derivative feature played a relatively small role in the target recognition process, and thus it could safely be disregarded. The OA of the original data was increased in different ways by both the band selection process and by optimizing the spectral features, and there were no further improvements after the addition of higher order derivative features, thereby verifying the optimization effect of the combinatorial spectrum proposed in this study. 4. Experimental results 4.1. Experimental setup To verify the method proposed in this study, we conducted several experiments based on a port in Tianjin. The hyperspectral imager used in the measurements was a Hyperspec VNIR-N (Headwall), which is based on grating dispersion and it had a slit width of 25 ␮m. In addition, it was equipped with a CCD detector (Falcon285, Raptor). Real sea vessels were used as our experimental objects and the hyperspectral image cubes were collected at different distances and times, with 500 × 502 pixels and 250 spectral bands in the 300–1000 nm region. During the classification procedure, the spectral data were mainly classified into two classes, i.e., target and non-target, or target and background. 4.2. Experimental results We imported the combinatorial derivative spectral features and the features extracted from the original hyperspectral image by PCA [12], LDA [15], and KPCA [16] into the classifier for comparison. Different data cubes collected in specific time intervals were used to obtain the standard and testing samples. The Matlab Toolbox for Dimensionality Reduction was used to apply the three classical methods. Under the experimental conditions described above, we collected and processed three hyperspectral data sets for vessels in the remote distance. In Figs. 4–6, image (a) shows the selected standard sample for each data set (The targets are marked in red and the non-target or background locations are marked in green), image (b) shows the test samples, and image (c) shows the average spectral curves for the vessels and the background, where the positions are marked for eight optimal bands in each sample. From hyperspectral data set 4, we randomly selected 387 standard samples with 191 target points and 196 background points. The total number of test samples was 392, i.e., 196 target points and 196 background points. The combinatorial

Y. Qu, Z. Liu / Optik 130 (2017) 1349–1357

1355

Fig. 4. Hyperspectral data set 4. (a) Locations of the selected standard samples. (b) Locations of the selected test samples. (c) Average spectral curves.

Fig. 5. Hyperspectral data set 5. (a) Locations of the selected standard samples. (b) Locations of the selected test samples. (c) Average spectral curves.

Fig. 6. Hyperspectral data set 6. (a) Locations of the selected standard samples. (b) Locations of the selected test samples. (c) Average spectral curves. Table 4 Band selection for hyperspectral data set 4. Number of Bands 2 4 6 8 10 12 14 16

Selected Bands 178 31 9 9 9 9 9 9

238 178 31 31 31 31 15 11

215 54 51 51 51 31 15

238 178 54 54 54 51 31

215 92 76 76 54 51

238 178 92 92 76 54

215 178 130 92 76

238 179 178 130 92

215 179 178 130

238 215 179 178

220 215 179

238 220 183

238 215

246 220

238

246

derivative spectral features and the features extracted by PCA, LDA, and KPCA were imported into the minimum distance classifier. Table 4 shows the bands selected based on the derivative spectrum theory. Table 5 shows the results obtained with different data dimensions, including the overall classification accuracy and kappa accuracy. In the experimental comparison, the dimension of the proposed method was the product of four dimensions and the selected number of bands for the combinatorial spectrum had a single-band vector of four dimensions. Table 5 shows that the proposed method outperformed the other three methods from eight to 64 dimensions, and it had more obvious advantages with lower dimensions. When

1356

Y. Qu, Z. Liu / Optik 130 (2017) 1349–1357

Table 5 Overall accuracy (OA) and kappa coefficient for different feature optimization methods with hyperspectral data set 4. Dimensions

8 16 24 32 40 48 56 64

PCA

LDA

KPCA

Proposed

OA (%)

kappa

OA (%)

kappa

OA (%)

kappa

OA (%)

kappa

59.95 59.95 59.95 59.95 59.95 58.93 58.93 58.93

0.199 0.199 0.199 0.199 0.199 0.179 0.179 0.179

56.12 53.83 61.73 56.63 40.82 43.11 46.94 38.01

0.122 0.077 0.235 0.133 0.184 0.138 0.061 0.240

34.95 34.69 28.83 28.32 27.30 28.32 28.32 28.32

0.301 0.306 0.424 0.434 0.454 0.434 0.434 0.434

91.33 93.37 96.94 84.69 81.10 68.37 68.37 70.66

0.829 0.867 0.939 0.694 0.602 0.367 0.367 0.413

Table 6 Overall accuracy (OA) and kappa coefficient for different feature optimization methods with hyperspectral data set 5. Dimensions

PCA OA (%)

LDA OA (%)

KPCA OA (%)

Proposed OA (%)

8 16 24

58.14 58.80 57.14

45.18 43.85 66.78

64.45 62.46 63.12

85.71 80.73 65.78

Table 7 Overall accuracy (OA) and kappa coefficient for different feature optimization methods with hyperspectral data set 6. Dimensions

PCA OA (%)

LDA OA (%)

KPCA OA (%)

Proposed OA (%)

8 16 24

67.20 67.20 66.67

69.33 69.33 69.33

47.73 46.93 46.67

73.60 69.60 69.87

the data dimension was only eight, the proposed method could achieve an OA of 91.33%, which was more than 30% higher than that using PCA and LDA. In hyperspectral data set 5, the total number of standard samples was 572, i.e., 248 target points and 324 background points. We selected 301 points as the test samples, i.e., 105 target points and 196 background points. For hyperspectral data set 6, we selected 532 standard samples, i.e., 178 target points and 357 background points. The test samples comprised 375 points, i.e., 115 target points and 260 background points. We only describe the recognition results obtained with lower dimensions, which are shown in Tables 6 and 7. The tables show that for hyperspectral data set 5, the proposed method had obvious advantages with relatively low data dimensions, where the OA was more than 20% higher than that using the other three approaches. For hyperspectral data set 6, it was difficult to accurately identify one of the two vessels located at a long distance on the sea. The proposed method, PCA, and LDA had approximately the same performance, but the OA of the proposed method was slightly better than that of the other two methods. The OA of KPCA was about 20% lower than that of the other methods, and thus this method has comparatively lower potential. In summary, the target recognition performance of the proposed method in different scenarios was better than that of the other three methods, where it always obtained significantly higher accuracy with relatively low data dimensions and it has better potential for the recognition of other targets at a long distance. 5. Conclusion In this study, we combined band selection based on the zero points of the third-order derivative spectrum and combinatorial derivative feature optimization for hyperspectral target recognition. The main aim of our method is to make full use of the combination of multiple derivative spectral features to retain as much intrinsic information as possible with relatively few bands. To assess the effectiveness of the proposed method in target recognition, we compared it with three other feature optimization approaches: PCA, LDA, and KPCA. Our experiments showed that the proposed method obtained good optimization and it outperformed the other three methods, especially with relatively low data dimensions. In addition, the comprehensive utilization of multiple spectral features can reduce the requirement for a classifier and achieve higher accuracy. Acknowledgement This work was supported by National Natural Science Foundation of China (Grant No. 51675033).

Y. Qu, Z. Liu / Optik 130 (2017) 1349–1357

1357

References [1] S. Mahesh, D.S. Jayas, J. Paliwal, N.D.G. White, Hyperspectral imaging to classify and monitor quality of agricultural materials, J. Stored Prod. Res. 61 (2015) 17–26. [2] A. Plaza, J.A. Benediktsson, J.W. Boardman, J. Brazile, L. Bruzzone, G. Camps-Valls, J. Chanussot, M. Fauvel, P. Gamba, A. Gualtieri, Recent advances in techniques for hyperspectral image processing, Remote Sens. Environ. 113 (2009) 110–122. [3] K.P. Judd, J.M. Nichols, J.G. Howard, J.R. Waterman, K.M. Vilardebo, Passive shortwave infrared broadband and hyperspectral imaging in a maritime environment, Opt. Eng. 51 (2012) 419–423. [4] S. Sabbah, R. Harig, P. Rusch, J. Eichmann, A. Keens, J.H. Gerhard, Remote sensing of gases by hyperspectral imaging: system performance and measurements, Opt. Eng. 51 (2012) 1371–1379. [5] G. Camps-Valls, D. Tuia, L. Bruzzone, J.N.A. Benediktsson, Advances in hyperspectral image classification, IEEE Signal Process Mag. 31 (2014) 45–54. [6] H. Lv, X. Lu, Y. Yuan, Data-dependent semi-supervised hyperspectral image classification, 2013 IEEE China Summit & International Conference on Signal and Information Processing (2013) 664–668. [7] S. Patra, P. Modi, L. Bruzzone, Hyperspectral band selection based on rough set, IEEE Trans. Geosci. Remote Sens. 53 (2015) 5495–5503. [8] J. Zabalza, J. Ren, Z. Wang, S. Marshall, J. Wang, Singular spectrum analysis for effective feature extraction in hyperspectral imaging, Geosci. Remote Sens. Lett. IEEE 11 (2014) 1886–1890. [9] L. Lin, S. Li, Y. Zhu, L. Xu, A novel approach to band selection for hyperspectral image classification, IEEE Chin. Conf. Pattern Recogn. (2009) 1–6. [10] H. Su, Y. Sheng, P. Du, A new band selection algorithm for hyperspectral data based on fractal dimension, international archives photogrammetry, Remote Sens. Spatial Inf. (2008) 279–283. [11] X. Miao, P. Gong, S. Swope, R.L. Pu, R. Carruthers, G.L. Anderson, Detection of yellow starthistle through band selection and feature extraction from hyperspectral imagery, Photogramm. Eng. Remote Sens. 73 (2007) 1005–1015. [12] C. Rodarmel, J. Shan, Principal component analysis for hyperspectral image classification, Surv. Land Inf. Sci. 62 (2002) 115. [13] J.A. Benediktsson, J.A. Palmason, J.R. Sveinsson, Classification of hyperspectral data from urban areas based on extended morphological profiles, IEEE Trans. Geosci. Remote Sens. 43 (2005) 480–491. [14] X. Niyogi, Locality preserving projections, Neural Inf. Process. Syst. 45 (2004) 186–197. [15] M. Loog, D. De Ridder, Local discriminant analysis, Int. Conf. Pattern Recogn. (2006) 328–331. [16] M. Fauvel, J. Chanussot, J.A. Benediktsson, Kernel principal component analysis for the classification of hyperspectral remote sensing data over urban areas, Eurasip, J. Adv. Signal Process. 2009 (2009) 1–14. [17] L.M. Bruce, J. Li, Wavelets for computationally efficient hyperspectral derivative analysis, IEEE Trans. Geosci. Remote Sens. 39 (2001) 1540–1546. [18] C.A. Wessman, J.D. Aber, D.L. Peterson, An evaluation of imaging spectrometry for estimating forest canopy chemistry, Int. J. Remote Sens. 10 (1989) 1293–1316. [19] T.H. Demetriades-Shah, M.D. Steven, J.A. Clark, High resolution derivative spectra in remote sensing, Remote Sens. Environ. 33 (1990) 55–64. [20] P. Gong, R. Pu, B. Yu, Conifer species recognition: an exploratory analysis of in situ hyperspectral data, Remote Sens. Environ. 62 (1997) 189–200. [21] F. Tsai, W. Philpot, Derivative analysis of hyperspectral data for detecting spectral features, Remote Sens. A Sci. Vision Sustain. Dev. 66 (1998) 41–45. [22] G.A. Blackburn, Quantifying chlorophylls and caroteniods at leaf and canopy scales: an evaluation of some hyperspectral approaches, Remote Sens. Environ. 66 (1998) 273–285. [23] C.-I. Chang, S. Chakravarty, H.-M. Chen, Y.-C. Ouyang, Spectral derivative feature coding for hyperspectral signature analysis, Pattern Recogn. 42 (2009) 395–408. [24] Y. Li, A new bands selection algorithm for hyperspectral image using hyperspectral derivative on Clifford manifold, Inf. Technol. J. 11 (2012) 904–909. [25] J. Bao, M. Chi, J.A. Benediktsson, Spectral derivative features for classification of hyperspectral remote sensing images: experimental evaluation, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 6 (2013) 594–601. [26] X. Wang, J. Zhang, G. Ren, Y. Ma, Yellow River Estuary typical wetlands classification based on hyperspectral derivative transformation, Selected Proceedings of the Photoelectronic Technology Committee Conferences Held July–December 2013 (2014), 914210–914210. [27] J. Wang, L. Jiao, H. Liu, S. Yang, F. Liu, Hyperspectral image classification by spatial–spectral derivative-aided kernel joint sparse representation, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 8 (2015) 2485–2500. [28] F. Tsai, W. Philpot, Derivative analysis of hyperspectral data, Remote Sens. Environ. 66 (1996) 41–51. [29] C. Liu, P. Frazier, L. Kumar, Comparative assessment of the measures of thematic classification accuracy, Remote Sens. Environ. 107 (2007) 606–616. [30] R.E. Mcgrath, Kappa Coefficient, Wiley StatsRef: Statistics Reference Online, 1982.