G Model
ARTICLE IN PRESS
ASOC 2747 1–13
Applied Soft Computing xxx (2015) xxx–xxx
Contents lists available at ScienceDirect
Applied Soft Computing journal homepage: www.elsevier.com/locate/asoc
Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classification from multitemporal satellite images
1
2
Q1
3
Jawad Iounousse ∗ , Salah Er-Raki, Ahmed El Motassadeq, Hassan Chehouani LP2M2E, Faculty of Sciences and Techniques, Cadi Ayyad University, Marrakesh, Morocco
4 5
a r t i c l e
6 22
i n f o
a b s t r a c t
7
Article history: Received 22 July 2013 Received in revised form 6 November 2014 Accepted 21 January 2015 Available online xxx
8 9 10 11 12 13
21
Keywords: Unsupervised classification Probabilistic Neural Network Ward’s method Cluster validity index Land use LANDSAT and SPOT images NDVI
23
1. Introduction
14 15 Q3 16 17 18 19 20
The aim of this work is to develop an unsupervised approach based on Probabilistic Neural Network (PNN) for land use classification. A time series of high spatial resolution acquired by LANDSAT and SPOT images has been used to firstly generate the profiles of Normalized Difference Vegetation Index (NDVI) and then used for the classification procedure. The proposed method allows the implementation of cluster validity technique in PNN using Ward’s method to get clusters. This procedure is completely automatic with no parameter adjusting and instantaneous training, has high ability in producing a good cluster number estimates and provides a new point of view to use PNN as unsupervised classifier. The obtained results showed that this approach gives an accurate classification with about 3.44% of error through a comparison with the real land use and provides a better performance when comparing to usual unsupervised classification methods (fuzzy c-means (FCM) and K-means). © 2015 Published by Elsevier B.V.
Q4
The classification is one of the most useful tasks of human behavior. It aims at identifying groups of similar objects in the sense of a homogeneity criterion and therefore helps to discover the distribution of patterns and interesting correlations in large data sets. Its application has an important role for resolving many problems in pattern recognition [1], imaging, color image segmentation [2], data mining [3] and in different domains such as medicine [4], biology [5], marketing [7], energy [8], remote sensing especially land use [9], etc. There are two main methods used for classification: supervised and unsupervised. In the first one, the user defines the classes which can be conceived as a finite set. The main task is to search the patterns and then construct their corresponding mathematical models. The consistency of those models is evaluated based on the actual data. The most used supervised classification methods are: maximum likelihood classification (MLC) [12], parallelepiped method (PP) [13] and fuzzy sets [14], neural networks (NNs) [19,20], support vector machines (SVM) [21,22] and computational intelligence [23]. In other hand, the basic task of unsupervised
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
Q2
∗ Corresponding author. Tel.: +212 667660347. E-mail addresses:
[email protected] (J. Iounousse),
[email protected] (S. Er-Raki),
[email protected] (A. El Motassadeq),
[email protected] (H. Chehouani).
learning methods is to develop classification labels automatically. Unsupervised algorithms seek out similarity between pieces of data in order to determine whether they can be characterized as forming groups labeled clusters. In remote sensing for example, the unsupervised methods commonly used are split-and-merge [24], ISODATA [25], K-means, fuzzy c-means (FCM) [26,27], NNs based methods [28,29] and scale space techniques [30]. Zhang [31] reported that the classification is the most investigated topic of NNs. Furthermore, it has been noted that NNs are a promising alternatives to various conventional classification methods. The advantages of using NNs are due to the following theoretical aspects. First, NNs are self-adaptive methods as they can adjust themselves to data without any explicit specification of functional or distributional form for their underlying structure. The user can adjust parameters of learning by setting up the initial weights of the network and selecting the correct number of hidden layers and nodes at each layer. Second, NNs can approximate any function with arbitrary accuracy [32–34]. So, any classification procedure seeks a functional relationship between the group membership and the attributes of the object. In fact, if the user disposes of different networks with a variety of methods using a multivariate training data formats, it can be easy to get an accurate identification of this underlying function. Finally, NNs are able to estimate the posterior probabilities using the Bayes rule. These probabilities provide the basis to establish classification rule and perform statistical analysis [35]. For classification tasks, the Probabilistic Neural Network (PNN) is one of the most used NN. It is
http://dx.doi.org/10.1016/j.asoc.2015.01.037 1568-4946/© 2015 Published by Elsevier B.V.
Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classification from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69
G Model ASOC 2747 1–13 2 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135
ARTICLE IN PRESS J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx
a special form of radial basis function NN (RBFNN). In addition, it is considered as an implementation of the Bayes optimal decision rule in the NN form based on nearest neighbor classifiers [36,37]. Several recent studies [4,8,38–46] used PNN for classification and showed that this method provides satisfactory results if the initial target classes are defined correctly. In this way, finding the basis function centers (classes) with their appropriate number is an important step to achieve suitable classification. This can be proved by several reasons as cited by Tsekouras and Tsimikas [47]. First, the activation of each hidden node depends exclusively on the distance between the center and the current input vector. Second, in the neuron construction, the distribution of neuron’s receptive fields across the feature space is strongly linked to the locations of the respective centers. Third, the underlying data structure is revealed by these centers. They affect directly the following neurons output. Fourth, the estimation of the widths directly depends on the locations of the centers. The classification performance depends heavily on selecting appropriate spread values. Too small spread values give very spiky Probability Density Functions (PDFs) whereas too large spread values smooth out the details. The idea of using clustering algorithms in training RBFNN design has been addressed by several authors [47–55]. Pedrycz [50] applied the conditional fuzzy clustering (modified FCM) in the input space. This method has embedded the output data using the clusters weights calculated as feedback information into the input mechanism. Uykan et al. [55] employed the K-means model and showed that the main impact of the input–output clustering is the minimization of an upper bound of network’s mean square error. Staiano et al. [54] used fuzzy clustering to generate the clusters in the input space and for each cluster established an input–output relationship through a local linear regression models. Tsekouras and Tsimikas [47] proposed an algorithm to select the optimal values for the basis function centers of RBFNN. This algorithm uses the output space to adjust the input partition by combining input–output fuzzy clustering and particle swarm optimization. Based on the state-of-the-art cited above, it seems that the major challenge in clustering is to determine the optimal number of clusters to better fit a data set. In the most clustering methods, experimental evaluations of 2D/3D-data sets are used in order to visually check the validity of the results (i.e. how well the clustering algorithm discovers the clusters of the data set). But in the case of large multidimensional (more than three dimensions) data sets like multidimensional remote sensing images, effective visualization of the data set would be difficult. Moreover, the perception of clusters using available visualization tools is a difficult task for humans that are not accustomed to higher dimensional spaces and complex sets of data. To overcome this problem, many techniques based on cluster analysis have been developed in order to group either the data or the variables into clusters. To do so, many criteria have been described like partitioning methods, hierarchical clustering, etc. One of the most widespread hierarchical clustering methods is the Ward’s method [56–64]. According to Hands and Everitt [64], this method achieves good results than other hierarchical methods (single-link, complete linkage, median, average linkage, etc.) especially when the group proportions are approximately equal. In this paper, we design an unsupervised approach for land classification. It is based on a different way to implement the clustering in PNN (RBFNN design). The Ward’s method [56] is used in training the input targets. A cluster validity function, generally applied on fuzzy clustering [65–67], is developed in the hidden layer output space of PNN by varying the number of classes to find the optimal number of clusters. The proposed model is firstly tested for Fischer’s Iris data set [75,76], synthetic grayscale and RGB digital images. The consistency of this approach is assessed through a comparison with FCM clustering using the concept of cluster analysis. After, this approach is applied for time series remote sensing images acquired
Fig. 1. Overview of the study area (false color composition).
by LANDSAT and SPOT to build land use map. Finally, the obtained results are then validated with the real land use and compared with the results of usual classification methods (FCM and K-means).
2. Study area and data description
NIR − RED NIR + RED
137 138
139
The region of interest is an irrigated area located in the Haouz plain in the center of the Tensift basin (Central Morocco), 40 km east of Marrakech city. The climate is of semi-arid Mediterranean type with an average annual precipitation of about 250 mm of which 70% falls during winter and spring. The area covers about 2800 ha and is mostly flat. It has been extensively studied during the 2002–2003, 2003–2004 and 2005–2006 agricultural seasons [69–72]. The main land cover classes are cereals; mostly wheat, then barley and a significant portion is left in fallow or not cultivated (Fig. 1). More details about the study area and the climate of region can be found in [68–72]. The vegetation development in this area is affected by a great inter-annual and/or intra-annual heterogeneity [72]. Then, the land cover maps required annual update. Therefore, the effort was directed toward the development of land cover classification methods based on remote sensing data. A time series of images acquired by SPOT and LANDSAT was collected during the growing season of wheat (November 2002–June 2003) in order to extract vegetation profiles. Due to cloudiness or uncertainty in atmospheric corrections, only seven images have been used in this study. These images with the size of 122,500 pixels arranged in 350 columns and 350 rows were radiometrically calibrated and atmospherically corrected based on the reflectance of an invariant objects and transformed to NDVI maps [72]. The NDVI was derived from red and near infrared reflectance bands as follows: NDVI =
136
(1)
where NIR and RED are the reflectance measured in the nearinfrared and red band respectively.
Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classification from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037
140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163
164
165 166
G Model
ARTICLE IN PRESS
ASOC 2747 1–13
J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx
3
Fig. 2. Architecture of the PNN. Fig. 3. Flowchart of the automation procedure for PNN. 167
168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184
185
3. Description of Probabilistic Neural Network Introduced in 1990 by Specht [36,37], the Probabilistic Neural Networks (PNNs) are based on the concept of utilizing a nonparametric estimator (Parzen window) for obtaining multivariate probability density estimates. In contrast to classical RBFs, PNNs are only used for classification and they compute conditional class probabilities p (class k/x) for each of C classes. A typical PNN consists of an input layer, a pattern layer (hidden layer) and a competitive output layer. The structure of a PNN is shown in Fig. 2. Similar to RBFs, PNNs receive D-dimensional feature vectors x = (x1 ,. . .,xD ) as input. This input vector is applied to the input neurons xi (1 ≤ i ≤ D) and is passed to the neurons in the hidden layer. Here, the hidden nodes are collected into groups: one group for each of the C classes. Each hidden node in the group for class k (1 ≤ k ≤ C) corresponds to a Gaussian function centered on its associated feature vector in the kth class (there is a Gaussian for each exemplar feature vector) called Probability Density Function (PDF). PDF for a single sample xk is written as follows: fk (x) =
1 (2)
D/2
D
e−((||x−xk
||2 )/(2 2 ))
(2)
192
where is the smoothing parameter for Gaussians, D is the dimension of the input vector x and ||x − xk || = i (x − xk )2 is the Euclidean distance between vectors x and xk . All of the Gaussians in a class group feed their functional values to the same output layer node for that class, so there are C output nodes. The kth output node sums these multivariate densities to produce a vector of probabilities representing the average of the PDF’s for C samples:
193
pk (x) =
186 187 188 189 190 191
C
1 (2)
D/2
DC
e−((||x−xk ||
2 )/(2 2 ))
200
c(x) = argmax{pk (x)},
196 197 198
201 202 203 204
PNN algorithm requires initially setting of the modes (centers of the Gaussian functions), which are not evident to find. The choice of modes and their number should be without errors. An evaluation methodology is required to determine and to choose the optimal number of clusters C*. This method is usually called the cluster validity. To make PNN automatic, we used the summation of PDFs in the output of its hidden layer which takes the form of a matrix of probabilities. This matrix will allow to calculate the validity index (V) according to the variation of the class number C in a given interval [Cmin ; Cmax ] in order to determine the adequate number of clusters. Cmin and Cmax are respectively the minimum and maximum number of possible classes fixed firstly by the user. The optimal number of classes is obtained when V reaches its maximum value. The flowchart (Fig. 3) illustrates the developed automation
(3)
199
195
4. Automation of PNN classification
k=1
Finally, a competitive transfer function gives 1 for the input class which has the maximum joint PDF and 0 for all other classes. An unknown input x belongs to class k if: pk (x) > pk (x) for all k = / k. Therefore, the neuron in the decision layer determines the class belongingness of the pattern x by (4) in accordance with Bayes’s decision rule under the following assumption:
194
site better than other networks like Multilayer Perceptron (MLP) and RBFNN. Furthermore, the accuracy of the PNN classification could be increased through the incorporation of prior probabilities of class membership. However, the accuracy of each classification could also be degraded by the presence of an untrained class [73]. Thus, it is essential to choose the appropriate classes.
k = 1, 2, . . ., C
(4)
where c(x) is the estimated class of the pattern x. PNN is commonly used as supervised classifier in various applications but it is less exploited in remote sensing. Foody [73] proved that PNN was able to accurately map land cover for an agricultural
Fig. 4. Flowchart describing the functional steps of the automation procedure for PNN.
Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classification from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037
205 206 207 208 209 210
211
212 213 214 215 216 217 218 219 220 221 222 223 224 225
G Model ASOC 2747 1–13 4 226 227
ARTICLE IN PRESS J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx
procedure for PNN. Fig. 4 describes its functional stages as summarized in the following steps:
240
(1) Proceed by hierarchical agglomerative classification using Ward’s method applied to input data for obtaining the C clusters. (2) Apply the PNN algorithm by implementing the C clusters as targets input founded in step 1. (3) Calculate V corresponding to the obtained classification. V requires the values of the probability matrix produced in the output of PNN’s hidden layer (see Section 4.2). (4) Repeat step 1 for different cases of C. The number of classes C can be chosen in an interval [Cmin ; Cmax ]. Otherwise, all possible numbers of classes are taken. (5) Select the optimal number C* of clusters corresponding to maximum value of V.
241
4.1. Ward’s method for defining the centers of Gaussian functions
228 229 230 231 232 233 234 235 236 237 238 239
242 243 244 245 246 247 248 249 250 251
252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268
269 270 271 272 273 274 275 276 277 278
279 280 281 282 283 284
In statistics, Ward’s method [56] is a criterion applied in hierarchical agglomerative clustering. This method consists in providing a set of partitions into less detailed classes obtained by combining successively the parties. The idea is to build a dendrogram or a tree of data that successively merges similar groups of points. This dendrogram is obtained by hierarchical ascending: We combine at first the two closest elements which form a “summit”. It remains only (n − 1) objects and we iterate the process until a complete group. The general pseudo code of the hierarchical agglomerative clustering is writing as follow: (1) Begin with N clusters, each containing one object and number the clusters 1 through N. (2) Compute the between-cluster distance dist(A, B) as the between-object distance of the two objects in A and B respectively with A, B = 1, 2, . . ., n. Let the square matrix D = dist(A, B). If the objects are represented by vectors, use the Euclidean distance. (3) Find the most similar pair of clusters r and s, such that the distance dist(A, B) is minimal among all the pairwise distances. (4) Merge A and B to a new cluster C and compute the betweencluster distance dist(C, k) for any existing cluster k = / A, B. Once the distances are obtained, delete the rows and columns corresponding to the old cluster A and B in the D matrix, since A and B do not exist anymore. Then add a new row and column in D corresponding to cluster C. (5) Repeat Step 3 a total of N − 1 times until there is only one cluster left. Ward’s method is distinct from other methods because it uses an analysis of variance approach to evaluate the distances between clusters and therefore it is very efficient. At each stage, the Ward objective is to find those two clusters whose merger gives the minimum increase in the total error sum of squares of the within-group (or distances between the centroids of the merged clusters). The Ward distance used between two classes is the distance of their centroids squared, weighted by the size of the two clusters. It is defined as follows: pA pB 2 dist(A, B) = d (gA , gB ) (5) pA + pB where gA and gB are the gravity centers of classes A and B with the weight pA and pB . Because the Ward method minimizes the sum of within-group sums of squares (squared error criterion), the clusters tend to be hyperspherical, i.e. spherical in multidimensional D-space, and to contain roughly equal numbers of objects if the observations
are evenly distributed through D-space. This criterion is the most accurate in hierarchical ascending clustering on Euclidean data particularly when the elements are close. In this paper, we used the Ward’s method to obtain the Gaussian functions centers in the hidden layer. In order to reduce the overlap of the centers, the widths of the radial basis functions are locally determined using a spread equal to the half of the minimum distance between the neighbor centers. 4.2. Proposed cluster validity index for the optimal number of modes
MPC(C, U, N) =
C
j=1
i=1
(uij )m − N
N(C − 1)
V (C, P, N) =
C
j=1
max1≤k≤C (pkj ) − N N(C − 1)
287 288 289 290 291 292
294
(6)
where m is the fuzzification coefficient, N the number of vectors to be classified, C the number of classes and uij is the element of the partition matrix U of size C × N representing the membership of the pattern xj to the cluster Ci . Before introducing the proposed cluster validity index V, we first use the summation of Gaussians produced by the computed clusters at the output of PNN’s hidden layer (see Section 3). This latter retrieves the probability matrix P = [pjk ]C×N which represents the membership of the kth vector to the jth data input. As P takes the same form of U in Eq. (6) and the PNN’s competitive function reaches the maximum of these probabilities, V is given by the following equation:
N
286
293
Cluster analysis aims at identifying groups of similar objects, therefore helps to discover interesting distribution of patterns and correlations in large data sets. Most of clustering algorithms need to know the right number of classes C*. However, it is generally difficult to predict this number for accurate separation of data set. If it is too large, one or more good compact clusters may be broken. In contrast, if it is too small, more than one separate cluster may be merged. The problem for finding C* is usually called cluster validity. A large number of cluster validity indices are available in the literature [65–67,74]. In this paper, the proposed cluster validity function is inspired from the Dave’s Modified Partition Coefficient (MPC) used for fuzzy partition [74]. MPC is defined as:
N C
285
(7)
where P = [pjk ]C×N is the matrix membership in the output of PNN’s hidden layer representing the kth vector of probabilities for the jth data input and max (P) is the maximum value of P associated to each input. In others words, max (P) represents the closest cluster to the input. The values of V range in [0; 1]. By varying C, the maximum proposed index corresponds to the optimal distribution of clusters and produces the best clustering performance for the dataset. 4.3. Tests and comparison We realized different tests to different types of data. We started with the famous Fischer’s Iris dataset then we tested the method to simple case of synthetic grayscale image and finally to digital RGB images. All results are compared with results of the FCM clustering algorithm using the same concept of cluster validity. 4.3.1. Test using Fischer’s Iris dataset This dataset contains random samples of flowers belonging to three species of iris flowers setosa, versicolor and virginica [75,76]. For each of the species, fifty observations for four features (sepal length, sepal width, petal length and petal width) are recorded. We applied the proposed algorithm and FCM clustering by choosing the
Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classification from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037
295 296 297 298 299 300 301 302 303 304 305 306
307
308 309 310 311 312 313 314 315 316 317 318 319
320
321 322 323 324 325 326 327
328
329 330 331 332 333
334 335 336 337 338 339
G Model
ARTICLE IN PRESS
ASOC 2747 1–13
J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx Table 1 Variability of cluster validity indexes with C for Fischer’s Iris dataset.
Table 4 Variability of cluster validity indexes with C for image of Moroccan tile.
C classes
2
3
4
5
6
C classes
3
4
5
6
7
8
V (PNN) MPC (FCM)
0.681 0.663
0.697 0.675
0.591 0.609
0.605 0.531
0.628 0.528
V (PNN) MPC (FCM)
0.878 0.810
0.881 0.844
0.882 0.829
0.792 0.812
0.753 0.799
0.777 0.791
Table 2 The correct samples of iris flowers detected and the accuracy. Methods
Setosa
Versicolor
Virginica
Accuracy
Automatic PNN Automatic FCM
50 50
48 47
36 33
89.33% 86.66%
Table 3 Variability of cluster validity indexes with C for synthetic grayscale image.
340 341 342 343 344 345 346
347 348 349 350 351 352 353 354 355 356 357
358 359 360 361 362 363 364
5
C classes
3
4
5
6
7
8
V (PNN) MPC (FCM)
0.721 0.706
0.752 0.728
0.741 0.701
0.734 0.785
0.844 0.894
0.969 0.878
number of classes C in the range [Cmin = 2; Cmax = 6]. Table 1 summarized the obtained results. Both of the methods give the optimal cluster number estimate C* = 3 for the Iris data set. But the difference is in the classification accuracy. Table 2 shows the detected samples of the three Iris flowers and the accuracy of classification using the two algorithms with a notable advantage of the proposed PNN classifier.
4.3.2. Test using synthetic grayscale image We tested the proposed method on a synthetic image representing a gradient of eight levels of gray. In this case, we choose a number of classes C in the range [Cmin = 3; Cmax = 10] to see if the algorithm is capable to determine the exact number of classes. Table 3 summarized the obtained results by the unsupervised PNN and FCM. The maximum validity index (0.969) corresponds to class number of C* = 8 for the proposed approach while C* = 7 for FCM clustering. Fig. 5 represents the original and the classified images using the two methods. We can note easily that FCM has detected a false number of classes.
4.3.3. Test using digital RGB image In this case, we increase the color space to three channels (Red, Green and Blue). We used RGB image of Moroccan tile which contains five colors to show if the proposed algorithm is able to give the exact number of colors and to perform meaningful classification. The range of C chosen is [Cmin = 2; Cmax = 8]. The results are illustrated in Table 4 and represented in Fig. 6.
4.3.4. Comparison between clustering using FCM and PNN We tested the same concept of cluster validity for FCM and PNN on different types of data. The results (Figs. 5 and 6 and Tables 2–4) showed that the proposed method gives the appropriate number of classes where the FCM technique fails. Regardless the number of channels in an image, the proposed method was able to distinguish between different classes. From these performed tests, we can see that the unsupervised PNN is a valid reliable classifier. 5. Application and results After testing and comparing the proposed approach with FCM clustering over several data sets (Fischer’s iris data, grayscale and RGB digital images), this approach is applied for a sequence of seven time series of NDVI remote sensing images acquired by LANDSAT and SPOT to build land use map. The obtained results of land cover are compared with the real data collected by land sampling in the framework of VALERI Program [72,77]. For large data sets like multi-layer remote sensing images, it is desirable to firstly apply spatial classification scene by scene in order to reduce the number of color. Then the results are classified in time. To use an image as feature vector of PNN input, a serialization procedure is applied to transform the matrix image to a vector (taken row by row or column by column) providing that the opposite transformation is done to restore the output classified image. 5.1. Spatial classification We applied the proposed model to each image of the seven NDVI scenes for different number of classes C in the range [Cmin = 5; Cmax = 15]. We chose the value 5 as the minimum of classes according to the minimum diversity of the land in the studied area [72]: bare soil, cereals, trees, trees with herbs, fallow, etc. The maximum number of classes chosen is the value 15 in order to represent more levels of NDVI and to keep the majority of the information from each scene. Table 5 showed for each scene the optimal number of classes C* obtained by comparing V values. Table 6 presents the number of classes obtained in each scene before and after spatial classification. The obtained results showed that after the classification, the scenes with a narrow histogram (7 Nov 2002, 25 Dec 2002 and 27 Jun 2003) took 5 as the minimum number of classes while the scenes with a large histogram (26 Jan 2003, 11 Feb 2003,
Fig. 5. (a) synthetic grayscale image. Classified images: (b) using automatic FCM, (c) using automatic PNN.
Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classification from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037
365 366 367 368 369 370 371 372
373
374 375 376 377 378 379 380 381 382 383 384 385 386 387 388
389
390 391 392 393 394 395 396 397 398 399 400 401 402 403
G Model
ARTICLE IN PRESS
ASOC 2747 1–13
J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx
6
Fig. 6. (a) RGB image of Moroccan tile. Classified images: (b) using automatic FCM, (c) using automatic PNN.
Table 5 Variability of V with C for each NDVI scene. Number of classes C
V for each scene 7 Nov 25 Dec 26 Jan 11 Feb 31 Mar 18 May 27 Jun
5
6
7
8
9
10
11
12
13
14
15
0.882 0.755 0.693 0.710 0.656 0.630 0.853
0.772 0.670 0.632 0.646 0.682 0.648 0.690
0.705 0.671 0.684 0.666 0.683 0.669 0.650
0.673 0.656 0.695 0.693 0.702 0.663 0.705
0.709 0.664 0.707 0.699 0.707 0.667 0.700
0.771 0.696 0.717 0.708 0.716 0.698 0.696
0.711 0.663 0.711 0.714 0.711 0.670 0.685
0.724 0.668 0.712 0.705 0.714 0.689 0.690
0.684 0.677 0.721 0.695 0.721 0.683 0.690
0.677 0.710 0.713 0.706 0.711 0.716 0.667
0.659 0.694 0.698 0.695 0.714 0.715 0.721
Table 6 The effect of classification on number of levels in NDVI values for the 7 scenes.
404 405 406 407 408 409 410 411 412 413 414
NDVI scenes
7 Nov 02
25 Dec 02
26 Jan 03
11 Feb 03
31 Mar 03
18 May 03
27 Jun 03
Number of levels in the original scene Number of levels after classification
73 5
75 5
77 13
75 11
82 13
77 14
87 5
31 Mar 2003 and 18 May 2003) took a number of classes greater than 10 (Fig. 7). It is logical and reasonable because in the wheat agricultural season, there is less verdure density in the period from 7 November to 25 December corresponding to cultivation period and harvest (after 27 June) while the period from 26 January to 18 May representing the growth phase showed more verdure density and several types of crops (wheat, barley, fallow, etc.). The spatial classification adopted here is a compression strategy which reduces the number of levels of NDVI values in each scene without affecting the information contained in it. Therefore, the number of NDVI temporal combinations is reduced from 121,493
to 4619 allowing a minimization of the running process time in the following stage.
5.2. Temporal classification To extract the different temporal behavior of NDVI, we applied the proposed algorithm to the time series of seven scenes spatially classified. Cluster validity index V by varying C in the range [Cmin = 5; Cmax = 15] is represented in Table 7. As shown in this table, the maximum value of V is about 0.99 which corresponds to fifteen classes.
Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classification from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037
415 416
417
418 419 420 421 422
G Model
ARTICLE IN PRESS
ASOC 2747 1–13
J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx
7
Fig. 7. Histograms of the 7 scenes.
Table 7 Variability of V with C for multitemporal NDVI scenes. Number of classes C
5
6
7
8
9
10
11
12
13
14
15
Cluster validity index V
0.893
0.889
0.930
0.951
0.971
0.962
0.969
0.977
0.983
0.986
0.990
Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classification from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037
G Model
ARTICLE IN PRESS
ASOC 2747 1–13
J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx
8
Fig. 8. The 15 obtained NDVI profiles.
423 424
425
426 427 428 429 430 431
432 433 434 435 436 437 438 439 440 441
Fig. 8 illustrates the temporal evolution of the fifteen obtained NDVI profiles which are used next to identify the main crop types.
is characterized by tree profiles having high NDVI range variations (>0.17) labeled as tree with herbs (i.e. on annual understory). - Annual crops (cereals) classes are defined by NDVI values rising above 0.18 showing significant vegetation biomass. Also these classes are characterized by NDVI values below 0.18 at the beginning and at the end of the growth phase (i.e. a period of bare soil) which can make a distinction with evergreen tree classes. Annual crops include mainly cereals like wheat and barley which can be divided in early and late classes considering its temporal NDVI profiles [71]. Five profiles (profile 8, 9, 10, 11 and 12) representing early (wheat/barley) cultivated before 15 December and three others (profile 13, 14 and 15) corresponding to late (wheat/barley) cultivated after 15 January with narrow growth phase. - Fallow land class can be defined as land with almost no vegetation or very poorly developed wheat with low NDVI values (i.e. rainfall wheat). This class is characterized by NDVI values less than 0.4 in the growth phase (profile 1, 2 and 3).
5.3. Crop types identification using NDVI profiles The crop identification method was designed based on field observations. These field data were made up of some thematic classes, including all the species encountered and their combinations. Based on the temporal evolution of the fifteen obtained NDVI profiles (Fig. 8), they can be merged to six following main classes: - Bare soil class (profile 4) is evident to find. This class has a constant value of NDVI around 0.15 which corresponds to clay soil [71]. Some fluctuations of NDVI could be explained by the variation of soil moisture and by small grown herbs due to the rainfall events. - Tree classes are considered as NDVI profile relatively constant over time and above 0.18 taking into account that the majority of them are evergreen trees (olive and citrus trees). Moreover, there are two tree classes. The first one is tree on bare soil class (profile 5) which is clearly identified by NDVI values higher than 0.43 with limited variations in range of 0.17. The other class (profile 6 and 7)
Table 8 showed the land cover classes which brand each NDVI evolution after identification.
Table 8 NDVI profiles merging and their interpretations. NDVI profiles
Interpretation of classes
7 Nov
25 Dec
26 Jan
11 Feb
31 Mar
18 May
27 Jun
0.13 0.18 0.18
0.24 0.26 0.21
0.24 0.34 0.22
0.26 0.39 0.23
0.28 0.39 0.36
0.17 0.26 0.25
0.08 0.27 0.27
Follow
0.12 0.43
0.15 0.47
0.20 0.48
0.19 0.55
0.13 0.60
0.17 0.49
0.14 0.49
Bare soil Trees on bare soil
0.37 0.28
0.42 0.40
0.39 0.43
0.39 0.50
0.53 0.62
0.44 0.36
0.47 0.27
0.14 0.14 0.13 0.16 0.15
0.17 0.17 0.15 0.19 0.39
0.45 0.38 0.26 0.27 0.52
0.60 0.51 0.41 0.35 0.55
0.78 0.58 0.79 0.60 0.49
0.26 0.24 0.27 0.28 0.24
0.18 0.17 0.16 0.27 0.19
Trees with herbs
0.13 0.12 0.14
0.15 0.15 0.18
0.28 0.14 0.09
0.34 0.23 0.12
0.36 0.61 0.42
0.17 0.27 0.30
0.08 0.16 0.11
Early (wheat/barley)
Late (wheat/barley)
Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classification from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037
442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459
460 461
G Model ASOC 2747 1–13
ARTICLE IN PRESS J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx
9
Fig. 9. Land cover map obtained after classification and merging.
462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487
488
489 490 491 492 493 494 495 496 497 498 499 500 501
After merging, the obtained classes give the land cover map illustrated in Fig. 9 with the following percentages: 17.24% of bare soil, 12.14% of fallow, 39.47% of late (wheat/barley), 22.44% of early (wheat/barley), 2.57% of trees on bare soil and 6.13% of trees with herbs. The obtained results are in agreement with the previous studies using the same data set but with other techniques of classification [71,72]. Er-Raki et al. [71] used the K-means to classify the cereals and they found two main classes: early and late sowing wheat as it has been found in this work. Simonneaux et al. [72] used the supervised classification method based on the use of simple phenological criteria of each crop. This method is called decision tree [78–80] which uses the minimum, the maximum or the range of NDVI as the phenological criteria. They obtained a general land cover (annual crops, trees, annual crops + trees, bare soils). By comparison with the presented classification, they did not classify the annual crops class on early and late sowing cereals and did not separate it from the fallow land class. In Spain, Julien et al. [9] used the Yearly Land Cover Dynamics (YLCD) approach based on annual behavior of LST (Land Surface Temperature) and NDVI. A time series of LANDSAT-5 images has been used to classify an agricultural area into crop types using the maximum likelihood classification. They obtained the main classes: cereals, irrigated and non-irrigated crops. As in this work, wheat and barley were merged in a single class (cereals) due to their NDVI similarity. While the irrigated and non-irrigated crops were separated in different classes due to strong differences in NDVI and LST annual behaviors. 5.4. Validation of the obtained results In order to check the accuracy of our approach, we compared the obtained land use with the real one established in the study region. During the 2002–2003 season, data sets were collected by VALERI Program [72,77] on a series of 450 sample plots distributed across the plain (Fig. 10 and Table 9). We merged the classes representing the same type of cover: Building is added to bare soil class, olive trees to trees on bare soil class and barleys to cereals class in order to make a comparison with the results of classification. To visualize the performance of the proposed algorithm, a matching matrix is presented in Table 10. This matrix was obtained by comparison of the proposed automatic PNN classification with the validation data mentioned above. Table 11 and Fig. 11 showed the results of this comparison. The overall
Fig. 10. Mapping of vegetation types survey in the region by sampling during 2002–2003 season.
accuracy is computed as the proportion of true prediction results (samples correctly classified) [81]. The obtained classes shown in Table 11 are recognized with an overall accuracy of 96.56% which is higher in comparison with other studies [9,72]. This high accuracy Table 9 Land cover of the region by sampling in 2002–2003 season. Classes
Number of parcels
Percentage
Cereals Barleys Fallow/not cultivated Alfalfa Olive trees Building Bare soil – fallow Trees on bare soil Trees with herbs
234 29 59 4 5 3 77 11 28
52.00% 6.45% 13.11% 0.89% 1.11% 0.67% 17.11% 2.44% 6.22%
Total
450
100%
Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classification from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037
502 503 504 505
G Model
ARTICLE IN PRESS
ASOC 2747 1–13
J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx
10
Fig. 11. Comparison of land cover results using the proposed classification and by sampling.
Fig. 12. Land cover map obtained after classification and merging using FCM.
Table 10 Results of matching matrix using the proposed method. Predicted classes
Actual classes Cereals Fallow Trees on bare soil Trees with herbs Bare soil Alfalfa
Cereals
Fallow
Trees on bare soil
Trees with herbs
Bare soil
Alfalfa
100% 5.91% 5.08% 0.6% 0.4% 0%
0% 92.6% 0% 0% 2.58% 0%
0% 0% 72.39% 0.85% 0% 0%
0% 0% 22.53% 98.55% 0% 0%
0% 1.49% 0% 0% 97.02% 0%
0% 0% 0% 0% 0% 0%
Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classification from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037
G Model
ARTICLE IN PRESS
ASOC 2747 1–13
J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx
11
Fig. 13. Land cover map obtained after classification and merging using K-means.
Table 11 Comparison of land cover results.
Table 12 Performance comparison between unsupervised PNN, FCM and K-means.
Classes
% by sampling
% by PNN classification
Classification precision
Cereals (wheat + barley) Fallow/not cultivated Trees on bare soil Trees with herbs Bare soil Alfalfa
58.45 13.11 3.55 6.22 17.78 0.89
61.91 12.14 2.57 6.13 17.25 –
100% 92.60% 72.39% 98.55% 97.02% 0%
Total
100
100
Accuracya = 96.56%
Classes
a
Accuracy =
Classification precision Unsupervised PNN Classes Cereals (wheat + barley) Fallow/not cultivated Trees on bare soil Trees with herbs Bare soil Alfalfa Overall accuracy
100% 92.6% 72.39% 98.55% 97.02% 0% 96.56%
FCM 70.15% 99.16% 92.12% 89.98% 0% 79%
K-means 74.82% 95.35% 84.51% 93.09% 95.61% 0% 82.02%
(precision × % sampling).
6. Conclusion 506 507 508 509 510 511
512 513
514 515 516 517 518 519 520 521 522 523 524 525 526 527
demonstrates that the proposed approach is globally able to retrieve automatically and accurately the existing crop types in the region. The class of alfalfa is characterized by a NDVI profile with frequent variation due to several cutting thus it was not recognized. More successive scenes with no cloudiness could overcome this miss-classification. 5.5. Performance comparison between unsupervised PNN, FCM and K-means In order to bring to light the performance of the proposed method, a comparative study with other usual classification methods (FCM, K-means) is done by using the same sequence of seven time series of NDVI images. Land cover maps obtained by using FCM and K-means are shown in Figs. 12 and 13, respectively. The performance comparisons between the three methods are displayed in Table 12. As expected, FCM method has given a less accuracy (79%) and a less cluster number estimation. Two classes (trees with herbs and trees in bare soil) are merged due to their similar clusters distribution. Regarding K-means method, it has done a reasonable job with 82.02% of accuracy and detailed classes (good number and type). As a conclusion, the proposed approach using PNN provides better results with higher accuracy (96.56% of overall accuracy) in comparison with other methods.
In this work, we have proposed an unsupervised approach based on Probabilistic Neural Network with the implementation of cluster validity technique using Ward’s method. This technique was firstly validated through a series of tests including Fischer’s Iris data set, synthetic grayscale and RGB digital images. A comparison with the classical automatic clustering by FCM using the same concept of cluster validation showed that the proposed algorithm was more accurate. The strength of this approach is its capability to solve a classification problem with unknown class number. This is the concrete case of land use classification which proceeds with large multidimensional data sets like multidimensional remote sensing images. Here, effective visualization of the data set and class number prediction are difficult. In this way, the developed approach was applied for a sequence of seven time series of NDVI remote sensing images acquired by LANDSAT and SPOT to build land use map. Spatial and temporal classifications were adopted. In fact, the procedure has proven its efficiency to distinguish between different classes and to determine the land cover especially for the large surfaces where the available information on soil and crops is limited. The obtained results are compared with real land use and showed 96.56% of overall accuracy which is higher than other usual methods like FCM and K-means. Thus, the implementation of cluster validity technique in PNN gives rise to a reliable tool for data classifying especially for massive data like multilayer images.
Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classification from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037
528
529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552
G Model ASOC 2747 1–13
J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx
12 553 554 555 556 557 558
Q5 559 560
561
562 563 564
565
566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625
ARTICLE IN PRESS
The principal advantages of the proposed approach are: (1) it is completely automatic with no parameter adjusting and instantaneous training, (2) it has high ability to perform good cluster number estimates, (3) it provides a new point of view to use PNN as unsupervised classifier, and (4) it is rapid and easy to implement in soft computing for classification. Uncited references [6,10,11,15–18]. Acknowledgements The authors are grateful to the International Joint LaboratoryTREMA (http://trema.ucam.ac.ma/) and CNES for providing us the satellite data. References [1] L. Zheng, X. He, Classification techniques in pattern recognition, in: Proceedings of the 13th International Conference in Central Europe on Computer Graphics, Visualization and computer vision (WSCG 2005), 2005, pp. 77–78. [2] V. Mohan, A. Kannan, Color image classification and retrieval using image mining techniques, Int. J. Eng. Sci. Technol. 2 (5) (2010) 1014–1020. [3] T.N. Phyu, Survey of classification techniques in data mining, in: Proceedings of the International MultiConference of Engineers and Computer Scientists (IMECS 2009), 2009, pp. 978–988. [4] J.S. Wang, W.C. Chiang, Y.L. Hsu, Y.T.C. Yang, ECG arrhythmia classification using a probabilistic neural network with a feature reduction method, Neurocomputing 116 (2013) 38–45. [5] J.I. Arribas, G.V. Sánchez-Ferrero, G. Ruiz-Ruiz, J. Gómez-Gil, Leaf classification in sunflower crops by computer vision and neural networks, Comput. Electron. Agric. 78 (1) (2011) 9–18. [6] R. Raghuraj, S. Lakshminarayanan, Variable predictive model based classification algorithm for effective separation of protein structural classes, Comput. Biol. Chem. 32 (4) (2008) 302–306. [7] F. Kaefer, C.M. Heilman, S.D. Ramenofsky, A neural network application to consumer classification to improve the timing of direct marketing activities, Comput. Oper. Res. 32 (10) (2005) 2595–2615. [8] N. Huang, D. Xu, X. Liu, L. Lin, Power quality disturbances classification based on S-transform and probabilistic neural network, Neurocomputing 98 (2012) 12–23. [9] Y. Julien, J.A. Sobrino, J.-C. Jiménez-Munoz, Land use classification from multitemporal Landsat imagery using the Yearly Land Cover Dynamics (YLCD) method, Int. J. Appl. Earth Obs. Geoinf. 13 (2011) 711–720. [10] R. Geerken, B. Zaitchik, J.P. Evans, Classifying rangeland vegetation type and coverage from NDVI time series using Fourier Filtered Cycle Similarity, Int. J. Remote Sens. 26 (24) (2005) 5535–5554. [11] A. Halder, A. Ghosh, S. Ghosh, Supervised and unsupervised land use map generation from remotely sensed images using ant based systems, Appl. Soft Comput. 11 (2011) 5770–5781. [12] J. Sun, J. Yang, C. Zhang, W. Yun, J. Qu, Automatic remotely sensed image classification in a grid environment based on the maximum likelihood method, Math. Comput. Modell. 58 (3–4) (2013) 573–581. [13] Q. Lü, M. Tang, Detection of hidden bruise on kiwi fruit using hyperspectral imaging and parallelepiped classification, Procedia Environ. Sci. 12 (B) (2012) 1172–1179. [14] C. Chen, Fuzzy training data for fuzzy supervised classification of remotely sensed images, in: Proceedings of 20th Asian Conference on Remote Sensing (ACRS 1999), 1999, pp. 460–465. [15] A. Ghosh, S. Meher, B.U. Shankar, A novel fuzzy classifier based on product aggregation operator, Pattern Recognit. 41 (6) (2008) 961–971. [16] F. Maselli, A. Rodolfi, C. Copnese, Fuzzy classification of spatially degraded thematic Mapper data for the estimation of sub-pixel components, Int. J. Remote Sens. 17 (3) (1996) 537–551. [17] F. Melgani, B.A. Hashemy, S. Taha, An explicit fuzzy supervised classification method for multispectral remote sensing images, IEEE Trans. Geosci. Remote Sens. 38 (1) (2000) 287–295. [18] Y. Liu, B. Zhang, L.-m. Wang, N. Wang, A self-trained semisupervised SVM approach to the remote sensing land cover classification, Comput. Geosci. 59 (2013) 98–107. [19] D.M. Miller, E.J. Kaminsky, S. Rana, Neural network classification of remotesensing data, Comput. Geosci. 21 (1995) 377–386. [20] J. Zeng, H.-f. Guo, Y.-m. HU, Artificial neural network model for identifying taxi gross emitter from remote sensing data of vehicle emission, J. Environ. Sci. 19 (2007) 427–431. [21] M. Brown, H. Lewis, S. Gunn, Linear spectral mixture models and support vector machines for remote sensing, IEEE Trans. Geosci. Remote Sens. 38 (5) (2000) 2346–2360.
[22] C. Huang, L. Davis, J. Townshend, An assessment of support vector machines for land cover classification, Int. J. Remote Sens. 23 (2002) 725–749. [23] D. Stathakis, A. Vasilakos, Comparison of computational intelligence based classification techniques for remotely sensed optical image classification, IEEE Trans. Geosci. Remote Sens. 44 (8) (2008) 2305–2318. [24] R. Laprade, Split-and-merge segmentation of aerial photographs, Comput. Vis. Graphics Image Process. 48 (1) (1988) 77–86. [25] B.J. Irvin, S.J. Ventura, B.K. Slater, Fuzzy and isodata classification of landform elements from digital terrain data in Pleasant Valley Wisconsin, Geoderma 77 (2–4) (1997) 137–154. [26] S. Pal, A. Ghosh, B. Uma Shankar, Segmentation of remotely sensed images with fuzzy thresholding and quantitative evaluation, Int. J. Remote Sens. 21 (11) (2000) 2269–2300. [27] R. Cannon, R. Dave, J. Bezdek, M. Trivedi, Segmentation of a thematic Mapper image using fuzzy c-means clustering algorithm, IEEE Trans. Geosci. Remote Sens. 24 (1) (1986) 400–408. [28] M.N. Kurnaz, Z. Dokur, T. Olmez, Segmentation of remote-sensing images by incremental neural network, Pattern Recogn. Lett. 26 (8) (2005) 1104–1316. [29] Z. Zhou, S. Wei, X. Zhang, X. Zhao, Remote sensing image segmentation based on self-organizing map at multiple scale, in: Proceedings of SPIE Geoinformatics: Remotely Sensed Data and Information, USA, 2007, pp. 122–126. [30] Y. Wong, E. Posner, A new clustering algorithm applicable to polarimetric and SAR images, IEEE Trans. Geosci. Remote Sens. 31 (3) (1993) 634–644. [31] G.P. Zhang, Neural networks for classification: a survey, IEEE Trans. Syst. Man Cybernet. C: Appl. Rev. 30 (4) (November 2000). [32] G. Cybenko, Approximation by superpositions of a sigmoidal function, Math. Control. Signals Syst. 2 (1989) 303–314. [33] K. Hornik, Approximation capabilities of multilayer feedforward networks, Neural Networks 4 (1991) 251–257. [34] K. Hornik, M. Stinchcombe, H. White, Multilayer feedforward networks are universal approximators, Neural Networks 2 (1989) 359–366. [35] M.D. Richard, R. Lippmann, Neural network classifiers estimate Bayesian a posteriori probabilities, Neural Comput. 3 (1991) 461–483. [36] D.F. Specht, Probabilistic neural networks for classification, mapping, or associative memory, in: IEEE International Conference on Neural Networks 1, July, 1988, pp. 525–532. [37] D.F. Specht, Probabilistic neural networks, Neural Networks 3 (1) (1990) 109–118. [38] T.D. Gancheva, D.K. Tasoulisb, M.N. Vrahatisb, N.D. Fakotakis, Generalized locally recurrent probabilistic neural networks with application to textindependent speaker verification, Neurocomputing 70 (2007) 1424–1438. [39] X. Fu, Y. Ying, Y. Zhou, H. Xu, Application of probabilistic neural networks in qualitative analysis of near infrared spectra: determination of producing area and variety of loquats, Anal. Chim. Acta 598 (2007) 27–33. [40] A.L.I. Oliveira, F.R.G. Costa, C.O.S. Filho, Novelty detection with constructive probabilistic neural networks, Neurocomputing 71 (2008) 1046–1053. [41] J. Grim, J. Hora, Iterative principles of recognition in probabilistic neural networks, Neural Networks 21 (2008) 838–846. [42] H. Adeli, A. Panakkat, A probabilistic neural network for earthquake magnitude prediction, Neural Networks 22 (2009) 1018–1024. [43] F. Ozturk, F. Ozen, A new license plate recognition system based on probabilistic neural networks, Procedia Technol. 1 (2012) 124–128. [44] O. Er, A.C. Tanrikulu, A. Abakay, F. Temurtas, An approach based on probabilistic neural network for diagnosis of Mesothelioma’s disease, Comput. Electr. Eng. 38 (2012) 75–81. [45] J. Jia, C. Liang, J. Cao, Z. Li, Application of probabilistic neural network in bacterial identification by biochemical profiles, J. Microbiol. Methods 94 (2013) 86–87. [46] S. Timung, T.K. Mandal, Prediction of flow pattern of gas–liquid flow through circular microchannel using probabilistic neural network, Appl. Soft Comput. 13 (2013) 1674–1685. [47] G.E. Tsekouras, J. Tsimikas, On training RBF neural networks using input–output fuzzy clustering and particle swarm optimization, Fuzzy Sets Syst. 221 (2013) 65–89. [48] J. González, I. Rojas, H. Pomares, J. Ortega, A. Prieto, A new clustering technique for function approximation, IEEE Trans. Neural Networks 13 (1) (2002) 132–142. [49] H.-S. Park, W. Pedrycz, S.-K. Oh, Granular neural networks and their development through context-based clustering and adjustable dimensionality of receptive fields, IEEE Trans. Neural Networks 20 (10) (2009) 1604–1616. [50] W. Pedrycz, Conditional fuzzy clustering in the design of radial basis function neural networks, IEEE Trans. Neural Networks 9 (4) (1998) 601–612. [51] W. Pedrycz, H.S. Park, S.K. Oh, A granular-oriented development of functional radial basis function neural networks, Neurocomputing 72 (2008) 420–435. [52] H.-S. Park, Y.-D. Chung, S.-K. Oh, W. Pedrycz, H.-K. Kim, Design pf information granule-oriented RBF neural networks and its application to power supply for high-field magnet, Eng. Appl. Artif. Intell. 24 (2011) 543–554. [53] S.B. Roh, T.C. Ahn, W. Pedrycz, The design methodology of radial basis function neural networks based on fuzzy K-nearest neighbors approach, Fuzzy Sets Syst. 161 (13) (2010) 1803–1822. [54] A. Staiano, R. Tagliaferri, W. Pedrycz, Improving RBF networks performance in regression tasks by means of a supervised fuzzy clustering, Neurocomputing 69 (2006) 1570–1581. [55] Z. Uykan, C. Guzelis, M.E. Celebi, H.N. Koivo, Analysis of input–output clustering for determining centers of RBFNN, IEEE Trans. Neural Networks 11 (4) (2000) 851–858.
Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classification from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037
626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711
G Model ASOC 2747 1–13
ARTICLE IN PRESS J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx
712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748
[56] J.H. Ward, Hierarchical grouping to optimize an objective function, J. Am. Stat. Assoc. 58 (1963) 236–244. [57] A.M. Dillner, J.J. Schauer, W.F. Christensen, G.R. Cass, A quantitative method for clustering size distributions of elements, Atmos. Environ. 39 (2005) 1525–1537. [58] H.C. Lu, C.L. Chang, J.C. Hsieh, Classification of PM10 distributions in Taiwan, Atmos. Environ. 40 (2006) 1452–1463. [59] T. Varin, R. Bureau, C. Mueller, P. Willett, Clustering files of chemical structures using the Székely–Rizzo generalization of Ward’s method, J. Mol. Graphics Modell. 28 (2009) 187–195. [60] N. Picard, F. Mortier, V. Rossi, S. Gourlet-Fleury, Clustering species using a model of population dynamics and aggregation theory, Ecol. Modell. 221 (2010) 152–160. [61] A. Carteron, M. Jeanmougin, F. Leprieur, S. Spatharis, Assessing the efficiency of clustering algorithms and goodness-of-fit measures using phytoplankton field data, Ecol. Inform. 9 (2012) 64–68. [62] C.S. Malley, C.F. Braban, M.R. Heal, The application of hierarchical cluster analysis and non-negative matrix factorization to European atmospheric monitoring site classification, Atmos. Res. 138 (2014) 30–40. [63] Y. Xiao, C. Mignolet, J.-F. Mari, M. Benoît, Modeling the spatial distribution of crop sequences at a large regional scale using land-cover survey data: a case from France, Comput. Electron. Agric. 102 (2014) 51–63. [64] S. Hands, B. Everitt, A Monte Carlo study of the recovery of cluster structure in binary data by hierarchical clustering techniques, Multivar. Behav. Res. 22 (1987) 235–243. [65] A. Ghosh, N.S. Mishra, S. Ghosh, Fuzzy clustering algorithms for unsupervised change detection in remote sensing images, Inform. Sci. 181 (2011) 699–715. [66] W. Wang, Y. Zhang, On fuzzy cluster validity indices, Fuzzy Sets Syst. 158 (2007) 2095–2117. [67] K.-L. Wu, M.-S. Yang, A cluster validity index for fuzzy clustering, Pattern Recogn. Lett. 26 (2005) 1275–1291. [68] A. Chehbouni, R. Escadafal, G. Boulet, B. Duchemin, V. Simonneaux, G. Dedieu, B. Mougenot, S. Khabba, H. Kharrou, O. Merlin, A. Chaponnière, J. Ezzahar, S. Er-Raki, J. Hoedjes, R. Hadria, H. Abourida, A. Cheggour, F. Raibi, L. Hanich, N. Guemouria, Ah. Chehbouni, A. Lahrouni, A. Olioso, F. Jacob, J. Sobrino, The use of remotely sensed data for integrated hydrological modeling in arid and semiarid regions: the SUDMED program, Int. J. Remote Sens. 29 (2008) 5161–5181. [69] R. Hadria, B. Duchemin, A. Lahroun, S. Khabba, S. Er-Raki, G. Dedieu, A. Chehbouni, Monitoring of irrigated wheat in a semi-arid climate using crop
[70]
[71]
[72]
[73]
[74] [75] [76] [77] [78] [79]
[80]
[81]
13
modelling and remote sensing data: impact of satellite revisit time frequency, Int. J. Remote Sens. 27 (2006) 1093–1117. B. Duchemin, R. Hadria, S. Erraki, G. Boulet, P. Maisongrande, A. Chehbouni, R. Escadafal, J. Ezzahar, J.C.B. Hoedjes, M.H. Kharrou, S. Khabba, B. Mougenot, A. Olioso, J.C. Rodriguez, V. Simonneaux, Monitoring wheat phenology and irrigation in Central Morocco: on the use of relationships between evapotranspiration, crops coefficients, leaf area index and remotely sensed vegetation indices, Agric. Water Manage. 79 (2006) 1–27. S. Er-Raki, A. Chehbouni, N. Guemouria, B. Duchemin, J. Ezzahar, R. Hadria, Combining FAO-56 model and ground-based remote sensing to estimate water consumptions of wheat crops in a semi-arid region, Agric. Water Manage. 87 (2007) 41–54. V. Simonneaux, B. Duchemin, D. Helson, S. Er-Raki, A. Olioso, A.G. Chehbouni, The use of high-resolution image time series for crop classification and evapotranspiration estimate over an irrigated area in central Morocco, Int. J. Remote Sens. 29 (2008) 95–116. G.M. Foody, Thematic mapping from remotely sensed data with neural networks: MLP, RBF and PNN based approaches, J. Geoghraph. Syst. 3 (2001) 217–232. R.N. Dave, Validating fuzzy partition obtained through c-shells clustering, Pattern Recogn. Lett. 17 (1996) 613–623. R.A. Fisher, The use of multiple measurements in taxonomic problems, Ann. Eugenics 7 (2) (1936) 179–188. E. Anderson, The species problem in Iris, Ann. MO Bot. Gard. 23 (3) (1936) 457–509. VALERI (Validation of Land European Remote sensing Instruments) Program, http://w3.avignon.inra.fr/valeri/. D. Lloyd, A phenological classification of terrestrial vegetation cover using shortwave vegetation index imagery, Int. J. Remote Sens. 11 (1990) 2269–2279. R.S. De Fries, M. Hansen, J.R.G. Townshend, R. Sohlberg, Global land cover classifications at 8 km spatial resolution: the use of training data derived from Landsat imagery in decision tree classifiers, Int. J. Remote Sens. 19 (1998) 3141–3168. S.S. Ray, V.K. Dadhwal, Estimation of crop evapotranspiration of irrigation command area using remote sensing and GIS, Agric. Water Manage. 49 (2001) 239–249. R. Congalton, K. Green, Assessing the Accuracy of Remotely Sensed Data, Lewis Publications, Boca Raton, FL, 1999.
Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classification from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037
749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786