Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classification from multitemporal satellite images

G Model ARTICLE IN PRESS ASOC 2747 1–13 Applied Soft Computing xxx (2015) xxx–xxx Contents lists available at ScienceDirect Applied Soft Computin...

Download PDF

5MB Sizes 0 Downloads 54 Views

Report

Full Text

G Model

ARTICLE IN PRESS

ASOC 2747 1–13

Applied Soft Computing xxx (2015) xxx–xxx

Contents lists available at ScienceDirect

Applied Soft Computing journal homepage: www.elsevier.com/locate/asoc

Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classiﬁcation from multitemporal satellite images

1

2

Q1

3

Jawad Iounousse ∗ , Salah Er-Raki, Ahmed El Motassadeq, Hassan Chehouani LP2M2E, Faculty of Sciences and Techniques, Cadi Ayyad University, Marrakesh, Morocco

4 5

a r t i c l e

6 22

i n f o

a b s t r a c t

7

Article history: Received 22 July 2013 Received in revised form 6 November 2014 Accepted 21 January 2015 Available online xxx

8 9 10 11 12 13

21

Keywords: Unsupervised classiﬁcation Probabilistic Neural Network Ward’s method Cluster validity index Land use LANDSAT and SPOT images NDVI

23

1. Introduction

14 15 Q3 16 17 18 19 20

The aim of this work is to develop an unsupervised approach based on Probabilistic Neural Network (PNN) for land use classiﬁcation. A time series of high spatial resolution acquired by LANDSAT and SPOT images has been used to ﬁrstly generate the proﬁles of Normalized Difference Vegetation Index (NDVI) and then used for the classiﬁcation procedure. The proposed method allows the implementation of cluster validity technique in PNN using Ward’s method to get clusters. This procedure is completely automatic with no parameter adjusting and instantaneous training, has high ability in producing a good cluster number estimates and provides a new point of view to use PNN as unsupervised classiﬁer. The obtained results showed that this approach gives an accurate classiﬁcation with about 3.44% of error through a comparison with the real land use and provides a better performance when comparing to usual unsupervised classiﬁcation methods (fuzzy c-means (FCM) and K-means). © 2015 Published by Elsevier B.V.

Q4

The classiﬁcation is one of the most useful tasks of human behavior. It aims at identifying groups of similar objects in the sense of a homogeneity criterion and therefore helps to discover the distribution of patterns and interesting correlations in large data sets. Its application has an important role for resolving many problems in pattern recognition [1], imaging, color image segmentation [2], data mining [3] and in different domains such as medicine [4], biology [5], marketing [7], energy [8], remote sensing especially land use [9], etc. There are two main methods used for classiﬁcation: supervised and unsupervised. In the ﬁrst one, the user deﬁnes the classes which can be conceived as a ﬁnite set. The main task is to search the patterns and then construct their corresponding mathematical models. The consistency of those models is evaluated based on the actual data. The most used supervised classiﬁcation methods are: maximum likelihood classiﬁcation (MLC) [12], parallelepiped method (PP) [13] and fuzzy sets [14], neural networks (NNs) [19,20], support vector machines (SVM) [21,22] and computational intelligence [23]. In other hand, the basic task of unsupervised

24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42

Q2

∗ Corresponding author. Tel.: +212 667660347. E-mail addresses: [email protected] (J. Iounousse), [email protected] (S. Er-Raki), [email protected] (A. El Motassadeq), [email protected] (H. Chehouani).

learning methods is to develop classiﬁcation labels automatically. Unsupervised algorithms seek out similarity between pieces of data in order to determine whether they can be characterized as forming groups labeled clusters. In remote sensing for example, the unsupervised methods commonly used are split-and-merge [24], ISODATA [25], K-means, fuzzy c-means (FCM) [26,27], NNs based methods [28,29] and scale space techniques [30]. Zhang [31] reported that the classiﬁcation is the most investigated topic of NNs. Furthermore, it has been noted that NNs are a promising alternatives to various conventional classiﬁcation methods. The advantages of using NNs are due to the following theoretical aspects. First, NNs are self-adaptive methods as they can adjust themselves to data without any explicit speciﬁcation of functional or distributional form for their underlying structure. The user can adjust parameters of learning by setting up the initial weights of the network and selecting the correct number of hidden layers and nodes at each layer. Second, NNs can approximate any function with arbitrary accuracy [32–34]. So, any classiﬁcation procedure seeks a functional relationship between the group membership and the attributes of the object. In fact, if the user disposes of different networks with a variety of methods using a multivariate training data formats, it can be easy to get an accurate identiﬁcation of this underlying function. Finally, NNs are able to estimate the posterior probabilities using the Bayes rule. These probabilities provide the basis to establish classiﬁcation rule and perform statistical analysis [35]. For classiﬁcation tasks, the Probabilistic Neural Network (PNN) is one of the most used NN. It is

http://dx.doi.org/10.1016/j.asoc.2015.01.037 1568-4946/© 2015 Published by Elsevier B.V.

Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classiﬁcation from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037

43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69

G Model ASOC 2747 1–13 2 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135

ARTICLE IN PRESS J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx

a special form of radial basis function NN (RBFNN). In addition, it is considered as an implementation of the Bayes optimal decision rule in the NN form based on nearest neighbor classiﬁers [36,37]. Several recent studies [4,8,38–46] used PNN for classiﬁcation and showed that this method provides satisfactory results if the initial target classes are deﬁned correctly. In this way, ﬁnding the basis function centers (classes) with their appropriate number is an important step to achieve suitable classiﬁcation. This can be proved by several reasons as cited by Tsekouras and Tsimikas [47]. First, the activation of each hidden node depends exclusively on the distance between the center and the current input vector. Second, in the neuron construction, the distribution of neuron’s receptive ﬁelds across the feature space is strongly linked to the locations of the respective centers. Third, the underlying data structure is revealed by these centers. They affect directly the following neurons output. Fourth, the estimation of the widths directly depends on the locations of the centers. The classiﬁcation performance depends heavily on selecting appropriate spread values. Too small spread values give very spiky Probability Density Functions (PDFs) whereas too large spread values smooth out the details. The idea of using clustering algorithms in training RBFNN design has been addressed by several authors [47–55]. Pedrycz [50] applied the conditional fuzzy clustering (modiﬁed FCM) in the input space. This method has embedded the output data using the clusters weights calculated as feedback information into the input mechanism. Uykan et al. [55] employed the K-means model and showed that the main impact of the input–output clustering is the minimization of an upper bound of network’s mean square error. Staiano et al. [54] used fuzzy clustering to generate the clusters in the input space and for each cluster established an input–output relationship through a local linear regression models. Tsekouras and Tsimikas [47] proposed an algorithm to select the optimal values for the basis function centers of RBFNN. This algorithm uses the output space to adjust the input partition by combining input–output fuzzy clustering and particle swarm optimization. Based on the state-of-the-art cited above, it seems that the major challenge in clustering is to determine the optimal number of clusters to better ﬁt a data set. In the most clustering methods, experimental evaluations of 2D/3D-data sets are used in order to visually check the validity of the results (i.e. how well the clustering algorithm discovers the clusters of the data set). But in the case of large multidimensional (more than three dimensions) data sets like multidimensional remote sensing images, effective visualization of the data set would be difﬁcult. Moreover, the perception of clusters using available visualization tools is a difﬁcult task for humans that are not accustomed to higher dimensional spaces and complex sets of data. To overcome this problem, many techniques based on cluster analysis have been developed in order to group either the data or the variables into clusters. To do so, many criteria have been described like partitioning methods, hierarchical clustering, etc. One of the most widespread hierarchical clustering methods is the Ward’s method [56–64]. According to Hands and Everitt [64], this method achieves good results than other hierarchical methods (single-link, complete linkage, median, average linkage, etc.) especially when the group proportions are approximately equal. In this paper, we design an unsupervised approach for land classiﬁcation. It is based on a different way to implement the clustering in PNN (RBFNN design). The Ward’s method [56] is used in training the input targets. A cluster validity function, generally applied on fuzzy clustering [65–67], is developed in the hidden layer output space of PNN by varying the number of classes to ﬁnd the optimal number of clusters. The proposed model is ﬁrstly tested for Fischer’s Iris data set [75,76], synthetic grayscale and RGB digital images. The consistency of this approach is assessed through a comparison with FCM clustering using the concept of cluster analysis. After, this approach is applied for time series remote sensing images acquired

Fig. 1. Overview of the study area (false color composition).

by LANDSAT and SPOT to build land use map. Finally, the obtained results are then validated with the real land use and compared with the results of usual classiﬁcation methods (FCM and K-means).

2. Study area and data description

NIR − RED NIR + RED

137 138

139

The region of interest is an irrigated area located in the Haouz plain in the center of the Tensift basin (Central Morocco), 40 km east of Marrakech city. The climate is of semi-arid Mediterranean type with an average annual precipitation of about 250 mm of which 70% falls during winter and spring. The area covers about 2800 ha and is mostly ﬂat. It has been extensively studied during the 2002–2003, 2003–2004 and 2005–2006 agricultural seasons [69–72]. The main land cover classes are cereals; mostly wheat, then barley and a signiﬁcant portion is left in fallow or not cultivated (Fig. 1). More details about the study area and the climate of region can be found in [68–72]. The vegetation development in this area is affected by a great inter-annual and/or intra-annual heterogeneity [72]. Then, the land cover maps required annual update. Therefore, the effort was directed toward the development of land cover classiﬁcation methods based on remote sensing data. A time series of images acquired by SPOT and LANDSAT was collected during the growing season of wheat (November 2002–June 2003) in order to extract vegetation proﬁles. Due to cloudiness or uncertainty in atmospheric corrections, only seven images have been used in this study. These images with the size of 122,500 pixels arranged in 350 columns and 350 rows were radiometrically calibrated and atmospherically corrected based on the reﬂectance of an invariant objects and transformed to NDVI maps [72]. The NDVI was derived from red and near infrared reﬂectance bands as follows: NDVI =

136

(1)

where NIR and RED are the reﬂectance measured in the nearinfrared and red band respectively.

Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classiﬁcation from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037

140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163

164

165 166

G Model

ARTICLE IN PRESS

ASOC 2747 1–13

J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx

3

Fig. 2. Architecture of the PNN. Fig. 3. Flowchart of the automation procedure for PNN. 167

168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184

185

3. Description of Probabilistic Neural Network Introduced in 1990 by Specht [36,37], the Probabilistic Neural Networks (PNNs) are based on the concept of utilizing a nonparametric estimator (Parzen window) for obtaining multivariate probability density estimates. In contrast to classical RBFs, PNNs are only used for classiﬁcation and they compute conditional class probabilities p (class k/x) for each of C classes. A typical PNN consists of an input layer, a pattern layer (hidden layer) and a competitive output layer. The structure of a PNN is shown in Fig. 2. Similar to RBFs, PNNs receive D-dimensional feature vectors x = (x1 ,. . .,xD ) as input. This input vector is applied to the input neurons xi (1 ≤ i ≤ D) and is passed to the neurons in the hidden layer. Here, the hidden nodes are collected into groups: one group for each of the C classes. Each hidden node in the group for class k (1 ≤ k ≤ C) corresponds to a Gaussian function centered on its associated feature vector in the kth class (there is a Gaussian for each exemplar feature vector) called Probability Density Function (PDF). PDF for a single sample xk is written as follows: fk (x) =

1 (2)

D/2

D

e−((||x−xk

||2 )/(2 2 ))

(2)

192

where is the smoothing parameter for Gaussians, D is the dimension of the input vector x and ||x − xk || = i (x − xk )2 is the Euclidean distance between vectors x and xk . All of the Gaussians in a class group feed their functional values to the same output layer node for that class, so there are C output nodes. The kth output node sums these multivariate densities to produce a vector of probabilities representing the average of the PDF’s for C samples:

193

pk (x) =

186 187 188 189 190 191

C

1 (2)

D/2

DC

e−((||x−xk ||

2 )/(2 2 ))

200

c(x) = argmax{pk (x)},

196 197 198

201 202 203 204

PNN algorithm requires initially setting of the modes (centers of the Gaussian functions), which are not evident to ﬁnd. The choice of modes and their number should be without errors. An evaluation methodology is required to determine and to choose the optimal number of clusters C*. This method is usually called the cluster validity. To make PNN automatic, we used the summation of PDFs in the output of its hidden layer which takes the form of a matrix of probabilities. This matrix will allow to calculate the validity index (V) according to the variation of the class number C in a given interval [Cmin ; Cmax ] in order to determine the adequate number of clusters. Cmin and Cmax are respectively the minimum and maximum number of possible classes ﬁxed ﬁrstly by the user. The optimal number of classes is obtained when V reaches its maximum value. The ﬂowchart (Fig. 3) illustrates the developed automation

(3)

199

195

4. Automation of PNN classiﬁcation

k=1

Finally, a competitive transfer function gives 1 for the input class which has the maximum joint PDF and 0 for all other classes. An unknown input x belongs to class k if: pk (x) > pk (x) for all k = / k. Therefore, the neuron in the decision layer determines the class belongingness of the pattern x by (4) in accordance with Bayes’s decision rule under the following assumption:

194

site better than other networks like Multilayer Perceptron (MLP) and RBFNN. Furthermore, the accuracy of the PNN classiﬁcation could be increased through the incorporation of prior probabilities of class membership. However, the accuracy of each classiﬁcation could also be degraded by the presence of an untrained class [73]. Thus, it is essential to choose the appropriate classes.

k = 1, 2, . . ., C

(4)

where c(x) is the estimated class of the pattern x. PNN is commonly used as supervised classiﬁer in various applications but it is less exploited in remote sensing. Foody [73] proved that PNN was able to accurately map land cover for an agricultural

Fig. 4. Flowchart describing the functional steps of the automation procedure for PNN.

Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classiﬁcation from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037

205 206 207 208 209 210

211

212 213 214 215 216 217 218 219 220 221 222 223 224 225

G Model ASOC 2747 1–13 4 226 227

ARTICLE IN PRESS J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx

procedure for PNN. Fig. 4 describes its functional stages as summarized in the following steps:

240

(1) Proceed by hierarchical agglomerative classiﬁcation using Ward’s method applied to input data for obtaining the C clusters. (2) Apply the PNN algorithm by implementing the C clusters as targets input founded in step 1. (3) Calculate V corresponding to the obtained classiﬁcation. V requires the values of the probability matrix produced in the output of PNN’s hidden layer (see Section 4.2). (4) Repeat step 1 for different cases of C. The number of classes C can be chosen in an interval [Cmin ; Cmax ]. Otherwise, all possible numbers of classes are taken. (5) Select the optimal number C* of clusters corresponding to maximum value of V.

241

4.1. Ward’s method for deﬁning the centers of Gaussian functions

228 229 230 231 232 233 234 235 236 237 238 239

242 243 244 245 246 247 248 249 250 251

252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268

269 270 271 272 273 274 275 276 277 278

279 280 281 282 283 284

In statistics, Ward’s method [56] is a criterion applied in hierarchical agglomerative clustering. This method consists in providing a set of partitions into less detailed classes obtained by combining successively the parties. The idea is to build a dendrogram or a tree of data that successively merges similar groups of points. This dendrogram is obtained by hierarchical ascending: We combine at ﬁrst the two closest elements which form a “summit”. It remains only (n − 1) objects and we iterate the process until a complete group. The general pseudo code of the hierarchical agglomerative clustering is writing as follow: (1) Begin with N clusters, each containing one object and number the clusters 1 through N. (2) Compute the between-cluster distance dist(A, B) as the between-object distance of the two objects in A and B respectively with A, B = 1, 2, . . ., n. Let the square matrix D = dist(A, B). If the objects are represented by vectors, use the Euclidean distance. (3) Find the most similar pair of clusters r and s, such that the distance dist(A, B) is minimal among all the pairwise distances. (4) Merge A and B to a new cluster C and compute the betweencluster distance dist(C, k) for any existing cluster k = / A, B. Once the distances are obtained, delete the rows and columns corresponding to the old cluster A and B in the D matrix, since A and B do not exist anymore. Then add a new row and column in D corresponding to cluster C. (5) Repeat Step 3 a total of N − 1 times until there is only one cluster left. Ward’s method is distinct from other methods because it uses an analysis of variance approach to evaluate the distances between clusters and therefore it is very efﬁcient. At each stage, the Ward objective is to ﬁnd those two clusters whose merger gives the minimum increase in the total error sum of squares of the within-group (or distances between the centroids of the merged clusters). The Ward distance used between two classes is the distance of their centroids squared, weighted by the size of the two clusters. It is deﬁned as follows: pA pB 2 dist(A, B) = d (gA , gB ) (5) pA + pB where gA and gB are the gravity centers of classes A and B with the weight pA and pB . Because the Ward method minimizes the sum of within-group sums of squares (squared error criterion), the clusters tend to be hyperspherical, i.e. spherical in multidimensional D-space, and to contain roughly equal numbers of objects if the observations

are evenly distributed through D-space. This criterion is the most accurate in hierarchical ascending clustering on Euclidean data particularly when the elements are close. In this paper, we used the Ward’s method to obtain the Gaussian functions centers in the hidden layer. In order to reduce the overlap of the centers, the widths of the radial basis functions are locally determined using a spread equal to the half of the minimum distance between the neighbor centers. 4.2. Proposed cluster validity index for the optimal number of modes

MPC(C, U, N) =

C

j=1

i=1

(uij )m − N

N(C − 1)

V (C, P, N) =

C

j=1

max1≤k≤C (pkj ) − N N(C − 1)

287 288 289 290 291 292

294

(6)

where m is the fuzziﬁcation coefﬁcient, N the number of vectors to be classiﬁed, C the number of classes and uij is the element of the partition matrix U of size C × N representing the membership of the pattern xj to the cluster Ci . Before introducing the proposed cluster validity index V, we ﬁrst use the summation of Gaussians produced by the computed clusters at the output of PNN’s hidden layer (see Section 3). This latter retrieves the probability matrix P = [pjk ]C×N which represents the membership of the kth vector to the jth data input. As P takes the same form of U in Eq. (6) and the PNN’s competitive function reaches the maximum of these probabilities, V is given by the following equation:

N

286

293

Cluster analysis aims at identifying groups of similar objects, therefore helps to discover interesting distribution of patterns and correlations in large data sets. Most of clustering algorithms need to know the right number of classes C*. However, it is generally difﬁcult to predict this number for accurate separation of data set. If it is too large, one or more good compact clusters may be broken. In contrast, if it is too small, more than one separate cluster may be merged. The problem for ﬁnding C* is usually called cluster validity. A large number of cluster validity indices are available in the literature [65–67,74]. In this paper, the proposed cluster validity function is inspired from the Dave’s Modiﬁed Partition Coefﬁcient (MPC) used for fuzzy partition [74]. MPC is deﬁned as:

N C

285

(7)

where P = [pjk ]C×N is the matrix membership in the output of PNN’s hidden layer representing the kth vector of probabilities for the jth data input and max (P) is the maximum value of P associated to each input. In others words, max (P) represents the closest cluster to the input. The values of V range in [0; 1]. By varying C, the maximum proposed index corresponds to the optimal distribution of clusters and produces the best clustering performance for the dataset. 4.3. Tests and comparison We realized different tests to different types of data. We started with the famous Fischer’s Iris dataset then we tested the method to simple case of synthetic grayscale image and ﬁnally to digital RGB images. All results are compared with results of the FCM clustering algorithm using the same concept of cluster validity. 4.3.1. Test using Fischer’s Iris dataset This dataset contains random samples of ﬂowers belonging to three species of iris ﬂowers setosa, versicolor and virginica [75,76]. For each of the species, ﬁfty observations for four features (sepal length, sepal width, petal length and petal width) are recorded. We applied the proposed algorithm and FCM clustering by choosing the

Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classiﬁcation from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037

295 296 297 298 299 300 301 302 303 304 305 306

307

308 309 310 311 312 313 314 315 316 317 318 319

320

321 322 323 324 325 326 327

328

329 330 331 332 333

334 335 336 337 338 339

G Model

ARTICLE IN PRESS

ASOC 2747 1–13

J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx Table 1 Variability of cluster validity indexes with C for Fischer’s Iris dataset.

Table 4 Variability of cluster validity indexes with C for image of Moroccan tile.

C classes

2

3

4

5

6

C classes

3

4

5

6

7

8

V (PNN) MPC (FCM)

0.681 0.663

0.697 0.675

0.591 0.609

0.605 0.531

0.628 0.528

V (PNN) MPC (FCM)

0.878 0.810

0.881 0.844

0.882 0.829

0.792 0.812

0.753 0.799

0.777 0.791

Table 2 The correct samples of iris ﬂowers detected and the accuracy. Methods

Setosa

Versicolor

Virginica

Accuracy

Automatic PNN Automatic FCM

50 50

48 47

36 33

89.33% 86.66%

Table 3 Variability of cluster validity indexes with C for synthetic grayscale image.

340 341 342 343 344 345 346

347 348 349 350 351 352 353 354 355 356 357

358 359 360 361 362 363 364

5

C classes

3

4

5

6

7

8

V (PNN) MPC (FCM)

0.721 0.706

0.752 0.728

0.741 0.701

0.734 0.785

0.844 0.894

0.969 0.878

number of classes C in the range [Cmin = 2; Cmax = 6]. Table 1 summarized the obtained results. Both of the methods give the optimal cluster number estimate C* = 3 for the Iris data set. But the difference is in the classiﬁcation accuracy. Table 2 shows the detected samples of the three Iris ﬂowers and the accuracy of classiﬁcation using the two algorithms with a notable advantage of the proposed PNN classiﬁer.

4.3.2. Test using synthetic grayscale image We tested the proposed method on a synthetic image representing a gradient of eight levels of gray. In this case, we choose a number of classes C in the range [Cmin = 3; Cmax = 10] to see if the algorithm is capable to determine the exact number of classes. Table 3 summarized the obtained results by the unsupervised PNN and FCM. The maximum validity index (0.969) corresponds to class number of C* = 8 for the proposed approach while C* = 7 for FCM clustering. Fig. 5 represents the original and the classiﬁed images using the two methods. We can note easily that FCM has detected a false number of classes.

4.3.3. Test using digital RGB image In this case, we increase the color space to three channels (Red, Green and Blue). We used RGB image of Moroccan tile which contains ﬁve colors to show if the proposed algorithm is able to give the exact number of colors and to perform meaningful classiﬁcation. The range of C chosen is [Cmin = 2; Cmax = 8]. The results are illustrated in Table 4 and represented in Fig. 6.

4.3.4. Comparison between clustering using FCM and PNN We tested the same concept of cluster validity for FCM and PNN on different types of data. The results (Figs. 5 and 6 and Tables 2–4) showed that the proposed method gives the appropriate number of classes where the FCM technique fails. Regardless the number of channels in an image, the proposed method was able to distinguish between different classes. From these performed tests, we can see that the unsupervised PNN is a valid reliable classiﬁer. 5. Application and results After testing and comparing the proposed approach with FCM clustering over several data sets (Fischer’s iris data, grayscale and RGB digital images), this approach is applied for a sequence of seven time series of NDVI remote sensing images acquired by LANDSAT and SPOT to build land use map. The obtained results of land cover are compared with the real data collected by land sampling in the framework of VALERI Program [72,77]. For large data sets like multi-layer remote sensing images, it is desirable to ﬁrstly apply spatial classiﬁcation scene by scene in order to reduce the number of color. Then the results are classiﬁed in time. To use an image as feature vector of PNN input, a serialization procedure is applied to transform the matrix image to a vector (taken row by row or column by column) providing that the opposite transformation is done to restore the output classiﬁed image. 5.1. Spatial classiﬁcation We applied the proposed model to each image of the seven NDVI scenes for different number of classes C in the range [Cmin = 5; Cmax = 15]. We chose the value 5 as the minimum of classes according to the minimum diversity of the land in the studied area [72]: bare soil, cereals, trees, trees with herbs, fallow, etc. The maximum number of classes chosen is the value 15 in order to represent more levels of NDVI and to keep the majority of the information from each scene. Table 5 showed for each scene the optimal number of classes C* obtained by comparing V values. Table 6 presents the number of classes obtained in each scene before and after spatial classiﬁcation. The obtained results showed that after the classiﬁcation, the scenes with a narrow histogram (7 Nov 2002, 25 Dec 2002 and 27 Jun 2003) took 5 as the minimum number of classes while the scenes with a large histogram (26 Jan 2003, 11 Feb 2003,

Fig. 5. (a) synthetic grayscale image. Classiﬁed images: (b) using automatic FCM, (c) using automatic PNN.

Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classiﬁcation from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037

365 366 367 368 369 370 371 372

373

374 375 376 377 378 379 380 381 382 383 384 385 386 387 388

389

390 391 392 393 394 395 396 397 398 399 400 401 402 403

G Model

ARTICLE IN PRESS

ASOC 2747 1–13

J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx

6

Fig. 6. (a) RGB image of Moroccan tile. Classiﬁed images: (b) using automatic FCM, (c) using automatic PNN.

Table 5 Variability of V with C for each NDVI scene. Number of classes C

V for each scene 7 Nov 25 Dec 26 Jan 11 Feb 31 Mar 18 May 27 Jun

5

6

7

8

9

10

11

12

13

14

15

0.882 0.755 0.693 0.710 0.656 0.630 0.853

0.772 0.670 0.632 0.646 0.682 0.648 0.690

0.705 0.671 0.684 0.666 0.683 0.669 0.650

0.673 0.656 0.695 0.693 0.702 0.663 0.705

0.709 0.664 0.707 0.699 0.707 0.667 0.700

0.771 0.696 0.717 0.708 0.716 0.698 0.696

0.711 0.663 0.711 0.714 0.711 0.670 0.685

0.724 0.668 0.712 0.705 0.714 0.689 0.690

0.684 0.677 0.721 0.695 0.721 0.683 0.690

0.677 0.710 0.713 0.706 0.711 0.716 0.667

0.659 0.694 0.698 0.695 0.714 0.715 0.721

Table 6 The effect of classiﬁcation on number of levels in NDVI values for the 7 scenes.

404 405 406 407 408 409 410 411 412 413 414

NDVI scenes

7 Nov 02

25 Dec 02

26 Jan 03

11 Feb 03

31 Mar 03

18 May 03

27 Jun 03

Number of levels in the original scene Number of levels after classiﬁcation

73 5

75 5

77 13

75 11

82 13

77 14

87 5

31 Mar 2003 and 18 May 2003) took a number of classes greater than 10 (Fig. 7). It is logical and reasonable because in the wheat agricultural season, there is less verdure density in the period from 7 November to 25 December corresponding to cultivation period and harvest (after 27 June) while the period from 26 January to 18 May representing the growth phase showed more verdure density and several types of crops (wheat, barley, fallow, etc.). The spatial classiﬁcation adopted here is a compression strategy which reduces the number of levels of NDVI values in each scene without affecting the information contained in it. Therefore, the number of NDVI temporal combinations is reduced from 121,493

to 4619 allowing a minimization of the running process time in the following stage.

5.2. Temporal classiﬁcation To extract the different temporal behavior of NDVI, we applied the proposed algorithm to the time series of seven scenes spatially classiﬁed. Cluster validity index V by varying C in the range [Cmin = 5; Cmax = 15] is represented in Table 7. As shown in this table, the maximum value of V is about 0.99 which corresponds to ﬁfteen classes.

Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classiﬁcation from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037

415 416

417

418 419 420 421 422

G Model

ARTICLE IN PRESS

ASOC 2747 1–13

J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx

7

Fig. 7. Histograms of the 7 scenes.

Table 7 Variability of V with C for multitemporal NDVI scenes. Number of classes C

5

6

7

8

9

10

11

12

13

14

15

Cluster validity index V

0.893

0.889

0.930

0.951

0.971

0.962

0.969

0.977

0.983

0.986

0.990

Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classiﬁcation from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037

G Model

ARTICLE IN PRESS

ASOC 2747 1–13

J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx

8

Fig. 8. The 15 obtained NDVI proﬁles.

423 424

425

426 427 428 429 430 431

432 433 434 435 436 437 438 439 440 441

Fig. 8 illustrates the temporal evolution of the ﬁfteen obtained NDVI proﬁles which are used next to identify the main crop types.

is characterized by tree proﬁles having high NDVI range variations (>0.17) labeled as tree with herbs (i.e. on annual understory). - Annual crops (cereals) classes are deﬁned by NDVI values rising above 0.18 showing signiﬁcant vegetation biomass. Also these classes are characterized by NDVI values below 0.18 at the beginning and at the end of the growth phase (i.e. a period of bare soil) which can make a distinction with evergreen tree classes. Annual crops include mainly cereals like wheat and barley which can be divided in early and late classes considering its temporal NDVI proﬁles [71]. Five proﬁles (proﬁle 8, 9, 10, 11 and 12) representing early (wheat/barley) cultivated before 15 December and three others (proﬁle 13, 14 and 15) corresponding to late (wheat/barley) cultivated after 15 January with narrow growth phase. - Fallow land class can be deﬁned as land with almost no vegetation or very poorly developed wheat with low NDVI values (i.e. rainfall wheat). This class is characterized by NDVI values less than 0.4 in the growth phase (proﬁle 1, 2 and 3).

5.3. Crop types identiﬁcation using NDVI proﬁles The crop identiﬁcation method was designed based on ﬁeld observations. These ﬁeld data were made up of some thematic classes, including all the species encountered and their combinations. Based on the temporal evolution of the ﬁfteen obtained NDVI proﬁles (Fig. 8), they can be merged to six following main classes: - Bare soil class (proﬁle 4) is evident to ﬁnd. This class has a constant value of NDVI around 0.15 which corresponds to clay soil [71]. Some ﬂuctuations of NDVI could be explained by the variation of soil moisture and by small grown herbs due to the rainfall events. - Tree classes are considered as NDVI proﬁle relatively constant over time and above 0.18 taking into account that the majority of them are evergreen trees (olive and citrus trees). Moreover, there are two tree classes. The ﬁrst one is tree on bare soil class (proﬁle 5) which is clearly identiﬁed by NDVI values higher than 0.43 with limited variations in range of 0.17. The other class (proﬁle 6 and 7)

Table 8 showed the land cover classes which brand each NDVI evolution after identiﬁcation.

Table 8 NDVI proﬁles merging and their interpretations. NDVI proﬁles

Interpretation of classes

7 Nov

25 Dec

26 Jan

11 Feb

31 Mar

18 May

27 Jun

0.13 0.18 0.18

0.24 0.26 0.21

0.24 0.34 0.22

0.26 0.39 0.23

0.28 0.39 0.36

0.17 0.26 0.25

0.08 0.27 0.27

Follow

0.12 0.43

0.15 0.47

0.20 0.48

0.19 0.55

0.13 0.60

0.17 0.49

0.14 0.49

Bare soil Trees on bare soil

0.37 0.28

0.42 0.40

0.39 0.43

0.39 0.50

0.53 0.62

0.44 0.36

0.47 0.27

0.14 0.14 0.13 0.16 0.15

0.17 0.17 0.15 0.19 0.39

0.45 0.38 0.26 0.27 0.52

0.60 0.51 0.41 0.35 0.55

0.78 0.58 0.79 0.60 0.49

0.26 0.24 0.27 0.28 0.24

0.18 0.17 0.16 0.27 0.19

Trees with herbs

0.13 0.12 0.14

0.15 0.15 0.18

0.28 0.14 0.09

0.34 0.23 0.12

0.36 0.61 0.42

0.17 0.27 0.30

0.08 0.16 0.11

Early (wheat/barley)

Late (wheat/barley)

Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classiﬁcation from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037

442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459

460 461

G Model ASOC 2747 1–13

ARTICLE IN PRESS J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx

9

Fig. 9. Land cover map obtained after classiﬁcation and merging.

462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487

488

489 490 491 492 493 494 495 496 497 498 499 500 501

After merging, the obtained classes give the land cover map illustrated in Fig. 9 with the following percentages: 17.24% of bare soil, 12.14% of fallow, 39.47% of late (wheat/barley), 22.44% of early (wheat/barley), 2.57% of trees on bare soil and 6.13% of trees with herbs. The obtained results are in agreement with the previous studies using the same data set but with other techniques of classiﬁcation [71,72]. Er-Raki et al. [71] used the K-means to classify the cereals and they found two main classes: early and late sowing wheat as it has been found in this work. Simonneaux et al. [72] used the supervised classiﬁcation method based on the use of simple phenological criteria of each crop. This method is called decision tree [78–80] which uses the minimum, the maximum or the range of NDVI as the phenological criteria. They obtained a general land cover (annual crops, trees, annual crops + trees, bare soils). By comparison with the presented classiﬁcation, they did not classify the annual crops class on early and late sowing cereals and did not separate it from the fallow land class. In Spain, Julien et al. [9] used the Yearly Land Cover Dynamics (YLCD) approach based on annual behavior of LST (Land Surface Temperature) and NDVI. A time series of LANDSAT-5 images has been used to classify an agricultural area into crop types using the maximum likelihood classiﬁcation. They obtained the main classes: cereals, irrigated and non-irrigated crops. As in this work, wheat and barley were merged in a single class (cereals) due to their NDVI similarity. While the irrigated and non-irrigated crops were separated in different classes due to strong differences in NDVI and LST annual behaviors. 5.4. Validation of the obtained results In order to check the accuracy of our approach, we compared the obtained land use with the real one established in the study region. During the 2002–2003 season, data sets were collected by VALERI Program [72,77] on a series of 450 sample plots distributed across the plain (Fig. 10 and Table 9). We merged the classes representing the same type of cover: Building is added to bare soil class, olive trees to trees on bare soil class and barleys to cereals class in order to make a comparison with the results of classiﬁcation. To visualize the performance of the proposed algorithm, a matching matrix is presented in Table 10. This matrix was obtained by comparison of the proposed automatic PNN classiﬁcation with the validation data mentioned above. Table 11 and Fig. 11 showed the results of this comparison. The overall

Fig. 10. Mapping of vegetation types survey in the region by sampling during 2002–2003 season.

accuracy is computed as the proportion of true prediction results (samples correctly classiﬁed) [81]. The obtained classes shown in Table 11 are recognized with an overall accuracy of 96.56% which is higher in comparison with other studies [9,72]. This high accuracy Table 9 Land cover of the region by sampling in 2002–2003 season. Classes

Number of parcels

Percentage

Cereals Barleys Fallow/not cultivated Alfalfa Olive trees Building Bare soil – fallow Trees on bare soil Trees with herbs

234 29 59 4 5 3 77 11 28

52.00% 6.45% 13.11% 0.89% 1.11% 0.67% 17.11% 2.44% 6.22%

Total

450

100%

Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classiﬁcation from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037

502 503 504 505

G Model

ARTICLE IN PRESS

ASOC 2747 1–13

J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx

10

Fig. 11. Comparison of land cover results using the proposed classiﬁcation and by sampling.

Fig. 12. Land cover map obtained after classiﬁcation and merging using FCM.

Table 10 Results of matching matrix using the proposed method. Predicted classes

Actual classes Cereals Fallow Trees on bare soil Trees with herbs Bare soil Alfalfa

Cereals

Fallow

Trees on bare soil

Trees with herbs

Bare soil

Alfalfa

100% 5.91% 5.08% 0.6% 0.4% 0%

0% 92.6% 0% 0% 2.58% 0%

0% 0% 72.39% 0.85% 0% 0%

0% 0% 22.53% 98.55% 0% 0%

0% 1.49% 0% 0% 97.02% 0%

0% 0% 0% 0% 0% 0%

Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classiﬁcation from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037

G Model

ARTICLE IN PRESS

ASOC 2747 1–13

J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx

11

Fig. 13. Land cover map obtained after classiﬁcation and merging using K-means.

Table 11 Comparison of land cover results.

Table 12 Performance comparison between unsupervised PNN, FCM and K-means.

Classes

% by sampling

% by PNN classiﬁcation

Classiﬁcation precision

Cereals (wheat + barley) Fallow/not cultivated Trees on bare soil Trees with herbs Bare soil Alfalfa

58.45 13.11 3.55 6.22 17.78 0.89

61.91 12.14 2.57 6.13 17.25 –

100% 92.60% 72.39% 98.55% 97.02% 0%

Total

100

100

Accuracya = 96.56%

Classes

a

Accuracy =

Classiﬁcation precision Unsupervised PNN Classes Cereals (wheat + barley) Fallow/not cultivated Trees on bare soil Trees with herbs Bare soil Alfalfa Overall accuracy

100% 92.6% 72.39% 98.55% 97.02% 0% 96.56%

FCM 70.15% 99.16% 92.12% 89.98% 0% 79%

K-means 74.82% 95.35% 84.51% 93.09% 95.61% 0% 82.02%

(precision × % sampling).

6. Conclusion 506 507 508 509 510 511

512 513

514 515 516 517 518 519 520 521 522 523 524 525 526 527

demonstrates that the proposed approach is globally able to retrieve automatically and accurately the existing crop types in the region. The class of alfalfa is characterized by a NDVI proﬁle with frequent variation due to several cutting thus it was not recognized. More successive scenes with no cloudiness could overcome this miss-classiﬁcation. 5.5. Performance comparison between unsupervised PNN, FCM and K-means In order to bring to light the performance of the proposed method, a comparative study with other usual classiﬁcation methods (FCM, K-means) is done by using the same sequence of seven time series of NDVI images. Land cover maps obtained by using FCM and K-means are shown in Figs. 12 and 13, respectively. The performance comparisons between the three methods are displayed in Table 12. As expected, FCM method has given a less accuracy (79%) and a less cluster number estimation. Two classes (trees with herbs and trees in bare soil) are merged due to their similar clusters distribution. Regarding K-means method, it has done a reasonable job with 82.02% of accuracy and detailed classes (good number and type). As a conclusion, the proposed approach using PNN provides better results with higher accuracy (96.56% of overall accuracy) in comparison with other methods.

In this work, we have proposed an unsupervised approach based on Probabilistic Neural Network with the implementation of cluster validity technique using Ward’s method. This technique was ﬁrstly validated through a series of tests including Fischer’s Iris data set, synthetic grayscale and RGB digital images. A comparison with the classical automatic clustering by FCM using the same concept of cluster validation showed that the proposed algorithm was more accurate. The strength of this approach is its capability to solve a classiﬁcation problem with unknown class number. This is the concrete case of land use classiﬁcation which proceeds with large multidimensional data sets like multidimensional remote sensing images. Here, effective visualization of the data set and class number prediction are difﬁcult. In this way, the developed approach was applied for a sequence of seven time series of NDVI remote sensing images acquired by LANDSAT and SPOT to build land use map. Spatial and temporal classiﬁcations were adopted. In fact, the procedure has proven its efﬁciency to distinguish between different classes and to determine the land cover especially for the large surfaces where the available information on soil and crops is limited. The obtained results are compared with real land use and showed 96.56% of overall accuracy which is higher than other usual methods like FCM and K-means. Thus, the implementation of cluster validity technique in PNN gives rise to a reliable tool for data classifying especially for massive data like multilayer images.

Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classiﬁcation from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037

528

529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552

G Model ASOC 2747 1–13

J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx

12 553 554 555 556 557 558

Q5 559 560

561

562 563 564

565

566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625

ARTICLE IN PRESS

The principal advantages of the proposed approach are: (1) it is completely automatic with no parameter adjusting and instantaneous training, (2) it has high ability to perform good cluster number estimates, (3) it provides a new point of view to use PNN as unsupervised classiﬁer, and (4) it is rapid and easy to implement in soft computing for classiﬁcation. Uncited references [6,10,11,15–18]. Acknowledgements The authors are grateful to the International Joint LaboratoryTREMA (http://trema.ucam.ac.ma/) and CNES for providing us the satellite data. References [1] L. Zheng, X. He, Classiﬁcation techniques in pattern recognition, in: Proceedings of the 13th International Conference in Central Europe on Computer Graphics, Visualization and computer vision (WSCG 2005), 2005, pp. 77–78. [2] V. Mohan, A. Kannan, Color image classiﬁcation and retrieval using image mining techniques, Int. J. Eng. Sci. Technol. 2 (5) (2010) 1014–1020. [3] T.N. Phyu, Survey of classiﬁcation techniques in data mining, in: Proceedings of the International MultiConference of Engineers and Computer Scientists (IMECS 2009), 2009, pp. 978–988. [4] J.S. Wang, W.C. Chiang, Y.L. Hsu, Y.T.C. Yang, ECG arrhythmia classiﬁcation using a probabilistic neural network with a feature reduction method, Neurocomputing 116 (2013) 38–45. [5] J.I. Arribas, G.V. Sánchez-Ferrero, G. Ruiz-Ruiz, J. Gómez-Gil, Leaf classiﬁcation in sunﬂower crops by computer vision and neural networks, Comput. Electron. Agric. 78 (1) (2011) 9–18. [6] R. Raghuraj, S. Lakshminarayanan, Variable predictive model based classiﬁcation algorithm for effective separation of protein structural classes, Comput. Biol. Chem. 32 (4) (2008) 302–306. [7] F. Kaefer, C.M. Heilman, S.D. Ramenofsky, A neural network application to consumer classiﬁcation to improve the timing of direct marketing activities, Comput. Oper. Res. 32 (10) (2005) 2595–2615. [8] N. Huang, D. Xu, X. Liu, L. Lin, Power quality disturbances classiﬁcation based on S-transform and probabilistic neural network, Neurocomputing 98 (2012) 12–23. [9] Y. Julien, J.A. Sobrino, J.-C. Jiménez-Munoz, Land use classiﬁcation from multitemporal Landsat imagery using the Yearly Land Cover Dynamics (YLCD) method, Int. J. Appl. Earth Obs. Geoinf. 13 (2011) 711–720. [10] R. Geerken, B. Zaitchik, J.P. Evans, Classifying rangeland vegetation type and coverage from NDVI time series using Fourier Filtered Cycle Similarity, Int. J. Remote Sens. 26 (24) (2005) 5535–5554. [11] A. Halder, A. Ghosh, S. Ghosh, Supervised and unsupervised land use map generation from remotely sensed images using ant based systems, Appl. Soft Comput. 11 (2011) 5770–5781. [12] J. Sun, J. Yang, C. Zhang, W. Yun, J. Qu, Automatic remotely sensed image classiﬁcation in a grid environment based on the maximum likelihood method, Math. Comput. Modell. 58 (3–4) (2013) 573–581. [13] Q. Lü, M. Tang, Detection of hidden bruise on kiwi fruit using hyperspectral imaging and parallelepiped classiﬁcation, Procedia Environ. Sci. 12 (B) (2012) 1172–1179. [14] C. Chen, Fuzzy training data for fuzzy supervised classiﬁcation of remotely sensed images, in: Proceedings of 20th Asian Conference on Remote Sensing (ACRS 1999), 1999, pp. 460–465. [15] A. Ghosh, S. Meher, B.U. Shankar, A novel fuzzy classiﬁer based on product aggregation operator, Pattern Recognit. 41 (6) (2008) 961–971. [16] F. Maselli, A. Rodolﬁ, C. Copnese, Fuzzy classiﬁcation of spatially degraded thematic Mapper data for the estimation of sub-pixel components, Int. J. Remote Sens. 17 (3) (1996) 537–551. [17] F. Melgani, B.A. Hashemy, S. Taha, An explicit fuzzy supervised classiﬁcation method for multispectral remote sensing images, IEEE Trans. Geosci. Remote Sens. 38 (1) (2000) 287–295. [18] Y. Liu, B. Zhang, L.-m. Wang, N. Wang, A self-trained semisupervised SVM approach to the remote sensing land cover classiﬁcation, Comput. Geosci. 59 (2013) 98–107. [19] D.M. Miller, E.J. Kaminsky, S. Rana, Neural network classiﬁcation of remotesensing data, Comput. Geosci. 21 (1995) 377–386. [20] J. Zeng, H.-f. Guo, Y.-m. HU, Artiﬁcial neural network model for identifying taxi gross emitter from remote sensing data of vehicle emission, J. Environ. Sci. 19 (2007) 427–431. [21] M. Brown, H. Lewis, S. Gunn, Linear spectral mixture models and support vector machines for remote sensing, IEEE Trans. Geosci. Remote Sens. 38 (5) (2000) 2346–2360.

[22] C. Huang, L. Davis, J. Townshend, An assessment of support vector machines for land cover classiﬁcation, Int. J. Remote Sens. 23 (2002) 725–749. [23] D. Stathakis, A. Vasilakos, Comparison of computational intelligence based classiﬁcation techniques for remotely sensed optical image classiﬁcation, IEEE Trans. Geosci. Remote Sens. 44 (8) (2008) 2305–2318. [24] R. Laprade, Split-and-merge segmentation of aerial photographs, Comput. Vis. Graphics Image Process. 48 (1) (1988) 77–86. [25] B.J. Irvin, S.J. Ventura, B.K. Slater, Fuzzy and isodata classiﬁcation of landform elements from digital terrain data in Pleasant Valley Wisconsin, Geoderma 77 (2–4) (1997) 137–154. [26] S. Pal, A. Ghosh, B. Uma Shankar, Segmentation of remotely sensed images with fuzzy thresholding and quantitative evaluation, Int. J. Remote Sens. 21 (11) (2000) 2269–2300. [27] R. Cannon, R. Dave, J. Bezdek, M. Trivedi, Segmentation of a thematic Mapper image using fuzzy c-means clustering algorithm, IEEE Trans. Geosci. Remote Sens. 24 (1) (1986) 400–408. [28] M.N. Kurnaz, Z. Dokur, T. Olmez, Segmentation of remote-sensing images by incremental neural network, Pattern Recogn. Lett. 26 (8) (2005) 1104–1316. [29] Z. Zhou, S. Wei, X. Zhang, X. Zhao, Remote sensing image segmentation based on self-organizing map at multiple scale, in: Proceedings of SPIE Geoinformatics: Remotely Sensed Data and Information, USA, 2007, pp. 122–126. [30] Y. Wong, E. Posner, A new clustering algorithm applicable to polarimetric and SAR images, IEEE Trans. Geosci. Remote Sens. 31 (3) (1993) 634–644. [31] G.P. Zhang, Neural networks for classiﬁcation: a survey, IEEE Trans. Syst. Man Cybernet. C: Appl. Rev. 30 (4) (November 2000). [32] G. Cybenko, Approximation by superpositions of a sigmoidal function, Math. Control. Signals Syst. 2 (1989) 303–314. [33] K. Hornik, Approximation capabilities of multilayer feedforward networks, Neural Networks 4 (1991) 251–257. [34] K. Hornik, M. Stinchcombe, H. White, Multilayer feedforward networks are universal approximators, Neural Networks 2 (1989) 359–366. [35] M.D. Richard, R. Lippmann, Neural network classiﬁers estimate Bayesian a posteriori probabilities, Neural Comput. 3 (1991) 461–483. [36] D.F. Specht, Probabilistic neural networks for classiﬁcation, mapping, or associative memory, in: IEEE International Conference on Neural Networks 1, July, 1988, pp. 525–532. [37] D.F. Specht, Probabilistic neural networks, Neural Networks 3 (1) (1990) 109–118. [38] T.D. Gancheva, D.K. Tasoulisb, M.N. Vrahatisb, N.D. Fakotakis, Generalized locally recurrent probabilistic neural networks with application to textindependent speaker veriﬁcation, Neurocomputing 70 (2007) 1424–1438. [39] X. Fu, Y. Ying, Y. Zhou, H. Xu, Application of probabilistic neural networks in qualitative analysis of near infrared spectra: determination of producing area and variety of loquats, Anal. Chim. Acta 598 (2007) 27–33. [40] A.L.I. Oliveira, F.R.G. Costa, C.O.S. Filho, Novelty detection with constructive probabilistic neural networks, Neurocomputing 71 (2008) 1046–1053. [41] J. Grim, J. Hora, Iterative principles of recognition in probabilistic neural networks, Neural Networks 21 (2008) 838–846. [42] H. Adeli, A. Panakkat, A probabilistic neural network for earthquake magnitude prediction, Neural Networks 22 (2009) 1018–1024. [43] F. Ozturk, F. Ozen, A new license plate recognition system based on probabilistic neural networks, Procedia Technol. 1 (2012) 124–128. [44] O. Er, A.C. Tanrikulu, A. Abakay, F. Temurtas, An approach based on probabilistic neural network for diagnosis of Mesothelioma’s disease, Comput. Electr. Eng. 38 (2012) 75–81. [45] J. Jia, C. Liang, J. Cao, Z. Li, Application of probabilistic neural network in bacterial identiﬁcation by biochemical proﬁles, J. Microbiol. Methods 94 (2013) 86–87. [46] S. Timung, T.K. Mandal, Prediction of ﬂow pattern of gas–liquid ﬂow through circular microchannel using probabilistic neural network, Appl. Soft Comput. 13 (2013) 1674–1685. [47] G.E. Tsekouras, J. Tsimikas, On training RBF neural networks using input–output fuzzy clustering and particle swarm optimization, Fuzzy Sets Syst. 221 (2013) 65–89. [48] J. González, I. Rojas, H. Pomares, J. Ortega, A. Prieto, A new clustering technique for function approximation, IEEE Trans. Neural Networks 13 (1) (2002) 132–142. [49] H.-S. Park, W. Pedrycz, S.-K. Oh, Granular neural networks and their development through context-based clustering and adjustable dimensionality of receptive ﬁelds, IEEE Trans. Neural Networks 20 (10) (2009) 1604–1616. [50] W. Pedrycz, Conditional fuzzy clustering in the design of radial basis function neural networks, IEEE Trans. Neural Networks 9 (4) (1998) 601–612. [51] W. Pedrycz, H.S. Park, S.K. Oh, A granular-oriented development of functional radial basis function neural networks, Neurocomputing 72 (2008) 420–435. [52] H.-S. Park, Y.-D. Chung, S.-K. Oh, W. Pedrycz, H.-K. Kim, Design pf information granule-oriented RBF neural networks and its application to power supply for high-ﬁeld magnet, Eng. Appl. Artif. Intell. 24 (2011) 543–554. [53] S.B. Roh, T.C. Ahn, W. Pedrycz, The design methodology of radial basis function neural networks based on fuzzy K-nearest neighbors approach, Fuzzy Sets Syst. 161 (13) (2010) 1803–1822. [54] A. Staiano, R. Tagliaferri, W. Pedrycz, Improving RBF networks performance in regression tasks by means of a supervised fuzzy clustering, Neurocomputing 69 (2006) 1570–1581. [55] Z. Uykan, C. Guzelis, M.E. Celebi, H.N. Koivo, Analysis of input–output clustering for determining centers of RBFNN, IEEE Trans. Neural Networks 11 (4) (2000) 851–858.

Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classiﬁcation from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037

626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711

G Model ASOC 2747 1–13

ARTICLE IN PRESS J. Iounousse et al. / Applied Soft Computing xxx (2015) xxx–xxx

712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748

[56] J.H. Ward, Hierarchical grouping to optimize an objective function, J. Am. Stat. Assoc. 58 (1963) 236–244. [57] A.M. Dillner, J.J. Schauer, W.F. Christensen, G.R. Cass, A quantitative method for clustering size distributions of elements, Atmos. Environ. 39 (2005) 1525–1537. [58] H.C. Lu, C.L. Chang, J.C. Hsieh, Classiﬁcation of PM10 distributions in Taiwan, Atmos. Environ. 40 (2006) 1452–1463. [59] T. Varin, R. Bureau, C. Mueller, P. Willett, Clustering ﬁles of chemical structures using the Székely–Rizzo generalization of Ward’s method, J. Mol. Graphics Modell. 28 (2009) 187–195. [60] N. Picard, F. Mortier, V. Rossi, S. Gourlet-Fleury, Clustering species using a model of population dynamics and aggregation theory, Ecol. Modell. 221 (2010) 152–160. [61] A. Carteron, M. Jeanmougin, F. Leprieur, S. Spatharis, Assessing the efﬁciency of clustering algorithms and goodness-of-ﬁt measures using phytoplankton ﬁeld data, Ecol. Inform. 9 (2012) 64–68. [62] C.S. Malley, C.F. Braban, M.R. Heal, The application of hierarchical cluster analysis and non-negative matrix factorization to European atmospheric monitoring site classiﬁcation, Atmos. Res. 138 (2014) 30–40. [63] Y. Xiao, C. Mignolet, J.-F. Mari, M. Benoît, Modeling the spatial distribution of crop sequences at a large regional scale using land-cover survey data: a case from France, Comput. Electron. Agric. 102 (2014) 51–63. [64] S. Hands, B. Everitt, A Monte Carlo study of the recovery of cluster structure in binary data by hierarchical clustering techniques, Multivar. Behav. Res. 22 (1987) 235–243. [65] A. Ghosh, N.S. Mishra, S. Ghosh, Fuzzy clustering algorithms for unsupervised change detection in remote sensing images, Inform. Sci. 181 (2011) 699–715. [66] W. Wang, Y. Zhang, On fuzzy cluster validity indices, Fuzzy Sets Syst. 158 (2007) 2095–2117. [67] K.-L. Wu, M.-S. Yang, A cluster validity index for fuzzy clustering, Pattern Recogn. Lett. 26 (2005) 1275–1291. [68] A. Chehbouni, R. Escadafal, G. Boulet, B. Duchemin, V. Simonneaux, G. Dedieu, B. Mougenot, S. Khabba, H. Kharrou, O. Merlin, A. Chaponnière, J. Ezzahar, S. Er-Raki, J. Hoedjes, R. Hadria, H. Abourida, A. Cheggour, F. Raibi, L. Hanich, N. Guemouria, Ah. Chehbouni, A. Lahrouni, A. Olioso, F. Jacob, J. Sobrino, The use of remotely sensed data for integrated hydrological modeling in arid and semiarid regions: the SUDMED program, Int. J. Remote Sens. 29 (2008) 5161–5181. [69] R. Hadria, B. Duchemin, A. Lahroun, S. Khabba, S. Er-Raki, G. Dedieu, A. Chehbouni, Monitoring of irrigated wheat in a semi-arid climate using crop

[70]

[71]

[72]

[73]

[74] [75] [76] [77] [78] [79]

[80]

[81]

13

modelling and remote sensing data: impact of satellite revisit time frequency, Int. J. Remote Sens. 27 (2006) 1093–1117. B. Duchemin, R. Hadria, S. Erraki, G. Boulet, P. Maisongrande, A. Chehbouni, R. Escadafal, J. Ezzahar, J.C.B. Hoedjes, M.H. Kharrou, S. Khabba, B. Mougenot, A. Olioso, J.C. Rodriguez, V. Simonneaux, Monitoring wheat phenology and irrigation in Central Morocco: on the use of relationships between evapotranspiration, crops coefﬁcients, leaf area index and remotely sensed vegetation indices, Agric. Water Manage. 79 (2006) 1–27. S. Er-Raki, A. Chehbouni, N. Guemouria, B. Duchemin, J. Ezzahar, R. Hadria, Combining FAO-56 model and ground-based remote sensing to estimate water consumptions of wheat crops in a semi-arid region, Agric. Water Manage. 87 (2007) 41–54. V. Simonneaux, B. Duchemin, D. Helson, S. Er-Raki, A. Olioso, A.G. Chehbouni, The use of high-resolution image time series for crop classiﬁcation and evapotranspiration estimate over an irrigated area in central Morocco, Int. J. Remote Sens. 29 (2008) 95–116. G.M. Foody, Thematic mapping from remotely sensed data with neural networks: MLP, RBF and PNN based approaches, J. Geoghraph. Syst. 3 (2001) 217–232. R.N. Dave, Validating fuzzy partition obtained through c-shells clustering, Pattern Recogn. Lett. 17 (1996) 613–623. R.A. Fisher, The use of multiple measurements in taxonomic problems, Ann. Eugenics 7 (2) (1936) 179–188. E. Anderson, The species problem in Iris, Ann. MO Bot. Gard. 23 (3) (1936) 457–509. VALERI (Validation of Land European Remote sensing Instruments) Program, http://w3.avignon.inra.fr/valeri/. D. Lloyd, A phenological classiﬁcation of terrestrial vegetation cover using shortwave vegetation index imagery, Int. J. Remote Sens. 11 (1990) 2269–2279. R.S. De Fries, M. Hansen, J.R.G. Townshend, R. Sohlberg, Global land cover classiﬁcations at 8 km spatial resolution: the use of training data derived from Landsat imagery in decision tree classiﬁers, Int. J. Remote Sens. 19 (1998) 3141–3168. S.S. Ray, V.K. Dadhwal, Estimation of crop evapotranspiration of irrigation command area using remote sensing and GIS, Agric. Water Manage. 49 (2001) 239–249. R. Congalton, K. Green, Assessing the Accuracy of Remotely Sensed Data, Lewis Publications, Boca Raton, FL, 1999.

Please cite this article in press as: J. Iounousse, et al., Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classiﬁcation from multitemporal satellite images, Appl. Soft Comput. J. (2015), http://dx.doi.org/10.1016/j.asoc.2015.01.037

749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786

Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classification from multitemporal satellite images

Using an unsupervised approach of Probabilistic Neural Network (PNN) for land use classification from multitemporal satellite images

Recommend Documents