Hierarchical clustering of self-organizing maps for cloud classification

Hierarchical clustering of self-organizing maps for cloud classification

Neurocomputing 30 (2000) 47}52 Hierarchical clustering of self-organizing maps for cloud classi"cation Christophe Ambroise *, Genie`ve Se`ze, Fouad...

83KB Sizes 0 Downloads 21 Views

Neurocomputing 30 (2000) 47}52

Hierarchical clustering of self-organizing maps for cloud classi"cation Christophe Ambroise *, Genie`ve Se`ze, Fouad Badran, Sylvie Thiria UMR CNRS, 6599 Heudiasyc, 60205 Compie% gne, France E! cole Polytechnique, 91128 Palaiseau Cedex, France Universite& Pierre et Marie Curie, 75005 Paris, France Conservatoire National des Arts et Me& tiers, 75003 Paris, France Received 7 October 1998; revised 13 February 1999; accepted 10 March 1999

Abstract This paper presents a new method for segmenting multispectral satellite images. The proposed method is unsupervised and consists of two steps. During the "rst step the pixels of a learning set are summarized by a set of codebook vectors using a Probabilistic SelfOrganizing Map (PSOM, Statistique et meH thodes neuronales, Dunod, Paris, 1997). In a second step the codebook vectors of the map are clustered using Agglomerative Hierarchical Clustering (AHC, Pattern Recognition and Neural Networks, Cambridge University Press, Cambridge, 1996). Each pixel takes the label of its nearest codebook vector. A practical application to Meteosat images illustrates the relevance of our approach.  2000 Elsevier Science B.V. All rights reserved. Keywords: Kohonen maps; Image segmentation; Hierarchical clustering; Cloud classi"cation

1. Introduction Classi"cation of remote sensed imagery is usually supervised. One of the problems with cloud classi"cation is the di$culty of getting a great number of labeled samples which represent the diversity of all possible situations. As supervised learning uses only labeled samples for determining the classi"er parameters and in assessing the

* Corresponding author. E-mail address: [email protected] (C. Ambroise) 0925-2312/00/$ - see front matter  2000 Elsevier Science B.V. All rights reserved. PII: S 0 9 2 5 - 2 3 1 2 ( 9 9 ) 0 0 1 4 1 - 1

48

C. Ambroise et al. / Neurocomputing 30 (2000) 47}52

classi"er performance, it can lead to biased results: some of the classes may not be represented or under sampled. Based on the dynamic cluster analysis method [3], an unsupervised cloud classi"cation algorithm using visible and infrared radiance "elds, has been developed by Se`ze and Desbois [2]. This classi"cation technique, used in a cloud and climate research context, has been tested and compared with other techniques during the preliminary phase of the International Satellite Cloud Climatology Program [8]. Further improvements, in particular, the introduction of two textural parameters, and processing of large data sets at di!erent times, have been made [9]. In this paper, we consider that the classi"cation obtained with this technique is globally well representative of the observed cloud "elds. An attempt to reproduce this classi"cation with a method based on Probabilistic Self-organizing Maps, is presented. This method works in two steps. The "rst step approximates the distribution of the pixels to be classi"ed using a topological map. Hierarchical clustering is then used for labeling the neurons. The advantage of such an approach compared to the dynamic cluster method is the stability of the solution and the possibility of cutting the hierarchy at the desired level of cluster agglomeration.

2. Combining probabilistic SOM with hierarchical clustering Agglomerative Hierarchical Clustering can be used for classifying several thousand objects, but is unable to deal with millions of objects (i.e. pixels of a METEOSAT image). Using probabilistic topological algorithms in the "rst stage allows to summarize the initial data set into a smaller set of codebook vectors which can be clustered using a hierarchical clustering algorithm [5]. The topological algorithm typically provides a partition of a thousand odd clusters, which is used as a starting point for hierarchical clustering. Let us detail the two steps of this procedure. 2.1. Probabilistic SOM Vector quantization algorithms are classically used to produce an approximation to the distribution of a data set. These algorithms "nd a set of codebook vectors to code compactly all vectors of the dataset. Compared to classical vector quantization methods, topological maps introduce an additional relation between the codebook vectors. This `topologicala relationship is a constraint which has been proved to produce robust results [4]. The most well-known topological algorithm is the Kohonen Self-Organizing Maps Algorithm. It assumes implicitly that the distribution to approximate is a Gaussian mixture where each component is de"ned by a mean vector (the codebook vector) and a covariance matrix of the form R "jI (where j is a scalar and I the identity matrix). I All the components are supposed to have the same covariance matrix. Generalizations of this algorithm have recently arisen in the framework of this probabilistic interpretation of Kohonen algorithm [1,10]. These generalizations relax the hypothesis concerning the covariance matrices and have been used in this paper.

C. Ambroise et al. / Neurocomputing 30 (2000) 47}52

49

2.2. Agglomerative hierarchical clustering Agglomerative Hierarchical Clustering is an iterative procedure which is based on the following principle : (1) Find the two closest clusters according to some dissimilarity. (2) Agglomerate them to form a new cluster. The "rst iteration considers the partition induced by the codebook vectors of the map. The following iterations require us to de"ne a dissimilarity between the merged clusters and all other clusters. There are many ways to do so. The most widely considered rule for merging clusters may be the method proposed by Ward [7]. This method measures the cluster dissimilarity using a weighted squared distance between the two cluster centers kl and k I n nl I ""k !kl"", n #nl I I where n and nl represent the cardinalities of clusters k and l. This distance has been I used in this paper. The labeled map can then be used for classifying a new set of objects resulting from similar measurements: Each object inherits the label of the nearest codebook vector.

3. Cloud classi5cation experiment The data set we consider here, is extracted from a much larger data set built in the framework of the POLDER (Polarization and Directionality of the Earth's Re#ectances) data cloud detection algorithm validation. This larger data set provided by MeH teH o France consists of METEOSAT full-"eld view visible and infrared radiance "eld, at 5 km resolution and one half-hour time period between 06 : 00Z and 15 : 00Z for the 30 October}10 November 1996 period. This data set was calibrated and the visible radiance was transformed into re#ectance by applying a correction for the solar zenith angle. The dynamic cluster analysis method was applied using four parameters, two spectral ones, the visible and the infrared and the two structural ones, the local spatial standard deviation of the visible and infrared radiances (3;3 neighborhood). The data for the di!erent days and times were processed following Ra!aelli and Se`ze [6], for "ve latitudinal regions over Atlantic. In our present study, only one region was retained, the Tropical Atlantic region and one time, 12 : 00Z. This set, which consists of 3520000 pixels, was then randomly sampled. The "nal sub-set, used to apply the technique presented in Section 2, is a collection of 70000 pixels. A visual comparison of histograms computed with the whole data set and histograms computed using the randomly sampled pixels shows that the original population and the sample have close distributions. Each pixel is characterized by the four variables mentioned above.

50

C. Ambroise et al. / Neurocomputing 30 (2000) 47}52

At the end of the clustering process, the whole set of pixels is classi"ed and classi"ed images for the 10 day period were built up. At that step, a choice must be made about the number of classes. We chose to partition the pixels into 20 and 11 classes. The number of classes given for this region by the dynamic cluster is 11. For each of the 10 processed days, we have 3 classi"ed images: The image obtained with the proposed method and a choice of 20 classes (SOM20), the image obtained with the proposed method and a choice of 11 classes (SOM11) and the reference classi"cation image (DCAM Dynamic Cluster Analysis Method). In the DCAM classi"cation, a cloud type label has been manually attributed to each class. This labeling is based on the analysis of the parameter value of the gravity centers and the analysis of local pixel neighborhood in the classi"ed images. Labeling in cloud type of SOM20 and SOM11 is "rst performed using the cooccurence matrix between these classi"cations and DCAM. The labeling obtained in this way is in agreement with the analysis of the center of gravity value. A "rst observation relates to the large di!erence in the clear sky percentage between DCAM (43%) and SOM20 (63%). The 43% of clear DCAM pixels are also detected as clear by the SOM20 technique. The remaining 20% of clear pixels in the SOM20 technique are all detected as thin edges or partly clear or cloudy pixels (cf. Tables 1 and 2). Except for that discrepancy, the comparison is satisfactory. Considering only the pixels declared as cloudy in the two classi"cations (37%), the agreement in the cloud type comparison is very high. Only 6% of the pixels are in strong disagreement (confusion between pixels partially covered by low clouds and very thin cirrus) and 71% are in full agreement. For the 6% of pixels in disagreement, analysis of the local spatial neighboring pixel label in the two classi"cations indicates the presence of the two cloud types in the close vicinity of those pixels. Another indicator of the resemblance between the two classi"cations has been to compare the spatial `compactnessa of classes in DCAM (connectedness of pixels belonging to the same class). This `compactnessa is slightly better in SOM20 than in DCAM.

Table 1 Breakdown of the SOM20 labels in the DCAM classes DCAM Classes

SOM20 Classes Clear

Clear Nearly clear Thin edges Partly cloudy Low Multi layers Thin cirrus Cirrus High thick

99 92 47 22 4

Nearly Thin clear edges

26

11 1

Partly cloudy

7 11 55 4 2 10

Low

20 90 23

Multi layers

Thin cirrus

Cirrus

High thick

1 1 15 2 1 15

2 62 7

57 16 91 37

63

C. Ambroise et al. / Neurocomputing 30 (2000) 47}52

51

Table 2 Breakdown of the DCAM labels in the SOM20 classes SOM20 Classes

Clear Nearly clear Thin edges Partly cloudy Low Multi layers Thin cirrus Cirrus High thick

DCAM Classes Clear

Nearly Thin clear edges

68

22

6

4

1

7 1

3 13 1

66 10

1 3 74 17

2

21

2 67 22 2 4

5

Partly cloudy

Low

1

Multi layers

3 71 1 8

Thin cirrus

Cirrus

19 6

3

3 59 9

6 8 71 2

High thick

10 98

A quick analysis of the DCAM and SOM11 comparison shows disappointing results. Di!erent cloud types which are well separated in SOM20 are mixed in SOM11.

4. Conclusion We have applied a new clustering algorithm for segmenting METEOSAT images. The technique does not improve the quality of the segmentation compared to a classical technique, but shows comparable results. However the main advantage of the presented method consists of the possibility of easily having di!erent levels of classi"cation. This feature can be very useful in an exploratory data analysis context. Moreover, it allows the use of classical techniques based on hierarchical clustering to automatically determine the `naturala number of classes.

Acknowledgements This research was supported by the NEUROSAT European research project (PL952049).

References [1] C. Ambroise, G. Govaert, Constrained clustering and kohonen self-organizing maps, J. Classi"cation 13 (2) (1996) 299}313.

52

C. Ambroise et al. / Neurocomputing 30 (2000) 47}52

[2] M. Desbois, G. Seze, G. Szejwach, Automatic classi"cation of clouds on meteosat imagery: Application to high-level clouds, J. Climatol. Appl. Meteorol. 21 (1882) 401}412. [3] E. Diday, La meH thode des nueH es dynamiques, Revue de Statistiques appliqueH es 19 (2) (1971) 19}34. [4] S.P. Luttrell, Derivation of a class of training algorithms, IEEE Trans. Neural Networks 1 (2) (1990). [5] F. Murtagh, Interpreting the kohonen self-organization feature map using contiguity-constrained clustering, Pattern Recognition Lett. 16 (1995) 399}408. [6] J.L. Ra!aelli, G. Se`ze, Cloud type separation using local correlation between visible and infrared satellite images, in: Passive Infrared Remote sensing of clouds and the atmosphere III. SPIE, Proceedings Series, vol. 2578, 1995. [7] B.D. Ripley, Pattern Recognition and Neural Networks, Cambridge University Press, Cambridge, 1996. [8] W.B. Rossow, F. Mosher, E. Kinsella, A. Arking, M. Desbois, E. Harrison, P. Minnis, E. Ruprecht, G. Se`ze, C. Simmer, E. Smith, Isccp cloud algorithm intercomparaison, J. Climatol. Appl. Meteorol. 24 (1985) 877}903. [9] G. Se`ze, M. Desbois, Cloud cover analysis from satellite imagery using spatial and temporal characteristics of the data, J. Climatol. Appl. Meteorol. 26 (1987) 287}303. [10] S. Thiria, Y. Lechevallier, O. Gascuel, S. Canu, Statistique et meH thodes neuronales, Dunod, Paris, 1997.