International Journal of Applied Earth Observation and Geoinformation 18 (2012) 69–81
Contents lists available at SciVerse ScienceDirect
International Journal of Applied Earth Observation and Geoinformation journal homepage: www.elsevier.com/locate/jag
Flood mapping of Danube River at Romania using single and multi-date ERS2-SAR images T.Y. Gan a,∗ , F. Zunic b , C.-C. Kuo a , T. Strobl b a b
Department of Civil & Environmental Engineering, University of Alberta, Canada Technische Universität München, Lehrstuhl für Wasserbau und Wasserwirtschaft, Arcisstraße 21, 80333 München, Germany
a r t i c l e
i n f o
Article history: Received 25 May 2011 Accepted 5 January 2012 Keywords: Flooding of Danube River ERS2-SAR images Permanent water Flooded area Dry land Single-image un-supervised, contextual, PCA and Isodata classifications
a b s t r a c t Several flood mapping classification techniques, applied to single-date and multi-date SAR images of ERS2, based on the Danube River flooding of 2006 in Romania are compared, as part of an effort to explore the feasibility of mapping flooded areas by SAR images acquired through radar sensors. Among 7 SAR images analyzed for the same study site located around Bistret of Romania, several represent “dry” and several “wet” conditions, where the latter represent the major Danube flooding event of 2006. The images were classified into (1) permanent water (Danube River and lakes), (2) flooded area, and (3) dry land, using single image, pixel-based classification, frequency-based contextual classification, and principal component analysis (PCA) combined with Isodata classification. The flooded areas delineated from the above procedures for the study site at Bistret are visually compared with that of Landsat-TM images and MODIS mosaic and digitally compared with referenced flooded area produced by the DEM data of SRTM. Apparently there is no one technique that is clearly better partly because of the nature of SAR data (radar echoes) and partly because of data noise even though the images were first subjected to speckle filtering and geometric corrections, and partly because SAR images could appear dark not only because of flooding but also because of smooth surfaces, target sizes, etc. However, if multi-date SAR images of both DRY and WET (flooding) conditions are available, it seems that PCA combined with the Isodata classifier would give better defined flooded areas of the Danube River than the simple single image, pixel-based classification or the contextual classification. © 2012 Elsevier B.V. All rights reserved.
1. Introduction Because of the general high density of urban population, and some European cities are partly built on flood plains, about 10% of Europeans live or work on flood plains. Central European rivers (including the Rhine, Danube, Odra, and Wisla) have suffered from many major floods in recent years. Between 1971 and 1995, 154 major floods occurred in Europe. The cost of flood damage in Europe between 1971 and 1995 has been estimated at about D100 billion. Flooding has been a problem partly because of the occurrences of extreme precipitation, ad hoc floodplain developments in urban areas, reduced river bed capacity and the impact of deforestation. Major floods often occur from the superposition of multiple peak floods triggered by successive episodes of major storms over a prolonged period on watersheds with high antecedent moisture conditions. In the spring of 2006, heavy floods inundated central and Eastern Europe because of a combination of intensive snowmelt and heavy rainfall (Younis et al., 2008). High water levels
∗ Corresponding author. Tel.: +1 780 492 9376; fax: +1 780 492 8289. E-mail address:
[email protected] (T.Y. Gan). 0303-2434/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.jag.2012.01.012
lasted for over 6 weeks (April–June 2006) in Danube, over 30,000 people were displaced in the entire Danube basin, and the overall damages estimated were over 500 million Euro (D). Since the beginning of the satellite era in the 1960s, mankind has been trying to map large scale floods using remotely sensed data from optical sensors of Landsat-TM5, Landsat-ETM7, NOAA’s AVHRR sensors, MODIS, ASTER, and synthetic aperture radar (SAR) sensors of European Remote Sensing Satellites (ERS-1 and ERS-2), ENVISAT, Canadian Radarsat-1 and Radarsat-2, and the Japanese JERS-1 and JERS-2 (e.g., Hess et al., 1995; Wang et al., 1995; Bates, 2004; Bates and De Roo, 2000; Jodouin et al., 2003; Horritt, 1999, 2006), and sometimes at global scale (e.g., Prigent et al., 2007). Optical sensors are weather dependent, and cloudy conditions could prevent us from identifying flooded areas from such satellite data. On the other hand, radar sensors are weather independent but in the past SAR sensors only produce data in one spectral channel, e.g., Radarsat-1, ERS-1 and ERS-2 even though some sensors have cross-polarization capabilities. Lately, the PALSAR (Phased Array type L-band Synthetic Aperture Radar) sensor carried by the ALOS (Advanced Land Observing Satellite) and the latest Canadian Radarsat-2 satellites launched in 2007 have quadruple polarizations (HH, VV, HV, VH).
70
T.Y. Gan et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 69–81
Table 1 Means and standard deviations of pixel values of 7 selected ERS2-SAR images for the RCB reach of Danube River. Acquired date
February 26, 2006 April 2, 2006 May 7, 2006 June 11, 2006 July 16, 2006 August 20, 2006 September 24, 2006
Orbit #
56746 57247 57748 58249 58750 59251 59752
Pixel values Mean
Standard deviation
306.08 348.03 306.36 380.65 351.09 312.63 332.70
156.68 170.71 157.12 206.39 138.71 147.79 153.55
Unlike visual interpretation of flooded areas from images acquired from optical sensors, interpretation of SAR images involves more ambiguities. For example, speckles and pixel values of SAR images are dependent not only on the dielectric constant, (due to water), but also on surface roughness, geometric distortions, layover effects, and other factors. Even though a wet surface could appear bright because of high surface roughness, it generally appears dark because when soil gets wet, its increases substantially to about 30 (Hendrickx et al., 2003). The brightness of a SAR image also depends on the incidence angle, surface roughness and polarization. Typically for flood mapping (smooth surface), copolarization (HH or VV) has a 0◦ phase difference and so is preferred over cross-polarization (VH or HV) which has a phase difference uniformly distributed between −180◦ and +180◦ , and so the latter does not contain much target-specific information. The amount of returned radiation (as backscatter echoes) that dictates the brightness of the image depends on factors such as the roughness, size of the target relative to the signal’s wavelength, volumetric and diffused scattering. Objects with dimensions similar to the wavelength appear bright (as though rough), while smaller objects may appear dark (smooth). Flat surfaces such as paved roads or calm water normally appear dark since most of the incident radar pulses are reflected away from the radar sensor. Therefore, interpreting radar echoes are more complicated compared to interpreting satellite data from optical sensors. 2. Research objective and methods The key objective of this study is to compare the effectiveness of several classification techniques in mapping the Danube River flood of 2006 in Romania using single-date and multi-date SAR images acquired by the radar sensor of ERS2 (European Resource Satellite-2 launched in April 1995). Three un-supervised classification techniques were used to map the dynamics or changes to the flooded areas of Danube during April to May of 2006, around the Rast-Catane-Bistret (RCB) area of Romania (Fig. 1) from seven ERS2-SAR images (Table 1) of a C-band radar (5.3 GHz and 5.66 cm in wavelength). These images are of VV polarization and 30 m in resolution (achieved via the principle of synthetic aperture radar, SAR). The classification techniques are: 1 Classify individual ERS2-SAR images on the basis of individual spectral radiances or pixel values. 2 Classify multi-date ERS2-SAR Images using a the contextual classification technique of Gong and Howarth (1992); and b principal component analysis (PCA) combined with the Isodata classifier. Using these classification techniques, the images are delineated to permanent water (part of the Danube River and nearby lakes), flooded area and dry land which are compared with the corresponding three landuses derived from the 90 m resolution DEM
(Digital Elevation Model) data of the Shuttle Radar Topography Mission (SRTM) of the National Geospatial-Intelligence Agency (NGA) and the National Aeronautics and Space Administration (NASA) (Farr et al., 2007), since the water level of floods and rivers depends on the terrain features specified by DEM data. According to Rodriguez et al. (2005), the mean, standard deviation, and 90% absolute height error of the SRTM-based DEM for Eurasia are −0.7 m, 3.7 m, and 6.6 m, respectively. This means that 90% of the absolute height errors of this DEM data for Eurasia are below 6.6 m. Further, according to the report, 50% of the absolute height errors are approximately below 1.5 m. Despite of uncertainties expected with the SRTM-DEM data (or any DEM data), this product should be adequate for delineating the study sites into permanent water, flooded areas, and dry land partly because elevation ranges for permanent water (15–26 m), flooded area (26–33 m), and dry land (greater than 33 m) are significantly (several meters) larger than the 50% (90%) absolute height error of SRTM-DEM. By using the above ranges of elevation, the Danube River channel delineated in Fig. 4 appear distinct and morphologically reasonable, especially the downstream reaches. Further, the Danube River channel and the flooded areas of Fig. 4 agree with the 2006 Danube flood map prepared by the WWF Danube Carpathian Programme (Bratrich and Moroz, 2006). Even though the boundaries between the three landuses delineated from the SRTM-DEM data are subjected to uncertainties, on the basis of the aforementioned explanations, the flood map produced out of the SRTM-DEM data should be acceptable even though SRTM-DEM it is more of a DSM than a DEM data.
3. Danube River basin The Danube River, the second longest river in Europe (2778 km), drains a watershed of 817,000 km2 in size, and flow from west to east with a mean annual discharge () of 5630 m3 /s into the Black Sea (Belz et al., 2004). The Danube River basin is of great economic importance to nine countries that border it and the Danube River serves as vital commercial transportation from Germany at the west to the Black Sea at the east. Given that the River has been substantially altered by a massive system of hydraulic structures, it will be a challenge to model the 2006 flood of Danube using a hydrologic/hydraulic model. This study is limited to the lower part of Danube formed by the Romanian-Bulgarian lowlands since this part of Danube was badly flooded in April to May of 2006 (Fig. 1) because extreme storms occurred after the spring snowmelt. Further, since 1970s, the lower Danube River has lost 70–90% of its morphological flood plains due to construction of dikes and high water regulation works, causing it to be almost completely disconnected from its floodplains (Schwarz et al., 2006). In April 2006, the peak flood of the Danube River recorded near the Serbian-Romanian border was about 15,800 m3 /s (≈3 and about 8 standard deviations () beyond , given that of the annual flow is about 1027 m3 /s), which was estimated as a 100year flood (Schwarz et al., 2006). By late April, the Danube River flooded 5000 homes in 12 counties of Romania, and over 15,000 people were evacuated. The 2006 flood damaged 500 km of roads, 255 bridges, over 80,000 hectares of farmland, and broke 13 main dikes. The damage and the number of people evacuated surpassed the 1970 flood which, until 2006, was considered the worst flood ever recorded in Romania. The Danube is of huge significance to Romania given it supply about 44% of its water resources and the country (237,391 km2 in area) is 97.4% within the Danube Basin. Our focus will be on mapping the low plains of Romania centered at Rast-Catane-Bistret (RCB) (Fig. 1).
T.Y. Gan et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 69–81
71
Fig. 1. The study site, the Rast-Catane-Bistret (RCB) reach of Danube River, Romania, histograms of annual runoff (m3 /s) fitted with a Gaussian normal probability distribution, N [mean () = 5630, standard deviation () = 1027], and multi-date, Landsat-TM5 images showing temporal changes of the 2006 flooding of the RCB reach of Danube River. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)
4. Unsupervised classifications
4.1. Classify ERS2-SAR data using single images
The three un-supervised classification techniques, ranging from the simplest to the most complex techniques, are described below.
In this simple approach, the distributions of spectral radiances of seven images are individually examined (Table 2), which all show double-peak distributions, but the relative
72
T.Y. Gan et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 69–81
Histogram of Pixels of Window B of Bistret, 2006
Histogram of Pixels of Window A at Bistrel 4000
3000 59752
2700
(a)
59251
3200
2100
2800 57748
1800 1500
Number
Number
(b)
3600
2400
59251
1200
58750
58249
900
59752
2400 2000
57748
1600 57247
1200
600
800
57247
300 0
100
200
58750
58249
400
56746
0
56746
300
400
500
600
0
700
0
Pixel
100
200
300
400
500
600
700
Pixel PC1
Histograms of ERS2-SAR image at Bistret 2006 40000
40000
36000
(c)
59251
59752
32000
(d)
35000
57247
30000
24000 57748
20000
58249
16000
25000
Number
Number
28000
20000 15000
12000
10000
8000
5000
58750
4000
0
0 0
100
200
300
400
500
600
-800 -700 -60 0 -500 -40 0 -300 -20 0 -100
PC2
200
300
40 0
500
60 0
700
800
PC3
35000
(e)
30000
(e)
35000
20000
(f)
30000
Number
25000
(f)
25000 20000
15000
15000
10000
10000
5000
5000 0
0 -800 -700 -600 -500 -400 -300 -200 -100
10 0
40000
40000
Number
0
Pixel Anomaly
Pixel
0
100 200 300 400 500 600 700 800
Pixel Anomaly
-800 -700 -600 -500 -400 -300 -200 -100
0
100
200
300
400
500
600
700
800
Pixel Anomaly
Fig. 2. Histograms of (a) Window A, (b) Window B, (c) full image, (d) PC1, (e) PC2, and (f) PC3 of the multi-temporal ERS2-SAR images of the RCB reach of Danube River at Romania. Proportionally the number of darker pixel values in the two windows (a) and (b) are higher than that of the whole image (c) which is to be expected because the two windows are selected on the flooded zones located beside the Danube River (see Fig. 1). The first peaks on the left-hand-side of plots (a)–(c) likely represent pixel values of permanent water body (mostly the river) and some flooded area. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)
magnitudes of both peaks vary from image to image (Fig. 2). Because of relatively high dielectric constant (ε) of water, areas under the first peak from the left should approximately represent pixel values of permanent water bodies which should be among the lowest values, while areas between the first and the second peaks represent pixel values of the flooded areas since these pixel values are generally in-between that of the permanent water bodies and that of dry ground, which should be the highest and located on
the right side of the second peak (Fig. 3). Unfortunately, because of complicating factors that affect the pixel values (Section 1) and data noise, there are overlaps between these three clusters of pixel values. On the basis of the 90 m resolution DEM data of SRTM (Farr et al., 2007) for the RCB reach of the Danube River, several flood stages were assumed and from the flooded area identified, it seems that setting a flood stage of 33 m (Fig. 4) matches well with that of Bratrich and Moroz (2006). From Fig. 3
T.Y. Gan et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 69–81
73
Fig. 3. Histograms of DRY (Orbit #57247) and WET (Orbit #57748) images of the RCB reach of Danube River at Romania for (a) the full image (window size 6400 × 1750), and (b) a sub-window (size 2048 × 1024). Reduced grey ranges for Permanent Water, Flooded Area, and Dry Land, are selected on the basis of the locations of either Peak-1 or Peak-2 for the full window [6400 × 1750], or approximately the intersections of DRY and WET histograms. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)
Table 2 The percentage (%) of permanent water (part of Danube River and lakes), flooded area and dry land derived from DRY or April 2, 2006 (Orbit #57247) and WET or May 7, 2006 (Orbit #57748) images of ERS2-SAR satellite of reduced grey levels (0–50) for two windows of Bistret, Romania of window sizes [6400 × 1750] and [2048 × 1024], respectively (see Fig. 3).
Column Percentage (%) [reduced pixel grey] Permanent water (Danube River and lakes) Flooded area Dry land
Permanent water (Danube River and lakes) Flooded area Dry land a b
Reduced grey range
Full-window (6400 pixels × 1750 lines)
(1)
(3)
(4)
(5)
(6)
Sag Point 7.85 [7a ] 10.27 [7]
Peak-1
Peak-2
Intersection of DRY and WET
33.71 [18]
52.17b [22] 62.65b [22] 39.99 27.08
(2)
0-7 0-7 8-18 8-19 19-50 20-50
DRY WET DRY WET DRY WET
38.62 [19] 58.44 51.11
Reduced grey range
Sub-Window (2048 pixels × 1024 lines)
(1)
(2)
(3)
0-8 0-8 9-17 9-17 18-50 18-50
DRY WET DRY WET DRY WET
11.42 [8] 25.95 [8]
(4)
25.33 [20] 42.80 48.72
(5)
(6)
45.78[20]
25.53 [17] 34.07 [17] 63.05 39.97
The upper limit of reduced grey (pixel) values used to estimate the respective % of land area. Estimated flooded areas based on the intersection point of DRY and WET histograms for the large window are too high.
Fig. 4. The RCB reach of the Danube River at Romania [6400 × 1750] of elevation ranges: (1) 15–26 m representing Permanent Water of Danube River and lakes (dark brown), (2) 26–33 m representing flooded area (brown), and (3) greater than 33 m representing dry land (light brown to dark green), derived from the high resolution DEM (Digital Elevation Model) data of the Shuttle Radar Topography Mission (SRTM) (Farr et al., 2007). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)
74
T.Y. Gan et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 69–81
Fig. 5. The classification of single SAR images on the basis of pixel spectral radiance, for the April 2, 2006 (DRY, #57247) ERS2-SAR Image of the RCB reach of Danube River at Romania [A window size of 6400 × 1750] for: (a) reduced grey values (0-50), and for (b) Permanent Water/Danube River (Dark color), Flooded area (Grey color), and Dry land (light Grey color); and similarly in (c) and (d) for the May 7, 2006 (WET, #57748) ERS2-SAR Image of Bistret. Each image represents an area of about 192 km × 52.5 km in size. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)
which shows the possible range of pixel values for permanent water body, flooded areas and dry land of individual images, we converted individual images to Fig. 5 which shows permanent water body (Danube River), flooded areas and dry land of RCB under Dry (Fig. 5a and b) and Wet (Fig. 5c and d) conditions, respectively. More discussions of the results obtained from this simple technique based on pixel values alone are given in Section 5. 4.2. Frequency-based contextual classification A frequency-based contextual classification method is a form of class discrimination based on the spatial relationships between each individual pixel, local and global configurations of
neighboring pixels. So, instead of using conventional classifiers that employ only spectral information on a single-pixel basis which tends to overlook a large amount of spatial information, the key idea of this classifier is to classify landuse patterns based on multispatial information obtained from multi-date images acquired over the same site. The resulting maps that show flooding in RCB (Fig. 6) are only qualitatively compared with images acquired by the Landsat-ETM sensor (Fig. 1) since acquisition dates and map scales between them are different and so a more objective comparison will be difficult and may not lead to any dependable conclusion. By assuming each image of ERS2-SAR as a single spectral channel, and using multi-date images, we applied the REDUCE and CONTEXT modules of the PCI Remote Sensing Corp based on the multi-spectral framework of Gong and Howarth (1992).
T.Y. Gan et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 69–81
75
Fig. 6. One (a) DRY image resulted from combining 4 Reduced Grey ERS2-SAR images (#56746, #57247, #59251, #59752) of the RCB reach of Danube River at Romania [6400 × 1750] by the REDUCE of PCI, and its (b) contextually classified version by the CONTEXT of PCI where Permanent Water/Danube River is of White color, Flooded area is of light Grey color, and Dry land is of Dark color; similarly, (c) one WET image resulted from combining 3 reduced grey images (#57748, #58249, #58750) and its contextually classified version (d). In contrast to Fig. 5, white color represents permanent water or flooded areas because pixel values were transformed through the clustering procedure of REDUCE, PCI. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)
In this contextual classifier of Gong and Howarth (1992), which is more efficient than the conventional maximum likelihood classifier for landuse classifications and most other classifiers computationally, two basic procedures are involved. First, the grey level vector of ERS2-SAR is reduced by partitioning the eigen space (eigenvalues and eigen vectors) derived from the covariance greylevel matrix into an expected number of pieces (new grey level vector). Suppose a grey level vector R in a k multi-date space, where Ri is the grey level with its mean, Ri , removed.
V −1 [Cov(R)]V = []
R = {R1 , R2 , . . . , Ri , . . . , Rk }T
The grey levels were transformed to a new set of grey-level values that range from 0 to 50, by an increment of 1, to significantly reduce the number of grey levels which for an ERS2-SAR image could exceed 30,000. The grey level interval is estimated
(1)
The grey values of every pixel in the “multi-spectral” image, R, are transformed to a grey level vector in the eigen space Ek , such that Y = {V1 , V2 , . . . , Vk }T R
(2)
where V1 , V2 , . . ., and Vk are the eigenvectors, and {1 , 2 , . . ., k } are the eigenvalues so that (3)
76
T.Y. Gan et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 69–81
according to a Gaussian normal distribution such that 97% of the grey level vectors fall within the multi-spectral space. To allow for easy access of the reduced grey data while operating the second step of the contextual algorithm, the multi-image data is stored in one image only. Next the reduced grey level vector is subjected to a frequencybased, un-supervised classification based on some selected bitmap segments (training site) and a small window of pixels (e.g., 7 × 7) and the results are shown in Fig. 6. It is essentially a classification procedure based on a multi-resolution extraction and a neural network of a multilayer perceptron model. The effectiveness of the contextual classification of SAR data is expected to be limited by data noise, pattern complexity, and problem dependency and the errors caused by this classification technique usually are found between class boundaries. Based on reduced pixel values ranging between 0 and 50, this approach is an efficient clustering procedure, but its accuracy is expected to be lower than other major clustering algorithms, such as the K-means and ISODATA. However, given that SAR images tend to have a higher amount of data noise than spectral data in the visible and IR range, and require speckle filtering, this is a quick exploratory approach for classifying SAR images which can usually be achieved with fast convergence. Comparing Figs. 5 and 6, especially Fig. 5d versus Fig. 6d, it is apparent that the multi-date contextual classification approach (Fig. 6) produced more superior flood maps than the subjective, single-image classification approach of Fig. 5 (Section 4.1), given that in Fig. 5 there is a fairly significant amount of overlap between permanent water and flooded area, as shown by the dark patches above the Danube river (see Fig. 5d), and flooded areas of Fig. 5d (black) are not as clearly defined as that of Fig. 6d (white). 4.3. Principal component analysis (PCA) and Isodata classification A flowchart of PCA combined with the Isodata classifier is shown in Fig. 7. Since PCA is well known, only a brief discussion of PCA is herein given. Through a linear transformation, PCA rotates the axis of multi-spectral images along lines of maximum variance, essentially to reduce the dimension of vectors that are highly correlated to each other. From applying PCA to the seven selected SAR images of ERS2 acquired over RCB, we only retained the first three PCs, PC1, PC2 and PC3 for further analysis. Since the first three PCs explained almost 70% of the total variance (40% by PC1, 15.47% by PC2 and 12.79% by PC3), it indicates that the 7 images are quite strongly correlated to each other (Table 3 shows cross-correlation between the 7 ERS2-SAR images). Since PC1 explained the most amount of the total variance, we expect PC1 to predominantly represent spectral changes between different images mainly caused by the flooding of flood plains of RCB, since major changes between different SAR images should be mainly caused by flooding. Conversely, by PC1, we should be able to approximately identify flooded land areas of RCB along the Danube River of Romania (Fig. 8). In Fig. 8a for PC1, the predominantly dark patches above the Danube River likely represent the extent of flooded area. Even though more or less the same patches above the River appears in Fig. 8b (for PC2) and 8c (for PC3), Table 3 Cross correlations () between the 7 images (specified in terms of Orbit #) of ERS2SAR of the RCB reach of Danube River listed in Table 1.
56746
57247
57784
58249
58750
59251
59752
56746 57247 57784 58249 58750 59251 59752
1 0.60 0.74 0.62 0.42 0.52 0.49
1 0.52 0.55 0.40 0.51 0.51
1 0.60 0.39 0.47 0.44
1 0.36 0.49 0.49
1 0.51 0.45
1 0.68
1
grey levels change from dark to light color, which suggest that PC2 and PC3 may only partially represent the effect of flooding. The Isodata classifier (Tou and Gonzalez, 1974; Nolin and Payne, 2007; Seiler et al., 2009), a popular and simple unsupervised classification method, consists of three steps: (1) Various centers of clusters are randomly assigned, (2) On the basis of Euclidean distances computed between each pixel and the center of all clusters, each pixel is assigned to the cluster where the Euclidean distance between the pixel and the center of that cluster is the shortest. (3) After assigning all the pixels to various clusters, the center of each cluster is re-estimated. Furthermore, procedures (2) and (3) are repeated until the number of pixels assigned to each cluster converges (minimal or no change), at which the iterative process stops.
5. Discussions of results By only inspecting changes to flood maps of the RCB reach of Romania entirely based on individual pixel values (Fig. 3), it is apparent that this simple technique could result in some flooded area classified as permanent water and vice versa. A perusal of Fig. 3 shows that identifying permanent water bodies, flooded areas and dry grounds from these images involve ambiguities caused by overlapping pixel values between the three landuses, and possibly data noise, even though these images of Romania were first subjected to speckle filtering using a gamma function to smooth out high frequency noise while retaining shape features in the images (e.g., Shi and Fung, 1994). In addition, when subjecting these images to more recently developed filters such as the coherence reduction speckle noise algorithm (CRSN) of Huang et al. (2009) or the wavelet transform algorithm of Solbø and Eltoft (2008), we still cannot eliminate the noise. It may be beneficial to explore other filters designed to remove data noise of SAR data effectively. The problem of classification could be further compounded by the possible effects of wind, intensive rainfall during floods, and emerging vegetation on SAR images. By applying the reduced grey range (Column 1) of the Subwindow to the Full-Window of the DRY image (Orbit #57247), the respective percentage of Permanent Water, Flooded area and Dry land of the image are 9.19, 27.68 and 63.13% (Fig. 3) which agree closely with 11.42, 25.53 and 63.05% listed under the sub-window in Table 2. Similarly, applying the reduced grey range of the Subwindow to the Full-Window of the WET (Orbit #57748) image, the respective landuse are 12.04, 36.84 and 51.11%, which agree closely with 10.27, 38.62 and 51.11% listed under the full-window in Table 2. From Fig. 3 it is obvious that the Wet image acquired on May 7, 2006 (Orbit #57748) shows an overall lower pixel values than the Dry counterpart acquired on April 2, 2006 (Orbit #57247) which is expected, e.g., the histogram of the WET is left to that of the Dry. Further, the first peak of WET for the subset exceeds that of the second peak partly because the subset is centered at the flooded area of Bistret, Romania, even though some pixels under the first peak may represent flooded area instead of permanent water body only. We further assessed the accuracy of the three landuse classes obtained by the pixel-based, unsupervised classification of single reduced grey image by comparing them with the corresponding landuse classes classified from the DEM data of SRTM (the contingency of Table 4). Table 4 shows the levels of agreement between pixels for the 3 landuse classes identified by this simple approach and that of the reference data. For the User’s accuracy, out of X pixels classified by this classification method under a particular landuse, Y correspond to the number of classified pixels agreed with the reference data and so the accuracy is Y/X %, e.g., for dry land,
T.Y. Gan et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 69–81
77
Obt ain PC1 from th e Principal Compon ent Analysis
1.Specify cluster centers zj
2.Distribu te th e N samples to cluster centers by th e sho rtest distance
3.Discard sample subsets with fewer than Θ N members
4.Upd a te ea ch cluster center zj , j=1-Nc by average. Nc is the nu mber of cluster center.
5.Compu te the a verage distance Dj of sa mples in cluster do main Sj
6.Comput e th e ov erall a verage distance of the samples
7.If (a)Nc ≤K/2; (b)this is an even-numbered itera tion or Nc ≥ 2K. otherwise, cont inue. (c)If this is the last iteration , (a )
8.Find the stand ard deviation vector σj
9.Find the maximum of each σj (σj max )
12.If Dij > Θc, lump zi and zj
Conv erge or last iteration ? No, go to step 2
10. If σj max > Θ s , and (a) Dj >Davg and Nj >2( ΘN +1), No (b) Nc≤ K/2. Split zj into two new clusters, then go to step 2
(b) (c)
11. Comput e the pairwise Dij between all cluster centers
Yes 13.End of the process
Note: Nc = Number of cluster centers Fig. 7. Flowchart of the principal component analysis (PCA) combined with the Isodata method.
X is 5088594 (=3557525 + 1203906 + 327163) while Y is 3557525 and so the accuracy is about 69.9%. On the other hand, based on the DEM data, the total number of pixels identified as dry land X is 7152947 (=3557525 + 3362633 + 232789), and so from the “producer’s” perspective, the accuracy of this category is 49.74%. The overall accuracy and the Hanssen–Kuiper (HK) score (Hanssen and Kuipers, 1965) were further computed. In Table 4, the overall accuracy of 45.36% is the percentage of correct classification, or the total number of correctly classified pixels divided by the total number of pixels. As a stringent assessment to the classification method, the HK score (see Appendix A) turns out to be only 0.08, which is very low. This means that even though the total number of pixels classified to a particular landuse agrees reasonably well with that of the reference data, e.g., 69.9% or 49.74% of correct classification of the dry land depending on the type of accuracy computed (user’s or producer’s accuracy), in term of HK score, the skill of this classification method is limited. HK is a skill measure commonly used to assess weather forecast based on a 2 × 2 contingency table (hit rate versus false alarm), while in this study a 3 × 3 contingency table is necessary given that we have three types of landuse. HK is a fundamental measure of skill relative to a baseline of random forecasts, such that a perfect skill yields a HK Score of 1, and a forecast can only be considered skillful if it receives a value greater than zero.
We only qualitatively compared Landsat-TM5 images with ERS2-SAR images of Bistret, Romania since these images were acquired at different dates and by different types of sensors. It is obvious that we can more easily identify flooded area from multispectral Landsat-TM5 images visually (Fig. 1) than single-band, speckle filtered, ERS2-SAR images (Fig. 3). Landsat-TM5 has 3 visible, three NIR and one mid-IR bands. In the three visible bands (0.45–0.69 m) of TM5, the resultant reflectance depends upon the reflectance of the water surface, suspended and bottom materials. However, band 4 (0.76–0.90 m) is generally the best band for detecting flooded areas. According to Landsat-TM5 images (Fig. 1), the Danube River near Bistret was still flooded until around July of 2006. One popular approach to detect flood from multi-images is the simple change detection technique, by subtracting pixel values of an image acquired during flooding from that acquired during dry condition. However, by subtracting the pixel values of two SAR images, we found that this simple technique did not work well partly because of speckles or data noise of SAR images. On the other hand, the effectiveness of clustering multi-date SAR images to one image and then performing the contextual classification can be limited by data noise and pattern complexity. Even though the overall feature of the classified flood map reflects the flooded areas of the RCB (Rast-Catane-Bistret) reach, we found that
Table 4 A contingency table of 3-landuses classified by a single image, pixel-based unsupervised classification. HK = 0.08
No. of referenced pixel
Overall accuracy = 45.36%
Dry land
Flooded area
Permanent water
3557525 3362633 232789 49.74
1203906 1066341 670261 36.26
327163 293784 438395 41.38
Dry land Flooded area Permanent water Producer’s accuracy (%)
No. of classified pixel
User’s accuracy (%)
69.91 22.58 32.68
78
T.Y. Gan et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 69–81
Fig. 8. Plots of: (a) PC1 that explains 39.81% of the total variance, (b) PC2 that explains 15.47% of the total variance, and (c) PC3 that explains 12.79% of the total variance, of the PCA of 7 ERS2-SAR images of original pixel values for the RCB reach of Danube River at Romania (#56746, #57247, #57748, #58249, #58750, #59251, #59752), where dark color represents permanent water (Danube River) and part of flooded area which can also be represented by grey color. Light grey and white likely represent the dry land; and (d) the classification of PC1 of 7 ERS2-SAR images by the Isodata classifier (green: permanent water; blue: flooded area; yellow: dry land). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)
because of combining several images together, the Danube river becomes not as clearly defined (compare Figs. 5 and 6), and boundaries between flooded areas and permanent water bodies could overlap each other, dry land could be classified as flooded area and vice versa (Fig. 6b), even though all the images used were already geometrically corrected with respect to the image #56746. This is a drawback of clustering multi-date SAR images even after speckle filtering because not all the data noise can be removed by speckle filtering. We also used the three landuses classified from the DEM data of SRTM (Table 5) to assess the accuracy of the multi-date SAR images merged to a single image and classified to the three landuses by the contextual classification technique. Compared to the results of Table 4 based on the pixel-based unsupervised
classification of single reduced grey image, there is an increase in the overall accuracy (45.36% to 51.93%) and the HK score increases from 0.08 to 0.2. Overall, the contextual classification is more effective than the pixel-based unsupervised classification of the single reduced grey image in classifying the dry land and probably the permanent water, but marginally poorer in classifying flooded area partly because between flooded area and permanent water, pixels could be wrongly classified as permanent water and vice versa. Given that the contexture of flooded area may change from image to image, using multi-date ERS2-SAR images could have some drawback. From PC1 of the 7 multi-date ERS2-SAR images of Bistret, the dark color likely represents the permanent water and part of the flooded area, which is also partly represented by grey color (Fig. 8a).
T.Y. Gan et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 69–81
79
Table 5 A contingency table of 3-landuses classified by the contextual classification. HK = 0.2
No. of referenced pixel
Overall accuracy = 51.93%
Dry land
Flooded area
Permanent water
4650453 1262425 1221719 65.18
1163676 391615 1385615 13.32
187015 122289 750057 70.80
Dry land Flooded area Permanent water
Producer’s accuracy (%)
Since permanent waters are persistent features in these multiple images, changes of pixel values of locations subjected to flooding in one image and dry conditions in another image gives rise to data variability. Therefore, PC1 that explains 40% of the total data variability, should approximately represents the permanent waters and flooded areas. However, there is an overlap between permanent water bodies and flooded area, as evident in its histogram showing a modest bimodal feature (Fig. 2d), where the first peak on the left represents the permanent water and the more pronounced second peak, represents both flooded area and permanent water. In PC1, the light grey and white areas predominantly represent dry land. Even though Fig. 6 derived from contextual classification shows light color as flooded areas and permanent water bodies, their patterns of flooded area are similar to that of PC1 with minor differences. Other than the Danube River as permanent water, the patterns of PC2 and PC3 are less clear cut and possibly more affected by data noise and radar speckles than PC1. Even though PC1 alone could show patterns that can be visually identified as permanent water, flooded area, and dry land, this “visual” application of PC1 is prone to ambiguities partly because visual interpretation of landuse patterns can be subjective. In this study, another unsupervised classification technique called Isodata was chosen to classify the PC1 image. Isodata could classify an image into various assigned categories. Even though our objective is to classify the SAR images into 3 dominant landuses (categories) only, in Isodata, it is better to assign more than three categories (preferred categories) in its search of patterns given that the reflectance of dry land are expected to vary from location to location because of differences in surface properties, vegetation cover, roughness, etc. Assigning insufficient number of categories for the dry land may result in an overlapping of the dry land with flooded areas as the classified flooded areas. By trial and error, the number of categories that will optimally classify the PC1 image by Isodata turns out to be when the number of pixels classified as permanent water asymptotically approaches a lower limit, which agrees with the number of pixels from the DEM data of SRTM identified as permanent water. In our experiment with Isodata, the PC1 image was classified to between 3 and 15 categories. Categories 1 and 2 were defined as permanent water and flooded area given that they should have the lowest and the second lowest DN (Digital Number). After classification, those higher categories were artificially merged to form one category that represents dry land. As we increase the number of landuse categories from three to eight, we found that the number of pixels classified as permanent water decreased (Fig. 9). For
77.49 22.05 22.34
4.5 Classifed
4
No. of pixel Millions
No. of classified pixel
User’s accuracy (%)
DEM Reference
3.5 3 2.5 2 1.5 1 0.5 0
3
4
5
6
7
8
9
10
Category
11
12
13
14
15
Fig. 9. The number of pixels of PC1 of 7 ERS2-SAR images classified by Isodata as permanent water with respect to the number of categories selected for the Isodata classifier. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)
Table 6 The performance of the Isodata classifier with respect to the number of categories applied to the PC1 of 7 ERS2-SAR images. No. of sub-category
7
8
9
10
HK Overall accuracy (%) Dry landa (%) Flooded areaa (%) Permanent watera (%)
0.19 51.21 55.33 37.49 61.49
0.24 61.23 75.61 29.93 50.99
0.24 63.06 79.65 27.85 48.79
0.25 63.75 81.20 26.99 47.93
a
Producer’s accuracy.
PC1, the point of inflexion occurs at about the 8-category because beyond which the number of pixels classified by Isodata as permanent water asymptotically approaches a lower limit, which is also closest to the number of pixels of permanent water identified from the DEM data of SRTM (Fig. 9). The performance of the Isodata classifier for 7–10 categories (Table 6) was compared with that of the 8-category (Table 7). Using 7 categories, the permanent water and flooded area were more accurately classified by Isodata than with 8 categories but at the expense of the accuracy of dry land. Conversely, when the number of categories was increased to 9 and 10, the overall performance of Isodata increased marginally but the accuracy of flooded area and permanent water classified by Isodata slowly decreases. Given that our focus is to optimize the classification of permanent water and flooded area, it seems that the best strategy is to classify the
Table 7 A contingency table of 3-landuses classified by the Isodata classifier with 8-category applied to PC1 of 7 ERS2-SAR images. HK = 0.24
No. of referenced pixel
Overall accuracy = 61.23%
Dry land
Flooded area
Permanent water
5411971 1606210 139669 75.61
1612182 880424 449458 29.93
190141 329060 540160 50.99
No. of classified pixel
Producer’s accuracy (%)
Dry land Flooded area Permanent water
User’s accuracy (%)
75.02 31.27 47.83
80
T.Y. Gan et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 69–81
PC1 image with 8 categories, whose results are shown in Table 7 (overall accuracy = 61.23%, HK score = 0.24). The same process was repeated for the combined images of PC2 and PC3, but the overall performance achieved by Isodata was marginally poorer than that achieved with the PC1 image. Therefore the most accurate flood map for the RCB of Romania achieved in this study is by classifying the PC1 of 7 multi-date ERS2-SAR images using the Isodata classifier with 8-category. 6. Summary and conclusions Among 7 ERS2-SAR images selected to map the flooding of Bistret of Romania, several images represent “dry” and several “wet” conditions, and the latter were due to the major Danube flooding event of April-May, 2006. The images were classified into (a) permanent water (mainly Danube River), (b) flooded area, and (c) dry land, using (1) a single image, pixel-based unsupervised classification, (2) frequency-based contextual classification, and (3) principal component analysis (PCA) combined with the Isodata classifier, which represent the simplest to the most complex techniques used in this study. The flooded areas delineated from the above procedures are visually compared with Landsat-TM and MODIS mosaic images of the study site at Bistret, and quantitatively assessed with the classifications obtained from the DEM data of SRTM. Apparently there is no one technique that is consistently better partly because of the nature of SAR data (radar echoes), and partly because of data noise even though the images were first subjected to speckle filtering, and partly because SAR images could appear dark not only because of flooding but also because of smooth surfaces or other complicating factors. However, if multidate SAR images of both DRY and WET (flooding) conditions are available, it seems that PCA combined with the Isodata classification would give better defined flooded areas of the Danube River than the simple, single image, pixel-based classification or the multi-image, contextual classification. For the two methods involving multi-date SAR images, the flooded areas extracted for the Rast-Catane-Bistret (RCB) reach of Danube River are representative partly because the flood water persisted for weeks. It seems that classification methods involving multi-date images can generally address the operational needs of mapping major floods. It should be noted that methods involving multi-date images should be more effective in addressing the operational needs of mapping major than minor floods that may not persist beyond a few days, or more effective when applied to multi-band images acquired at the same time, or to those geographical movements that are less time sensitive, such as lake area or crop mapping. 7. Recommendation for future research A possible future extension of this flood mapping effort using satellite data may be done by combining images of optical and SAR sensors, such as Radarsat-2 given that the ERS2-SAR satellite has been shut down recently (September 5, 2011), and Radarsat-2 SAR data are quad-polarization data (HH, VV, HV, and VH). The idea is to exploit the respective strength of SAR data of multiple polarizations, and that of optical sensors in an optimum manner. For example, Prigent et al. (2007) used multi-satellite method employing passive microwave land surface emissivities calculated from SSM/I and ISCCP observations, ERS scatterometer responses, and AVHRR visible and near-infrared reflectance to map inundations at global scale. Second, in terms of techniques, the active contours of areal features such as water bodies defined according to the geometry of the
target (Horritt, 1999) can be modified or “deformed” using statistical and boundary information derived from multispectral images (e.g., Jodouin et al., 2003). Lastly, the Surrey Satellite Technology Ltd. (SSTL) has developed the Disaster Monitoring Constellation (DMC) of micro-satellites at low cost, to provide some landfocused data products in the visible and near IR wavebands with a highly attractive daily access capability. Future work will explore the possibility of using such data in flood mapping of European rivers. Acknowledgements The first author was jointly supported by the German Academic Exchange Service (DAAD) and the Technische Universität München, Germany during his sabbatical leave from the University of Alberta. The ERS2-SAR data was provided through a research proposal funded by the European Space Agency (ESA) under its Category 1 project program. Image analysis was done using the PCI software. The third author was partly funded by the Canadian Foundation for Climate and Atmospheric Science (CFCAS). Appendix A. Hanssen–Kuiper (HK) score To compute the HK skill score, the classified and referenced landuse are grouped into three categories: Dry Land, Flooded Area, and Permanent Water. Then a contingency table is established to calculate HK score. For an L × L contingency table, the HK score may be expressed in terms of probabilities as
L
HK =
i=1
p(refi , clasi ) − 1−
L
L
j=1
i=1
p(refi ) × p(clasi )
[p(refj )]2
(A1)
where refi and clasi are the ith referenced and classified values. HK score values range from −1 to +1. Perfect classifications receive a score of 1, random classifications receive a score of 0, and classifications inferior to random classifications receive negative scores. In this paper L equals to three. References Bates, P.D., 2004. Remote sensing and flood inundation modeling. Hydrol. Processes 18, 2593–2597. Bates, P.D., De Roo, A.P.J., 2000. A simple raster based model for flood inundation simulation. J. Hydrol. 236, 54–77. Belz, J.U., Goda, L., Buzis, Z., Domokos, M., Engel, H., Weber, J., 2004. Flow regime of River Danube and its catchment. The Danube and its catchment – a hydrological monograph, follow-up, vol. VIIIl2. Koblenz and Baja, 152 pp. Bratrich, C., Moroz, S., 2006. 2006 Floods in the Danube River basin. Working Paper of WWF, Vienna. Farr, T.G., et al., 2007. The shuttle radar topography mission. Rev. Geophys. 45, RG2004. Gong, P., Howarth, P., 1992. Frequency-based contextual classification and graylevel vector reduction for land-use identification. Photogrammetr. Eng. Remote Sens. 58 (4), 423–437. Hanssen, A.W., Kuipers, W.J.A., 1965. On the relationship between the frequency of rain and various meteorological parameters. Med. Verh 81, 2–15. Hendrickx, J.M.H., Remke, L. van D., Borchers, B., Curtis, J., Lensen, H.A., Harmon, R., 2003. Worldwide distribution of soil dielectric and thermal properties. In: Proceedings of SPIE – The International Society for Optical Engineering (SPIE). Hess, L.L., Melack, J.M., Solange, F., Wang, Y., 1995. Delineation of Inundated area and vegetation along the Amazon floodplain with the SIR-C Synthetic Aperture Radar. IEEE Trans. Geosci. Remote Sensing 33 (4), 896–905. Horritt, M.S., 1999. A statistical active contour model for SAR image segmentation. Image Vision Comput. 17 (3–4), 213–224. Horritt, M.S., 2006. A methodology for the validation of uncertain flood inundation models. J. Hydrol. 326, 153–165. Huang, S.Q., Liu, D., Gao, G., Guo, X., 2009. A novel method for speckle noise reduction and ship target detection in SAR images. Pattern Recogn. 42, 1533–1542. Jodouin, S., Bentabeta, L., Zioua, D., Vaillancourta, J., Armenakis, C., 2003. Spatial database updating using active contours for multispectral images: application with Landsat 7. ISPRS J. Photogrammetr. Remote Sensing 57, 346–355.
T.Y. Gan et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 69–81 Nolin, A.W., Payne, M.C., 2007. Classification of glacier zones in western Greenland using albedo and surface roughness from the Multi-angle Imaging SpectroRadiometer (MISR). Remote Sensing Environ. 107, 264–275. Prigent, C., Papa, F., Aires, F., Rossow, W.B., Matthews, E., 2007. Global inundation dynamics from multiple satellite observations, 1993–2000. J. Geophys. Res. 112, D12107. Rodriguez, E., Morris, C.S., Belz, J.E., Chapin, E.C., Martin, J.M., Daffer, W., Hensley, S., 2005. An assessment of the SRTM topographic products. Technical Report JPL D-31639. Jet Propulsion Laboratory, Pasadena, California, 143 pp. Schwarz, U., Bratrich, C., Hulea, O., Moroz, S., Pumputyte, N., Rast, G., Bern, M.R., Siposs, V., 2006. Floods 2006 in the DRB: Restoration of Floodplains. WWF, Vienna. Seiler, R., Schmidt, J., Diallo, O., Csaplovics, E., 2009. Flood monitoring in a semi-arid environment using spatially high resolution radar and optical data. J. Environ. Manage. 90, 2121–2129.
81
Shi, Z., Fung, K.B., 1994. A comparison of digital speckle filters. In: Proceedings of IGRASS 94, August 8–12. Solbø, S., Eltoft, T., 2008. A stationary wavelet-domain Wiener filter for correlated speckle. IEEE Trans. Geosci. Remote Sensing 46 (4). Tou, J.T., Gonzalez, R.C., 1974. Pattern Recognition Principles. Addison-Wesley Publishing Co. Wang, Y., Koopmans, B.N., Pohl, C., 1995. The 1995 flood in the Netherlands monitored from space – a multi-sensor approach. Int. J. Remote Sensing. 16 (15), 2735–2739. Younis, J., Ramos, M.H., Thielen, J., 2008. EFAS forecasts for the March–April 2006 flood in the Czech part of the Elbe river basin – a case study. Atm. Sci. Lett. 9, 88–97.