Journal Pre-proofs Research papers Flood susceptibility mapping using convolutional neural network frameworks Yi Wang, Zhice Fang, Haoyuan Hong, Ling Peng PII: DOI: Reference:
S0022-1694(19)31217-X https://doi.org/10.1016/j.jhydrol.2019.124482 HYDROL 124482
To appear in:
Journal of Hydrology
Received Date: Revised Date: Accepted Date:
24 September 2019 2 December 2019 16 December 2019
Please cite this article as: Wang, Y., Fang, Z., Hong, H., Peng, L., Flood susceptibility mapping using convolutional neural network frameworks, Journal of Hydrology (2019), doi: https://doi.org/10.1016/j.jhydrol.2019.124482
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
© 2019 Published by Elsevier B.V.
1
Flood susceptibility mapping using convolutional neural
2
network frameworks
3
Yi Wang1,*, Zhice Fang1, Haoyuan Hong2,3,4,5*, Ling Peng6
4
1
5
430074, China
6
2
7
Universitätsstraße 7, 1010 Vienna, Austria
8
3
9
Ministry of Education, Nanjing, 210023, China
Institute of Geophysics and Geomatics, China University of Geosciences, Wuhan Department of Geography and Regional Research, University of Vienna, Key Laboratory of Virtual Geographic Environment (Nanjing Normal University),
10
4
11
(Jiangsu Province), Nanjing, 210023, China
12
5
13
Development and Application, Nanjing, Jiangsu 210023, China
14
6 China
15
*Correspondence
16
(
[email protected])
State Key Laboratory Cultivation Base of Geographical Environment Evolution Jiangsu Center for Collaborative Innovation in Geographic Information Resource Institute of Geo-Environment Monitoring, Beijing 100081, China Author: Yi Wang (
[email protected]); Haoyuan Hong
17 18 19 20 21 22 23 1
24
1. Introduction
25
Floods are one of the most common and catastrophic geo-hazards that are poorly
26
comprehended (Tehrany et al., 2014a). The severe impact of floods on natural
27
ecosystems and human activities has greatly affected economic and social
28
sustainability (Rahmati et al., 2015). Therefore, it is essential to identify flood-prone
29
zones to prevent or mitigate adverse effects of flooding (Bathrellos et al., 2017; Zhu et
30
al., 2015).
31
In the past few years, various statistical and machine learning (ML) methods have
32
been successfully applied in flood susceptibility mapping (FSM) (Bui et al., 2019b;
33
Mojaddadi et al., 2017). Statistical methods are mainly based on the assumption that
34
historical flood events are closely related to flood predisposing factors, including
35
frequency ratio (Rahmati et al., 2016; Tehrany et al., 2015a), logistic regression
36
(Youssef et al., 2015a), weights of evidence (Tehrany et al., 2014b; Youssef et al.,
37
2015b), analytic hierarchy process (Kazakis et al., 2015) and multiple criteria decision
38
(Liu et al., 2018; Santos et al., 2019a; Santos et al., 2019b; Wang et al., 2019b).
39
Recently, ML techniques have been applied to FSM, such as artificial neural network
40
(Bui et al., 2019a; Kia et al., 2012), decision tree (Choubin et al., 2019; Khosravi et
41
al., 2018; Tehrany et al., 2013), Random Forest (Rizeei et al., 2018) and Support
42
Vector Machine (SVM) (Choubin et al., 2019; Tehrany et al., 2014b; Tehrany et al.,
43
2015a; Tehrany et al., 2015b).
44
In general, when applying an ML framework to solve a particular problem, feature
45
engineering is a critical step that obtains an appropriate feature representation from
46
the raw data (e.g., pixel values of the image) prior to data modeling. Furthermore, the
47
performance of ML algorithms depends to a large extent on the representation of the 2
48
raw data (Goodfellow et al., 2016). In this sense, ML methods cannot directly uncover
49
instructive representations from raw data, nor can they obtain new insights from these
50
representations, thus further improving predictive capability (LeCun et al., 2015).
51
More recently, deep learning, as one of the most popular ML techniques, has received
52
widespread attention and has been able to obtain reliable results comparable to or
53
superior to conventional ML methods (Schmidhuber, 2015). This technique includes a
54
broad class of methods for computer vision, signal processing and natural language
55
processing with multiple network architectures (Graves et al., 2013; Hinton et al.,
56
2012; Sutskever et al., 2014). Among various deep learning methods, a convolutional
57
neural network (CNN) with architecture inspired by biological visual perception can
58
identify representation with extreme variability through convolutional and pooling
59
layers (Wang et al., 2019a). Gebrehiwot et al. have successfully applied CNN to
60
identify flood ranges using unmanned aerial vehicle data (Gebrehiwot et al., 2019).
61
Moreover, several feature engineering methods of correlation-based, information gain
62
ratio and multi-collinearity analysis have been used in flood susceptibility analysis
63
(Bui et al., 2019b; Khosravi et al., 2019; Zhao et al., 2019). Therefore, since CNNs
64
can get rid of the tedious feature engineering steps and extract useful information
65
directly from the original data, they can play an important role in FSM. Although
66
CNNs are more accurate than the state-of-the-art ML methods in many fields
67
(Ghorbanzadeh et al., 2019; Salamon and Bello, 2017; Simard et al., 2003; Yu et al.,
68
2017), few studies have applied this technique to FSM.
69
Choosing mapping units is a key step for flood susceptibility modelling. When
70
applying the ML method to FSM, grid cells (pixels) are commonly used mapping
71
units that have the advantage of being easier to process (Bui et al., 2019a; Das, 2019a;
72
Termeh et al., 2018). Essentially, FSM is a binary classification task that determines 3
73
whether a pixel will be a flood point and predicts its probability of occurrence. In the
74
FSM process, each grid cell is considered as a feature vector that contains information
75
about different triggering factors, which can be processed by the ML models.
76
Therefore, the entire study area can be regarded as a multi-channel “image”, where
77
each channel represents a flood triggering factor layer, which makes it possible to
78
apply the CNN technique to flood susceptibility analysis. In this paper, the CNN
79
framework is introduced to assess flood susceptibility in Shangyou County, China.
80
CNNs can use different convolutional operations to extract various information from
81
different data modalities. The three main contributions of this paper are outlined as
82
follows. First, to the best of our knowledge, the application of CNN in FSM is still
83
very rare. We explore the feasibility of CNNs in FSM in this study. Second, two CNN
84
frameworks for FSM are presented. Specifically, for the first framework, the CNN
85
model is used directly as a classifier to assess flood susceptibility. For the second
86
framework, CNN is integrated with the SVM classifier in a hybrid manner: (1) the
87
CNN model is used to extract powerful and useful representation of flood triggering
88
factors, and (2) the SVM classifier is selected for classification with the extracted
89
features. Third, three data presentation methods are designed in the CNN architecture
90
to adapt to the classification and feature extraction in the FSM process. To validate
91
and compare the predictive capability of the proposed CNN-based methods to SVM,
92
several objective criteria of overall accuracy (OA), kappa coefficient (), receiver
93
operating characteristic (ROC) and area under ROC curve (AUC) were used.
4
94
2. Study area and available data
95
2.1. Study area
96
Shangyou County is located in the southern part of Jiangxi Province, China, and is
97
in the northern hilly district of Nanling. The study site covers an area of
98
approximately 1543 km2 between 25°42′N to 26°01′N and 114°00′E to 114°40′E. The
99
Shangyou district has a total population of approximately 3.22 million people and the
100
elevation ranges between 110 and 1901 m above sea level (Fig. 1). Shangyou County
101
is located in the southern subtropical zone and belongs to the humid monsoon climate
102
zone in the subtropical hilly region. In this study area, the climate is mild, rainfall is
103
abundant, sunshine is sufficient, four seasons are distinct, and the frost-free period is
104
long. Specifically, the average annual rainfall during 1959–2014 ranged between
105
933.7 and 2147.6 mm, according to the Shangyou Meteorological Bureau. The
106
rainfall displays a great difference in spring and summer and the annual rainy season
107
in Shangyou district is from April to August. The average annual sunshine time is
108
1708.3 h, which is an intermediate level in Jiangxi Province, and the average annual
109
temperature is 18.6 ℃. Moreover, January and July are the coldest and hottest month
110
with an average temperature of -2.7 ℃ and 38 ℃, respectively. Vegetation coverage
111
accounts for more than 95% of the study area, mainly including farmland, forests and
112
grasslands. Meanwhile, the study area is rich in water resources. The annual average
113
runoff of surface water in Shangyou County is 3.52 billion cubic meters. The
114
groundwater is mainly loose pore water, bedrock fissure water and underground hot
115
water. In short, from a climate perspective, the study area is rich in precipitation and
116
extreme weather events. As a result, the area is vulnerable to climatic factors that
117
cause flooding. 5
118 119
120
Fig. 1.
Location of the study area.
2.2. Flood inventory mapping
121
A flood inventory map records the locations of the inundated area and provides
122
detailed information about the characteristics of historical flood events (Santos et al.,
123
2019a; Zazo et al., 2018). In this study, a flood inventory map with 108 historical
124
flood events was provided by the Jiangxi Meteorological Bureau1 and the Department
125
of Civil Affairs of Jiangxi province2, as shown in Fig. 1. The locations of flood events
126
were obtained through historical records, extensive field surveys and visual
127
interpretation of Google Earth images. All the sampling points in Fig. 1 represent
128
where the flood occurred and were used to produce training and validation sets: 76
129
locations (70%) were randomly selected for training, whereas 32 locations (30%)
130
were used for validation. Moreover, the same number of non-flood locations (76 and 1 2
http://www.weather.org.cn http://www.jxmzw.gov.cn 6
131
32) were randomly selected from the area free of flood to construct the training and
132
validation sets.
133
2.3. Flood triggering factors
134
According to literature review (Arabameri et al., 2019; Khosravi et al., 2019;
135
Shafizadeh-Moghadam et al., 2018) and characteristics of the study area, 13 flood
136
triggering factors were selected for FSM, including altitude, aspect, curvature, slope,
137
stream power index (SPI), sediment transport index (STI) and topographic wetness
138
index (TWI), lithology, land use, normalized difference vegetation index (NDVI),
139
soil, distance to rivers and rainfall.
140
Altitude is a frequently used flood triggering factor and lower altitudes are usually
141
accompanied by higher river discharge (Bout and Jetten, 2018; Sowmya et al., 2015;
142
Wang et al., 2016). Aspect is the orientation or direction of the maximum slope of the
143
terrain surface and influences hydrologic conditions because different orientations
144
face different effects of precipitation and solar radiation (Bathrellos et al., 2018; Xu et
145
al., 2016). Curvature describes the degree of distortion of the slope surface and
146
reflects the morphology of the topography in a given area (Mojaddadi et al., 2017).
147
The curvature layer can be computed from digital elevation model (DEM) data using
148
fourth order surface model (Zevenbergen and Thorne, 1987). Slope directly affects
149
surface runoff and vertical percolation since water flows from higher to lower altitude
150
(Chaabani et al., 2018). SPI represents the performance of river sediment transport
151
and fluvial channel erosion (Pamučar et al., 2017) and is calculated as follows (Moore
152
and Wilson, 1992): SPI As tan
153
(1)
where As is the area of the basin and is the slope gradient (in degrees). STI 7
154
qualitatively explains the process of erosion and deposition (Tehrany et al., 2015a)
155
and is calculated as follows (Moore et al., 1993): 0.6
1.3
A sin STI s 22.13 0.0896
(2)
156
where the parameters in Eq. (2) are defined the same as SPI. TWI is a physical
157
attribute that reflects the situation of geotechnical wetness (Chapi et al., 2017), and it
158
is computed as follows (BEVEN and Kirkby, 1979):
TWI ln tan
(3)
159
where and represent the upslope area per unit contour length and the slope
160
angle, respectively. Lithology determines the geological engineering characteristics of
161
the study area (Arabameri et al., 2019; Hong et al., 2018b), and the lithological
162
classes of the study area are described in Table 1. Land use contributes to flood
163
occurrence because it may alter flood runoff and sediment transport (Das, 2019b;
164
Rizeei et al., 2018). NDVI can accurately display surface vegetation coverage (Liang
165
et al., 2017) and is calculated as follows:
NDVI
RNIR RR RNIR RR
(4)
166
where RNIR and RR represent the spectral reflectance acquired in near-infrared
167
region and in red region, respectively. Different soil surface characteristics have
168
diverse potential of water storage, which significantly affect the water balance (Hong
169
et al., 2018a). The distance to rivers indicates the distance from river networks, which
170
are the main pathways for flood discharge and expansion (Gigovi´C et al., 2017;
171
González-Arqueros et al., 2018; Qazi et al., 2016). Continuous rainfall events
172
recharge basins and aquifers, and thus the average annual rainfall was always 8
173
considered as a flood triggering factor (Arnell and Gosling, 2016; Tehrany et al.,
174
2015b).
175
The related information on all the triggering factors is listed in Table 2. It should
176
be noted that the land use factor was obtained by classifying a Landsat 7 ETM+
177
satellite image (Scene ID: LE71220422001324SGS00) obtained on November 20,
178
2001 by using the maximum likelihood algorithm with an OA above 85%, and the
179
river networks were extracted from the topographic map and the distance to rivers
180
factor was calculated by using Euclidean tool. All the factor layers were converted
181
into a raster format with a grid size of 30 30 m that corresponds to the spatial size of
182
the DEM data. All the continuous factors were reclassified into categorical classes
183
using ArcGIS software, based on the previous studies (Chen et al., 2019; Gigović et
184
al., 2017; González-Arqueros et al., 2018; Mojaddadi et al., 2017), the characteristics
185
of flood spatial distributions and the numerical range of these factors. The
186
reclassification maps of all the flood triggering factors are shown in Fig. 2.
(a)
(b)
9
(c)
(d)
(e)
(f)
(g)
(h)
10
(i)
(j)
(k)
(l)
(m) 187 188 189
Fig. 2. Thematic maps of the study area. (a) Altitude, (b) aspect, (c) Curvature, (d) distance to rivers, (e) land use, (f) lithology, (g) NDVI, (h) rainfall, (i) slope, (j) soil, (k) SPI, (l) STI and (m) TWI.
190 191 192 193 11
194
Table 1 Lithological types of the study area. Group name
Unit name
Lithology
A
Yang Jiayuan group; Zishan group
Dolomite, calcareous Carbonaceous siltstone
B
Zhongpeng group; Mashan group
Sandstone, silty shale, siltstone
C
Huangxie unit; Changshan unit;
D
Tangbiangroup; Hekou group
unit;
Xia
Xia Huangkeng group; Chang kengshui group; Dui Ershi group Nan Pingshan unit; Tanghu super unit; Fufang super unit Yingqian super unit; Huang Tiankeng unit
E F G
Monzonitic granite, potassium granite Conglomerate, mudstone, Pebbly Sandstone Sericite slate, siliceous slate Tonalite, phyric granodiorite, porphyry monzogranite, Granodiorite
H
Bali group; Lao Hutang group
Phyllite, sandy slate, feldspathic quartz sandstone
I
Shuishi group, Gaotan group
Silty slate, carbonaceous slate
195
196
Haihui
siltstone,
Table 2 Data sources and the associated factors in the study area. Flood trigger factors
Source of data
Scare/Resolution
Altitude Aspect Curvature Slope SPI STI TWI
ASTER GDEM Version 2
30 m
Lithology
China Geology Organization
1:2,000,000
Land use NDVI
Landsat 7 ETM + images
30 m
Soil
Institute of Soil Science, Chinese Academy of Sciences
1:1,000,000
Distance to rivers
ASTER GDEM Version 2
30 m
Rainfall
Jiangxi Meteorological Bureau
1:50,000
3. Methodology
197
The flowchart of the proposed methodology to identify flood susceptibility zones is
198
illustrated in Fig. 3. Specifically, a flood inventory map and flood triggering factors
199
are first prepared to construct training and validation sets. Then, three data 12
200
representation forms are used to train different dimensional CNN architectures. Next,
201
the trained CNN architectures are applied for classification and feature extraction in
202
the FSM process. Finally, the prediction results obtained by different CNN-based
203
methods are quantitatively evaluated using several objective criteria.
204 205
206
Fig. 3.
Flowchart of the proposed methodology.
3.1. SVM
207
SVM can map original data into high dimensional feature space by using kernel
208
functions (Cortes and Vapnik, 1995) and attempt to find an optimal hyperplane that
209
separates true and false classes (Vapnik, 1999). Recently, it has been widely used in
210
flood susceptibility assessment (Choubin et al., 2019; Tehrany et al., 2019; Tehrany et
211
al., 2014b; Tehrany et al., 2015b). 13
212
B N Assuming that a training set x x1 , x2 , . . . , xN consists of two classes that
213
denoted as L 1, 1 , the aim of SVM is to ensure hyperplane margin
214
maximization and error minimization. The error minimization is expressed as follows: 1 min w w , i , b 2
2
C i i
(1)
s.t. y i w xi b 1 i , i 0 , i 1, 2, . . . , l 215
where w is the weight vector, b denotes the bias, i is the slack variables for
216
non-separable data and C denotes the penalty parameter. The optimization (1) is
217
solved as follows: l 1 l max i i j yi y j K x , xi i 2 i 1, j 1 i 1
(2)
l
s.t. 0 i C , i yi 0 i 1
218
where i is Lagrange multipliers and K x, xi is the kernel function, the decision
219
function is written as follows: l
f x i yi K x, xi b
(3)
i 1
220
The SVM with the radial basis function (RBF) kernel can achieve excellent
221
performance in classification/prediction tasks due to the following facts: First, the
222
RBF kernel function can map a sample to a higher dimensional space, and the linear
223
kernel function is essentially a special case of RBF. Meanwhile, the kernels of RBF
224
and sigmoid have similar performance with certain parameters. Second, compared
225
with the polynomial kernel function, only a few parameters need to be modulated by
226
RBF and the number of kernel function parameters directly influences the complexity
227
of this function. Finally, the RBF kernel has less numerical difficulties. 14
228
3.2. Data representations for CNNs
229
As a powerful technique, CNN structures of different dimensions can be
230
constructed to adapt to input data of different dimensions, which can greatly affect
231
final classification/prediction results (Kussul et al., 2017). In addition, the
232
representation of raw data is crucial because CNN is based on the principle of locality
233
(Li et al., 2019). In this section, three data representations will be introduced to fit the
234
novel CNN architectures, as illustrated in Fig. 4. Initially, 13 flood triggering factor
235
layers are stacked together to build a multi-channel “image” with a spatial resolution
236
of 30 m, which is regarded as input data for further data transformation. To clarify,
237
the three data representations are briefly described as follows:
238
(1) For the one-dimensional data form in Fig. 4 (a), each pixel vector is considered
239
as an image with several flood triggering attributes hidden in each pixel.
240
Specifically, the input data consists of a set of column vectors whose length is
241
determined by the number of flood triggering factors. Each element in the
242
vector represents the corresponding flood predisposing factor’s attribute value.
243
(2) As shown on the right side of Fig. 4 (a), the two-dimensional data form is
244
derived from one-dimensional data. The most critical issue is how to construct a
245
bridge from a one-dimensional vector to a two-dimensional matrix. In this
246
study, each pixel vector in the multi-channel “image” is converted into an n n
247
matrix in a specific way, where n is the maximum between the number of flood
248
triggering factors and the number of categories in which each factor is
249
reclassified. For example, there are 13 flood triggering factors in the study area,
250
which are larger than the maximum categories (9) of the aspect and lithology
251
factors. Thus, a value of 13 is assigned to n. Then, each value in the pixel vector 15
252
(one-dimensional data) is expanded into a row vector of length 13. Specifically,
253
the fourth element of “6” (the attribute value of the distance to rivers factor) in
254
the pixel vector is converted into a row vector of
255
which the sixth and other elements of the vector are assigned values of 1 and 0,
256
respectively. Finally, all row vectors form a 13 × 13 matrix (two-dimensional
257
data).
0 0 0 0 0 1 0 0 0 0 0 0 0 , in
258
(3) In the case of three-dimensional data, each pixel and its adjacent pixels form a
259
three-dimensional window patch, as shown in Fig. 4 (b). The patch has the
260
same label as the current pixel. Assuming the window size is 5, each data
261
sample is 13 × 5 × 5.
(a)
16
(b) 262 263
Fig. 4.
264
3.3. CNN architectures
Different data forms. (a) One-dimensional and two-dimensional data forms, (b)
three-dimensional data form.
265
CNNs, exhibiting robust performance in computer vision and image processing
266
fields, are basically multilayer feed-forward neural networks that can automatically
267
extract valuable features from raw data (Zhang et al., 2018). The CNN architecture is
268
mainly composed of an input layer, multiple hidden layers and an output layer, and
269
the hidden layers consist of one or more convolutional and pooling layers
270
(Ghorbanzadeh et al., 2019; LeCun et al., 2015). The convolutional layer that consists
271
of several convolution kernels iteratively extracts sophisticated and effective features
272
from the original data (Canziani et al., 2016; Mallat, 2016). The pooling (sub-sample)
273
layer is usually carried out after a convolutional layer to reduce the dimensionality of
274
feature maps through a down-sampling algorithm, which can avoid overfitting and
275
reduce computational cost (Chen et al., 2016). Moreover, CNNs have strong adaptive
276
capability and can achieve excellent performance when facing different dimensional 17
277
data (Shin et al., 2016). In this section, three different dimensional CNN architectures
278
are developed for regional flood susceptibility analysis.
279
3.3.1. 1D-CNN
280
The 1D-CNN structure is comprised of five layers of an input layer, a convolutional
281
layer, a max pooling layer, a fully connected layer and an output layer, which was
282
developed and trained using one-dimensional data. Assuming that there are n flood
283
triggering factors in the input data, N convolutional kernels with a size of a 1 to
284
filter the input data, the resultant layer has N feature vectors. Each grid cell is
285
connected to an a 1 neighborhood in the input vector. A b 1 max pooling layer is
286
immediately used after the first convolutional layer, which can reduce the length of
287
feature vectors. A fully connected layer with k neural units follows the max pooling
288
layer to reorganize the extracted representations. Finally, m neural units are viewed as
289
the output of the network. Fig. 5 shows the architecture of 1D-CNN.
290 291
Fig. 5.
1D-CNN architecture ( n = 13, N = 20, a = 3, b = 2, k = 20 and m = 2 ). 18
292
3.3.2. 2D-CNN
293
The two-dimensional CNN structure consists of two convolutional layers, two max
294
pooling layers and one fully connected layer. Assuming that the input data is an n n
295
image and the first convolutional layer has N kernels with a size of a a, which can
296
produce N feature maps, the size of feature maps is reduced by half with the depth
297
remaining unchanged after applying a b b max pooling operation. The resulting
298
feature maps are sent to the second convolutional layer that has M kernels with a size
299
of a a. Then, the resultant M feature maps are sent to the second b b max pooling
300
layer, and a fully connected layer with k neural units are used to recognize these
301
extracted features. Finally, m neural units are regarded as the output of the 2D-CNN
302
architecture. Fig. 6 shows the structure of 2D-CNN.
19
303 304
Fig. 6.
2D-CNN architecture ( n = 13, a = 3, N = 20, M = 10, b = 2, k = 50 and m = 2 ) .
20
305
3.3.3. 3D-CNN
306
The 3D-CNN structure contains one convolutional layer, one max pooling layer
307
and two fully connected layers. Assuming that each input data can be represented as a
308
c n n matrix, N convolutional kernels with a size of a a a to filter the input
309
data, the resultant layer has N feature maps. The b b b max pooling layer is used
310
to reduce the size of feature maps by half. Then, two fully connected layers with k
311
neural units are used to reorganize the extracted information. Finally, the output layer
312
containing m neural units, representing the final result of the whole network. Fig. 7
313
shows the structure of 3D-CNN.
314 315 316
Fig. 7.
3D-CNN architecture ( c = 13, n = 5, N = 20, a = 3, b = 2, k = 45 and m = 2 ).
3.3.4. Hyperparameters of CNN
317
Hyperparameters setting is a key step in building a CNN architecture (LeCun et al.,
318
2015). This section illustrates some related hyperparameters used in this study. 21
319
Specifically, the sizes of the convolutional kernel and the pooling kernel determine
320
the scale of the convolution and pooling operations, respectively (Choi et al., 2017;
321
Golik et al., 2015; Simard et al., 2003). The activation function converts a linear
322
relationship to a nonlinear relationship so that the neural network can approximate
323
any nonlinear function (Audebert et al., 2019; Chen et al., 2016; Dahl et al., 2013;
324
Mou et al., 2017). The loss function is used to measure the degree of inconsistency
325
between the predicted value and the true value (Chen et al., 2014; James and Stein,
326
1992; Nourani et al., 2015; Ranjbar et al., 2018; Singh, 1997). In addition, the
327
optimizer is used to iteratively update parameters at different learning rates through a
328
gradient descent algorithm (Hinton et al., 2012). AdaGrad is a widely used optimizer
329
because it can automatically adjust the learning rate based on the gradient value of the
330
independent variable in each dimension, thereby avoiding the problem of difficulty in
331
adapting to the unified learning rate in all dimensions (Duchi et al., 2011; Paoletti et
332
al., 2018; Schmidhuber, 2015). The learning rate is critical to the training process
333
because it controls the learning speed of the CNN architecture (Marmanis et al., 2016;
334
Salamon and Bello, 2017).
335
3.4. FSM using CNNs
336
CNNs are characterized with local connections and sharing weights, and their main
337
advantage is to effectively extract spatial information and automatically optimized
338
network parameters (Krizhevsky et al., 2012; LeCun et al., 1998). To explore the
339
predictive capability of the CNN framework, two different perspectives of
340
classification and feature extraction are proposed in flood susceptibility analysis.
341
Specifically, in the first perspective, CNNs are directly used as classifiers for FSM,
342
whereas in the second perspective, a hybrid framework is developed by integrating 22
343
CNNs with the SVM classifier, and the CNNs are used to extract more representative
344
features from original data, improving classical ML classifiers to achieve more
345
effective prediction with these representative features.
346
3.4.1. CNN classifiers
347
As mentioned in Section 3.2 and 3.3, CNNs have achieved fairly good results in
348
processing classification related tasks (Hu et al., 2015). In the field of flood
349
susceptibility assessment, the main objective is to predict where a flood event will
350
occur, which can be treated as a binary classification process to label a given region
351
with “flood” or “non-flood”. In this study, the CNNs are directly used as classifiers
352
for regional flood spatial prediction. The main steps of applying CNNs for
353
classification are summarized as follows. First, the original data are reformed
354
according to the data representation algorithms previously mentioned in Section 3.2.
355
Then, a CNN structure is built based on the dimensionality of the reformed data. After
356
a series of convolutional and pooling operations, the high-level feature maps are
357
extracted from the input data and then reorganized by a fully connected layer. Finally,
358
the reorganized feature vectors are converted to resultant classification results using a
359
nonlinear activation function. For simplicity, the generalized classification process
360
using a CNN classifier with the 2D-CNN architecture is shown in Fig. 8.
23
361 362 363
Fig. 8.
Generalized CNN classifier with the 2D-CNN architecture.
3.4.2. Integration of CNNs and SVM
364
CNN is an effective ML technique that can discover new representations and
365
relationships from the original data, and these features are not easily uncovered
366
through traditional methods (Bergen et al., 2019). Furthermore, CNNs can
367
automatically extract image features including color, shape, texture, and topological
368
structures by using different combinations of neurons and learning rules, which
369
proved the great potential in feature extraction as well (Niu and Suen, 2012). In other
370
fields, the integration of CNNs and SVM has achieved promising results. For
371
example, Niu and Suen (Niu and Suen, 2012) used a 2D-CNN with five layers to
372
extract features from Mixed National Institute of Standards and Technology data and
373
applied SVM for final classification. Also, the hybrid method of CNN and SVM was
374
applied for microvascular morphological type recognition and hyperspectral image
375
classification (Leng et al., 2016; Xue et al., 2016). In this study, different CNN
376
structures are used to learn more suitable features from the original data. Based on
377
this idea, the main steps of integrating CNN and SVM are summarized as follows. 24
378
First, the original data is fed to the CNN structure in different dimensional styles.
379
Then, after several layers of convolution and pooling manipulations, the hidden
380
information of the input data is reorganized in the fully connected layer, where the
381
neural units are regarded as the extracted features. In the following, the SVM
382
classifier is trained using the extracted high-level features. Finally, this integrated
383
framework is used to perform FSM. Specifically, three different dimensional CNN
384
feature extractors are proposed to learn high-level features from the original data with
385
the same structures mentioned in Section 3.3. Fig. 9 illustrates the classification
386
process using the 2D-CNN architecture to extract features for an example.
387 388 389
Fig. 9.
Integrated classifier using a ML classifier, couple with CNNs for feature extraction.
390
3.5. Evaluation methods
391
3.5.1. Feature evaluation
392
To evaluate the effectiveness of extracted features by CNNs, a similarity measure
393
(mean value of correlation coefficients) among all the samples is used. The greater the
394
similarity of extracted features in the same class indicates that the input data is easier
395
to discriminate. For clarification, the correlation coefficient is defined as follows (Goh 25
396
et al., 2007):
x1 , x2
C x1 , x2 D x1 D x2
(1)
397
where x1 and x2 represent two feature vectors, C x1 , x2 is a covariance matrix,
398
and D x1 and D x2 are the variance of x1 and x2 , respectively.
399
3.5.2. Model evaluation
400
Recently, the ROC curve has been used to assess the performance of FSM models
401
(Dano et al., 2019; Kazakis et al., 2015; Popovic et al., 2018). Specifically, this curve
402
is constructed using “sensitivity” on the y-axis against “1-Specificity” on the x-axis
403
(Bradley, 1997). Moreover, an AUC value indicates the predictive capability as well
404
(Choubin et al., 2019; Pamučar et al., 2018b; Xia et al., 2017). In particular, an AUC
405
value of 1 indicates that there is a perfect spatial agreement between the observed and
406
simulation data. An AUC value of 0.5 is the agreement that would be expected due to
407
chance, whereas a value of 0 considers a non-informative result (Choubin et al., 2019;
408
Fawcett, 2006; Tehrany et al., 2014b). In addition, OA is another widely used
409
criterion to assess the predictive performance of FSM models (Chapi et al., 2017;
410
Hong et al., 2018b; Rahmati et al., 2016), and it is the proportion of the number of
411
correctly classified pixels to the total number of pixels as follows: OA
TP TN TP TN FP FN
(2)
412
where TP (true positive) and TN (true negative) denote the number of flood and
413
non-flood grid cells that are correctly classified, where FP (false positive) and FN
414
(false negative) are the number of flood and non-flood grid cells incorrectly classifier,
415
respectively. Besides, is another powerful statistical criterion measuring the 26
416
agreement or reliability between two raters (Pontius Jr and Millones, 2011). For flood
417
susceptibility assessment, this criterion can evaluate the prediction accuracy of FSM
418
models and a higher value implies better prediction (Khosravi et al., 2018; Mahmoud
419
and Gan, 2018). This criterion is formulated as follows:
κ
Pexp
PC Pexp 1 Pexp
((TP FN )(TP FP) ( FP TN )( FN +TN )) (TP TN FN FP)
(3) (4)
420
where PC is the proportion of samples that have been correctly classified and Pexp
421
means the expected probability of change agreement.
422
Additionally, a non-parametric test of Wilcoxon signed-rank test was implemented
423
to validate the statistical significance difference among different models (Wilcoxon et
424
al., 1970). The null hypothesis is based on the pre-assumption that FSM models have
425
no significance difference at a significance level of = 5%. Generally, if the p value
426
is smaller than 0.05 and the Z value beyond the threshold of 1.96, the prior
427
hypothesis is not true and rejected and the difference between two models is
428
significant.
429
4. Results
430
4.1. Building flood models and producing flood susceptibility
431
maps
432
In this subsection, three different CNN architectures were constructed for
433
classification and feature extraction in flood susceptibility analysis. To this end, three
434
different dimensional training sets and validation sets were first prepared for 27
435
subsequent CNN training and validation, as described in Section 3.3. Then, three
436
different dimensional CNN structures were constructed, and training sets of different
437
dimensions were fed to the corresponding neural network with a certain number of
438
iterations until the training process converges. Finally, trained CNN models were used
439
for classification and feature extraction. It should be noted that the window size is a
440
key parameter for constructing a three-dimensional data form. Therefore, as shown in
441
Fig. 10, analysis of window size is performed. It can be observed that a 3D-CNN
442
model with a window size of 5 performs the highest AUC value. In addition, during
443
the construction of the CNN structure, the related hyperparameters were determined
444
through a fine-tuning step using the validation set, referring to previous studies
445
(Audebert et al., 2019; Hu et al., 2014; Maggiori et al., 2017; Shin et al., 2016). Table
446
3 reports all hyperparameters settings for different CNN architectures. The
447
aforementioned CNN structures were built in Python using the Keras framework3.
448 449
Fig. 10. Analysis of window size for three-dimensional data.
450 3
https://keras.io 28
451 452 453
Table 3 Hyperparameter settings of the CNN architectures. Parameters
CNNs
Convolutional kernel size
Max pooling kernel size
Number of epochs
1D-CNN
31
21
600
2D-CNN
33
22
400
3D-CNN
333
222
150
Activation function
Loss function
Optimizer
Learning rate
ReLU
Categorical cross-entropy
AdaGrad
0.0015
454
As the proposed CNN architectures were constructed and trained, three CNN
455
structures were applied to classification and feature extraction. Regarding CNN for
456
classification, 1D-CNN, 2D-CNN and 3D-CNN models were used to calculate the
457
flood susceptibility index for each pixel. For CNN for feature extraction, the validity
458
of the extracted features was evaluated using the similarity statistical index, as shown
459
in Table 4. It can be observed that all the extracted features obtained by the 1D-CNN,
460
2D-CNN and 3D-CNN architectures have higher similarity than the original features,
461
which demonstrates that the extracted features are more efficient and easier to
462
classify. Then, the learned features were regarded as new representation vectors and
463
sent to the SVM model for classification, and the susceptibility indexes were
464
calculated by the hybrid methods of 1D-CNN-SVM, 2D-CNN-SVM and
465
3D-CNN-SVM. After the susceptibility index of each grid cell in the study area was
466
obtained by the FSM models, these indices were automatically divided into five
467
intervals of very low, low, moderate, high and very high using the natural breaks
468
(Jenks) method (Jenks, 1967), which has been widely used in the FSM process (Chapi
469
et al., 2017; Dano et al., 2019; Shafizadeh-Moghadam et al., 2018). The SVM
470
classifier with RBF kernel is a robust method that has achieved good performance in
471
FSM (Choubin et al., 2019; Tehrany et al., 2015b; Zhao et al., 2019). For comparison,
472
an SVM classifier with an RBF kernel was used for FSM to demonstrate the 29
473
effectiveness of a CNN-based feature extractor.
474 475
Table 4 Similarity of original features and extracted features. Quantitative Methods Original 1D features Extracted features Original 2D features Extracted features Original 3D features Extracted features
Similarity Flood class Non-flood class 0.723 0.508 0.934 0.894 0.487 0.275 0.933 0.839 0.705 0.494 0.911 0.772
476
The flood susceptibility maps produced by different CNN-based methods and SVM
477
are shown in Fig. 11. It can be seen that very high susceptible zones have similar
478
distributions and most floods are located in very high susceptible areas. In order to
479
quantitatively analyze the susceptibility maps, the frequency of flood occurrence in
480
the susceptible zones was analyzed, and the results are shown in Table 5 . It can be
481
seen that most floods are predicted in the high and very high susceptible areas and
482
very few floods occur in very low susceptible areas, which proves that there is
483
relatively high consistency between flood historical events and susceptible zones for
484
all the methods. Furthermore, more than 80% of the flood historical events located in
485
the very high susceptible area confirmed the rationality of the flood susceptibility
486
maps. Meanwhile, all the CNN-based methods achieved more reliable susceptibility
487
maps because the frequency ratio of floods was higher than SVM in the very high and
488
high areas.
30
(a)
(b)
(c)
(d)
(e)
(f)
31
(g) 489 490
Fig. 11. Flood susceptibility maps of different classifiers. (a) 1D-CNN; (b) 2D-CNN; (c) 3D-CNN; (d) 1D-CNN-SVM; (e) 2D-CNN-SVM; (f) 3D-CNN-SVM and (g) SVM.
491 492 493 494 495 496
Table 5 Frequency analysis of floods on the susceptibility maps.
Susceptible area Very low Low Moderate High Very high 497
1D-CNN 0% 0% 2.78% 15.74% 81.48%
2D-CNN 0.93% 0% 2.78% 12.96% 83.33%
3D-CNN 0% 0.93% 1.85% 12.04% 85.19%
1D-CNN-SVM 0% 0.93% 4.63% 11.11% 83.33%
2D-CNN-SVM 0% 0.93% 6.48% 8.33% 84.26%
3D-CNN-SVM 0.93% 1.85% 1.85% 11.11% 84.26%
4.2. Model validation and comparison
498
To evaluate the performance of all the methods, the statistical criteria of and OA
499
were used, as shown in Table 6. It can be seen that all the CNN-based methods
500
achieved better performance than SVM in terms of both OA and . For example, the
501
1D-CNN-SVM, 2D-CNN-SVM and 3D-CNN-SVM methods were 1.57–4.69% and
502
0.0312–0.0937 higher than SVM in terms of OA and , respectively. Moreover, the
503
2D-CNN-SVM method achieved the highest OA and value among all the methods.
504
Therefore, the performance of SVM can be further improved by using the features
505
extracted by CNN. 32
SVM 0% 4.63% 5.56% 6.48% 83.33%
506
Table 6 Prediction accuracies of different CNN-based methods and SVM. Method 1D-CNN 2D-CNN 3D-CNN 1D-CNN-SVM 2D-CNN-SVM 3D-CNN-SVM SVM
OA value 84.38% 85.94% 84.38% 85.94% 87.50% 84.38% 82.81%
0.6875 0.7188 0.6875 0.7188 0.7500 0.6875 0.6563
507
Fig. 12 shows the ROC curves of all the methods using the validation set.
508
Experimental results demonstrated that all the CNN-based methods had better
509
prediction performance than SVM in terms of AUC. Specifically, the highest AUC
510
value obtained by the 2D-CNN method was 0.937, followed by 2D-CNN-SVM
511
(0.934), 3D-CNN (0.928), 3D-CNN-SVM (0.922), 1D-CNN (0.905), 1D-CNN-SVM
512
(0.904) and SVM (0.883). Meanwhile, when CNN was used for feature extraction in
513
the classification process, the AUC value of SVM can be increased by 0.021–0.051.
514
Furthermore, the Wilcoxon signed-rank test was selected to verify the statistical
515
difference between the proposed CNN-based methods and the SVM classifier. When
516
the significance level of p value is smaller than 0.05 and the Z value exceeds 1.96,
517
the difference between flood models is significant. Z values and the significant levels
518
of different flood models are shown in Table 7. It can be seen that all the comparative
519
models are very different, because the Z-value and p-value of the significance level
520
satisfy with the corresponding significant conditions mentioned in Section 3.5.2.
33
521 522
Fig. 12. ROC curves for all the methods using the verification set.
523
Table 7 Wilcoxon signed-rank test between the proposed methods and SVM. Comparative pairs SVM vs. 1D-CNN SVM vs. 2D-CNN SVM vs. 3D-CNN SVM vs. 1D-CNN-SVM SVM vs. 2D-CNN-SVM SVM vs. 3D-CNN-SVM
524
Z value -8.575 -2.925 -11.145 -14.657 -14.469 -11.145
p value < 0.05 < 0.05 < 0.05 < 0.05 < 0.05 < 0.05
Significance level Yes Yes Yes Yes Yes Yes
5. Discussion
525
It is a key prerequisite for preventing and reducing flood damage by conducting an
526
appropriate FSM technique (Hong et al., 2018a; Wang et al., 2019b). Therefore, it is
527
necessary to explore the possibility of applying newly developed techniques in FSM.
528
In this paper, the deep learning representative of CNN is considered for regional flood
529
susceptibility analysis with a different perspective in Shangyou County, China.
530
As a popular and powerful technique, CNN gradually shows great potential in
531
classification/prediction. CNN architectures of different dimensions have been
532
designed to solve various tasks with different data modalities (LeCun et al., 2015). 34
533
For example, 1D-CNN has been widely used for signal processing and sentence
534
classification, 2D-CNN has been mainly used for image processing (Golik et al.,
535
2015; Hu et al., 2014) and 3D-CNN has been developed for video related problems (Ji
536
et al., 2012; Tran et al., 2015). With regard to flood susceptibility analysis, the sample
537
in the study area is usually composed of a set of factor vectors in the form of
538
one-dimensional data (Chapi et al., 2017; Hong et al., 2018a; Khosravi et al., 2019;
539
Zhao et al., 2019), and all the samples are input into the ML method for classification.
540
To fully explore the prediction capability of CNN in flood susceptibility assessment,
541
data forms of different dimensions were proposed to fit the corresponding CNN
542
structure. The one-dimensional data form contains all the information about the flood
543
triggering factors, and 1D-CNN architecture can take advantage of the local
544
relationships between the flood triggering factors. Various flood triggering factors
545
related to the morphological, geological and hydrological conditions play a crucial
546
role in constructing flood models (Mahmoud and Gan, 2018b; Rijal et al., 2018).
547
Based on the fact that CNN has demonstrated the promising results in image
548
processing (Anthimopoulos et al., 2016; Krizhevsky et al., 2012; LeCun et al., 1998),
549
a two-dimensional data form is proposed to construct the CNN to analyze flood
550
susceptibility in an “image” perspective. This means that this data form can be
551
regarded as a two-dimensional extension of the feature vector by considering the
552
spatial information of the triggering factors. As for the three-dimensional data form, it
553
contains both the attribute factor information and local terrain spatial information
554
between pixels in a specific window.
555
It is significant for decision-makers to produce reliable flood susceptibility maps to
556
assess and prevent flood hazards (Termeh et al., 2018; Wang et al., 2019c; Xia et al.,
557
2017). Meanwhile, it is a useful way to assess flood susceptibility maps using 35
558
frequency analysis of flood occurrences. In Table 5, most floods occurred in the high
559
and very high susceptible areas and few floods occurred in the very low susceptible
560
areas, which is consistent with previous studies (Bui et al., 2019b; Chapi et al., 2017;
561
Khosravi et al., 2018). Moreover, the CNN-based methods achieved a higher flood
562
frequency than SVM (89.81%), reaching 92.59–97.22%, in terms of the high and very
563
high susceptible areas, which proved that the CNN-based methods can achieve more
564
accurate and reliable susceptibility maps. Furthermore, experimental results in Table
565
6 and Fig. 12 demonstrated that all the proposed CNNs obtained better results than
566
SVM in terms of OA, and AUC. SVM learns the decision plane directly from the
567
raw data, while CNN can transform the raw data into useful representations that are
568
important for model discrimination. After convolution and pooling operations,
569
important parts of input data are enhanced, and irrelevant information is reduced.
570
Furthermore, the Wilcoxon signed-rank test in Table 7 demonstrated that the
571
statistical difference between the CNN-based methods and SVM is significant. It was
572
verified that the improvement in prediction accuracy brought by the CNN-based
573
methods is instructive for decision-makers considering adopting a novel CNN
574
technique to obtain more accurate flood susceptibility maps. In particular, when the
575
2D-CNN architecture was used directly as a classifier or feature extraction in the FSM
576
process, the reasons for obtaining the highest accuracy can be summarized as follows.
577
First, as described in Section 3, compared to the one-dimensional data form, the
578
three-dimensional data form contains not only attribute information but also local
579
terrain spatial information. However, the three-dimensional data form may contain
580
redundant information that interferes with the classifier to distinguish true
581
classification labels. Second, the two-dimensional data form not only makes the
582
construction of the CNN network easier, but also can better display the attribute 36
583
information in the original data.
584
It should be noted in Table 6 that some methods had the same OA and values.
585
For example, 1D-CNN, 3D-CNN and 3D-CNN-SVM obtained the same OA and
586
values of 84.38% and 0.6875, and 2D-CNN and 1D-CNN-SVM got the same OA and
587
values of 85.94% and 0.7188, respectively, due to the very limited validation
588
samples. In addition, the evaluation criteria of OA and that mainly consider the final
589
classification label may not accurately reflect the significant differences between the
590
proposed FSM models, but the ROC curve technique is very sensitive to flood
591
susceptibility indices. Therefore, the ROC curve and AUC value can precisely portray
592
the reliable assessment results of the FSM methods, which has been confirmed by
593
other studies (Liu et al., 2018; Mukhametzyanov and Pamucar, 2018; Pamučar et al.,
594
2018a; Zhao et al., 2019).
595
6. Conclusions
596
This paper studies the application of the CNN technique in FSM from two different
597
perspectives in the case of Shangyou County, China. First, a spatial database
598
containing 13 flood triggering factors and 108 flood historical events was prepared to
599
construct the proposed methods. The developed CNN architectures were then used to
600
construct flood susceptibility models in two different ways of classification and
601
feature extraction. Next, flood susceptibility maps were obtained using the
602
CNN-based methods in comparison to SVM. Finally, several objective criteria of OA,
603
, ROC and AUC were used to compare and verify all the FSM methods. The main
604
conclusions of this paper are summarized as follows:
605
(1) The proposed 1D-CNN, 2D-CNN and 3D-CNN classifiers achieved better 37
606
prediction performance than SVM in terms of AUC, which proves the superiority of
607
the CNN frameworks.
608
(2) By using CNN for feature extraction, the prediction capability of SVM was
609
effectively improved. In particular, the 2D-CNN-SVM method achieved the highest
610
accuracy in terms of OA, and AUC.
611
(3) The proposed CNN-based methods are expected to be used for flood disaster
612
assessment and management. Meanwhile, the application prospect of CNN can inspire
613
more effective FSM techniques.
614
Acknowledgements
615
This work was supported by the National Natural Science Foundation of China
616
(61271408, 41602362), the International Partnership Program of Chinese Academy of
617
Sciences
618
(201906860029). The authors would also like to thank the handling editor and the two
619
anonymous reviewers for their valuable comments and suggestions, which
620
significantly improved the quality of this paper.
621
References
622 623 624 625 626 627 628 629 630
Anthimopoulos, M., Christodoulidis, S., Ebner, L., Christe, A., Mougiakakou, S., 2016. Lung pattern
(115242KYSB20170022)
and
the
China
Scholarship
Council
classification for interstitial lung diseases using a deep convolutional neural network. IEEE transactions on medical imaging, 35, 1207-1216. Arabameri, A., Rezaei, K., Cerdà, A., Conoscenti, C., Kalantari, Z., 2019. A comparison of statistical methods and multi-criteria decision making to map flood hazard susceptibility in Northern Iran. Science of The Total Environment, 660, 443-458. Arnell, N.W., Gosling, S.N., 2016. The impacts of climate change on river flow regimes at the global scale. Climatic Change, 134, 387-401. Audebert, N., Saux, B., Lefèvre, S., 2019. Deep Learning for Classification of Hyperspectral Data: A 38
631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674
Comparative Review. arXiv preprint arXiv:1904.10674. Bathrellos, G., Skilodimou, H., Soukis, K., Koskeridou, E., 2018. Temporal and Spatial Analysis of Flood Occurrences in the Drainage Basin of Pinios River (Thessaly, Central Greece). Land, 7, 106. Bathrellos, G.D., Skilodimou, H.D., Chousianitis, K., Youssef, A.M., Pradhan, B., 2017. Suitability estimation for urban development using multi-hazard assessment map. Science of the Total Environment, 575, 119-134. Bergen, K.J., Johnson, P.A., Maarten, V., Beroza, G.C., 2019. Machine learning for data-driven discovery in solid Earth geoscience. Science, 363, eaau0323. BEVEN, K.J., Kirkby, M.J., 1979. A physically based, variable contributing area model of basin hydrology/Un modèle à base physique de zone d'appel variable de l'hydrologie du bassin versant. Hydrological Sciences Journal, 24, 43-69. Bout, B., Jetten, V., 2018. The validity of flow approximations when simulating catchment-integrated flash floods. Journal of Hydrology, 556, 674-688. Bradley, A.P., 1997. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern recognition, 30, 1145-1159. Bui, D.T. et al., 2019a. A novel deep learning neural network approach for predicting flash flood susceptibility: A case study at a high frequency tropical storm area. Science of The Total Environment, 134413. Bui, D.T., Tsangaratos, P., Ngo, P.-T.T., Pham, T.D., Pham, B.T., 2019b. Flash flood susceptibility modeling using an optimized fuzzy rule based feature selection technique and tree based ensemble methods. Science of The Total Environment. Canziani, A., Paszke, A., Culurciello, E., 2016. An analysis of deep neural network models for practical applications. arXiv preprint arXiv:1605.07678. Chaabani, C., Chini, M., Abdelfattah, R., Hostache, R., Chokmani, K., 2018. Flood Mapping in a Complex Environment Using Bistatic TanDEM-X/TerraSAR-X InSAR Coherence. Remote Sensing, 10, 1873. Chapi, K. et al., 2017. A novel hybrid artificial intelligence approach for flood susceptibility assessment. Environmental Modelling & Software, 95, 229-245. Chen, W. et al., 2019. Flood susceptibility modelling using novel hybrid approach of Reduced-error pruning trees with Bagging and Random subspace ensembles. Journal of Hydrology. Chen, Y., Jiang, H., Li, C., Jia, X., Ghamisi, P., 2016. Deep feature extraction and classification of hyperspectral images based on convolutional neural networks. IEEE Transactions on Geoscience and Remote Sensing, 54, 6232-6251. Chen, Y., Lin, Z., Zhao, X., Wang, G., Gu, Y., 2014. Deep learning-based classification of hyperspectral data. IEEE Journal of Selected topics in applied earth observations and remote sensing, 7, 2094-2107. Choi, K., Fazekas, G., Sandler, M., Cho, K., 2017. Convolutional recurrent neural networks for music classification, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp. 2392-2396. Choubin, B. et al., 2019. An Ensemble prediction of flood susceptibility using multivariate discriminant analysis, classification and regression trees, and support vector machines. Science of the Total Environment, 651, 2087-2096. Cortes, C., Vapnik, V., 1995. Support-vector networks. Machine learning, 20, 273-297. 39
675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718
Dahl, G.E., Sainath, T.N., Hinton, G.E., 2013. Improving deep neural networks for LVCSR using rectified linear units and dropout, 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, pp. 8609-8613. Dano, U.L. et al., 2019. Flood Susceptibility Mapping Using GIS-Based Analytic Network Process: A Case Study of Perlis, Malaysia. Water, 11, 615. Das, S., 2019a. Geospatial mapping of flood susceptibility and hydro-geomorphic response to the floods in Ulhas basin, India. Remote Sensing Applications: Society and Environment, 14, 60-74. Das, S., 2019b. Geospatial mapping of flood susceptibility and hydro-geomorphic response to the floods in Ulhas basin, India. Remote Sensing Applications: Society and Environment. Duchi, J., Hazan, E., Singer, Y., 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12, 2121-2159. Fawcett, T., 2006. An introduction to ROC analysis. Pattern recognition letters, 27, 861-874. Gebrehiwot, A., Hashemi-Beni, L., Thompson, G., Kordjamshidi, P., Langan, T.E., 2019. Deep Convolutional Neural Network for Flood Extent Mapping Using Unmanned Aerial Vehicles Data. Sensors, 19, 1486. Ghorbanzadeh, O. et al., 2019. Evaluation of different machine learning methods and deep-learning convolutional neural networks for landslide detection. Remote Sensing, 11, 196. Gigovi´C, L.G.C., Pamučar, D., Baji´C, Z.B.C., Drobnjak, S., 2017. Application of GIS-Interval Rough AHP Methodology for Flood Hazard Mapping in Urban Areas. Water, 9, 1-26. Gigović, L., Pamučar, D., Božanić, D., Ljubojević, S., 2017. Application of the GIS-DANP-MABAC multi-criteria model for selecting the location of wind farms: A case study of Vojvodina, Serbia. Renewable Energy, 103, 501-521. Goh, K.-I. et al., 2007. The human disease network. Proceedings of the National Academy of Sciences, 104, 8685-8690. Golik, P., Tüske, Z., Schlüter, R., Ney, H., 2015. Convolutional neural networks for acoustic modeling of raw time signal in LVCSR, Sixteenth annual conference of the international speech communication association. González-Arqueros, M.L., Mendoza, M.E., Bocco, G., Castillo, B.S., 2018. Flood susceptibility in rural settlements in remote zones: The case of a mountainous basin in the Sierra-Costa region of Michoacán, Mexico. Journal of environmental management, 223, 685-693. Goodfellow, I., Bengio, Y., Courville, A., 2016. Deep learning. MIT press. Graves, A., Mohamed, A.-r., Hinton, G., 2013. Speech recognition with deep recurrent neural networks, 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, pp. 6645-6649. Hinton, G. et al., 2012. Deep neural networks for acoustic modeling in speech recognition. IEEE Signal processing magazine, 29. Hong, H. et al., 2018a. Flood susceptibility assessment in Hengfeng area coupling adaptive neuro-fuzzy inference system with genetic algorithm and differential evolution. Science of The Total Environment, 621, 1124-1141. Hong, H. et al., 2018b. Application of fuzzy weight of evidence and data mining techniques in construction of flood susceptibility map of Poyang County, China. Science of the Total Environment, 625, 575-588. Hu, B., Lu, Z., Li, H., Chen, Q., 2014. Convolutional neural network architectures for matching natural 40
719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762
language sentences, Advances in neural information processing systems, pp. 2042-2050. Hu, W., Huang, Y., Wei, L., Zhang, F., Li, H., 2015. Deep convolutional neural networks for hyperspectral image classification. Journal of Sensors, 2015. James, W., Stein, C., 1992. Estimation with quadratic loss, Breakthroughs in statistics. Springer, pp. 443-460. Jenks, G.F., 1967. The data model concept in statistical mapping. International yearbook of cartography, 7, 186-190. Ji, S., Xu, W., Yang, M., Yu, K., 2012. 3D convolutional neural networks for human action recognition. IEEE transactions on pattern analysis and machine intelligence, 35, 221-231. Kazakis, N., Kougias, I., Patsialis, T., 2015. Assessment of flood hazard areas at a regional scale using an index-based approach and Analytical Hierarchy Process: Application in Rhodope-Evros region, Greece. Science of the Total Environment, 538, 555-563. Khosravi, K. et al., 2018. A comparative assessment of decision trees algorithms for flash flood susceptibility modeling at Haraz watershed, northern Iran. Science of the Total Environment, 627, 744-755. Khosravi, K. et al., 2019. A Comparative Assessment of Flood Susceptibility Modeling Using Multi-Criteria Decision-Making Analysis and Machine Learning Methods. Journal of Hydrology. Kia, M.B., Pirasteh, S., Pradhan, B., Wan, N.A.S., Moradi, A., 2012. An artificial neural network model for flood simulation using GIS: Johor River Basin, Malaysia. Environmental Earth Sciences, 67, 251-264. Krizhevsky, A., Sutskever, I., Hinton, G.E., 2012. Imagenet classification with deep convolutional neural networks, Advances in neural information processing systems, pp. 1097-1105. LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep learning. nature, 521, 436. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86, 2278-2324. Leng, J., Li, T., Bai, G., Dong, Q., Dong, H., 2016. Cube-CNN-SVM: a novel hyperspectral image classification method, 2016 IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, pp. 1027-1034. Li, S. et al., 2019. Deep Learning for Hyperspectral Image Classification: An Overview. IEEE Transactions on Geoscience and Remote Sensing. Liang, D., Xu, Z., Liu, D., 2017. Three-way decisions with intuitionistic fuzzy decision-theoretic rough sets based on point operators. Information Sciences, 375, 183-201. Liu, F., Aiwu, G., Lukovac, V., Vukic, M., 2018. A multicriteria model for the selection of the transport service provider: A single valued neutrosophic DEMATEL multicriteria model. Decision Making: Applications in Management and Engineering, 1, 121-130. Maggiori, E., Tarabalka, Y., Charpiat, G., Alliez, P., 2017. Convolutional neural networks for large-scale remote-sensing image classification. IEEE Transactions on Geoscience and Remote Sensing, 55, 645-657. Mahmoud, S.H., Gan, T.Y., 2018. Multi-criteria approach to develop flood susceptibility maps in arid regions of Middle East. Journal of Cleaner Production, 196, 216-229. Mallat, S., 2016. Understanding deep convolutional networks. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374, 20150203. Marmanis, D. et al., 2016. Semantic segmentation of aerial images with an ensemble of CNNs. ISPRS 41
763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806
Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 3, 473. Mojaddadi, H., Pradhan, B., Nampak, H., Ahmad, N., Ghazali, A.H.b., 2017. Ensemble machine-learning-based geospatial approach for flood risk assessment using multi-sensor remote-sensing data and GIS. Geomatics, Natural Hazards and Risk, 8, 1080-1102. Moore, I.D., Gessler, P., Nielsen, G., Peterson, G., 1993. Soil attribute prediction using terrain analysis. Soil Science Society of America Journal, 57, 443-452. Moore, I.D., Wilson, J.P., 1992. Length-slope factors for the Revised Universal Soil Loss Equation: Simplified method of estimation. Journal of soil and water conservation, 47, 423-428. Mou, L., Ghamisi, P., Zhu, X.X., 2017. Deep recurrent neural networks for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 55, 3639-3655. Mukhametzyanov, I., Pamucar, D., 2018. A sensitivity analysis in MCDM problems: A statistical approach. Decis. Mak. Appl. Manag. Eng, 1, 1-20. Niu, X.-X., Suen, C.Y., 2012. A novel hybrid CNN–SVM classifier for recognizing handwritten digits. Pattern Recognition, 45, 1318-1325. Nourani, V., Alami, M.T., Vousoughi, F.D., 2015. Wavelet-entropy data pre-processing approach for ANN-based groundwater level modeling. Journal of Hydrology, 524, 255-269. Pamučar, D., Mihajlović, M., Obradović, R., Atanasković, P., 2017. Novel approach to group multi-criteria
decision
making
based
on
interval
rough
numbers:
Hybrid
DEMATEL-ANP-MAIRCA model. Expert Systems with Applications, 88, 58-80. Pamučar, D., Petrović, I., Ćirović, G., 2018a. Modification of the Best–Worst and MABAC methods: A novel approach based on interval-valued fuzzy-rough numbers. Expert Systems with Applications, 91, 89-106. Pamučar, D., Stević, Ž., Sremac, S., 2018b. A New Model for Determining Weight Coefficients of Criteria in MCDM Models: Full Consistency Method (FUCOM). Symmetry, 10, 393. Paoletti, M., Haut, J., Plaza, J., Plaza, A., 2018. A new deep convolutional neural network for fast hyperspectral image classification. ISPRS journal of photogrammetry and remote sensing, 145, 120-147. Pontius Jr, R.G., Millones, M., 2011. Death to Kappa: birth of quantity disagreement and allocation disagreement for accuracy assessment. International Journal of Remote Sensing, 32, 4407-4429. Popovic, M., Kuzmanović, M., Savić, G., 2018. A comparative empirical study of Analytic Hierarchy Process and Conjoint analysis: Literature review. Decision Making: Applications in Management and Engineering, 1, 153-163. Qazi, K.I., Lam, H.K., Xiao, B., Ouyang, G., Yin, X., 2016. Classification of epilepsy using computational intelligence techniques. CAAI Transactions on Intelligence Technology, 1, 137-149. Rahmati, O., Pourghasemi, H.R., Zeinivand, H., 2016. Flood susceptibility mapping using frequency ratio and weights-of-evidence models in the Golastan Province, Iran. Geocarto International, 31, 42-70. Rahmati, O., Zeinivand, H., Besharat, M., 2015. Flood hazard zoning in Yasooj region, Iran, using GIS and multi-criteria decision analysis. Geomatics Natural Hazards & Risk. Ranjbar, S., Hooshyar, M., Singh, A., Wang, D., 2018. Quantifying climatic controls on river network branching structure across scales. Water Resources Research, 54, 7347-7360. Rizeei, H.M., Pradhan, B., Saharkhiz, M.A., 2018. An integrated fluvial and flash pluvial model using 42
807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850
2D high-resolution sub-grid and particle swarm optimization-based random forest approaches in GIS. Complex & Intelligent Systems, 1-20. Salamon, J., Bello, J.P., 2017. Deep convolutional neural networks and data augmentation for environmental sound classification. IEEE Signal Processing Letters, 24, 279-283. Santos, P.P., Reis, E., Pereira, S., Santos, M., 2019a. A flood susceptibility model at the national scale based on multicriteria analysis. Science of The Total Environment, 667, 325-337. Santos, P.P., Reis, E., Pereira, S., Santos, M., 2019b. A national scale flood susceptibility model based on multicriteria analysis. Science of The Total Environment. Schmidhuber, J., 2015. Deep learning in neural networks: An overview. Neural networks, 61, 85-117. Shafizadeh-Moghadam, H., Valavi, R., Shahabi, H., Chapi, K., Shirzadi, A., 2018. Novel forecasting approaches using combination of machine learning and statistical models for flood susceptibility mapping. Journal of environmental management, 217, 1-11. Shin, H.-C. et al., 2016. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE transactions on medical imaging, 35, 1285-1298. Simard, P.Y., Steinkraus, D., Platt, J.C., 2003. Best practices for convolutional neural networks applied to visual document analysis, null. IEEE, pp. 958. Singh, V., 1997. The use of entropy in hydrology and water resources. Hydrological processes, 11, 587-626. Sowmya, K., John, C., Shrivasthava, N., 2015. Urban flood vulnerability zoning of Cochin City, southwest coast of India, using remote sensing and GIS. Natural Hazards, 75, 1271-1286. Sutskever, I., Vinyals, O., Le, Q.V., 2014. Sequence to sequence learning with neural networks, Advances in neural information processing systems, pp. 3104-3112. Tehrany, M.S., Jones, S., Shabani, F., 2019. Identifying the essential flood conditioning factors for flood prone area mapping using machine learning techniques. CATENA, 175, 174-192. Tehrany, M.S., Lee, M.J., Pradhan, B., Jebur, M.N., Lee, S., 2014a. Flood susceptibility mapping using integrated bivariate and multivariate statistical models. Environmental Earth Sciences, 72, 4001-4015. Tehrany, M.S., Pradhan, B., Jebur, M.N., 2013. Spatial prediction of flood susceptible areas using rule based decision tree (DT) and a novel ensemble bivariate and multivariate statistical models in GIS. Journal of Hydrology, 504, 69-79. Tehrany, M.S., Pradhan, B., Jebur, M.N., 2014b. Flood susceptibility mapping using a novel ensemble weights-of-evidence and support vector machine models in GIS. Journal of Hydrology, 512, 332-343. Tehrany, M.S., Pradhan, B., Jebur, M.N., 2015a. Flood susceptibility analysis and its verification using a novel ensemble support vector machine and frequency ratio method. Stochastic Environmental Research and Risk Assessment, 29, 1149-1165. Tehrany, M.S., Pradhan, B., Mansor, S., Ahmad, N., 2015b. Flood susceptibility assessment using GIS-based support vector machine model with different kernel types. Catena, 125, 91-101. Termeh, S.V.R., Kornejady, A., Pourghasemi, H.R., Keesstra, S., 2018. Flood susceptibility mapping using novel ensembles of adaptive neuro fuzzy inference system and metaheuristic algorithms. Science of the Total Environment, 615, 438-451. Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M., 2015. Learning spatiotemporal features with 3d convolutional networks, Proceedings of the IEEE international conference on 43
851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891
computer vision, pp. 4489-4497. Vapnik, V.N., 1999. An overview of statistical learning theory. IEEE transactions on neural networks, 10, 988-999. Wang, H., Yang, B., Chen, W., 2016. Unknown Constrained Mechanisms Operation based on Dynamic Interactive Control. Caai Transactions on Intelligence Technology, 1. Wang, Y., Fang, Z., Hong, H., 2019a. Comparison of convolutional neural networks for landslide susceptibility mapping in Yanshan County, China. Science of The Total Environment. Wang, Y. et al., 2019b. A Hybrid GIS Multi-Criteria Decision-Making Method for Flood Susceptibility Mapping at Shangyou, China. Remote Sensing, 11, 62. Wang, Y. et al., 2019c. Flood susceptibility mapping in Dingnan County (China) using adaptive neuro-fuzzy inference system with biogeography based optimization and imperialistic competitive algorithm. Journal of environmental management, 247, 712-729. Wilcoxon, F., Katti, S., Wilcox, R.A., 1970. Critical values and probability levels for the Wilcoxon rank sum test and the Wilcoxon signed rank test. Selected tables in mathematical statistics, 1, 171-259. Xia, X., Liang, Q., Ming, X., Hou, J., 2017. An efficient and stable hydrodynamic model with novel source term discretization schemes for overland flow and flood simulations. Water Resources Research, 53, 3730-3759. Xu, X., Law, R., Chen, W., Tang, L., 2016. Forecasting tourism demand by extracting fuzzy Takagi– Sugeno rules from trained SVMs. Caai Transactions on Intelligence Technology, 1, 30-42. Xue, D.-X., Zhang, R., Feng, H., Wang, Y.-L., 2016. CNN-SVM for microvascular morphological type recognition with data augmentation. Journal of medical and biological engineering, 36, 755-764. Youssef, A.M., Pradhan, B., Jebur, M.N., El-Harbi, H.M., 2015a. Landslide susceptibility mapping using ensemble bivariate and multivariate statistical models in Fayfa area, Saudi Arabia. Environmental Earth Sciences, 73, 3745-3761. Youssef, A.M., Pradhan, B., Pourghasemi, H.R., Abdullahi, S., 2015b. Landslide susceptibility assessment at Wadi Jawrah Basin, Jizan region, Saudi Arabia using two bivariate models in GIS. Geosciences Journal, 19, 449-469. Yu, S., Jia, S., Xu, C., 2017. Convolutional neural networks for hyperspectral image classification. Neurocomputing, 219, 88-98. Zazo, S. et al., 2018. Flood Hazard Assessment Supported by Reduced Cost Aerial Precision Photogrammetry. Remote Sensing, 10, 1566. Zevenbergen, L.W., Thorne, C.R., 1987. Quantitative analysis of land surface topography. Earth surface processes and landforms, 12, 47-56. Zhang, C. et al., 2018. An object-based convolutional neural network (OCNN) for urban land use classification. Remote sensing of environment, 216, 57-70. Zhao, G., Pang, B., Xu, Z., Peng, D., Xu, L., 2019. Assessment of urban flood susceptibility using semi-supervised machine learning model. Science of The Total Environment, 659, 940-949. Zhu, G.N., Hu, J., Qi, J., Gu, C.C., Peng, Y.H., 2015. An integrated AHP and VIKOR for design concept evaluation based on rough number. Advanced Engineering Informatics, 29, 408-418.
892
44
893
Abstract
894
Flood is a very destructive natural disaster in the world, which seriously threaten the
895
safety of human life and property. In this paper, the most popular convolutional neural
896
network (CNN) is introduced to assess flood susceptibility in Shangyou County,
897
China. The main contributions of this study are summarized as follows. First, the
898
CNN technique is used for flood susceptibility mapping (FSM) through two different
899
CNN classification and feature extraction frameworks. Second, three data
900
presentation methods are designed in the CNN architecture to fit the two proposed
901
frameworks. To construct the proposed CNN-based methods, 13 flood triggering
902
factors related to flood historical events in the study area were prepared. The
903
performance of these CNN-based methods was evaluated using several objective
904
criteria in comparison to the classic support vector machine (SVM) classifier.
905
Experiments results demonstrate that all the CNN-based methods can help produce
906
more reliable and practical flood susceptibility maps. For example, the proposed
907
CNN-based classifiers were 0.022–0.054 higher than SVM in terms of area under the
908
curve (AUC). In addition, in the classification process, CNN-based feature extraction
909
can effectively improve the prediction capability of SVM by 0.021–0.051 in terms of
910
AUC. Therefore, the proposed CNN frameworks can help mitigate and manage
911
floods.
912
Keywords: Flood susceptibility mapping; convolution neural network; classification;
913
feature extraction; China.
914 915
45
Credit Author Statement
916 917
918
Yi Wang: Conceptualization; Formal analysis; Funding acquisition;
919
Supervision; Roles/Writing - original draft
920
Zhice Fang: Data curation, Methodology, Validation; Visualization
921
Haoyuan Hong: Investigation; Resources; Writing - review & editing;
922
Funding acquisition;
923
Ling Peng: Funding acquisition; Writing - review & editing
924 925 926 927 928 929 930 931 932
Declaration of interests ☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
☐The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:
933 934 935 936 937
938
Highlights
939
CNNs are considered for dealing with the flood susceptibility mapping task. 46
940
Two different CNN frameworks of classification and feature extraction are presented.
941
942
Three data presentation forms are designed for the proposed CNN frameworks.
943
Reliable flood susceptibility maps can be obtained by using the proposed CNNs.
944
Prediction performance of SVM can be improved using the CNNs for feature
945
extraction.
946 947
47