Flood susceptibility mapping using convolutional neural network frameworks

Flood susceptibility mapping using convolutional neural network frameworks

Journal Pre-proofs Research papers Flood susceptibility mapping using convolutional neural network frameworks Yi Wang, Zhice Fang, Haoyuan Hong, Ling ...

8MB Sizes 0 Downloads 60 Views

Journal Pre-proofs Research papers Flood susceptibility mapping using convolutional neural network frameworks Yi Wang, Zhice Fang, Haoyuan Hong, Ling Peng PII: DOI: Reference:

S0022-1694(19)31217-X https://doi.org/10.1016/j.jhydrol.2019.124482 HYDROL 124482

To appear in:

Journal of Hydrology

Received Date: Revised Date: Accepted Date:

24 September 2019 2 December 2019 16 December 2019

Please cite this article as: Wang, Y., Fang, Z., Hong, H., Peng, L., Flood susceptibility mapping using convolutional neural network frameworks, Journal of Hydrology (2019), doi: https://doi.org/10.1016/j.jhydrol.2019.124482

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2019 Published by Elsevier B.V.

1

Flood susceptibility mapping using convolutional neural

2

network frameworks

3

Yi Wang1,*, Zhice Fang1, Haoyuan Hong2,3,4,5*, Ling Peng6

4

1

5

430074, China

6

2

7

Universitätsstraße 7, 1010 Vienna, Austria

8

3

9

Ministry of Education, Nanjing, 210023, China

Institute of Geophysics and Geomatics, China University of Geosciences, Wuhan Department of Geography and Regional Research, University of Vienna, Key Laboratory of Virtual Geographic Environment (Nanjing Normal University),

10

4

11

(Jiangsu Province), Nanjing, 210023, China

12

5

13

Development and Application, Nanjing, Jiangsu 210023, China

14

6 China

15

*Correspondence

16

([email protected])

State Key Laboratory Cultivation Base of Geographical Environment Evolution Jiangsu Center for Collaborative Innovation in Geographic Information Resource Institute of Geo-Environment Monitoring, Beijing 100081, China Author: Yi Wang ([email protected]); Haoyuan Hong

17 18 19 20 21 22 23 1

24

1. Introduction

25

Floods are one of the most common and catastrophic geo-hazards that are poorly

26

comprehended (Tehrany et al., 2014a). The severe impact of floods on natural

27

ecosystems and human activities has greatly affected economic and social

28

sustainability (Rahmati et al., 2015). Therefore, it is essential to identify flood-prone

29

zones to prevent or mitigate adverse effects of flooding (Bathrellos et al., 2017; Zhu et

30

al., 2015).

31

In the past few years, various statistical and machine learning (ML) methods have

32

been successfully applied in flood susceptibility mapping (FSM) (Bui et al., 2019b;

33

Mojaddadi et al., 2017). Statistical methods are mainly based on the assumption that

34

historical flood events are closely related to flood predisposing factors, including

35

frequency ratio (Rahmati et al., 2016; Tehrany et al., 2015a), logistic regression

36

(Youssef et al., 2015a), weights of evidence (Tehrany et al., 2014b; Youssef et al.,

37

2015b), analytic hierarchy process (Kazakis et al., 2015) and multiple criteria decision

38

(Liu et al., 2018; Santos et al., 2019a; Santos et al., 2019b; Wang et al., 2019b).

39

Recently, ML techniques have been applied to FSM, such as artificial neural network

40

(Bui et al., 2019a; Kia et al., 2012), decision tree (Choubin et al., 2019; Khosravi et

41

al., 2018; Tehrany et al., 2013), Random Forest (Rizeei et al., 2018) and Support

42

Vector Machine (SVM) (Choubin et al., 2019; Tehrany et al., 2014b; Tehrany et al.,

43

2015a; Tehrany et al., 2015b).

44

In general, when applying an ML framework to solve a particular problem, feature

45

engineering is a critical step that obtains an appropriate feature representation from

46

the raw data (e.g., pixel values of the image) prior to data modeling. Furthermore, the

47

performance of ML algorithms depends to a large extent on the representation of the 2

48

raw data (Goodfellow et al., 2016). In this sense, ML methods cannot directly uncover

49

instructive representations from raw data, nor can they obtain new insights from these

50

representations, thus further improving predictive capability (LeCun et al., 2015).

51

More recently, deep learning, as one of the most popular ML techniques, has received

52

widespread attention and has been able to obtain reliable results comparable to or

53

superior to conventional ML methods (Schmidhuber, 2015). This technique includes a

54

broad class of methods for computer vision, signal processing and natural language

55

processing with multiple network architectures (Graves et al., 2013; Hinton et al.,

56

2012; Sutskever et al., 2014). Among various deep learning methods, a convolutional

57

neural network (CNN) with architecture inspired by biological visual perception can

58

identify representation with extreme variability through convolutional and pooling

59

layers (Wang et al., 2019a). Gebrehiwot et al. have successfully applied CNN to

60

identify flood ranges using unmanned aerial vehicle data (Gebrehiwot et al., 2019).

61

Moreover, several feature engineering methods of correlation-based, information gain

62

ratio and multi-collinearity analysis have been used in flood susceptibility analysis

63

(Bui et al., 2019b; Khosravi et al., 2019; Zhao et al., 2019). Therefore, since CNNs

64

can get rid of the tedious feature engineering steps and extract useful information

65

directly from the original data, they can play an important role in FSM. Although

66

CNNs are more accurate than the state-of-the-art ML methods in many fields

67

(Ghorbanzadeh et al., 2019; Salamon and Bello, 2017; Simard et al., 2003; Yu et al.,

68

2017), few studies have applied this technique to FSM.

69

Choosing mapping units is a key step for flood susceptibility modelling. When

70

applying the ML method to FSM, grid cells (pixels) are commonly used mapping

71

units that have the advantage of being easier to process (Bui et al., 2019a; Das, 2019a;

72

Termeh et al., 2018). Essentially, FSM is a binary classification task that determines 3

73

whether a pixel will be a flood point and predicts its probability of occurrence. In the

74

FSM process, each grid cell is considered as a feature vector that contains information

75

about different triggering factors, which can be processed by the ML models.

76

Therefore, the entire study area can be regarded as a multi-channel “image”, where

77

each channel represents a flood triggering factor layer, which makes it possible to

78

apply the CNN technique to flood susceptibility analysis. In this paper, the CNN

79

framework is introduced to assess flood susceptibility in Shangyou County, China.

80

CNNs can use different convolutional operations to extract various information from

81

different data modalities. The three main contributions of this paper are outlined as

82

follows. First, to the best of our knowledge, the application of CNN in FSM is still

83

very rare. We explore the feasibility of CNNs in FSM in this study. Second, two CNN

84

frameworks for FSM are presented. Specifically, for the first framework, the CNN

85

model is used directly as a classifier to assess flood susceptibility. For the second

86

framework, CNN is integrated with the SVM classifier in a hybrid manner: (1) the

87

CNN model is used to extract powerful and useful representation of flood triggering

88

factors, and (2) the SVM classifier is selected for classification with the extracted

89

features. Third, three data presentation methods are designed in the CNN architecture

90

to adapt to the classification and feature extraction in the FSM process. To validate

91

and compare the predictive capability of the proposed CNN-based methods to SVM,

92

several objective criteria of overall accuracy (OA), kappa coefficient (), receiver

93

operating characteristic (ROC) and area under ROC curve (AUC) were used.

4

94

2. Study area and available data

95

2.1. Study area

96

Shangyou County is located in the southern part of Jiangxi Province, China, and is

97

in the northern hilly district of Nanling. The study site covers an area of

98

approximately 1543 km2 between 25°42′N to 26°01′N and 114°00′E to 114°40′E. The

99

Shangyou district has a total population of approximately 3.22 million people and the

100

elevation ranges between 110 and 1901 m above sea level (Fig. 1). Shangyou County

101

is located in the southern subtropical zone and belongs to the humid monsoon climate

102

zone in the subtropical hilly region. In this study area, the climate is mild, rainfall is

103

abundant, sunshine is sufficient, four seasons are distinct, and the frost-free period is

104

long. Specifically, the average annual rainfall during 1959–2014 ranged between

105

933.7 and 2147.6 mm, according to the Shangyou Meteorological Bureau. The

106

rainfall displays a great difference in spring and summer and the annual rainy season

107

in Shangyou district is from April to August. The average annual sunshine time is

108

1708.3 h, which is an intermediate level in Jiangxi Province, and the average annual

109

temperature is 18.6 ℃. Moreover, January and July are the coldest and hottest month

110

with an average temperature of -2.7 ℃ and 38 ℃, respectively. Vegetation coverage

111

accounts for more than 95% of the study area, mainly including farmland, forests and

112

grasslands. Meanwhile, the study area is rich in water resources. The annual average

113

runoff of surface water in Shangyou County is 3.52 billion cubic meters. The

114

groundwater is mainly loose pore water, bedrock fissure water and underground hot

115

water. In short, from a climate perspective, the study area is rich in precipitation and

116

extreme weather events. As a result, the area is vulnerable to climatic factors that

117

cause flooding. 5

118 119

120

Fig. 1.

Location of the study area.

2.2. Flood inventory mapping

121

A flood inventory map records the locations of the inundated area and provides

122

detailed information about the characteristics of historical flood events (Santos et al.,

123

2019a; Zazo et al., 2018). In this study, a flood inventory map with 108 historical

124

flood events was provided by the Jiangxi Meteorological Bureau1 and the Department

125

of Civil Affairs of Jiangxi province2, as shown in Fig. 1. The locations of flood events

126

were obtained through historical records, extensive field surveys and visual

127

interpretation of Google Earth images. All the sampling points in Fig. 1 represent

128

where the flood occurred and were used to produce training and validation sets: 76

129

locations (70%) were randomly selected for training, whereas 32 locations (30%)

130

were used for validation. Moreover, the same number of non-flood locations (76 and 1 2

http://www.weather.org.cn http://www.jxmzw.gov.cn 6

131

32) were randomly selected from the area free of flood to construct the training and

132

validation sets.

133

2.3. Flood triggering factors

134

According to literature review (Arabameri et al., 2019; Khosravi et al., 2019;

135

Shafizadeh-Moghadam et al., 2018) and characteristics of the study area, 13 flood

136

triggering factors were selected for FSM, including altitude, aspect, curvature, slope,

137

stream power index (SPI), sediment transport index (STI) and topographic wetness

138

index (TWI), lithology, land use, normalized difference vegetation index (NDVI),

139

soil, distance to rivers and rainfall.

140

Altitude is a frequently used flood triggering factor and lower altitudes are usually

141

accompanied by higher river discharge (Bout and Jetten, 2018; Sowmya et al., 2015;

142

Wang et al., 2016). Aspect is the orientation or direction of the maximum slope of the

143

terrain surface and influences hydrologic conditions because different orientations

144

face different effects of precipitation and solar radiation (Bathrellos et al., 2018; Xu et

145

al., 2016). Curvature describes the degree of distortion of the slope surface and

146

reflects the morphology of the topography in a given area (Mojaddadi et al., 2017).

147

The curvature layer can be computed from digital elevation model (DEM) data using

148

fourth order surface model (Zevenbergen and Thorne, 1987). Slope directly affects

149

surface runoff and vertical percolation since water flows from higher to lower altitude

150

(Chaabani et al., 2018). SPI represents the performance of river sediment transport

151

and fluvial channel erosion (Pamučar et al., 2017) and is calculated as follows (Moore

152

and Wilson, 1992): SPI  As tan 

153

(1)

where As is the area of the basin and  is the slope gradient (in degrees). STI 7

154

qualitatively explains the process of erosion and deposition (Tehrany et al., 2015a)

155

and is calculated as follows (Moore et al., 1993): 0.6

1.3

 A   sin   STI   s     22.13   0.0896 

(2)

156

where the parameters in Eq. (2) are defined the same as SPI. TWI is a physical

157

attribute that reflects the situation of geotechnical wetness (Chapi et al., 2017), and it

158

is computed as follows (BEVEN and Kirkby, 1979):

   TWI  ln    tan  

(3)

159

where  and  represent the upslope area per unit contour length and the slope

160

angle, respectively. Lithology determines the geological engineering characteristics of

161

the study area (Arabameri et al., 2019; Hong et al., 2018b), and the lithological

162

classes of the study area are described in Table 1. Land use contributes to flood

163

occurrence because it may alter flood runoff and sediment transport (Das, 2019b;

164

Rizeei et al., 2018). NDVI can accurately display surface vegetation coverage (Liang

165

et al., 2017) and is calculated as follows:

NDVI 

RNIR  RR RNIR  RR

(4)

166

where RNIR and RR represent the spectral reflectance acquired in near-infrared

167

region and in red region, respectively. Different soil surface characteristics have

168

diverse potential of water storage, which significantly affect the water balance (Hong

169

et al., 2018a). The distance to rivers indicates the distance from river networks, which

170

are the main pathways for flood discharge and expansion (Gigovi´C et al., 2017;

171

González-Arqueros et al., 2018; Qazi et al., 2016). Continuous rainfall events

172

recharge basins and aquifers, and thus the average annual rainfall was always 8

173

considered as a flood triggering factor (Arnell and Gosling, 2016; Tehrany et al.,

174

2015b).

175

The related information on all the triggering factors is listed in Table 2. It should

176

be noted that the land use factor was obtained by classifying a Landsat 7 ETM+

177

satellite image (Scene ID: LE71220422001324SGS00) obtained on November 20,

178

2001 by using the maximum likelihood algorithm with an OA above 85%, and the

179

river networks were extracted from the topographic map and the distance to rivers

180

factor was calculated by using Euclidean tool. All the factor layers were converted

181

into a raster format with a grid size of 30  30 m that corresponds to the spatial size of

182

the DEM data. All the continuous factors were reclassified into categorical classes

183

using ArcGIS software, based on the previous studies (Chen et al., 2019; Gigović et

184

al., 2017; González-Arqueros et al., 2018; Mojaddadi et al., 2017), the characteristics

185

of flood spatial distributions and the numerical range of these factors. The

186

reclassification maps of all the flood triggering factors are shown in Fig. 2.

(a)

(b)

9

(c)

(d)

(e)

(f)

(g)

(h)

10

(i)

(j)

(k)

(l)

(m) 187 188 189

Fig. 2. Thematic maps of the study area. (a) Altitude, (b) aspect, (c) Curvature, (d) distance to rivers, (e) land use, (f) lithology, (g) NDVI, (h) rainfall, (i) slope, (j) soil, (k) SPI, (l) STI and (m) TWI.

190 191 192 193 11

194

Table 1 Lithological types of the study area. Group name

Unit name

Lithology

A

Yang Jiayuan group; Zishan group

Dolomite, calcareous Carbonaceous siltstone

B

Zhongpeng group; Mashan group

Sandstone, silty shale, siltstone

C

Huangxie unit; Changshan unit;

D

Tangbiangroup; Hekou group

unit;

Xia

Xia Huangkeng group; Chang kengshui group; Dui Ershi group Nan Pingshan unit; Tanghu super unit; Fufang super unit Yingqian super unit; Huang Tiankeng unit

E F G

Monzonitic granite, potassium granite Conglomerate, mudstone, Pebbly Sandstone Sericite slate, siliceous slate Tonalite, phyric granodiorite, porphyry monzogranite, Granodiorite

H

Bali group; Lao Hutang group

Phyllite, sandy slate, feldspathic quartz sandstone

I

Shuishi group, Gaotan group

Silty slate, carbonaceous slate

195

196

Haihui

siltstone,

Table 2 Data sources and the associated factors in the study area. Flood trigger factors

Source of data

Scare/Resolution

Altitude Aspect Curvature Slope SPI STI TWI

ASTER GDEM Version 2

30 m

Lithology

China Geology Organization

1:2,000,000

Land use NDVI

Landsat 7 ETM + images

30 m

Soil

Institute of Soil Science, Chinese Academy of Sciences

1:1,000,000

Distance to rivers

ASTER GDEM Version 2

30 m

Rainfall

Jiangxi Meteorological Bureau

1:50,000

3. Methodology

197

The flowchart of the proposed methodology to identify flood susceptibility zones is

198

illustrated in Fig. 3. Specifically, a flood inventory map and flood triggering factors

199

are first prepared to construct training and validation sets. Then, three data 12

200

representation forms are used to train different dimensional CNN architectures. Next,

201

the trained CNN architectures are applied for classification and feature extraction in

202

the FSM process. Finally, the prediction results obtained by different CNN-based

203

methods are quantitatively evaluated using several objective criteria.

204 205

206

Fig. 3.

Flowchart of the proposed methodology.

3.1. SVM

207

SVM can map original data into high dimensional feature space by using kernel

208

functions (Cortes and Vapnik, 1995) and attempt to find an optimal hyperplane that

209

separates true and false classes (Vapnik, 1999). Recently, it has been widely used in

210

flood susceptibility assessment (Choubin et al., 2019; Tehrany et al., 2019; Tehrany et

211

al., 2014b; Tehrany et al., 2015b). 13

212

B N Assuming that a training set x   x1 , x2 , . . . , xN   consists of two classes that

213

denoted as L  1,  1 , the aim of SVM is to ensure hyperplane margin

214

maximization and error minimization. The error minimization is expressed as follows: 1 min  w w , i , b 2 

2

  C  i  i 

(1)

s.t. y i  w    xi   b   1   i ,  i  0 ,  i  1, 2, . . . , l 215

where w is the weight vector, b denotes the bias,  i is the slack variables for

216

non-separable data and C denotes the penalty parameter. The optimization (1) is

217

solved as follows:  l  1 l max    i    i j yi y j K  x , xi   i 2 i 1, j 1   i 1

(2)

l

s.t. 0   i  C ,   i yi  0 i 1

218

where  i is Lagrange multipliers and K  x, xi  is the kernel function, the decision

219

function is written as follows: l

f  x     i yi K  x, xi   b

(3)

i 1

220

The SVM with the radial basis function (RBF) kernel can achieve excellent

221

performance in classification/prediction tasks due to the following facts: First, the

222

RBF kernel function can map a sample to a higher dimensional space, and the linear

223

kernel function is essentially a special case of RBF. Meanwhile, the kernels of RBF

224

and sigmoid have similar performance with certain parameters. Second, compared

225

with the polynomial kernel function, only a few parameters need to be modulated by

226

RBF and the number of kernel function parameters directly influences the complexity

227

of this function. Finally, the RBF kernel has less numerical difficulties. 14

228

3.2. Data representations for CNNs

229

As a powerful technique, CNN structures of different dimensions can be

230

constructed to adapt to input data of different dimensions, which can greatly affect

231

final classification/prediction results (Kussul et al., 2017). In addition, the

232

representation of raw data is crucial because CNN is based on the principle of locality

233

(Li et al., 2019). In this section, three data representations will be introduced to fit the

234

novel CNN architectures, as illustrated in Fig. 4. Initially, 13 flood triggering factor

235

layers are stacked together to build a multi-channel “image” with a spatial resolution

236

of 30 m, which is regarded as input data for further data transformation. To clarify,

237

the three data representations are briefly described as follows:

238

(1) For the one-dimensional data form in Fig. 4 (a), each pixel vector is considered

239

as an image with several flood triggering attributes hidden in each pixel.

240

Specifically, the input data consists of a set of column vectors whose length is

241

determined by the number of flood triggering factors. Each element in the

242

vector represents the corresponding flood predisposing factor’s attribute value.

243

(2) As shown on the right side of Fig. 4 (a), the two-dimensional data form is

244

derived from one-dimensional data. The most critical issue is how to construct a

245

bridge from a one-dimensional vector to a two-dimensional matrix. In this

246

study, each pixel vector in the multi-channel “image” is converted into an n  n

247

matrix in a specific way, where n is the maximum between the number of flood

248

triggering factors and the number of categories in which each factor is

249

reclassified. For example, there are 13 flood triggering factors in the study area,

250

which are larger than the maximum categories (9) of the aspect and lithology

251

factors. Thus, a value of 13 is assigned to n. Then, each value in the pixel vector 15

252

(one-dimensional data) is expanded into a row vector of length 13. Specifically,

253

the fourth element of “6” (the attribute value of the distance to rivers factor) in

254

the pixel vector is converted into a row vector of

255

which the sixth and other elements of the vector are assigned values of 1 and 0,

256

respectively. Finally, all row vectors form a 13 × 13 matrix (two-dimensional

257

data).

0 0 0 0 0 1 0 0 0 0 0 0 0 , in

258

(3) In the case of three-dimensional data, each pixel and its adjacent pixels form a

259

three-dimensional window patch, as shown in Fig. 4 (b). The patch has the

260

same label as the current pixel. Assuming the window size is 5, each data

261

sample is 13 × 5 × 5.

(a)

16

(b) 262 263

Fig. 4.

264

3.3. CNN architectures

Different data forms. (a) One-dimensional and two-dimensional data forms, (b)

three-dimensional data form.

265

CNNs, exhibiting robust performance in computer vision and image processing

266

fields, are basically multilayer feed-forward neural networks that can automatically

267

extract valuable features from raw data (Zhang et al., 2018). The CNN architecture is

268

mainly composed of an input layer, multiple hidden layers and an output layer, and

269

the hidden layers consist of one or more convolutional and pooling layers

270

(Ghorbanzadeh et al., 2019; LeCun et al., 2015). The convolutional layer that consists

271

of several convolution kernels iteratively extracts sophisticated and effective features

272

from the original data (Canziani et al., 2016; Mallat, 2016). The pooling (sub-sample)

273

layer is usually carried out after a convolutional layer to reduce the dimensionality of

274

feature maps through a down-sampling algorithm, which can avoid overfitting and

275

reduce computational cost (Chen et al., 2016). Moreover, CNNs have strong adaptive

276

capability and can achieve excellent performance when facing different dimensional 17

277

data (Shin et al., 2016). In this section, three different dimensional CNN architectures

278

are developed for regional flood susceptibility analysis.

279

3.3.1. 1D-CNN

280

The 1D-CNN structure is comprised of five layers of an input layer, a convolutional

281

layer, a max pooling layer, a fully connected layer and an output layer, which was

282

developed and trained using one-dimensional data. Assuming that there are n flood

283

triggering factors in the input data, N convolutional kernels with a size of a  1 to

284

filter the input data, the resultant layer has N feature vectors. Each grid cell is

285

connected to an a  1 neighborhood in the input vector. A b  1 max pooling layer is

286

immediately used after the first convolutional layer, which can reduce the length of

287

feature vectors. A fully connected layer with k neural units follows the max pooling

288

layer to reorganize the extracted representations. Finally, m neural units are viewed as

289

the output of the network. Fig. 5 shows the architecture of 1D-CNN.

290 291

Fig. 5.

1D-CNN architecture ( n = 13, N = 20, a = 3, b = 2, k = 20 and m = 2 ). 18

292

3.3.2. 2D-CNN

293

The two-dimensional CNN structure consists of two convolutional layers, two max

294

pooling layers and one fully connected layer. Assuming that the input data is an n  n

295

image and the first convolutional layer has N kernels with a size of a  a, which can

296

produce N feature maps, the size of feature maps is reduced by half with the depth

297

remaining unchanged after applying a b  b max pooling operation. The resulting

298

feature maps are sent to the second convolutional layer that has M kernels with a size

299

of a  a. Then, the resultant M feature maps are sent to the second b  b max pooling

300

layer, and a fully connected layer with k neural units are used to recognize these

301

extracted features. Finally, m neural units are regarded as the output of the 2D-CNN

302

architecture. Fig. 6 shows the structure of 2D-CNN.

19

303 304

Fig. 6.

2D-CNN architecture ( n = 13, a = 3, N = 20, M = 10, b = 2, k = 50 and m = 2 ) .

20

305

3.3.3. 3D-CNN

306

The 3D-CNN structure contains one convolutional layer, one max pooling layer

307

and two fully connected layers. Assuming that each input data can be represented as a

308

c  n  n matrix, N convolutional kernels with a size of a  a  a to filter the input

309

data, the resultant layer has N feature maps. The b  b  b max pooling layer is used

310

to reduce the size of feature maps by half. Then, two fully connected layers with k

311

neural units are used to reorganize the extracted information. Finally, the output layer

312

containing m neural units, representing the final result of the whole network. Fig. 7

313

shows the structure of 3D-CNN.

314 315 316

Fig. 7.

3D-CNN architecture ( c = 13, n = 5, N = 20, a = 3, b = 2, k = 45 and m = 2 ).

3.3.4. Hyperparameters of CNN

317

Hyperparameters setting is a key step in building a CNN architecture (LeCun et al.,

318

2015). This section illustrates some related hyperparameters used in this study. 21

319

Specifically, the sizes of the convolutional kernel and the pooling kernel determine

320

the scale of the convolution and pooling operations, respectively (Choi et al., 2017;

321

Golik et al., 2015; Simard et al., 2003). The activation function converts a linear

322

relationship to a nonlinear relationship so that the neural network can approximate

323

any nonlinear function (Audebert et al., 2019; Chen et al., 2016; Dahl et al., 2013;

324

Mou et al., 2017). The loss function is used to measure the degree of inconsistency

325

between the predicted value and the true value (Chen et al., 2014; James and Stein,

326

1992; Nourani et al., 2015; Ranjbar et al., 2018; Singh, 1997). In addition, the

327

optimizer is used to iteratively update parameters at different learning rates through a

328

gradient descent algorithm (Hinton et al., 2012). AdaGrad is a widely used optimizer

329

because it can automatically adjust the learning rate based on the gradient value of the

330

independent variable in each dimension, thereby avoiding the problem of difficulty in

331

adapting to the unified learning rate in all dimensions (Duchi et al., 2011; Paoletti et

332

al., 2018; Schmidhuber, 2015). The learning rate is critical to the training process

333

because it controls the learning speed of the CNN architecture (Marmanis et al., 2016;

334

Salamon and Bello, 2017).

335

3.4. FSM using CNNs

336

CNNs are characterized with local connections and sharing weights, and their main

337

advantage is to effectively extract spatial information and automatically optimized

338

network parameters (Krizhevsky et al., 2012; LeCun et al., 1998). To explore the

339

predictive capability of the CNN framework, two different perspectives of

340

classification and feature extraction are proposed in flood susceptibility analysis.

341

Specifically, in the first perspective, CNNs are directly used as classifiers for FSM,

342

whereas in the second perspective, a hybrid framework is developed by integrating 22

343

CNNs with the SVM classifier, and the CNNs are used to extract more representative

344

features from original data, improving classical ML classifiers to achieve more

345

effective prediction with these representative features.

346

3.4.1. CNN classifiers

347

As mentioned in Section 3.2 and 3.3, CNNs have achieved fairly good results in

348

processing classification related tasks (Hu et al., 2015). In the field of flood

349

susceptibility assessment, the main objective is to predict where a flood event will

350

occur, which can be treated as a binary classification process to label a given region

351

with “flood” or “non-flood”. In this study, the CNNs are directly used as classifiers

352

for regional flood spatial prediction. The main steps of applying CNNs for

353

classification are summarized as follows. First, the original data are reformed

354

according to the data representation algorithms previously mentioned in Section 3.2.

355

Then, a CNN structure is built based on the dimensionality of the reformed data. After

356

a series of convolutional and pooling operations, the high-level feature maps are

357

extracted from the input data and then reorganized by a fully connected layer. Finally,

358

the reorganized feature vectors are converted to resultant classification results using a

359

nonlinear activation function. For simplicity, the generalized classification process

360

using a CNN classifier with the 2D-CNN architecture is shown in Fig. 8.

23

361 362 363

Fig. 8.

Generalized CNN classifier with the 2D-CNN architecture.

3.4.2. Integration of CNNs and SVM

364

CNN is an effective ML technique that can discover new representations and

365

relationships from the original data, and these features are not easily uncovered

366

through traditional methods (Bergen et al., 2019). Furthermore, CNNs can

367

automatically extract image features including color, shape, texture, and topological

368

structures by using different combinations of neurons and learning rules, which

369

proved the great potential in feature extraction as well (Niu and Suen, 2012). In other

370

fields, the integration of CNNs and SVM has achieved promising results. For

371

example, Niu and Suen (Niu and Suen, 2012) used a 2D-CNN with five layers to

372

extract features from Mixed National Institute of Standards and Technology data and

373

applied SVM for final classification. Also, the hybrid method of CNN and SVM was

374

applied for microvascular morphological type recognition and hyperspectral image

375

classification (Leng et al., 2016; Xue et al., 2016). In this study, different CNN

376

structures are used to learn more suitable features from the original data. Based on

377

this idea, the main steps of integrating CNN and SVM are summarized as follows. 24

378

First, the original data is fed to the CNN structure in different dimensional styles.

379

Then, after several layers of convolution and pooling manipulations, the hidden

380

information of the input data is reorganized in the fully connected layer, where the

381

neural units are regarded as the extracted features. In the following, the SVM

382

classifier is trained using the extracted high-level features. Finally, this integrated

383

framework is used to perform FSM. Specifically, three different dimensional CNN

384

feature extractors are proposed to learn high-level features from the original data with

385

the same structures mentioned in Section 3.3. Fig. 9 illustrates the classification

386

process using the 2D-CNN architecture to extract features for an example.

387 388 389

Fig. 9.

Integrated classifier using a ML classifier, couple with CNNs for feature extraction.

390

3.5. Evaluation methods

391

3.5.1. Feature evaluation

392

To evaluate the effectiveness of extracted features by CNNs, a similarity measure

393

(mean value of correlation coefficients) among all the samples is used. The greater the

394

similarity of extracted features in the same class indicates that the input data is easier

395

to discriminate. For clarification, the correlation coefficient is defined as follows (Goh 25

396

et al., 2007):

  x1 , x2  

C  x1 , x2  D  x1  D  x2 

(1)

397

where x1 and x2 represent two feature vectors, C  x1 , x2  is a covariance matrix,

398

and D  x1  and D  x2  are the variance of x1 and x2 , respectively.

399

3.5.2. Model evaluation

400

Recently, the ROC curve has been used to assess the performance of FSM models

401

(Dano et al., 2019; Kazakis et al., 2015; Popovic et al., 2018). Specifically, this curve

402

is constructed using “sensitivity” on the y-axis against “1-Specificity” on the x-axis

403

(Bradley, 1997). Moreover, an AUC value indicates the predictive capability as well

404

(Choubin et al., 2019; Pamučar et al., 2018b; Xia et al., 2017). In particular, an AUC

405

value of 1 indicates that there is a perfect spatial agreement between the observed and

406

simulation data. An AUC value of 0.5 is the agreement that would be expected due to

407

chance, whereas a value of 0 considers a non-informative result (Choubin et al., 2019;

408

Fawcett, 2006; Tehrany et al., 2014b). In addition, OA is another widely used

409

criterion to assess the predictive performance of FSM models (Chapi et al., 2017;

410

Hong et al., 2018b; Rahmati et al., 2016), and it is the proportion of the number of

411

correctly classified pixels to the total number of pixels as follows: OA 

TP  TN TP  TN  FP  FN

(2)

412

where TP (true positive) and TN (true negative) denote the number of flood and

413

non-flood grid cells that are correctly classified, where FP (false positive) and FN

414

(false negative) are the number of flood and non-flood grid cells incorrectly classifier,

415

respectively. Besides,  is another powerful statistical criterion measuring the 26

416

agreement or reliability between two raters (Pontius Jr and Millones, 2011). For flood

417

susceptibility assessment, this criterion can evaluate the prediction accuracy of FSM

418

models and a higher value implies better prediction (Khosravi et al., 2018; Mahmoud

419

and Gan, 2018). This criterion is formulated as follows:

κ

Pexp 

PC  Pexp 1  Pexp

((TP  FN )(TP  FP)  ( FP  TN )( FN +TN )) (TP  TN  FN  FP)

(3) (4)

420

where PC is the proportion of samples that have been correctly classified and Pexp

421

means the expected probability of change agreement.

422

Additionally, a non-parametric test of Wilcoxon signed-rank test was implemented

423

to validate the statistical significance difference among different models (Wilcoxon et

424

al., 1970). The null hypothesis is based on the pre-assumption that FSM models have

425

no significance difference at a significance level of  = 5%. Generally, if the p value

426

is smaller than 0.05 and the Z value beyond the threshold of 1.96, the prior

427

hypothesis is not true and rejected and the difference between two models is

428

significant.

429

4. Results

430

4.1. Building flood models and producing flood susceptibility

431

maps

432

In this subsection, three different CNN architectures were constructed for

433

classification and feature extraction in flood susceptibility analysis. To this end, three

434

different dimensional training sets and validation sets were first prepared for 27

435

subsequent CNN training and validation, as described in Section 3.3. Then, three

436

different dimensional CNN structures were constructed, and training sets of different

437

dimensions were fed to the corresponding neural network with a certain number of

438

iterations until the training process converges. Finally, trained CNN models were used

439

for classification and feature extraction. It should be noted that the window size is a

440

key parameter for constructing a three-dimensional data form. Therefore, as shown in

441

Fig. 10, analysis of window size is performed. It can be observed that a 3D-CNN

442

model with a window size of 5 performs the highest AUC value. In addition, during

443

the construction of the CNN structure, the related hyperparameters were determined

444

through a fine-tuning step using the validation set, referring to previous studies

445

(Audebert et al., 2019; Hu et al., 2014; Maggiori et al., 2017; Shin et al., 2016). Table

446

3 reports all hyperparameters settings for different CNN architectures. The

447

aforementioned CNN structures were built in Python using the Keras framework3.

448 449

Fig. 10. Analysis of window size for three-dimensional data.

450 3

https://keras.io 28

451 452 453

Table 3 Hyperparameter settings of the CNN architectures. Parameters

CNNs

Convolutional kernel size

Max pooling kernel size

Number of epochs

1D-CNN

31

21

600

2D-CNN

33

22

400

3D-CNN

333

222

150

Activation function

Loss function

Optimizer

Learning rate

ReLU

Categorical cross-entropy

AdaGrad

0.0015

454

As the proposed CNN architectures were constructed and trained, three CNN

455

structures were applied to classification and feature extraction. Regarding CNN for

456

classification, 1D-CNN, 2D-CNN and 3D-CNN models were used to calculate the

457

flood susceptibility index for each pixel. For CNN for feature extraction, the validity

458

of the extracted features was evaluated using the similarity statistical index, as shown

459

in Table 4. It can be observed that all the extracted features obtained by the 1D-CNN,

460

2D-CNN and 3D-CNN architectures have higher similarity than the original features,

461

which demonstrates that the extracted features are more efficient and easier to

462

classify. Then, the learned features were regarded as new representation vectors and

463

sent to the SVM model for classification, and the susceptibility indexes were

464

calculated by the hybrid methods of 1D-CNN-SVM, 2D-CNN-SVM and

465

3D-CNN-SVM. After the susceptibility index of each grid cell in the study area was

466

obtained by the FSM models, these indices were automatically divided into five

467

intervals of very low, low, moderate, high and very high using the natural breaks

468

(Jenks) method (Jenks, 1967), which has been widely used in the FSM process (Chapi

469

et al., 2017; Dano et al., 2019; Shafizadeh-Moghadam et al., 2018). The SVM

470

classifier with RBF kernel is a robust method that has achieved good performance in

471

FSM (Choubin et al., 2019; Tehrany et al., 2015b; Zhao et al., 2019). For comparison,

472

an SVM classifier with an RBF kernel was used for FSM to demonstrate the 29

473

effectiveness of a CNN-based feature extractor.

474 475

Table 4 Similarity of original features and extracted features. Quantitative Methods Original 1D features Extracted features Original 2D features Extracted features Original 3D features Extracted features

Similarity Flood class Non-flood class 0.723 0.508 0.934 0.894 0.487 0.275 0.933 0.839 0.705 0.494 0.911 0.772

476

The flood susceptibility maps produced by different CNN-based methods and SVM

477

are shown in Fig. 11. It can be seen that very high susceptible zones have similar

478

distributions and most floods are located in very high susceptible areas. In order to

479

quantitatively analyze the susceptibility maps, the frequency of flood occurrence in

480

the susceptible zones was analyzed, and the results are shown in Table 5 . It can be

481

seen that most floods are predicted in the high and very high susceptible areas and

482

very few floods occur in very low susceptible areas, which proves that there is

483

relatively high consistency between flood historical events and susceptible zones for

484

all the methods. Furthermore, more than 80% of the flood historical events located in

485

the very high susceptible area confirmed the rationality of the flood susceptibility

486

maps. Meanwhile, all the CNN-based methods achieved more reliable susceptibility

487

maps because the frequency ratio of floods was higher than SVM in the very high and

488

high areas.

30

(a)

(b)

(c)

(d)

(e)

(f)

31

(g) 489 490

Fig. 11. Flood susceptibility maps of different classifiers. (a) 1D-CNN; (b) 2D-CNN; (c) 3D-CNN; (d) 1D-CNN-SVM; (e) 2D-CNN-SVM; (f) 3D-CNN-SVM and (g) SVM.

491 492 493 494 495 496

Table 5 Frequency analysis of floods on the susceptibility maps.

Susceptible area Very low Low Moderate High Very high 497

1D-CNN 0% 0% 2.78% 15.74% 81.48%

2D-CNN 0.93% 0% 2.78% 12.96% 83.33%

3D-CNN 0% 0.93% 1.85% 12.04% 85.19%

1D-CNN-SVM 0% 0.93% 4.63% 11.11% 83.33%

2D-CNN-SVM 0% 0.93% 6.48% 8.33% 84.26%

3D-CNN-SVM 0.93% 1.85% 1.85% 11.11% 84.26%

4.2. Model validation and comparison

498

To evaluate the performance of all the methods, the statistical criteria of  and OA

499

were used, as shown in Table 6. It can be seen that all the CNN-based methods

500

achieved better performance than SVM in terms of both OA and . For example, the

501

1D-CNN-SVM, 2D-CNN-SVM and 3D-CNN-SVM methods were 1.57–4.69% and

502

0.0312–0.0937 higher than SVM in terms of OA and , respectively. Moreover, the

503

2D-CNN-SVM method achieved the highest OA and  value among all the methods.

504

Therefore, the performance of SVM can be further improved by using the features

505

extracted by CNN. 32

SVM 0% 4.63% 5.56% 6.48% 83.33%

506

Table 6 Prediction accuracies of different CNN-based methods and SVM. Method 1D-CNN 2D-CNN 3D-CNN 1D-CNN-SVM 2D-CNN-SVM 3D-CNN-SVM SVM

OA value 84.38% 85.94% 84.38% 85.94% 87.50% 84.38% 82.81%

 0.6875 0.7188 0.6875 0.7188 0.7500 0.6875 0.6563

507

Fig. 12 shows the ROC curves of all the methods using the validation set.

508

Experimental results demonstrated that all the CNN-based methods had better

509

prediction performance than SVM in terms of AUC. Specifically, the highest AUC

510

value obtained by the 2D-CNN method was 0.937, followed by 2D-CNN-SVM

511

(0.934), 3D-CNN (0.928), 3D-CNN-SVM (0.922), 1D-CNN (0.905), 1D-CNN-SVM

512

(0.904) and SVM (0.883). Meanwhile, when CNN was used for feature extraction in

513

the classification process, the AUC value of SVM can be increased by 0.021–0.051.

514

Furthermore, the Wilcoxon signed-rank test was selected to verify the statistical

515

difference between the proposed CNN-based methods and the SVM classifier. When

516

the significance level of p value is smaller than 0.05 and the Z value exceeds 1.96,

517

the difference between flood models is significant. Z values and the significant levels

518

of different flood models are shown in Table 7. It can be seen that all the comparative

519

models are very different, because the Z-value and p-value of the significance level

520

satisfy with the corresponding significant conditions mentioned in Section 3.5.2.

33

521 522

Fig. 12. ROC curves for all the methods using the verification set.

523

Table 7 Wilcoxon signed-rank test between the proposed methods and SVM. Comparative pairs SVM vs. 1D-CNN SVM vs. 2D-CNN SVM vs. 3D-CNN SVM vs. 1D-CNN-SVM SVM vs. 2D-CNN-SVM SVM vs. 3D-CNN-SVM

524

Z value -8.575 -2.925 -11.145 -14.657 -14.469 -11.145

p value < 0.05 < 0.05 < 0.05 < 0.05 < 0.05 < 0.05

Significance level Yes Yes Yes Yes Yes Yes

5. Discussion

525

It is a key prerequisite for preventing and reducing flood damage by conducting an

526

appropriate FSM technique (Hong et al., 2018a; Wang et al., 2019b). Therefore, it is

527

necessary to explore the possibility of applying newly developed techniques in FSM.

528

In this paper, the deep learning representative of CNN is considered for regional flood

529

susceptibility analysis with a different perspective in Shangyou County, China.

530

As a popular and powerful technique, CNN gradually shows great potential in

531

classification/prediction. CNN architectures of different dimensions have been

532

designed to solve various tasks with different data modalities (LeCun et al., 2015). 34

533

For example, 1D-CNN has been widely used for signal processing and sentence

534

classification, 2D-CNN has been mainly used for image processing (Golik et al.,

535

2015; Hu et al., 2014) and 3D-CNN has been developed for video related problems (Ji

536

et al., 2012; Tran et al., 2015). With regard to flood susceptibility analysis, the sample

537

in the study area is usually composed of a set of factor vectors in the form of

538

one-dimensional data (Chapi et al., 2017; Hong et al., 2018a; Khosravi et al., 2019;

539

Zhao et al., 2019), and all the samples are input into the ML method for classification.

540

To fully explore the prediction capability of CNN in flood susceptibility assessment,

541

data forms of different dimensions were proposed to fit the corresponding CNN

542

structure. The one-dimensional data form contains all the information about the flood

543

triggering factors, and 1D-CNN architecture can take advantage of the local

544

relationships between the flood triggering factors. Various flood triggering factors

545

related to the morphological, geological and hydrological conditions play a crucial

546

role in constructing flood models (Mahmoud and Gan, 2018b; Rijal et al., 2018).

547

Based on the fact that CNN has demonstrated the promising results in image

548

processing (Anthimopoulos et al., 2016; Krizhevsky et al., 2012; LeCun et al., 1998),

549

a two-dimensional data form is proposed to construct the CNN to analyze flood

550

susceptibility in an “image” perspective. This means that this data form can be

551

regarded as a two-dimensional extension of the feature vector by considering the

552

spatial information of the triggering factors. As for the three-dimensional data form, it

553

contains both the attribute factor information and local terrain spatial information

554

between pixels in a specific window.

555

It is significant for decision-makers to produce reliable flood susceptibility maps to

556

assess and prevent flood hazards (Termeh et al., 2018; Wang et al., 2019c; Xia et al.,

557

2017). Meanwhile, it is a useful way to assess flood susceptibility maps using 35

558

frequency analysis of flood occurrences. In Table 5, most floods occurred in the high

559

and very high susceptible areas and few floods occurred in the very low susceptible

560

areas, which is consistent with previous studies (Bui et al., 2019b; Chapi et al., 2017;

561

Khosravi et al., 2018). Moreover, the CNN-based methods achieved a higher flood

562

frequency than SVM (89.81%), reaching 92.59–97.22%, in terms of the high and very

563

high susceptible areas, which proved that the CNN-based methods can achieve more

564

accurate and reliable susceptibility maps. Furthermore, experimental results in Table

565

6 and Fig. 12 demonstrated that all the proposed CNNs obtained better results than

566

SVM in terms of OA,  and AUC. SVM learns the decision plane directly from the

567

raw data, while CNN can transform the raw data into useful representations that are

568

important for model discrimination. After convolution and pooling operations,

569

important parts of input data are enhanced, and irrelevant information is reduced.

570

Furthermore, the Wilcoxon signed-rank test in Table 7 demonstrated that the

571

statistical difference between the CNN-based methods and SVM is significant. It was

572

verified that the improvement in prediction accuracy brought by the CNN-based

573

methods is instructive for decision-makers considering adopting a novel CNN

574

technique to obtain more accurate flood susceptibility maps. In particular, when the

575

2D-CNN architecture was used directly as a classifier or feature extraction in the FSM

576

process, the reasons for obtaining the highest accuracy can be summarized as follows.

577

First, as described in Section 3, compared to the one-dimensional data form, the

578

three-dimensional data form contains not only attribute information but also local

579

terrain spatial information. However, the three-dimensional data form may contain

580

redundant information that interferes with the classifier to distinguish true

581

classification labels. Second, the two-dimensional data form not only makes the

582

construction of the CNN network easier, but also can better display the attribute 36

583

information in the original data.

584

It should be noted in Table 6 that some methods had the same OA and  values.

585

For example, 1D-CNN, 3D-CNN and 3D-CNN-SVM obtained the same OA and 

586

values of 84.38% and 0.6875, and 2D-CNN and 1D-CNN-SVM got the same OA and

587

 values of 85.94% and 0.7188, respectively, due to the very limited validation

588

samples. In addition, the evaluation criteria of OA and  that mainly consider the final

589

classification label may not accurately reflect the significant differences between the

590

proposed FSM models, but the ROC curve technique is very sensitive to flood

591

susceptibility indices. Therefore, the ROC curve and AUC value can precisely portray

592

the reliable assessment results of the FSM methods, which has been confirmed by

593

other studies (Liu et al., 2018; Mukhametzyanov and Pamucar, 2018; Pamučar et al.,

594

2018a; Zhao et al., 2019).

595

6. Conclusions

596

This paper studies the application of the CNN technique in FSM from two different

597

perspectives in the case of Shangyou County, China. First, a spatial database

598

containing 13 flood triggering factors and 108 flood historical events was prepared to

599

construct the proposed methods. The developed CNN architectures were then used to

600

construct flood susceptibility models in two different ways of classification and

601

feature extraction. Next, flood susceptibility maps were obtained using the

602

CNN-based methods in comparison to SVM. Finally, several objective criteria of OA,

603

, ROC and AUC were used to compare and verify all the FSM methods. The main

604

conclusions of this paper are summarized as follows:

605

(1) The proposed 1D-CNN, 2D-CNN and 3D-CNN classifiers achieved better 37

606

prediction performance than SVM in terms of AUC, which proves the superiority of

607

the CNN frameworks.

608

(2) By using CNN for feature extraction, the prediction capability of SVM was

609

effectively improved. In particular, the 2D-CNN-SVM method achieved the highest

610

accuracy in terms of OA,  and AUC.

611

(3) The proposed CNN-based methods are expected to be used for flood disaster

612

assessment and management. Meanwhile, the application prospect of CNN can inspire

613

more effective FSM techniques.

614

Acknowledgements

615

This work was supported by the National Natural Science Foundation of China

616

(61271408, 41602362), the International Partnership Program of Chinese Academy of

617

Sciences

618

(201906860029). The authors would also like to thank the handling editor and the two

619

anonymous reviewers for their valuable comments and suggestions, which

620

significantly improved the quality of this paper.

621

References

622 623 624 625 626 627 628 629 630

Anthimopoulos, M., Christodoulidis, S., Ebner, L., Christe, A., Mougiakakou, S., 2016. Lung pattern

(115242KYSB20170022)

and

the

China

Scholarship

Council

classification for interstitial lung diseases using a deep convolutional neural network. IEEE transactions on medical imaging, 35, 1207-1216. Arabameri, A., Rezaei, K., Cerdà, A., Conoscenti, C., Kalantari, Z., 2019. A comparison of statistical methods and multi-criteria decision making to map flood hazard susceptibility in Northern Iran. Science of The Total Environment, 660, 443-458. Arnell, N.W., Gosling, S.N., 2016. The impacts of climate change on river flow regimes at the global scale. Climatic Change, 134, 387-401. Audebert, N., Saux, B., Lefèvre, S., 2019. Deep Learning for Classification of Hyperspectral Data: A 38

631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674

Comparative Review. arXiv preprint arXiv:1904.10674. Bathrellos, G., Skilodimou, H., Soukis, K., Koskeridou, E., 2018. Temporal and Spatial Analysis of Flood Occurrences in the Drainage Basin of Pinios River (Thessaly, Central Greece). Land, 7, 106. Bathrellos, G.D., Skilodimou, H.D., Chousianitis, K., Youssef, A.M., Pradhan, B., 2017. Suitability estimation for urban development using multi-hazard assessment map. Science of the Total Environment, 575, 119-134. Bergen, K.J., Johnson, P.A., Maarten, V., Beroza, G.C., 2019. Machine learning for data-driven discovery in solid Earth geoscience. Science, 363, eaau0323. BEVEN, K.J., Kirkby, M.J., 1979. A physically based, variable contributing area model of basin hydrology/Un modèle à base physique de zone d'appel variable de l'hydrologie du bassin versant. Hydrological Sciences Journal, 24, 43-69. Bout, B., Jetten, V., 2018. The validity of flow approximations when simulating catchment-integrated flash floods. Journal of Hydrology, 556, 674-688. Bradley, A.P., 1997. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern recognition, 30, 1145-1159. Bui, D.T. et al., 2019a. A novel deep learning neural network approach for predicting flash flood susceptibility: A case study at a high frequency tropical storm area. Science of The Total Environment, 134413. Bui, D.T., Tsangaratos, P., Ngo, P.-T.T., Pham, T.D., Pham, B.T., 2019b. Flash flood susceptibility modeling using an optimized fuzzy rule based feature selection technique and tree based ensemble methods. Science of The Total Environment. Canziani, A., Paszke, A., Culurciello, E., 2016. An analysis of deep neural network models for practical applications. arXiv preprint arXiv:1605.07678. Chaabani, C., Chini, M., Abdelfattah, R., Hostache, R., Chokmani, K., 2018. Flood Mapping in a Complex Environment Using Bistatic TanDEM-X/TerraSAR-X InSAR Coherence. Remote Sensing, 10, 1873. Chapi, K. et al., 2017. A novel hybrid artificial intelligence approach for flood susceptibility assessment. Environmental Modelling & Software, 95, 229-245. Chen, W. et al., 2019. Flood susceptibility modelling using novel hybrid approach of Reduced-error pruning trees with Bagging and Random subspace ensembles. Journal of Hydrology. Chen, Y., Jiang, H., Li, C., Jia, X., Ghamisi, P., 2016. Deep feature extraction and classification of hyperspectral images based on convolutional neural networks. IEEE Transactions on Geoscience and Remote Sensing, 54, 6232-6251. Chen, Y., Lin, Z., Zhao, X., Wang, G., Gu, Y., 2014. Deep learning-based classification of hyperspectral data. IEEE Journal of Selected topics in applied earth observations and remote sensing, 7, 2094-2107. Choi, K., Fazekas, G., Sandler, M., Cho, K., 2017. Convolutional recurrent neural networks for music classification, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp. 2392-2396. Choubin, B. et al., 2019. An Ensemble prediction of flood susceptibility using multivariate discriminant analysis, classification and regression trees, and support vector machines. Science of the Total Environment, 651, 2087-2096. Cortes, C., Vapnik, V., 1995. Support-vector networks. Machine learning, 20, 273-297. 39

675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718

Dahl, G.E., Sainath, T.N., Hinton, G.E., 2013. Improving deep neural networks for LVCSR using rectified linear units and dropout, 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, pp. 8609-8613. Dano, U.L. et al., 2019. Flood Susceptibility Mapping Using GIS-Based Analytic Network Process: A Case Study of Perlis, Malaysia. Water, 11, 615. Das, S., 2019a. Geospatial mapping of flood susceptibility and hydro-geomorphic response to the floods in Ulhas basin, India. Remote Sensing Applications: Society and Environment, 14, 60-74. Das, S., 2019b. Geospatial mapping of flood susceptibility and hydro-geomorphic response to the floods in Ulhas basin, India. Remote Sensing Applications: Society and Environment. Duchi, J., Hazan, E., Singer, Y., 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12, 2121-2159. Fawcett, T., 2006. An introduction to ROC analysis. Pattern recognition letters, 27, 861-874. Gebrehiwot, A., Hashemi-Beni, L., Thompson, G., Kordjamshidi, P., Langan, T.E., 2019. Deep Convolutional Neural Network for Flood Extent Mapping Using Unmanned Aerial Vehicles Data. Sensors, 19, 1486. Ghorbanzadeh, O. et al., 2019. Evaluation of different machine learning methods and deep-learning convolutional neural networks for landslide detection. Remote Sensing, 11, 196. Gigovi´C, L.G.C., Pamučar, D., Baji´C, Z.B.C., Drobnjak, S., 2017. Application of GIS-Interval Rough AHP Methodology for Flood Hazard Mapping in Urban Areas. Water, 9, 1-26. Gigović, L., Pamučar, D., Božanić, D., Ljubojević, S., 2017. Application of the GIS-DANP-MABAC multi-criteria model for selecting the location of wind farms: A case study of Vojvodina, Serbia. Renewable Energy, 103, 501-521. Goh, K.-I. et al., 2007. The human disease network. Proceedings of the National Academy of Sciences, 104, 8685-8690. Golik, P., Tüske, Z., Schlüter, R., Ney, H., 2015. Convolutional neural networks for acoustic modeling of raw time signal in LVCSR, Sixteenth annual conference of the international speech communication association. González-Arqueros, M.L., Mendoza, M.E., Bocco, G., Castillo, B.S., 2018. Flood susceptibility in rural settlements in remote zones: The case of a mountainous basin in the Sierra-Costa region of Michoacán, Mexico. Journal of environmental management, 223, 685-693. Goodfellow, I., Bengio, Y., Courville, A., 2016. Deep learning. MIT press. Graves, A., Mohamed, A.-r., Hinton, G., 2013. Speech recognition with deep recurrent neural networks, 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, pp. 6645-6649. Hinton, G. et al., 2012. Deep neural networks for acoustic modeling in speech recognition. IEEE Signal processing magazine, 29. Hong, H. et al., 2018a. Flood susceptibility assessment in Hengfeng area coupling adaptive neuro-fuzzy inference system with genetic algorithm and differential evolution. Science of The Total Environment, 621, 1124-1141. Hong, H. et al., 2018b. Application of fuzzy weight of evidence and data mining techniques in construction of flood susceptibility map of Poyang County, China. Science of the Total Environment, 625, 575-588. Hu, B., Lu, Z., Li, H., Chen, Q., 2014. Convolutional neural network architectures for matching natural 40

719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762

language sentences, Advances in neural information processing systems, pp. 2042-2050. Hu, W., Huang, Y., Wei, L., Zhang, F., Li, H., 2015. Deep convolutional neural networks for hyperspectral image classification. Journal of Sensors, 2015. James, W., Stein, C., 1992. Estimation with quadratic loss, Breakthroughs in statistics. Springer, pp. 443-460. Jenks, G.F., 1967. The data model concept in statistical mapping. International yearbook of cartography, 7, 186-190. Ji, S., Xu, W., Yang, M., Yu, K., 2012. 3D convolutional neural networks for human action recognition. IEEE transactions on pattern analysis and machine intelligence, 35, 221-231. Kazakis, N., Kougias, I., Patsialis, T., 2015. Assessment of flood hazard areas at a regional scale using an index-based approach and Analytical Hierarchy Process: Application in Rhodope-Evros region, Greece. Science of the Total Environment, 538, 555-563. Khosravi, K. et al., 2018. A comparative assessment of decision trees algorithms for flash flood susceptibility modeling at Haraz watershed, northern Iran. Science of the Total Environment, 627, 744-755. Khosravi, K. et al., 2019. A Comparative Assessment of Flood Susceptibility Modeling Using Multi-Criteria Decision-Making Analysis and Machine Learning Methods. Journal of Hydrology. Kia, M.B., Pirasteh, S., Pradhan, B., Wan, N.A.S., Moradi, A., 2012. An artificial neural network model for flood simulation using GIS: Johor River Basin, Malaysia. Environmental Earth Sciences, 67, 251-264. Krizhevsky, A., Sutskever, I., Hinton, G.E., 2012. Imagenet classification with deep convolutional neural networks, Advances in neural information processing systems, pp. 1097-1105. LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep learning. nature, 521, 436. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86, 2278-2324. Leng, J., Li, T., Bai, G., Dong, Q., Dong, H., 2016. Cube-CNN-SVM: a novel hyperspectral image classification method, 2016 IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, pp. 1027-1034. Li, S. et al., 2019. Deep Learning for Hyperspectral Image Classification: An Overview. IEEE Transactions on Geoscience and Remote Sensing. Liang, D., Xu, Z., Liu, D., 2017. Three-way decisions with intuitionistic fuzzy decision-theoretic rough sets based on point operators. Information Sciences, 375, 183-201. Liu, F., Aiwu, G., Lukovac, V., Vukic, M., 2018. A multicriteria model for the selection of the transport service provider: A single valued neutrosophic DEMATEL multicriteria model. Decision Making: Applications in Management and Engineering, 1, 121-130. Maggiori, E., Tarabalka, Y., Charpiat, G., Alliez, P., 2017. Convolutional neural networks for large-scale remote-sensing image classification. IEEE Transactions on Geoscience and Remote Sensing, 55, 645-657. Mahmoud, S.H., Gan, T.Y., 2018. Multi-criteria approach to develop flood susceptibility maps in arid regions of Middle East. Journal of Cleaner Production, 196, 216-229. Mallat, S., 2016. Understanding deep convolutional networks. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374, 20150203. Marmanis, D. et al., 2016. Semantic segmentation of aerial images with an ensemble of CNNs. ISPRS 41

763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806

Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 3, 473. Mojaddadi, H., Pradhan, B., Nampak, H., Ahmad, N., Ghazali, A.H.b., 2017. Ensemble machine-learning-based geospatial approach for flood risk assessment using multi-sensor remote-sensing data and GIS. Geomatics, Natural Hazards and Risk, 8, 1080-1102. Moore, I.D., Gessler, P., Nielsen, G., Peterson, G., 1993. Soil attribute prediction using terrain analysis. Soil Science Society of America Journal, 57, 443-452. Moore, I.D., Wilson, J.P., 1992. Length-slope factors for the Revised Universal Soil Loss Equation: Simplified method of estimation. Journal of soil and water conservation, 47, 423-428. Mou, L., Ghamisi, P., Zhu, X.X., 2017. Deep recurrent neural networks for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 55, 3639-3655. Mukhametzyanov, I., Pamucar, D., 2018. A sensitivity analysis in MCDM problems: A statistical approach. Decis. Mak. Appl. Manag. Eng, 1, 1-20. Niu, X.-X., Suen, C.Y., 2012. A novel hybrid CNN–SVM classifier for recognizing handwritten digits. Pattern Recognition, 45, 1318-1325. Nourani, V., Alami, M.T., Vousoughi, F.D., 2015. Wavelet-entropy data pre-processing approach for ANN-based groundwater level modeling. Journal of Hydrology, 524, 255-269. Pamučar, D., Mihajlović, M., Obradović, R., Atanasković, P., 2017. Novel approach to group multi-criteria

decision

making

based

on

interval

rough

numbers:

Hybrid

DEMATEL-ANP-MAIRCA model. Expert Systems with Applications, 88, 58-80. Pamučar, D., Petrović, I., Ćirović, G., 2018a. Modification of the Best–Worst and MABAC methods: A novel approach based on interval-valued fuzzy-rough numbers. Expert Systems with Applications, 91, 89-106. Pamučar, D., Stević, Ž., Sremac, S., 2018b. A New Model for Determining Weight Coefficients of Criteria in MCDM Models: Full Consistency Method (FUCOM). Symmetry, 10, 393. Paoletti, M., Haut, J., Plaza, J., Plaza, A., 2018. A new deep convolutional neural network for fast hyperspectral image classification. ISPRS journal of photogrammetry and remote sensing, 145, 120-147. Pontius Jr, R.G., Millones, M., 2011. Death to Kappa: birth of quantity disagreement and allocation disagreement for accuracy assessment. International Journal of Remote Sensing, 32, 4407-4429. Popovic, M., Kuzmanović, M., Savić, G., 2018. A comparative empirical study of Analytic Hierarchy Process and Conjoint analysis: Literature review. Decision Making: Applications in Management and Engineering, 1, 153-163. Qazi, K.I., Lam, H.K., Xiao, B., Ouyang, G., Yin, X., 2016. Classification of epilepsy using computational intelligence techniques. CAAI Transactions on Intelligence Technology, 1, 137-149. Rahmati, O., Pourghasemi, H.R., Zeinivand, H., 2016. Flood susceptibility mapping using frequency ratio and weights-of-evidence models in the Golastan Province, Iran. Geocarto International, 31, 42-70. Rahmati, O., Zeinivand, H., Besharat, M., 2015. Flood hazard zoning in Yasooj region, Iran, using GIS and multi-criteria decision analysis. Geomatics Natural Hazards & Risk. Ranjbar, S., Hooshyar, M., Singh, A., Wang, D., 2018. Quantifying climatic controls on river network branching structure across scales. Water Resources Research, 54, 7347-7360. Rizeei, H.M., Pradhan, B., Saharkhiz, M.A., 2018. An integrated fluvial and flash pluvial model using 42

807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850

2D high-resolution sub-grid and particle swarm optimization-based random forest approaches in GIS. Complex & Intelligent Systems, 1-20. Salamon, J., Bello, J.P., 2017. Deep convolutional neural networks and data augmentation for environmental sound classification. IEEE Signal Processing Letters, 24, 279-283. Santos, P.P., Reis, E., Pereira, S., Santos, M., 2019a. A flood susceptibility model at the national scale based on multicriteria analysis. Science of The Total Environment, 667, 325-337. Santos, P.P., Reis, E., Pereira, S., Santos, M., 2019b. A national scale flood susceptibility model based on multicriteria analysis. Science of The Total Environment. Schmidhuber, J., 2015. Deep learning in neural networks: An overview. Neural networks, 61, 85-117. Shafizadeh-Moghadam, H., Valavi, R., Shahabi, H., Chapi, K., Shirzadi, A., 2018. Novel forecasting approaches using combination of machine learning and statistical models for flood susceptibility mapping. Journal of environmental management, 217, 1-11. Shin, H.-C. et al., 2016. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE transactions on medical imaging, 35, 1285-1298. Simard, P.Y., Steinkraus, D., Platt, J.C., 2003. Best practices for convolutional neural networks applied to visual document analysis, null. IEEE, pp. 958. Singh, V., 1997. The use of entropy in hydrology and water resources. Hydrological processes, 11, 587-626. Sowmya, K., John, C., Shrivasthava, N., 2015. Urban flood vulnerability zoning of Cochin City, southwest coast of India, using remote sensing and GIS. Natural Hazards, 75, 1271-1286. Sutskever, I., Vinyals, O., Le, Q.V., 2014. Sequence to sequence learning with neural networks, Advances in neural information processing systems, pp. 3104-3112. Tehrany, M.S., Jones, S., Shabani, F., 2019. Identifying the essential flood conditioning factors for flood prone area mapping using machine learning techniques. CATENA, 175, 174-192. Tehrany, M.S., Lee, M.J., Pradhan, B., Jebur, M.N., Lee, S., 2014a. Flood susceptibility mapping using integrated bivariate and multivariate statistical models. Environmental Earth Sciences, 72, 4001-4015. Tehrany, M.S., Pradhan, B., Jebur, M.N., 2013. Spatial prediction of flood susceptible areas using rule based decision tree (DT) and a novel ensemble bivariate and multivariate statistical models in GIS. Journal of Hydrology, 504, 69-79. Tehrany, M.S., Pradhan, B., Jebur, M.N., 2014b. Flood susceptibility mapping using a novel ensemble weights-of-evidence and support vector machine models in GIS. Journal of Hydrology, 512, 332-343. Tehrany, M.S., Pradhan, B., Jebur, M.N., 2015a. Flood susceptibility analysis and its verification using a novel ensemble support vector machine and frequency ratio method. Stochastic Environmental Research and Risk Assessment, 29, 1149-1165. Tehrany, M.S., Pradhan, B., Mansor, S., Ahmad, N., 2015b. Flood susceptibility assessment using GIS-based support vector machine model with different kernel types. Catena, 125, 91-101. Termeh, S.V.R., Kornejady, A., Pourghasemi, H.R., Keesstra, S., 2018. Flood susceptibility mapping using novel ensembles of adaptive neuro fuzzy inference system and metaheuristic algorithms. Science of the Total Environment, 615, 438-451. Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M., 2015. Learning spatiotemporal features with 3d convolutional networks, Proceedings of the IEEE international conference on 43

851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891

computer vision, pp. 4489-4497. Vapnik, V.N., 1999. An overview of statistical learning theory. IEEE transactions on neural networks, 10, 988-999. Wang, H., Yang, B., Chen, W., 2016. Unknown Constrained Mechanisms Operation based on Dynamic Interactive Control. Caai Transactions on Intelligence Technology, 1. Wang, Y., Fang, Z., Hong, H., 2019a. Comparison of convolutional neural networks for landslide susceptibility mapping in Yanshan County, China. Science of The Total Environment. Wang, Y. et al., 2019b. A Hybrid GIS Multi-Criteria Decision-Making Method for Flood Susceptibility Mapping at Shangyou, China. Remote Sensing, 11, 62. Wang, Y. et al., 2019c. Flood susceptibility mapping in Dingnan County (China) using adaptive neuro-fuzzy inference system with biogeography based optimization and imperialistic competitive algorithm. Journal of environmental management, 247, 712-729. Wilcoxon, F., Katti, S., Wilcox, R.A., 1970. Critical values and probability levels for the Wilcoxon rank sum test and the Wilcoxon signed rank test. Selected tables in mathematical statistics, 1, 171-259. Xia, X., Liang, Q., Ming, X., Hou, J., 2017. An efficient and stable hydrodynamic model with novel source term discretization schemes for overland flow and flood simulations. Water Resources Research, 53, 3730-3759. Xu, X., Law, R., Chen, W., Tang, L., 2016. Forecasting tourism demand by extracting fuzzy Takagi– Sugeno rules from trained SVMs. Caai Transactions on Intelligence Technology, 1, 30-42. Xue, D.-X., Zhang, R., Feng, H., Wang, Y.-L., 2016. CNN-SVM for microvascular morphological type recognition with data augmentation. Journal of medical and biological engineering, 36, 755-764. Youssef, A.M., Pradhan, B., Jebur, M.N., El-Harbi, H.M., 2015a. Landslide susceptibility mapping using ensemble bivariate and multivariate statistical models in Fayfa area, Saudi Arabia. Environmental Earth Sciences, 73, 3745-3761. Youssef, A.M., Pradhan, B., Pourghasemi, H.R., Abdullahi, S., 2015b. Landslide susceptibility assessment at Wadi Jawrah Basin, Jizan region, Saudi Arabia using two bivariate models in GIS. Geosciences Journal, 19, 449-469. Yu, S., Jia, S., Xu, C., 2017. Convolutional neural networks for hyperspectral image classification. Neurocomputing, 219, 88-98. Zazo, S. et al., 2018. Flood Hazard Assessment Supported by Reduced Cost Aerial Precision Photogrammetry. Remote Sensing, 10, 1566. Zevenbergen, L.W., Thorne, C.R., 1987. Quantitative analysis of land surface topography. Earth surface processes and landforms, 12, 47-56. Zhang, C. et al., 2018. An object-based convolutional neural network (OCNN) for urban land use classification. Remote sensing of environment, 216, 57-70. Zhao, G., Pang, B., Xu, Z., Peng, D., Xu, L., 2019. Assessment of urban flood susceptibility using semi-supervised machine learning model. Science of The Total Environment, 659, 940-949. Zhu, G.N., Hu, J., Qi, J., Gu, C.C., Peng, Y.H., 2015. An integrated AHP and VIKOR for design concept evaluation based on rough number. Advanced Engineering Informatics, 29, 408-418.

892

44

893

Abstract

894

Flood is a very destructive natural disaster in the world, which seriously threaten the

895

safety of human life and property. In this paper, the most popular convolutional neural

896

network (CNN) is introduced to assess flood susceptibility in Shangyou County,

897

China. The main contributions of this study are summarized as follows. First, the

898

CNN technique is used for flood susceptibility mapping (FSM) through two different

899

CNN classification and feature extraction frameworks. Second, three data

900

presentation methods are designed in the CNN architecture to fit the two proposed

901

frameworks. To construct the proposed CNN-based methods, 13 flood triggering

902

factors related to flood historical events in the study area were prepared. The

903

performance of these CNN-based methods was evaluated using several objective

904

criteria in comparison to the classic support vector machine (SVM) classifier.

905

Experiments results demonstrate that all the CNN-based methods can help produce

906

more reliable and practical flood susceptibility maps. For example, the proposed

907

CNN-based classifiers were 0.022–0.054 higher than SVM in terms of area under the

908

curve (AUC). In addition, in the classification process, CNN-based feature extraction

909

can effectively improve the prediction capability of SVM by 0.021–0.051 in terms of

910

AUC. Therefore, the proposed CNN frameworks can help mitigate and manage

911

floods.

912

Keywords: Flood susceptibility mapping; convolution neural network; classification;

913

feature extraction; China.

914 915

45

Credit Author Statement

916 917

918

Yi Wang: Conceptualization; Formal analysis; Funding acquisition;

919

Supervision; Roles/Writing - original draft

920

Zhice Fang: Data curation, Methodology, Validation; Visualization

921

Haoyuan Hong: Investigation; Resources; Writing - review & editing;

922

Funding acquisition;

923

Ling Peng: Funding acquisition; Writing - review & editing

924 925 926 927 928 929 930 931 932

Declaration of interests ☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

☐The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:

933 934 935 936 937

938

Highlights

939



CNNs are considered for dealing with the flood susceptibility mapping task. 46

940



Two different CNN frameworks of classification and feature extraction are presented.

941

942



Three data presentation forms are designed for the proposed CNN frameworks.

943



Reliable flood susceptibility maps can be obtained by using the proposed CNNs.

944



Prediction performance of SVM can be improved using the CNNs for feature

945

extraction.

946 947

47