Comparing the prediction performance of a Deep Learning Neural Network model with conventional machine learning models in landslide susceptibility assessment

Comparing the prediction performance of a Deep Learning Neural Network model with conventional machine learning models in landslide susceptibility assessment

Catena 188 (2020) 104426 Contents lists available at ScienceDirect Catena journal homepage: www.elsevier.com/locate/catena Comparing the prediction...

6MB Sizes 0 Downloads 80 Views

Catena 188 (2020) 104426

Contents lists available at ScienceDirect

Catena journal homepage: www.elsevier.com/locate/catena

Comparing the prediction performance of a Deep Learning Neural Network model with conventional machine learning models in landslide susceptibility assessment

T



Dieu Tien Buia, Paraskevas Tsangaratosb,c, , Viet-Tien Nguyend,e, Ngo Van Liemf, Phan Trong Trinhd a

Institute of Research and Development, Duy Tan University, Da Nang 550000, Viet Nam Geographic Information Science Research Group, Ton Duc Thang University, Ho Chi Minh City, Viet Nam Faculty of Environment and Labour Safety, Ton Duc Thang University, Ho Chi Minh City, Viet Nam d Institute of Geological Sciences, Vietnam Academy of Science and Technology, 84 Chua Lang, Dong da, Hanoi, Viet Nam e Graduate University of Science and Technology, Vietnam Academy of Science and Technology, 18 Hoang Quoc Viet, Cau Giay, Ha Noi, Viet Nam f Faculty of Geography, VNU University of Science, 334 Nguyen Trai, Thanh Xuan, Hanoi, Viet Nam b c

A R T I C LE I N FO

A B S T R A C T

Keywords: Deep Learning Neural Networks Landslide susceptibility GIS Viet nam

The main objective of the current study was to introduce a Deep Learning Neural Network (DLNN) model in landslide susceptibility assessments and compare its predictive performance with state-of-the-art machine learning models. The efficiency of the DLNN model was estimated for the Kon Tum Province, Viet Nam, an area characterized by the presence of landslide phenomena. Nine landslide related variables, elevation, slope angle, aspect, land use, normalized difference vegetation index, soil type, distance to faults, distance to geology boundaries, lithology cover, and 1,657 landslide locations, were used so as to produce the training and validation datasets during the landslide susceptibility assessment. The Frequency Ratio method was used so as to estimate the existing relation between the landslide-related variables and the presence of landslides, assigning to each variable class a weight value. Based on the results concerning the predictive ability of the landslide related variables which was evaluated using the Information ration method, all variables were further processed since they appear as highly predictive. The learning ability of the DLNN model has been evaluated and compared with a Multi Layer Preceptron Neural Network, a Support Vector Machine, a C4.5-Decision Tree model and a Random Forest model using the training dataset, whereas the predictive performance of each model has been evaluated and compared using the validation datasets. In order to evaluate their learning and predictive capacity of each model the classification accuracy, the sensitivity, the specificity and the area under the success and predictive rate curves (AUC) were calculated. Results showed that the proposed DLNN model had a higher performance than the four benchmark models. Although DLNN has been used seldom in landslide susceptibility assessments, the study highlights that the usage of deep learning approach could be considered as a satisfactory alternative approach for landslide susceptibility mapping.

1. Introduction Landslides are classified as one of the most disastrous natural hazards which are responsible for intensive adverse affects upon the physical and human environment. During the last two decades, one of the most studied topics in landslide assessments was identifying prone to landslide areas by understanding the mechanisms responsible for the evolution of landslides (Tien Bui et al., 2017; Pourghasemi et al., 2018). Such efforts, are essential tasks for assessing landslide risk, since they



provide highly valued knowledge extracted from the analysis of the geomorphological, geological, tectonic, climate, vegetation and anthropogenic settings and in general assist in decision-making and land management practices (Gorsevski et al., 2006; Fell et al., 2008; Hervás and Bobrowsky, 2009). As pointed out by many researchers the complexity of mainly the natural but also the man-made processes that are considered responsible for the evolution of landslides, makes it difficult to predict the spatial and temporal occurrence of landslides (Guzzetti et al., 1999; Pourghasemi et al., 2018; Segoni et al., 2018). In such

Corresponding author at: Geographic Information Science Research Group, Ton Duc Thang University, Ho Chi Minh City, Viet Nam. E-mail addresses: [email protected] (D.T. Bui), [email protected] (P. Tsangaratos).

https://doi.org/10.1016/j.catena.2019.104426 Received 22 July 2019; Received in revised form 9 December 2019; Accepted 17 December 2019 0341-8162/ © 2019 Published by Elsevier B.V.

Catena 188 (2020) 104426

D.T. Bui, et al.

layers of non-linear information in order to model complex relations among data most commonly using a multi-layer neural network (MLPNN) (Deng and Yu, 2014). Deep Learning Neural Networks (DLNN) exhibits different topology than that of single-hidden-layer neural networks (shl-NN) as they appear with more than one hidden layer. The DLNN are used for supervised or unsupervised feature extraction, transformation, pattern recognition and classification tasks (Schmidhuber, 2015). In reviewing the relevant literature it was found that DLLN have seldom been utilized in landslide susceptibility assessments. Liu and Wu (2016) developed a deep learning based landslide recognition method, which involved a deep autoencoder network with multiple hidden layers. The network learned the high-level features and representations of a database of optical remote sensing images, which contained 150 landslide and 150 non-landslide images and classified the images either as landslides or non-landslide areas. The experimental results indicate that the proposed method outperformed the performance of SVM and ANN in efficiency and accuracy. Ding et al. (2016) evaluated Convolution Neural Network (CNN) for landslide detection from GF-1 images with four spectral bands and 8 m spatial resolution for Shenzhen, China. Their automated landslide detection method achieved a landslide detection rate of 72.5%, a false positives rate of 10.2%, and an overall accuracy of 67%. Similar Yu et al. (2017) developed an analogous CNN model for their research along with an improved region growing algorithm (RSG_R) for landslide detection. They trained their CNN model by using a set of landslide images, extracting the area and the boundary of the landslides with the RSG_R algorithm, achieving high detection accuracy concerning identifying landslide characteristics. Chen et al. (2018) introduced an automated landslide detection approach based on pre- and post-event remote sensing images. The authors used Deep Convolution Neural Network (DCNN) in order to identify zones in which significant alterations have taken place, whereas the Spatial Temporal Context Learning (STCL) method was applied to locate landslides. The authors were satisfied concerning the feasibility and accuracy achieved by the DCNN-STCL approach. Ghorbanzadeh et al. (2019) applied a CNN method for landslide detection using optical satellite imagery derived from the Rapid Eye sensor. The authors compared the results obtained from the CNN method with those from three state-of-the-art machine learning methods (ANN, SVM and RF). By using the spectral information of the Rapid Eye images separately along with topographic factors, they estimated the performance of each method and the impact of the used spectral and topographic factors on the landslide detection process. The areas identified as having landslide occurrences are then validated using common remote sensing and GIS validation metrics and the mean intersection-over-union (mIOU) validation method from computer vision. Apart from the previous mentioned studies, Wang et al. (2019) had presented a CNN for landslide susceptibility mapping. Sixteen influencing factors associated with landslide occurrence covering an area located in Yanshan County, China, were converted to a series of subimages to fit the CNN architecture. As reported by the authors, the CNN model outperformed in terms of prediction performance an optimized Support Vector Machines (SVM) model. Our approach differs from the above mentioned studies firstly as it uses DLNN models for mapping landslide susceptibility and not detecting landslide areas. Secondly in our approach the input data does not need to be converted to a series of sub-images so as to fit to DLNN architecture, a process necessary for the implementation of other deep learning architecture models such as CNN models. In our case, assessing landslides is considered as a classification problem, whereas the main aim of the current study was to develop a landslide susceptibility model based on a DLNN. The novel DLNN model was evaluated at the Kon Tum province which is located at the north of the Western Highland of Viet nam. The outcomes of the study was compared with the results obtained by four different methods, namely; Multi-Layer Preceptron Neural Network (MLP-NN), SVM, C4.5 Decision Tree and Random

context, landslide susceptibility assessments are utilized as the main investigation tool so as to identify potential relation between landslides and landslide-related variables but also to zone landslide prone areas more efficiently (Fell et al., 2008; Tien Bui et al. 2016; Chen et al., 2019). Taking advantage of the evolution of computer technology, both in software and hardware assets, and also data availability, numerous methods have been developed for the production of landslide susceptibility maps (Pourghasemi et al., 2013; Goetz et al., 2015; Hong et al., 2017; Chen et al., 2018; Dou et al., 2019; Pham et al., 2018a; Tsangaratos et al., 2018). Most of the methods operate within a geographic information system (GIS) framework and are based either on expert knowledge (knowledge-driven methods) or the analysis of available spatial and temporal data (data-driven methods) (Gokceoglu and Sezer, 2009; Pourghasemi et al., 2018; Reichenbach et al., 2018). For knowledge-driven methods, experts provide the weighting scheme in order to express the influence landslide related variables have on the occurrence of landslides, while data-driven analyze within a mathematical framework the relation between past landslide events and a set of landslide related variables. According to Pradhan (2013) data-driven approaches are considered more efficient than knowledge-based methods concerning the prediction of landslides occurrence. Pourghasemi et al. (2018) conducted a review so as to analyze from a statistical point articles concerning landslide susceptibility around the world during the period of 2005–2016 and report that the four most popular methods for landslide susceptibility mapping are listed as datadriven methods. Specifically, the Logistic Regression (LR) was found in the first place, followed by the Frequency Ration (FR), the Weight of Evidence (WofE) and the Artificial Neural Network (ANN), while the Analytic Hierarchy Process (AHP) which is a knowledge-driven method ranked in the fifth place. In a similar report, Reichenbach et al. (2018), critically reviewed the data-driven landslide susceptibility assessment literature from 1983 to 2016 and found also that the most common methods included LR, ANN, data-overlay, index-based and WofE. They also pointed out that in recent years there is an increasing preference in the use of machine learning methods. However, the choice of method that should be followed depends mainly on the data accuracy and quantity, the scale of analysis and the spatial product that would be formed as a result of the analysis (Hong et al., 2017; Tien Bui et al., 2019). Increasing applications of data mining and machine learning algorithms have been reported in landslide susceptibility assessments, involving fuzzy logic algorithms, artificial neural networks (ANN) and evolutionary population based algorithms (Aghdam et al., 2017; Ballabio and Sterlacchini, 2012; Chen et al., 2019; Pham et al., 2018; Pourghasemi et al., 2012; Tsangaratos and Ilia, 2016a, 2016b). As reported in previous studies machine learning models in most cases outperform conventional methods as they appear sufficient in handling non-linear data with different scales and from different type of sources (Pourghasemi and Kerle, 2016; Hong et al., 2018b; Pham et al., 2018a; Tien Bui et al., 2018a, b; Jaafari et al., 2019). Also, it has been well established that the integration of conventional statistical methods and machine learning methods in most cases perform better than individual machine learning techniques in susceptibility assessments (Ilia et al., 2018; Pham et al., 2018c; Chen et al., 2019b; Pham et al., 2019). Especially, ANN has been accepted as an effective and powerful method for landslide modeling, due to their ability to handle data without depending on the measurement scale and also the way data are arranged (Yesilnacar and Topal, 2005; Pradhan et al., 2010; Yilmaz, 2010; Tsangaratos and Benardos, 2014; Tien Bui et al. 2016). They appear as heuristic algorithms that can learn from experience using known samples so as to classify new unseen data (Melchiorre et al., 2008; Pradhan and Lee, 2010). Also, the parallel distribution of information that can be achieved by implementing an ANN model allows the assessment of complex and non-linear problems and interrelated processes (Zare et al., 2013). Deep learning algorithms are defined as a special case of machine learning algorithms, which exploits multiple 2

Catena 188 (2020) 104426

D.T. Bui, et al.

Fig. 1. Location of the study area.

valleys and the wide flat terrain in the southern whereas, the third zone covers the high lands located in the northeastern part of the area with elevations between 1,100 and 1,300 m. The Kom Tum province is located within the Central Highland, where the total annually rainfall ranges from 1,700 mm to above 3,000 mm. The 90.9% of the annual precipitation falls during the rainy season which is from April to November. Two subzones could be identified concerning the spatial distribution of the precipitation. The first covers the northern and the northeastern part of the area, with the rainy season lasting from May to November and the annual rainfall ranging from 2,000 mm to above 3,000 mm. The second subzone is located at the center and the southern of the province with the rainy season lasting from April to October and with the annual rainfall ranging between 1,700 and 2,200 mm. The morphology of the Kon Tum area is shaped by the presence of three large river systems. The Dak Bla river, the KrongPoKo river and the Sa Thay river that crosses the study area. Two distinctive seasons concerning the flow patterns of the three river systems could be identified, a long lasting flood season from May to January with maximum levels recorded from July to October and a short lasting dry season. Concerning the geological settings of the study area, it belongs to the Kon Tum block, one of the three uplift blocks (Kon Tum, Dak Lac and Lam Dong) located at the eastern side of the Indosinian orogeny. According to the Geological and Mineral Resource Map of Vietnam (scale of 1:200,000), the study area is covered by 31 geological formations and complexes dating from Archean to Cenozoic, with 17 of them covering the 95% of total area (Fig. 2i) (Table 1). The prevailing geological formation are the metamorphic rocks, followed by intrusive igneous rocks, extrusive igneous rocks and sedimentary rocks, with varying chemical composition, physical and mechanical properties. Almost, 90% of the total number of landslides in the wider area is recorded on the following geological formations and complexes. The Song Re, Tac Po, Kham Duc and Dai Nga geological formations, mainly consist of gneiss biotite, gneiss biotite-horblend, quartz-biotite schist, quartz schist with two mica, basalt tholeiite and basalt olivine, whereasthe Tu Mo Rong, Dien Binh, Ben Giang-Que Son, Hai Van and Ba Na complexes include granitogneiss, diorite, granodiorite, granite

Forest (RF), as benchmark machine learning methods.

2. The study area and data used 2.1. Description of the study area The Kon Tum province is located at the western part of Truong Son mountain range, between the longitudes 107° 20′ E and 108° 32′ E and latitudes 13° 55′ N to 15° 26′ N. The area covers approximate 6,850.63 km2 and is subdivided into nine administrative district units (Fig. 1). The Kon Tum province expands to the north of the Western highland. The region has been largely affected by natural hazards such as floods and droughts, landslides partly induced by climate change and human activities (Hens et al., 2018, Minh et al., 2018), quaternary volcanism (Anh et al., 2017, Dung and Minh, 2017, Phuc et al., 2018) and also earthquakes related to the presence of large hydropower reservoirs (Duan et al., 2015, Phuong et al., 2016, Phuong and Nam, 2018). Kon Tum masif consists of high temperature metamorphic rocks, known as the Kan Nack and Ngoc Linh formations. The geological complex of Kan Nack is identified as Archean on the basis of correlation with typical granulit Archean formations. Surroundings of Archean nucleus are magmatic metamorphic complexes of Paleozoic age such as granitoid Ta Vi, gran Lai granitogneiss, granite granitogneiss of Dai Ky, gabbro Phu My, including late magmatic forms of Paleozoic river enderbite-charnockite, Van Canh granite-granosyenite and dike Triassic dike complexes, etc. (Hung et al., 2019). Recent studies report that the metamorphic activity in Kon Tum Terrane occured during different tectonic periods forming overlapping metamorphic complexes. Intensive magnesium metamorphism occurred in the Permian-Late Triassic period and the Early Ordovician removing the older basement following the collision processes and the later array, leading to convergence of the Indochina and South China blocks at the end of the Late Permian - Early Triassic period (Hung et al., 2019). The study area is characterized by three main relief zones. The first zone which covers about 40% of the area includes the high mountains which extend in length around 200 km from NNW to SSE and the low mountains extending from N to S. The second zone covers the narrow 3

Catena 188 (2020) 104426

D.T. Bui, et al.

Fig. 2. Photos of some landslide in the study area (these photos were taken by Viet-Tien Nguyen and Ngo Van Liem on May 2017).

been conducted within the wider area, nine landslide related variables were introduced in the current study as relevant and with an influence to landslide phenomena. (Pham et al., 2018d). Specifically, the nine landslide related variables included elevation, slope, aspect, land use, normalized difference vegetation index (NDVI), soil type, distance to faults, distance to geological boundaries and lithology. According to several researchers the predictive performance of a model is influenced by the classification process used for classifying and weighting the landslide related variables (Tsangaratos and Ilia, 2016a; Chen et al., 2018). In the current study, the variables were classified following guidelines and suggestions that are commonly used in landslide susceptibility assessments (Guzzetti et al., 1999; Pham et al., 2018d). Based on national topographic maps of scale 1:50,000, (Ministry of Natural Resources and Environment of Vietnam, MONRE) a digital elevation model (DEM) was constructed having a resolution of 30 × 30 m (Tien Bui et al., 2016). Elevation, slope and aspect layers were produced applying spatial functions executed within ArcMap based on the DEM file (ESRI, 2012). Fig. 3a–c illustrate the elevation, slope and aspect of the research area. Nine, eight and nine classes were used as layer classes for analysis, respectively that were defined according to previous studies (Pham et al., 2018d). Fig. 3d illustrates the land use map of the area (source of land use map: No.02/2012/ HD-HTSP funded by Ministry of Education and Training of Vietnam) which has been classified into nine classes, whereas the soil type map (Fig. 3f) has been classified in thirteen types based on criteria of homogeneity (source of soil map: Department of Agriculture and Rural Development). NVDI values were obtained from a Landsat 8 OLI file, acquired on 14 April 2015 (available online at https://earthexplorer.usgs.gov),

biotite, grabrodiorite, granodiorite biotite-horblend, granit, granosyenite and granite with two mica (Table 1). 2.2. Data used 2.2.1. Historical landslide records The landslide database included historical records with information concerning the location, type and size of landslides. Landslides were identified by the interpretation of aerial photos and the use of satellite imagery and also during field observations. A total of 1,657 landslide locations, were mapped and recognized covering events that occurred from 2004 to 2015. They are characterized as shallow soil mixed boulder slides (Cruden and Varnes, 1996; Hungr et al., 2018) and examples of four landslide photos in the study areas are shown in Fig. 2. Our fieldworks and analysis of these landslides point out that torrential rainfall, especially rainfalls in tropical storms, are the triggering factor. Thus, the activation of these failures was due to soil saturations that reduce the cohesion of the particles. It should be noted that rockslides, rock topples, and rockfalls were very few, and therefore, they were eliminated due to the difference of the sliding mechanisms. The size of the largest landslide is 361,485 m2 and the smallest 118.4 m2, with the large-sized landslides (> 10,000 m2) accounting for the 12% of the total number of landslides. Approximately 73% of landslides are characterized as medium-sized (1,000–10,000 m2), whereas 15% are characterized as small-sized landslides (less than1,000 m2). 2.2.2. Landslide influencing factor Taking into account the geo-environmental settings of the area of research, the availability of data but also previous studies that have 4

Catena 188 (2020) 104426

D.T. Bui, et al.

Table 1 Characteristics of the geologic formations and complexes of the Kon tum area. BG-QS: Ben Giang-Que Son. No.

Formation or complex

Symbol

Areas(%)

Landslidelocation (%)

Main lithology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Tac Po formation BG-QS complex Song Re formation Hai Van complex Dai Nga formation Mang Yang formation Kham Duc formation Quaternary Song Bung formation Nui Ngoc complex Xa Lam Co formation Dak Lo formation Dien Binh complex Hiep Duc complex Ba Na formation Chu Lai complex Phan Rang complex Cheo Reo complex Kon Tum formation Ta Vi complex Mang Xim complex Cha Val complex Nam Nin complex Phu My complex Dak Long formation Cu Mong complex Deo Ca formation Van Canh complex Tu Mo Rong complex Plei Weik complex Dak Rium formation

PR1tp γξPZ3bg-qs PR1sr γaT3hv βN2dn T2my PR2-3kd Q T1-2sb νPZ1nng ARxlc ARdl γδSdb δPZ1hd γK-ρbn γPR3cl νξπρpr δPR1cr N2kt νPR3tv νξπρmx νaT3cv γδPR3nn νPR2pm Є-Sdlg νρcm γδKdc γξT2vc γPR1tmr δPR3pw K2dr

30.96 8.17 2.28 18.90 7.11 0.75 10.12 2.55 0.08 0.27 1.27 0.75 2.95 0.09 1.85 0.30 0.06 0.02 3.22 0.08 0.04 0.04 0.16 0.04 5.01 0.02 1.28 0.74 0.53 0.08 0.28

55.27 16.50 11.20 9.25 5.41 1.48 0.77 0.11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Biotite gneiss, biotite plagiogneiss, and silica shade Diorite, gabbrodiorite, quarzt diorite, pegmatit, granosyenit, and tonalite Hornblende- biotite-gneiss and biotite-gneiss Biotite granite and two-mica granite Tholeiite basalt and olivine basalt Cobblestone, sandstone, silica shade, rhyolite, felsite, and stuff Quartz schist, mica-quartz schists , biotite, sillimanite, and graphite Cobbles, pebbles, grit, sand, powder, clay Cobblestone and its stuff, sandstone, siltstone, and gritstone Gabro, gabbro pyroxenite, and gabbro anorthosite Plagioclase–biotite–hypersthene schist and biotite-silimanite-granate- cordierite quartz schist Gneis biotit- silimanit-cordierit- granat, marble, calciphyr, and quarzit Hornblende-biotite granodiorite, granodiorite, tonalite, aplite granite and pegmatite Dunite, peridotite, and pyroxene Biotite granite and two-mica granite Granitogneis, granit migmatit, and pegmatit Aplite, granite, and porphyry granite Horblendit and pyroxenit Cobblestone,sandstone, and dentonite clay Gabbro amphibolite Quartz syenite,granosyenit porphyr, porphyry, and syenite Gabrodiorit, gabronorit, and gabropyroxenit Plagiogranite gneis and tonalite gneiss Gabro amphibolit Quartzite, quartz sericite, shale, and marble Gabbrodiabase and diabase dykes Granite and hornblende-biotite granite Granite, biotite grano- syenite, aplite granite, and porphyry granite Granite-gneiss, granite migmatite, and pegmatite Pyroxenite, peridotite, pyroxenite, and dunite Conglomerate, gritstone, sandstone, and siltstone

reclassified into six classes, taking into account the weathering degree, the clay composition, the mineral constituent and the strength parameter (Fig. 2j) (Table 2) (Tien Bui et al., 2012). Elevation, slope and aspect had the same resolution as the DEM file, whereas lithology, land use and soil type maps were converted into raster format file with the same resolution (30 m × 30 m). The estimation of the NDVI file produced a 30 m × 30 m resolution raster format file, whereas distance to fault and distance to geological boundaries files were also converted into a raster format file with a 30 m × 30 m resolution.

whereas nine classes were determined for the NVDI map (Fig. 2e) based on a frequency analysis of landslides within the research area (Pham et al., 2018d). Soil types that cover an area are considered as a significant influencing factor in landslide assessments (Lee and Dan, 2005). The soil type map used in the present study was extracted at a scale of 1:100,000 edited by the Agricultural Planning and Projection of Vietnam produced in 2005 (Dung et al., 2005). The 26 original soil types indentified in the research were unified into 12 soil categories based on their chemical and physical properties as presented in Fig. 3f (Jouquet et al., 2008). Fault zones are considered as an influent factor in landslide evolution since they are characterized by the presence of fractures and formations that may be subject to extensive weathering degree, which directly affects the slope instability (Varnes, 1984; Tien Bui et al., 2012). Different classification schemes have been used in similar landslide susceptibility studies in Viet nam, which in most cases used a 200 m initial buffer zone (Tien Bui et al., 2012, 2016; Duc, 2013). In the present study, fault features were used to construct a buffer fault map having the following categories: 0–200, 200–400, 400–600, 600–800, 800–1,000, 1,000–1,200, and > 1,200 m (Fig. 2g). Geological boundaries are areas containing inhomogeneous geological formations, which are strongly affected by weathering and fracturing processes. Hence, it is considered an influencing factor to the slope instability and differently affected due to the distance geological boundary (Dou et al., 2015; ). The geological boundaries were extracted from the geological map at scale of 1:200,000 (Tran et al., 1997) (Fig. 3h). As many researchers have reported lithology influences the occurrence of landslides (Yem et al., 2006; Hung et al., 2016). The lithological sequence of an area reflects the physical and mechanical properties of geological formations covering the surface. The lithological map was constructed based on the geological and mineral resource map of Vietnam at the scale of 1:200,000 (Tran et al., 1997) (Fig. 2i) and was

3. Proposed methodology for landslide susceptibility modeling using DLNN The proposed methodology included the following phases: (a) construct the geospatial database, which involved selecting and classifying the landslide related variables, identifying landslide and nonlandslide locations and weighting the variables, (b) perform a predictive ability analysis, normalize the variables and create the training and validation datasets, (c) implement the DLNN and the MLP-NN, the SVM, the C4.5 and RF model after performing a tuning process and (d) compare the DLNN with MLP-NN, SVM, C4.5 and RF models and validate the produced landslide susceptibility map. Fig. 4 shows the processes that were utilized, whereas information concerning the processes is presented in the following paragraphs. 3.1. Geospatial database, coding The first phase, involved the identification of the landslide and nonlandslide locations. As already mentioned a total of 1,657 locations were identified and cataloged covering events of the period 2004–2015. Non-landslide locations were identified based on random selection of the free space which also had gentle topography without height 5

Catena 188 (2020) 104426

D.T. Bui, et al.

Fig. 3. Landslide conditioning factors: (a) Elevation; (b) Slope; (c) Aspect; (d) Landuse; (e) NDVI; (f) Soil type; (g) Distance to fault; (h) Distance to geological boundary; (i) Geological formations and complexes of the study area. and (j) Lithology.

The following process was to implement the Frequency ratio method so as to assign a weight coefficient to each class of each variable that express the probabilistic relation among the variable and the occurrence of landslides (Cao et al., 2016) (Eq. (1)):

differences and sharp change (Tsangaratos et al., 2017; Pham et al., 2019). As mentioned the continuous variables were classified into categorical classes following certain guidelines used in landslide assessments and to define the class intervals (Tien Bui et al., 2012; Pham et al., 2018b,c). 6

Catena 188 (2020) 104426

D.T. Bui, et al.

Group

Description

Area (%)

Landside location (%)

1 2 3

Quartermary sediments Sedimentary clastic rocks Extrusive mafic - ultramafic igneous rocks Intrusive mafic - ultramafic igneous rocks Intrusive acid - neutral igneous rocks Metamorphic rock with rich quartz components

3.62 11.16 6.49

0.11 0.00 5.39

predictive ability, which is based on the estimation of the decrease of Entropy as a measure of importance. Detailed description of IGR for evaluating landslide-related variables can be found in (Tien Bui et al., 2016). WEKA ver. 3.8 was used for the calculation of the IGR values (Frank et al., 2016). Afterwards, all classes were normalized in the range 0.01 to 0.99 by applying the min-max method (Hong et al., 2018a) (Eq. (2)):

0.60

0.00

valuenew =

36.01 42.02

27.23 67.27

Table 2 Characteristics of the lithological groups of the study area.

4 5 6

FRvi =

Lvci Ls Aci As

value − min (v ) × (upper − lower ) + lower max (v ) − min (v )

(2)

where valuenew is the normalized value, value is the original value, and upper (0.99) and lower (0.01) are the upper and lower boundaries. During this phase, the landslide locations were randomly divided into training and validation dataset. The first dataset was set to train the models and the second to validate the models (Chung and Fabbri, 2003). To avoid creating imbalance datasets an equal number of nonlandslide locations were randomly sampled as explained in the previous paragraph (Tien Bui et al., 2012). In our case, 70% of the landslide inventory (3,643 pixels) had been used as a training dataset, whereas the remaining 30% (1,627 pixels) had been utilized to construct the validation dataset. Similarly, 3,643 non-landslide pixels have been included in the training dataset and 1,627 non-landslide pixels have been included in the validation dataset.

(1)

where Lvci is the number of grid cells with a landslide for each class of each i variable; Ls is the total number of grid cells with landslides; Aci the number of grid celss for each class of each i landslide related variable; and As is total number of grid cells covering the whole area. To estimate the FR values, data processing were carried out using spatial functions found within the ArcGIS 10.2 platform (ESRI, 2012). Specifically, after classifying each variable, an estimation of the percentage of the area each class occupied was done. Also, we proceed in the estimation of the percentage of the landslides located in each class by utilizing the Statistics and Extraction toolbox found in the Analysis and Spatial analyst tools, respectively (ESRI, 2012).

3.3. Configuration of the DLNN model For the development of a DLNN model, the main features that should be set are the number of layers and nodes, which define the depth of the architecture (Singaravel et al., 2018), and also the activation and transfer functions.

3.2. Predictive ability analysis, normalizing, training Dataset, and validation Dataset

3.3.1. Layers and processing elements There is no universal rule concerning the number of hidden layers and nodes a model should use. In most cases, two to five hidden layers are sufficient. The number of the processing elements in the hidden layers is set by the characteristic of the dataset, the number of features and also the number of training samples. The objective in all cases is to

The first action of the second phase was to evaluate the predictive ability of the selected landslide related variables, since variables that have low predictive ability may add uncertainty and provide less accurate results (Tien Bui et al., 2016). In the current study, Information Gain Ratio (IGR) (Witten et al., 2011) was used to evaluate the

Fig. 4. Flowchart of the developed methodology. 7

Catena 188 (2020) 104426

D.T. Bui, et al.

find the simplest model with the highest predictive performance. The topology of the output layer depends on the type of problem where in the case of regression problems, the output layer will have one node, whereas in the case of classification problems they may contain either one node (which will predict the probability of success) or a number of nodes equal to the number of categories evaluated. Every layer is trained using features that are set by the outcomes of the previous layer. In the current study the optimized number of hidden layers and nodes were estimated by applying a grid search technique.

(Breiman, 2001). Specifically, the process of learning involves selecting, for every iteration, the predictive variable and resampling the data with replacement (Youssef et al., 2016; Chen et al., 2017). By this process, a RF model presents a more efficient ability in avoiding overfitting issues and in general presenting a better generalization performance. RF’s performance is influenced by two structural parameters, mtry and ntree. The first refers to the number of factors used in each Random Tree, whereas the second refers to the number of Trees that the RF produces (Lagomarsino et al., 2017; Hong et al., 2018a). In our case, the optimal values for mtry and ntree were estimated by using a grid search technique (Kavzoglu and Colkesen, 2009). The mtry value ranged between 1 and 9 and the ntree values within the range of 100 and 1,000.

3.3.2. Activate and transfer function The second most important feature of a DLNN model is the activation and transfer function. Each processing element is connected to each processing element of the previous layer receiving a weight. The values of the processing elements aggregate all the incoming inputs multiplied by their connection weights and when the outcome exceeds a specific threshold then the processing element is activated (Goodfellow et al., 2016). The activation functions assist in mapping the non-linearity relation between inputs and outputs in the hidden layers (Singaravel et al., 2018). The Rectified Linear Unit (ReLu) is the most commonly used activation function introduced by Hahnloser et al. (2000). According to Krizhevsky et al. (2012) ReLu is widely accepted as one of the several reasons for the remarkable performance of the DLNN models. The ReLu function is defined as f(xi) = max(0,xi) where xi is the input and f(xi) is the output. The ReLu function returns zero (0) when the input is negative, while for any positive value it returns that value back. The main advantage of the ReLU function is that it does not activate all the neurons at the same time, making the network more efficient and easy for computation. The transfer function delivers the same purpose as the activation function; however it is implemented to map the non-linearity of the last hidden layer and the output layer. In the case of classification problems the sigmoid activation function was used.

3.4.2. Support Vector Machine (SVM) Numerous landslide susceptibility assessments have employed SVM reporting its predictive superiority in comparison with other statistical and most of machine learning methods (Ballabio and Sterlacchini, 2012; Tien Bui et al., 2012). SVMs are machine learning methods which are based on statistical learning theory, used for classification and regression analysis (Vapnik, 1998). The main characteristic of SVM models is that during the learning process it transforms the initial input space to a higher-dimensional feature space so as to estimate an ideal separating hyperplane and classify new unknown examples to know classes (Kavzoglu and Colkesen 2009; Abe, 2010). This could be applied in both linear and non-linear separated complex problems, with the latter using various Kernel functions to achieve the separation which include linear, polynomial, sigmoid, and radial basis functions (Smola and Schölkopf, 2004). In our case, the radial basic kernel function was implemented, the performance of which is controlled by the value of the kernel width (γ) (Tien Bui et al., 2016). The accuracy of the SVM model is also controlled by the regularization (C) parameter, whereas both γ and C are tuned with the use of a grid search technique (Kavzoglu and Colkesen, 2009). The C value within the range (1 to 10) and γ values within the range (0.01, 1.0). The pair which achieved the highest score of accuracy based on the validation data is considered as the optimized values with which the SVM model was built.

3.3.3. Back-propagation, objective function and optimizer During the first training pass, the DLNN model is trained on a batch of samples, randomly assigns weights to each node connection and predicts a final outcome. The next phase is the DLNN model to automatically adjust the weights to each connection by considering the estimated error from the previous prediction so as to improve in the next training pass the predictive accuracy of the DLNN model. The last process is referred to as backpropagation, a process that is controlled mainly by two aspects, the objective function that is used in order to measure the predictive performance and the optimization algorithm which assist in finding the optimal structural parameters of the DLNN model. The objective function used in the present study is the mean square error function (loss/cost function). Concerning the optimization algorithm used to adjust the learning rate, the Rmsprop technique was used. Rmsprop is referred as a gradient-based optimization technique first introduced by Geoffrey Hinton (Hinton et al., 2012). The basic concept of the Rmsprop is that it estimates the moving average of squared gradients so as to normalize the gradient itself.

3.4.3. c4.5 The C4.5 model is referred to as a Decision Tree algorithm, widely used in classification problems (Quinlan, 1993; Wu et al., 2008; Yeon et al., 2010). The learning process of C4.5 involves two phases, the growth phase of the Tree and the pruning phase. During the pruning phase, the tree is split based on the attribute which classifies a set of samples using an attribute selection measure. As for the structural parameters that need to be tuned, the C4.5 model must tune the confidence threshold and the minimum instances per leaf. In our case both parameters were tuned by a grid search technique. 3.4.4. Mlp-NN The Multi-layer perceptron Neural Network (MLP-NN) and radial basis function (RBF Neural Nets) are two of the most common Neural Networks that have been used in landslide susceptibility assessments (Pradhan et al., 2010; Zare et al., 2013; Tien Bui et al., 2016; Pham et al., 2017). Basically, the topology of MLP-NN models involved three layers, an input layer, one hidden layer and an output layer (Kavzoglu and Mather, 2003), whereas their performance is controlled by their structure, the activation functions, and the way connection weights among the processing elements are updated (Haykin, 1998). In the current study, the MLP-NN was implemented based on the “RSNNS” R package (Bergmeir and Benitez, 2012) whereas the number of the hidden processing elements was tuned by utilizing a grid search technique.

3.4. Comparison with benchmark models The fourth phase involved the implementation of RF, SVM, C4.5 and MLP-NN models using the same datasets so as to compare their results with the outcomes of the DLNN model. In the following paragraphs a brief description of the four benchmark models will be provided. It should be noted that these models were developed using Python WEKA Wrapper (Beckham et al., 2016). 3.4.1. Random Forest (RF) RF, are considered as an ensemble learning method which classify unknown samples based on the aggregated results of a number of weak classification Trees produced through bootstrapping techniques

3.5. Performance Assessment The Positive Predictive Value (PPV), the Negative Predictive Value 8

Catena 188 (2020) 104426

D.T. Bui, et al.

landslide susceptibility, less than 0.04. Thus it was considered as most appropriate to introduce an extra class which characterizes areas with no susceptibility values. Approximately, 3% of the entire research area has been characterized by very high susceptible values, whereas 64.47% of the total number of landslides recorder in the area fall in this zone. The percentage of landslides that fall in the high and very high susceptibility zones reaches near 84% a strong indication of the learning and predictive ability of the DLNN model (Table 4). The southsouthwest section of the area is characterized by practical none susceptible values indicating the absence of landslides. This section is characterized by low elevation values and gentle slope. This two morphometric attributes may form a much more stable environment, with rainfall patterns and vegetation less favourable to slide, whereas the gentle slope, are characterized by higher shear forces that act (Tien Bui et al., 2011; Catani et al., 2013; Nolasco-Javier et al., 2015). The great influence of elevation and slope is confirmed by the results of the Information Gain Ratio index, in which elevation had the highest predictive ability value followed by slope. A significant role plays also the variable of aspect, which is related to the sunlight exposure and the presence of winds that control the soil moisture and appear that are less favourable to slide (Magliulo et al., 2008). Four benchmark models, MLP-NN, SVM, C4.5 and RF were applied on the same training and validating datasets, to evaluate and compare their performance. MLP-NN, SVM and RF models had been tuned, based on a grid search method, in order to estimate the optimal structural parameters. For MLP-NN, the optimizing parameter was the number of neurons in the hidden layer which was estimated to be 8, for the SVM model, the parameters cost and gamma were tuned with cost = 8 and gamma = 0.645 and for the RF model the number of Trees, which reached the value of 500 Trees. The goodness-of-fit and the prediction performance of the five landslide models expressed by statistical metrics are shown in Table 5 and 6, respectively. Concerning the goodness-of-fit, in other words the ability of the model to learn, the RF model was evaluated with the highest performance for the classification of landslides, having a sensitivity value equal to 95.98%, followed by the MLP-NN model (93.14%), the C4.5 model (91.73), the SVM model (90.36) and the DLNN model (83.25%). For classification of the non-landslides, the highest performance was achieved by the DLNN model, with a specificity value equal to 95.15%, followed by the RF model (91.54%), the C4.5 model (87.20%), the SVM model (86.60) and the MLP-NN model (82.86%). The RF model had the highest classification accuracy with a value of 93.65, followed by the C4.5 model (89.34), the SVM model (88.39), the DLNN model (88.29) and the MLP-NN model (87.30). A quiet different pattern of performance can be observed when evaluating the models based on the validation datasets. The DLNN model achieved the highest performance regarding the classification of landslides (95.02%), followed by the MLP-NN model (82.00%), the SVM model (81.01%), the C4.5 model (80.51%) and the RF model (75.12%). On the other hand, RF model showed the highest performance in classifying correctly the non-landslides (98.65), followed by the SVM model (95.67), the MLP-NN model (94.56), the C4.5 model (89.80) and the DLNN model (86.86). However, the DLNN model had the highest classification accuracy, with a value of 90.53, followed by the MLP-NN model (87.25), the SVM model (86.94), the C4.5 model (84.54) and the RF model (83.13). It is obvious that all models achieved a high level of learning and predictive performance, with minor variations that may be related to their different learning approach and constraints each method has. Several researchers emphasize the necessity of investigating landslide susceptibility assessments by employing various techniques and methods, and comparing and evaluating their results since even a small percent of the increment of the prediction accuracy could control the resulting landslide susceptibility zones (Pourghasemi et al., 2012a; Tien Bui et al., 2014). This is also the case in our study, where the implementation of a DLNN model may seem a more complex task

Table 3 Predictive ability of conditioning factors. No.

Conditioning factor

Predictive ability Using Information Gain Ratio

1 2 3 4 5 6 7 8 9

Elevation Slope Soil Lithology Aspect Landuse Distance to fault NDVI Distance to geological boundary

0.239 0.160 0.094 0.074 0.070 0.054 0.043 0.027 0.009

(NPV), the Specificity, the Sensitivity and the classification Accuracy were the main statistical metrics that were estimated so as to evaluate the learning and predictive ability of models that were used in the current study (Shahabi and Hashim, 2015; Tien Bui et al., 2016). In addition, the Receiver Operating Characteristic (ROC) curves and the Area Under the Curve (AUC) value, which are based on sensitivity (true positivity rate) and specificity (false negative rate) and the success and predictive curve were also estimated (Peterson et al., 2008; Tien Bui et al., 2016). 4. Results and discussion As proposed by the followed methodology, the predictive ability of the landslide related variables was evaluated in order to exclude landslide related variables that appear with no predictive power and by this improving the performance of the prediction models (Tien Bui et al., 2016). Based on the results shown in Table 3, and the evaluation of the Information Gain Ratio index, elevation had the highest predictive ability value (0.239) followed by slope (0.160), soil type (0.094), lithology (0.074), aspect (0.070), landuse (0.054), distance to fault (0.043), NDVI (0.027) and distance to geological boundary (0.009). Thus, all factors were further processed as their IGR values were higher than “0.000”, a situation which would introduce “noise” and indicate null predictive ability resulting in decreasing predictive performance (Tien Bui et al., 2016; Shirzadi et al., 2018; Abedini et al., 2019). The outcomes of the present study concerning the estimation of the predictive ability are in accordance with the findings from similar rainfall-induced landslides studies in Vietnam (Tien Bui et al., 2014, 2017; Nguyen et al., 2017). Elevation has been reported as highly significant conditioning factor, followed by slope. Tien Bui et al. (2016) reported that slope appears to have the highest predictive power among other landslide related factors, in similar research areas. The results seem reasonable enough based on the widely accepted notion that morphometric attributes are of significant importance in landslide evolution (Van Den Eeckhaut et al., 2006; Costanzo et al., 2012). The following process involved the implementation of the DLNN model, the structure of which appears in Fig. 5. Based on several trial and error runs, it was decided to create a model using three hidden layers with 64 neurons and two output neurons, one for landslide and one for non-landslide incidence. According to the ROC curve analysis, both training and validating datasets show high performance, with AUC values 0.953 and 0.973, respectively (Fig. 6a, b), whereas Fig. 7 illustrates the success and prediction rate curve. Concerning the landslide susceptibility map produced by the DLNN model, it was classified into 6 categories (Fig. 8). The most common classification scheme in landslide susceptibility assessments use a five level scale, from very low, low moderate, high and very high susceptibility (Fell et al., 2008). However, in the present study a very large portion of the research area has an extremely low percentage of 9

Catena 188 (2020) 104426

D.T. Bui, et al.

Fig. 5. Structure of the Deep Learning Neural Network model for landslide susceptibility modeling in this study.

Fig. 7. Success rate curve and prediction rate curve of the Deep Learning Neural Network model.

SVM and NN (DLNN and MLP-NN) models in terms of overall classification accuracy. On the other hand, during the prediction phase, the NN-based models were the most accurate models compared to the SVM and the Tree-based models. Comparing the two classification indexes (referring to the learning and predictive phases), the Tree-based models appeared with the highest variation. The prediction performance was reduced by a 7.66% on average, compared to the leaning performance of each Tree-based model. Although Tree-based models and especially RF models, are reported to be less sensitive to overfitting problems and outliers (Merghadi et al., 2018), in our case the NPV value (67.18%) indicates a poor predictive performance concerning the identification of an area as a non-landslide area, which also has a significant effect on the overall classification accuracy of the model. As already mentioned non-landslide areas are randomly selected from the free of landslide space and a subjective evaluation process that may have as a result the introduction of uncertainties in data and consequently the generation of less accurate models (Kornejady et al., 2017; Marjanović et al., 2019). Therefore, the Tree-based models and the SVM model, which both need a well structured database in order to provide more accurate models appear having less predictive performance than the NN-based models. Furthermore, the NN-based models had the lowest variation between their learning and predictive classification accuracy, showing an increase in their prediction performance by a 1.10% on average compared to their learning classification accuracy. The most stable model was the MLP-NN since the learning and predictive classification accuracy was almost identical, whereas the DLNN had the highest increase (2.24%). According to Merghadi et al. (2018) NN-based models have enhanced predictive performance when dealing with non-linear complex phenomena described by multiple related variables especially

Fig. 6. ROC curve and AUC of the Deep Learning Neural Network (DLNN) model using the training dataset (a) and validation dataset (b).

compared to the usage of more conventional and easy to handle models, however their slightly higher predictive performance is of significant importance. Based on the results of the current analysis, during the learning phase, the Tree-based models (RF and C4.5) outperformed the 10

Catena 188 (2020) 104426

D.T. Bui, et al.

Fig. 8. Landslide susceptibility map for the study area derived from the DLNN model. Table 4 Description of the five flood susceptibility classes. No.

1 2 3 4 5 6

susceptibility index range

1.00–0.891 0.891–0.715 0.715–0.475 0.475–0.171 0.171–0.023 0.023–0.000

landslide susceptibility (%) %)(%)

Landslide locations (%)

100–90 90–80 80–70 70–55 55–35 35–0

64.47 19.07 7.54 4.60 4.31 0.01

Areas (km2)

685.063 685.063 685.063 1027.595 1370.126 2397.721

Table 5 Goodness-of-fit of the DLNN model in comparing with other machine learning benchmarks using the training dataset. (MLP-NN: 8 hidden neurons; SVM: C = 8; gamma = 0.645, RF: 500 trees).

Verbal expression

Very high High Medium Low Very low No

when large numbers of observations are available. Furthermore, it has been reported that NNs are capable of modeling complex and nonlinear relation as the easily perform successive non-linear transformations across each layer. The above is in accordance to the findings of the current work. Ghorbanzadeh et al. (2019), reported that deep-learning convolution neural networks used for detecting landslides do not automatically outperform ANN, SVM and RF, as they strongly depend by the depth of the layers, the input window sizes and training strategies. Concerning the comparison of the estimated PPV and NPV values, the NN-based models appeared having a more balanced performance (|PPV – NPV| on average 13.18%), followed by the SVM (19.12%) and

No

Evaluation metrics

DLNN

MLP-NN

SVM

C4.5

RF

1 2 3 4 5 6 7 8 9

True positive True negative False positive False negative PPV (%) NPV (%) Sensitivity (%) Specificity (%) Classification Accuracy (%)

3493 2940 150 703 95.88 80.70 83.25 95.15 88.29

2934 3427 709 216 80.54 94.07 93.14 82.86 87.30

3131 3309 512 334 85.95 90.83 90.36 86.60 88.39

3150 3359 493 284 86.47 92.20 91.73 87.20 89.34

3319 3504 324 139 91.11 96.18 95.98 91.54 93.65

the Tree-based models (22.56%). Similar the NN models had the lowest difference between the performance of the models concerning the estimation of the Sensitivity and Specificity. Overall, the DLNN model had the lowest variation in both comparisons indicating a more balance performance. This could be justified by the fact that DLNN models are more stable when the number of observations and feature inputs are high. In our case the number of observations were sufficient enough so that the DLNN model could capture the complex nature of landslide 11

Catena 188 (2020) 104426

D.T. Bui, et al.

Table 6 Prediction performance of the DLNN model in comparing with other machine learning benchmarks using the validation dataset. No

Evaluation metrics

DLNN

MLP-NN

SVM

C4.5

RF

1 2 3 4 5 6 7 8 9

True positive True negative False positive False negative PPV (%) NPV (%) Sensitivity (%) Specificity (%) Classification Accuracy (%)

1392 1554 235 73 85.56 95.51 95.02 86.86 90.53

1553 1286 74 341 95.45 79.04 82.00 94.56 87.25

1570 1259 57 368 96.50 77.38 81.01 95.67 86.94

1483 1268 144 359 91.15 77.93 80.51 89.80 84.54

1612 1093 15 534 99.08 67.18 75.12 98.65 83.13

Aghdam, I.N., Pradhan, B., Panahi, M., 2017. Landslide susceptibility assessment using a novel hybrid model of statistical bivariate methods (FR and WOE) and adaptive neuro-fuzzy inference system (ANFIS) at southern Zagros Mountains in Iran. Environ. Earth Sci. 76 (6), 237. Anh, L.D., Hoang, N., Shakirov, R.B., Huong, T.T., 2017. Geochemistry of late miocenepleistocene basalts in the Phu Quy island area (East Vietnam Sea): Implication for mantle source feature and melt generation. Vietnam J. Earth Sci. 39, 270–288. Ballabio, C., Sterlacchini, S., 2012. Support vector machines for landslide susceptibility mapping: the staffora River Basin case study. Italy. Math. Geosci. 44 (1), 47–70. Beckham, C.J., Hall, M.A., Frank, E., 2016. WekaPyScript: classification, regression, and filter schemes for WEKA implemented in Python. J. Open Res. Softw 4, e33. https:// doi.org/10.5334/jors.108. Bergmeir, C., Benitez, J.M., 2012. Neural networks in R using the stuttgart neural network simulator: RSNNS. J. Stat. Softw. 46 (7). https://doi.org/10.18637/jss.v046.i07. Breiman, L., 2001. Random forests. Mach. Learn. 45, 5–32. Cao, C., Wang, Q., Chen, J., Ruan, Y., Zheng, L., Song, S., Niu, C., 2016. Landslide susceptibility mapping in vertical distribution law of precipitation area: Case of the Xulong Hydropower station Reservoir. Southwestern China. Water 8 (7), 270. Catani, F., Lagomarsino, D., Segoni, S., Tofani, V., 2013. Landslide susceptibility estimation by random forests technique: sensitivity and scaling issues. Nat. Hazards Earth Syst. Sci. 13, 2815–2831. https://doi.org/10.5194/nhess-13-2815-2013. Chen, W., Panahi, M., Tsangaratos, P., Shahabi, H., Ilia, I., Panahi, S., Li, S., Jaafari, A., Ahmad, B.B., 2019a. Applying population-based evolutionary algorithms and a neuro-fuzzy system for modeling landslide susceptibility. Catena 172, 212–231. Chen, W., Shahabi, H., Zhang, S., Khosravi, K., Shirzadi, A., Chapi, K., Pham, B.T., Zhang, T., Zhang, L., Chai, H., Ma, J., Chen, Y., Wang, X., Li, R., Ahmad, B.B., 2018a. Landslide susceptibility modeling based on gis and novel bagging-based kernel logistic regression. Appl. Sci. 8 (12), 2540. Chen, W., Shirzadi, A., Shahabi, H., Ahmad, B.B., Zhang, S., Hong, H., Zhang, N., 2017. A novel hybrid artificial intelligence approach based on the rotation forest ensemble and naïve Bayes tree classifiers for a landslide susceptibility assessment in Langao County, China. Geomat. Nat. Haz. Risk 8 (2), 1955–1977. Chen, W., Sun, Z., Han, J., 2019b. Landslide susceptibility modeling using integrated ensemble weights of evidence with logistic regression and random forest models. Appl. Sci. 9 (1), 171. Chen, Z., Zhang, Y., Ouyang, C., Zhang, F., Ma, J., 2018b. Automated landslides detection for mountain cities using multi-temporal remote sensing imagery. Sensors 13, 821. https://doi.org/10.3390/s18030821. Chung, C.-J.F., Fabbri, A.G., 2003. Validation of spatial prediction models for landslide hazard mapping. Nat. Haz. 30, 451–472. Cruden, D. M., Varnes, D. J., 1996. Landslides: investigation and mitigation. Chapter 3Landslide types and processes. Transportation research board special report, (247). Costanzo, D., Rotigliano, E., Irigaray, C., Jimenez-Per Alavarez, J.D., Chacon, J., 2012. Factors selection in landslide susceptibility modelling on large scale following the GIS matrix method: application to the river Beiro basin (Spain). Nat. Haz. Earth Syst. Sci. 12, 327–340. Deng, L., Yu, D., 2014. Deep learning: methods and applications found. Trends Signal Process 7 (3–4), 197–387. Ding, A., Zhang, Q., Zhou, X., Dai, B., 2016. Automatic recognition of landslide based on CNN and texture change detection. In: In Proceedings of the Chinese Association of Automation (YAC), Youth Academic Annual Conference, Wuhan, China, 11–13 November 2016. IEEE, pp. 444–448. Duc, D.M., 2013. Rainfall-triggered large landslides on 15 December 2005 in Van Canh District, Binh Dinh Province, Vietnam. Landslides 10 (2), 219–230. Dou, J., Bui, D.T., Yunus, A.P., Jia, K., Song, X., Revhaug, I., Xia, H., Zhu, Z., 2015. Optimization of causative factors for landslide susceptibility evaluation using remote sensing and GIS data in parts of Niigata. Japan. PLoS ONE 10, e0133262. Dou, J., Yunus, A.P., Tien Bui, D., Merghadi, A., Sahana, M., Zhu, Z., Chen, C.-W., Khosravi, K., Yang, Y., Pham, B.T., 2019. Assessment of advanced random forest and decision tree algorithms for modeling rainfall-induced landslide susceptibility in the Izu-Oshima Volcanic Island. Japan. Sci. Total Environ. 662, 332–346. Duan, B.V., Giang, H.T., Duong, N.A., Nguyen, P.D., 2015. About factors related to the occurrence of earthquakes in the Song Tranh 2 hydropower area in period 2011–2014. Vietnam J. Earth Sci. 37, 228–240. Dung, T.T., Minh, N.Q., 2017. Eruptive-volcanic-basalt structures in the Truong SaSpratly Islands and adjacent areas from interpreting gravity and magnetic data. Vietnam J. Earth Sci. 39 (1), 1–13. Dung, V.N., Chut, L.Q., Tuyen, T.D., 2005. Supplemental investigating, updating, and constructing soil type maps at the scales 1: 50,000–1: 100,000 for the provinces at the Central Highland (Vietnam). In: The Agricultural Planning and Projection of Vietnam, Ministry of Natural Resources and Environment, Vietnam, Hanoi, pp. 68. Environmental Systems Research Institute (ESRI), 2012. ArcGIS Release 10.1. Redlands, CA. Fell, R., Corominas, J., Bonnard, C., Cascini, L., Leroi, E., Savage, W.Z., 2008. Guidelines for landslide susceptibility, hazard and risk zoning for land-use planning. Eng. Geol. 102, 99–111. Frank, E., Hall, M.A., Witten, I.H., 2016. The WEKA Workbench. Online Appendix for “Data Mining: Practical Machine Learning Tools and Techniques”. Morgan KaufmannFourth Edition. Ghorbanzadeh, O., Blaschke, T., Gholamnia, K., Meena, S.R., Tiede, D., Aryal, J., 2019. Evaluation of Different Machine Learning Methods and Deep-Learning Convolutional Neural Networks for Landslide Detection. Remote Sens. 11, 196. Goetz, H.N., Brenning, A., Petschko, H., Leopold, R., 2015. Evaluating machine learning and statistical prediction techniques for landslide susceptibility modeling. Comput. Geosci. 81, 1–11. Gokceoglu, C., Sezer, E.A., 2009. A statistical assessment on international landslide

phenomena. Although the DLNN model could be considered as an alternative approach with highly predictive accuracy, there are several limitations that should be reported. The most important one that may influence the decision of using DLNN models in landslide assessments is the process of tuning the structural parameters. Finding the optimal number of hidden layers and processing elements is a very computational demanding task. The common practice of trial and error needs also experience from side of the user so as to set the initial settings. As for the future work concerning DLNN models it could involve the implementation of a methodology which would allow tuning the structural parameters of the DLNN model and also implementing the methodology to an area with different geo-environmental settings so as to evaluate the efficiency of the developed DLNN model. 5. Conclusion The current study introduced a DLNN model as a promising alternative investigation tool for landslide susceptibility assessments. A geospatial database for the Kom Tum province in Viet nam has been constructed which involved historical records of landslides and nine landslide related variables, namely: elevation, slope, aspect, landuse, NDVI, soil type, distance to faults, distance to geological boundaries and lithology. Based on the conducted analysis the elevation appeared to have the highest predictive ability, followed by slope and soil cover. Comparing the outcomes of the DLNN model it outperformed MLP-NN, SVM, C4.5 and RF models which were used as benchmark methods. Generally, the Neural Network based models had a more stable classification performance compared to the Tree-based models. Despite its excellent performance, the DLNN model is largely influenced by the processes of tuning the structural parameters both in computational cost and operational time. In conclusion, the landslide susceptibility map that has been generated by the proposed methodology could assist the local and government authorities in the directions of establishing appropriate landslide mitigation strategies and land-use plans. Acknowledgement This research was partially supported by the project TN18/T13 which belongs to the Central Highland Program No.3, Vietnam Academy of Science and Technology, Vietnam. TPT thanks the support from national project “Pliocene-present tectonics in Vietnam islands and continental shelf for assessing geological hazards”, KC.09.22/16-20 and senior research assistant program (VAST). References Abe, S., 2010. Support vector machines for pattern classification, 2nd ed. Springer-Verlag, London Limited, London, UK. Abedini, M., Ghasemian, B., Shirzadi, A., Tien Bui, D., 2019. A comparative study of support vector machine and logistic model tree classifiers for shallow landslide susceptibility modeling. Environ. Earth Sci. 78 (18). https://doi.org/10.1007/s12665019-8562-z.

12

Catena 188 (2020) 104426

D.T. Bui, et al.

Nguyen, Q., Bui, D.T., Hoang, N., Trịnh, P.T., Nguyen, V., Yilmaz, I.C., 2017. A Novel Hybrid Approach Based on Instance Based Learning Classifier and Rotation Forest Ensemble for Spatial Prediction of Rainfall-Induced Shallow Landslides using GIS. Sustainability 9 (5), 813. https://doi.org/10.3390/su9050813. Nolasco-Javier, D., Kumar, L., Tengonciang, A.M.P., 2015. Rapid appraisal of rainfall threshold and selected landslides in Baguio, Philippines. Nat. Haz. Springer Netherlands;. https://doi.org/10.1007/s11069-015-1790-y. Peterson, T., Papes, M., Soberon, J., 2008. Rethinking receiver operating characteristic analysis applications in ecological niche modeling. Ecol. Model. 213, 63–72. Pham, B.T., Nguyen, V.T., Ngo, V.L., Trinh, P.T., Ngo, H.T.T., Tien Bui, D., 2018a. A Novel Hybrid Model of Rotation Forest Based Functional Trees for Landslide Susceptibility Mapping: A Case Study at Kon Tum Province, Vietnam. In: Tien, Bui D., Ngoc, Do A., Bui, H.B., Hoang, N.D. (Eds.), Advances and Applications in Geospatial Technology and Earth Resources. GTER 2017. Springer, Cham. Pham, B.T., Prakash, I., Bui, D.T., 2018b. Spatial prediction of landslides using a hybrid machine learning approach based on Random Subspace and Classification and Regression Trees. Geomorphology 303, 256–270. Pham, B.T., Prakash, I., Dou, J., Singh, S.K., Trinh, P.T., Trung Tran, H., Minh Le, T., Tran, V.P., Kim Khoi, D., Shirzadi, A., Tien Bui, D., 2018c. A novel hybrid approach of landslide susceptibility modeling using rotation forest ensemble and different base classifiers. Geocarto Int 1–38. Pham, B.T., Prakash, I., Singh, S.K., Shirzadi, A., Shahabi, H., Tran, T.-T.-T., Bui, D.T., 2019. Landslide susceptibility modeling using Reduced Error Pruning Trees and different ensemble techniques: Hybrid machine learning approaches. CATENA 175, 203–218. Pham, B.T., Shirzadi, A., Tien Bui, D., Prakash, I., Dholakia, M.B., 2018d. A hybrid machine learning ensemble approach based on a Radial Basis Function neural network and Rotation Forest for landslide susceptibility modeling: A case study in the Himalayan area, India. Int. J. Sedim. Res. 33, 157–170. Phuc, L.T., Tachihara, H., Honda, T., Tuat, L.T., Thom, B.V., Hoang, N., Chikano, Y., Yoshida, K., Tung, N.T., Danh, P.N., Hung, N.B., Duc, T.M., Vu, P.G.M., Hoa, N.T.M., Bien, H.T., Quy, T.Q., Minh, N.T., 2018. Geological values of lava caves in Krongno Volcano Geopark, Dak Nong Vietnam. Vietnam J. Earth Sci. 40, 299–319. Phuong, N.H., Nam, N.T., 2018. Development of a web-GIS based decision support system for earthquake warning service in Vietnam. Vietnam J. Earth Sci. 40 (3), 193–206. Phuong, N.H., Truyen, P.T., Nam, N.T., 2016. Probabilistic seismic hazard assessment for the Tranh river hydropower plant No2 site, Quang Nam province. Vietnam J. Earth Sci. 38, 188–201. Pourghasemi, H.R., Jirandeh, A.G., Pradhan, B., Xu, C., Gokceoglu, C., 2013. Landslide susceptibility mapping using support vector machine and GIS at the Golestan province. Iran. J. Earth Syst. Sci. 2, 349–369. Pourghasemi, H.R., Kerle, N., 2016. Random forests and evidential belief function-based landslide susceptibility assessment in Western Mazandaran Province. Iran. Environ. Earth Sci. 75, 185. https://doi.org/10.1007/s12665-015-4950-1. Pourghasemi, H.R., Mohammady, M., Pradhan, B., 2012a. Landslide susceptibility mapping using index of entropy and conditional probability models in GIS: Safarood Basin. Iran. CATENA 97, 71–84. Pourghasemi, H.R., Pradhan, B., Gokceoglu, C., 2012b. Application of fuzzy logic and analytical hierarchy process (AHP) to landslide susceptibility mapping at Haraz watershed. Iran. Nat. Haz. 63, 965–996. Pourghasemi, H.R., Yansari, Z.T., Panagos, P., Pradhan, B., 2018. Analysis and evaluation of landslide susceptibility: a review on articles published during 2005–2016 (periods of 2005–2012 and 2013–2016). Arab. J. Geosci. 11, 193. https://doi.org/10.1007/ s12517-018-3531-5. Pradhan, B., 2013. A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Comput. Geosci. 51, 350–365. Pradhan, B., Lee, S., 2010. Landslide susceptibility assessment and factor effect analysis: back propagation artificial neural networks and their comparison with frequency ratio and bivariate logistic regression modeling. Environ. Model Softw. 25, 747–759. Pradhan, B., Lee, S., Buchroithner, M.F., 2010. A GIS-based back-propagation neural network model and its cross-application and validation for landslide susceptibility analyses. Comput. Environ. Urban Syst. 34 (3), 216–235. Quinlan, J.R., 1993. C4.5: programs for machine learning. In: Morgan Kaufmann Series in Machine Learning, first ed. pp. 302. Reichenbach, P., Rossi, M., Malamud, B., Mihir, M., Guzzetti, F., 2018. A review of statistically-based landslide susceptibility models. Earth-Sci. Rev. 180, 60–91. Schmidhuber, J., 2015. Deep learning in neural networks: An overview. Neural Net. 61, 85–117. Segoni, S., Tofani, V., Rosi, A., Catani, F., Casagli, N., 2018. Combination of rainfall thresholds and susceptibility maps for dynamic landslide hazard assessment at regional scale. Front. Earth Sci. 6, 85. Shahabi, H., Hashim, M., 2015. Landslide susceptibility mapping using GIS-based statistical models and Remote sensing data in tropical environment. Sci. Rep. 5, 9899. Shirzadi, A., Soliamani, K., Habibnejhad, M., Kavian, A., Chapi, K., Shahabi, H., Chen, W., Khosravi, K., Thai Pham, B., Pradhan, B., Ahmad, A., Bin Ahmad, B., Tien Bui, D., 2018. Novel GIS based machine learning algorithms for shallow landslide susceptibility mapping. Sensors 18 (11). https://doi.org/10.3390/s18113777. Singaravel, S., Suykens, J., Geyer, P., 2018. Deep-learning neural-network architectures and methods: using component-based models in building-design energy prediction. Adv. Eng. Infor. 38, 81–90. Smola, A.J., Schölkopf, B., 2004. A tutorial on support vector regression. Stat. Comput. 14, 199–222. Tien Bui, D., Anh Tuan, T., Klempe, H., Pradhan, B., Revhaug, I., 2016. Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and

literature (1945–2008). Landslides 6 (4), 345–351. Goodfellow, I., Bengio, Y., Courville, A., 2016. Deep Learning. MIT Press. Gorsevski, P.V., Gessler, R.E., Boll, J., Elliot, W., Foltz, R.B., 2006. Spatially and temporally distributed modeling for landslide susceptibility. Geomorphology 80, 178–198. Guzzetti, F., Carrara, A., Cardinali, M., Reichenbach, P., 1999. Landslide hazard evaluation: an aid to a sustainable development. Geomorphology 31, 181–216. Hahnloser, R., Sarpeshkar, R., Mahowald, M.A., Douglas, R.J., Seung, H.S., 2000. Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 405, 947–951. Haykin, S., 1998. Neural Networks: A Comprehensive Foundation. Prentice Hall PTR Upper Saddle River, NJ, USA. Hens, L., Thinh, N.A., Hanh, T.H., Cuong, N.S., Lan, T.D., Van Thanh, N., Le, D.T., 2018. Sea-level rise and resilience in Vietnam and the asia-pacific: A synthesis. Vietnam J. Earth Sci. 40 (2), 126–152. Hervás, J., Bobrowsky, P., 2009. Mapping: inventories, susceptibility, hazard and risk. In Landslides-Disaster Risk Reduction. Springer, Berlin, Heidelberg, pp. 321–349. Hinton, G., Deng, L., Yu, D., Dahl, G., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T., 2012. Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process Mag. 29, 82–97. Hong, H., Ilia, I., Tsangaratos, P., Chen, W., Xu, C., 2017. A hybrid fuzzy weight of evidence method in landslide susceptibility analysis on the Wuyuan area, China. Geomorphology 290, 1–16. Hong, H., Liu, J., Tien Bui, D., Pradhan, B., Acharya, T.D., Pham, B.T., Zhu, A.-X., Chen, W., Ahmad, B.B., 2018a. Landslide susceptibility mapping using J48 Decision Tree with AdaBoost, Bagging and Rotation Forest ensembles in the Guangchang area (China). CATENA 163, 399–413. Hong, H., Tsangaratos, P., Ilia, I., Liu, J., Zhu, A.-X., Xu, C., 2018b. Applying genetic algorithms to set the optimal combination of forest fire related variables and model forest fire susceptibility based on data mining models. The case of Dayu County. China. Sci. Total Environ. 630, 1044–1056. Hung, D.D., Tsutsumi, Y., Komatsu, T., Hoang, N., Hung, N.B., Ha, T.T., Dung, N.T., 2019. The significance of zircon U-Pb ages in the Ba river basin to the timing of major tectonic stages of Kon Tum massif. Vietnam J. Earth Sci. 41 (2), 105–115. Hung, L.Q., Van, N.T.H., Duc, D.M., Ha, L.T.C., Son, P.V., Khanh, N.H., Binh, L.T., 2016. Landslide susceptibility mapping by combining the analytical hierarchy process and weighted linear combination methods: a case study in the upper Lo River catchment (Vietnam). Landslides 13 (5), 1285–1301. Hungr, O., Leroueil, S., Picarelli, L., 2014. The Varnes classification of landslide types, an update. Landslides, 11(2), 167-194. Ilia, I., Loupasakis, C., Tsangaratos, P., 2018. Land subsidence phenomena investigated by spatiotemporal analysis of groundwater resources, remote sensing techniques, and random forest method: the case of Western Thessaly Greece. Environ Monit Assess 190, 623. Jaafari, A., Panahi, M., Pham, B.T., Shahabi, H., Tien Bui, D., Rezaie, F., Lee, S., 2019. Meta optimization of an adaptive neuro-fuzzy inference system with grey wolf optimizer and biogeography-based optimization algorithms for spatial prediction of landslide susceptibility. CATENA 175, 430–445. Jouquet, P., Bottineli, N., Podwojewski, P., Hallaire, V., Duc, T.T., 2008. Chemical and physical properties of earthworm casts compared to bulk soil under a range of different land-use systems in Vietnam. Geoderma 146 (1–2), 231–238. Kavzoglu, T., Colkesen, I., 2009. A kernel functions analysis for support vector machines for land cover classification. Int. J. Appl. Earth Obs. Geoinf. 11 (5), 352–359. Kavzoglu, T., Mather, P.M., 2003. The use of backpropagating artificial neural networks in land cover classification. Int. J. Remote Sens. 24 (23), 4907–4938. Kornejady, A., Ownegh, M., Bahremand, A., 2017. Landslide susceptibility assessment using maximum entropy model with two different data sampling methods. CATENA 152, 144–162. Krizhevsky, A., Sutskever, I., Hinton, G., 2012. ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neu- ral Information Processing Systems (Lake Tahoe, NV, Dec. 2012), pp. 1097–1105. Lagomarsino, D., Tofani, V., Segoni, S., Catani, F., Casagli, N., 2017. A tool for classification and regression using random forest methodology: applications to landslide susceptibility mapping and soil thickness modeling. Environ. Model. Assess. 22 (3), 201–214. Lee, S., Dan, N.T., 2005. Probabilistic landslide susceptibility mapping in the Lai Chau province of Vietnam: focus on the relationship between tectonic fractures and landslides. Environ. Geol. 48 (6), 778–787. Liu, Y., Wu, L., 2016. Geological disaster recognition on optical remote sensing images using deep learning. Procedia Comput. Sci. 91, 566–575. Magliulo, P., Di Lisio, A., Russo, F., Zelano, A., 2008. Geomorphology and landslide susceptibility assessment using GIS and bivariate statistics: a case study in southern Italy. Nat. Hazards 47, 411–435. https://doi.org/10.1007/s11069-008-9230-x. Marjanović, M., Samardžić-Petrović, M., Abolmasov, B., Đurić, U., 2019. Concepts for improving machine learning based landslide assessment. Adv. Nat. Technolog. Haz. Res. 28, 27–58. Melchiorre, C., Metteucci, M., Azzoni, A., Zanchi, A., 2008. Artificial neural networks and cluster analysis in landslide susceptibility zonation. Geomorphology 94, 379–400. Merghadi, A., Abderrahmane, B., Tien Bui, D., 2018. Landslide Susceptibility Assessment at Mila Basin (Algeria): A comparative assessment of prediction capability of advanced Machine Learning Methods. Int. J. Geo-Inform. 7, 268. https://doi.org/10. 3390/ijgi7070268. Minh, P.T., Tuyet, B.T., Thao, T.T.T., 2018. Application of ensemble Kalman filter in WRF model to forecast rainfall on monsoon onset period in South Vietnam. Vietnam J. Earth Sci. 40 (4), 367–394.

13

Catena 188 (2020) 104426

D.T. Bui, et al.

Van Den Eeckhaut, M., Vanwalleghem, T., Poesen, J., Govers, G., Verstraeten, G., 2006. Prediction of landslide susceptibility using rare events logistic regression: a casestudy in the Flemish Ardennes (Belgium). Geomorphology 76, 392–410. Vapnik, V., 1998. Statistical Learning Theory. John Wiley and Sons, Inc., New York. Varnes, D., 1984. Landslide hazard zonation: a review of principles and practice. United Nations Scientific and Cultural Organization, Paris, pp. 1–6. Wang, Y., Fang, Z., Hong, H., 2019. Comparison of convolutional neural networks for landslide susceptibility mapping in Yanshan County. China. Sci. Total Environ. 666, 975–993. Witten, I.H., Frank, E., Hall, M.A., 2011. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco, CA, USA. Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., MacLachlan, G.J., Ng, A., Liu, B., Yu, P.S., Zhou, Z.H., Steinbach, M., Hand, D.J., Steinberg, D., 2008. Top 10 algorithms in data mining. Knowl. Inf. Syst. 14 (1), 1–37. Yem, N.T., Thanh, N.Q., Anh, P.L., Chi, C.T., Du, C.D., Dung, N.P., Dung, P.D., Hai, N.P., Hien, T.T., Hoang, N.V., Lien, V.T.H., Phuong, C.T., Quoc, L.M., Tuan, T.A., Thuan, P.N., Thom, B.V., Thinh, N.H., 2006. Assessment of landslides and debris flows at some prone mountainous areas Vietnam and recommendation of remedial measures. Phase I: a study of the east side of the Hoang Lien Son mountainous area of Vietnam. Institute of Geological Sciences, Vietnam Academy of Science and Technology, Hanoi, pp. 361. Yeon, Y.-K., Han, J.-G., Ryu, K.-H., 2010. Lanslide susceptibility mapping in Injae, Korea, using a decision tree. Eng. Geol. 116, 274–283. Yesilnacar, E., Topal, T., 2005. Landslide susceptibility mapping: a comparison of logistic regression and neural networks model in a medium scale study, Hendek region (Turkey). Eng. Geol. 79, 251–266. Yilmaz, I., 2010. The effect of the sampling strategies on the landslide susceptibility mapping by conditional probability and artificial neural network. Environ. Earth Sci. 60, 505–519. Youssef, A.M., Pourghasemi, H.R., Pourtaghi, Z.S., Al-Katheeri, M.M., 2016. Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region Saudi Arabia. Landslides 13 (5), 839–856. Yu, H., Ma, Y., Wang, L., Zhai, Y., Wang, X., 2017. A landslide intelligent detection method based on CNN and rsg_r. In: Proceedings of the 2017 IEEE International Conference on Mechatronics and Automation (ICMA), Takamatsu, Japan, 6–9 August 2017. IEEE, pp. 40–44. Zare, M., Pourghasemi, H.R., Vafakhah, M., Pradhan, B., 2013. Landslide susceptibility mapping at Vaz watershed (Iran) using an artificial neural network model: A comparison between multilayer perceptron (MLP) and radial basic function (RBF) algorithms. Arab. J. Geosci. 6, 2873–2888.

logistic model tree. Landslides 13 (2), 361–378. Tien Bui, D., Khosravi, K., Li, S., Shahabi, H., Panahi, M., Singh, V., Chapi, K., Shrizadi, A., Panahi, S., Chen, W., Bin Ahmad, B., 2018a. New hybrids of anfis with several optimization algorithms for flood susceptibility modeling. Water 10 (9), 1210. Tien Bui, D., Lofman, O., Revhaug, I., Dick, O., 2011. Landslide susceptibility analysis in the Hoa Binh province of Vietnam using statistical index and logistic regression. Nat. Hazards 59, 1413–1444. https://doi.org/10.1007/s11069-011-9844-2. Tien Bui, D., Pradhan, B., Lofman, O., Revhaug, I., 2012. Landslide susceptibility assessment in vietnam using support vector machines, decision tree, and naïve bayes models. Math. Probl. Eng. 2012. https://doi.org/10.1155/2012/974638. 974638 26. Tien Bui, D., Pradhan, B., Revhaug, I., Nguyen, D.B., Pham, V.H., Bui, Q.N., 2014. A novel hybrid evidential belief function based fuzzy logic model in spatial prediction of rainfall-induced shallow landslides in the Lang Son city area (Vietnam). Geomat. Nat. Haz. Risk. https://doi.org/10.1080/19475705.2013.843206. Tien Bui, D., Shahabi, H., Shirzadi, A., Chapi, K., Hoang, N.D., Pham, B., Bui, Q.-T., Tran, C.-T., Panahi, M., Ahmad, B.B., Saro, L., 2018b. A novel integrated approach of relevance vector machine optimized by imperialist competitive algorithm for spatial modeling of shallow landslides. Remote Sens. 10 (10), 1538. Tien Bui, D., Tsangaratos, P., Ngo, P.T.T., Pham, T.D., Pham, B.T., 2019. Flash flood susceptibility modeling using an optimized fuzzy rule based feature selection technique and tree based ensemble methods. Sci. Total Environ. 668, 1038–1054. Tien Bui, D., Tuan, T.A., Hoang, N.-D., Thanh, N.Q., Ngugen, D.B., Liem, N.V., Pradhan, B., 2017. Spatial prediction of rainfall-induced landslides for the Lao Cai area (Vietnam) using a hybrid intelligent approach of least squares support vector machines inference model and artificial bee colony optimization. Landslides 14 (2), 447–458. Tran, T., Le, T.D., Nguyen, H.T., Nguyen, V.L., Nguyen, V.T., Phan, V.T., et al., 1997. Geological and mineral resources map of Vietnam scale 1:200,000. Department of Geology and Minerals of Vietnam, Hanoi. Tsangaratos, P., Benardos, A., 2014. Estimating landslide susceptibility through a artificial neural network classifier. Nat. Hazards 74, 1489–1516. Tsangaratos, P., Ilia, I., 2016a. Landslide susceptibility mapping using a modified decision tree classifier in the Xanthi Perfection. Greece. Landslides 13 (2), 305–320. Tsangaratos, P., Loupasakis, C., Nikolakopoulos, K., Angelitsa, V., Ilia, I., 2018. Developing a landslide susceptibility map based on remote sensing, fuzzy logic and expert knowledge of the Island of Lefkada. Greece. Environ. Earth Sci. 77, 363. Tsangaratos, P., Ilia, I., 2016. Comparison of a logistic regression and Naïve Bayes classifier in landslide susceptibility assessments: The influence of models complexity and training dataset size. Catena 145, 164–179. Tsangaratos, P., Ilia, I., Hong, H., Chen, W., Xu, C., 2017. Applying information theory and GIS-based quantitative methods to produce landslide susceptibility maps in Nancheng county, China. Landslides 14, 1091–1111.

14