Landslide susceptibility mapping based on rough set theory and support vector machines: A case of the Three Gorges area, China

Landslide susceptibility mapping based on rough set theory and support vector machines: A case of the Three Gorges area, China

Geomorphology 204 (2014) 287–301 Contents lists available at ScienceDirect Geomorphology journal homepage: www.elsevier.com/locate/geomorph Landsli...

5MB Sizes 2 Downloads 105 Views

Geomorphology 204 (2014) 287–301

Contents lists available at ScienceDirect

Geomorphology journal homepage: www.elsevier.com/locate/geomorph

Landslide susceptibility mapping based on rough set theory and support vector machines: A case of the Three Gorges area, China Ling Peng a,b, Ruiqing Niu a,⁎, Bo Huang c, Xueling Wu a, Yannan Zhao a, Runqing Ye d a

Institute of Geophysics and Geomatics, China University of Geosciences, Wuhan 430074, PR China China Institute of Geo-Environment Monitoring, Beijing 100081, PR China Department of Geography and Resource Management, The Chinese University of Hong Kong, Hong Kong, China d Headquarters of Prevention and Control of Geo-Hazards in Area of TGR, Yichang 443000, PR China b c

a r t i c l e

i n f o

Article history: Received 3 September 2012 Received in revised form 28 July 2013 Accepted 17 August 2013 Available online 24 August 2013 Keywords: Landslide susceptibility Rough set theory Support vector machine GIS Three Gorges area

a b s t r a c t This paper aims to develop a novel hybrid model for assessing landslide susceptibility at the regional scale using multisource data to produce a landslide susceptibility map of the Zigui–Badong area near the Three Gorges Reservoir, China. This area is subject to anthropogenic influences because the reservoir's water level fluctuates cyclically between 145 and 175 m; in addition, the area suffers from extreme rainfall events due to the local climate. The area has experienced significant and widespread landslide events in recent years. In our study, a novel hybrid model is proposed to produce landslide susceptibility maps using geographical information systems (GIS) and remote sensing. The hybrid model is based on rough set (RS) theory and a support vector machine (SVM). RS theory is employed as an attribute reduction tool to identify the significant environmental parameters of a landslide, and an SVM is used to predict landslide susceptibility. Four data domains were considered in this research: geological, geomorphological, hydrology, and land cover. The original group of 20 environmental parameters and 202 landslides were used as the inputs to produce a landslide susceptibility map. According to the map, 19.7% of the study area was identified as medium- and high-susceptibility zones encompassing 89.5% of the historical landslides. The results indicate high levels of landslide hazard in and around the main inhabited areas, such as Badong County and other towns, as well as in rural residential areas and transportation areas along the Yangtze River and its tributaries. The predicted map indicates a good correlation between the classified high hazard areas and slope failures confirmed in the field. Furthermore, the quality of the proposed model was comprehensively evaluated, including the degree of model fit, the robustness of the model, the uncertainty associated with the probabilistic estimate, and the model prediction skill. The proposed model was also compared with the general SVM, which demonstrated that the hybrid model has superior prediction skill and higher reliability and confirmed the usefulness of the proposed model for landslide susceptibility mapping at a regional scale. © 2013 Elsevier B.V. All rights reserved.

1. Introduction The area near the Three Gorges Reservoir along the Yangtze River in China is characterized by many small and large active landslides and poses a serious threat to the socioeconomic stability of the region. More than 3800 landslide locations have been reported in the region (Liu et al., 2009). With significant increases and periodic fluctuations in the water level of the reservoir, the stability of bank slopes is a serious and unavoidable problem. Therefore, landslide prediction is critical for landslide prevention and mitigation in the area. Several landslide susceptibility evaluation studies have been conducted recently for the Three Gorges area. Liu et al. (2004) and Fourniadis et al. (2007) successfully applied Terra ASTER imagery to estimate a number of environmental parameters of landslides in the Three Gorges area. However, the limited data of these studies did not permit a ⁎ Corresponding author. Tel.: +86 27 67883425; fax: +86 27 67883251. E-mail address: [email protected] (R. Niu). 0169-555X/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.geomorph.2013.08.013

detailed statistical assessment of landslides; thus, these studies used a semi-quantitative approach involving logical elimination and characterization to produce landslide hazard maps. Thus, the objectives of these studies were to develop a methodology to assess landslide hazard using remotely sensed data. Bai et al. (2010) applied a logistic regression model to produce a landslide susceptibility map for the Zhongxian– Shizhu segment in the Three Gorges area. In contrast to conventional statistical models, this model employs only continuous variables, and the landslide density is used to transform nominal variables to numeric variables. Although this methodology is suitable for landslide susceptibility mapping over large areas and the produced results are relatively easy to explain, this model is inherently linear and thus is not appropriate for complex relationships among a large number of environmental factors in complicated landslide systems. Liu et al. (2011) used rough set (RS) theory to clarify the relationship between landslide and environmental factors in the Qinggan River of the Three Gorges area. A total of 86 rules were identified by the attribute reduction and rule extraction processing, but only three rules were significant for landslide

288

L. Peng et al. / Geomorphology 204 (2014) 287–301

prediction because the available environmental factors were insufficient for predicting landslides. Consequently, most of the rules were approximate and had large roughness values. Landslide susceptibility evaluation is a complex task (Brabb, 1991). Several approaches for landslide susceptibility mapping have been proposed, which can be grouped into two broad categories, qualitative and quantitative (Guzzetti et al., 1999; Fell et al., 2008; Kanungo et al., 2009). Qualitative methods, such as heuristic approaches and direct geomorphologic mapping, were widely used during the late 1970s by engineering geologists and geomorphologists (Castellanos Abella and Van Westen, 2008), whereas quantitative methods, which are based on numerical expressions of the relationship between environmental factors and landslides, have become popular in the last few decades due to the development of computer and geographic information system (GIS) technology (e.g., Carrara et al., 1991; Lee and Min, 2001; Remondo et al., 2003; Ayalew and Yamagishi, 2005; Melchiorre et al., 2008; Bai et al., 2010; Atkinson and Massari, 2011; Choi et al., 2012; Van Den Eeckhaut et al., 2012). Qualitative methods are somewhat subjective and depend on the judgment of experts, whereas quantitative methods are relatively objective. However, landslides are typically complex systems, and predicting landslide susceptibility requires geomorphological, geological, hydrological, and land cover data, as well as other data related to environmental factors ranging in number from several to dozens. There are no universal guidelines for selecting factors that influence landslides in susceptibility mapping (Yalcin, 2008). In different study areas and various predictive models, the extraction of an appropriate evaluation factor is complex, particularly when a large number of factors are available, but this key issue has not been considered in previous landslide susceptibility assessments (Van Westen et al., 2006; Wang et al., 2008). Previous studies tended to provide little or no information on the quality of the proposed models (Guzzetti et al., 2006; Frattini et al., 2010), which is unsatisfactory for the end users and a limitation in many studies (Chung and Fabbri, 2003). Therefore, better techniques for identifying the significant environmental factors of landslides are urgently needed. RS theory is a new effective tool in dealing with vagueness and uncertainty information (Pawlak, 1982). Attribute reduction is one of the most important concepts. Irrelevant and redundant attributes are removed from the decision without any information loss. RS theory identifies structural relationships in data and is useful for identifying potentially significant facts or data patterns in multidimensional attribute collections (Gorsevski and Jankowski, 2008). RS theory has been used in a number of disciplines of science (Thangavel and Pethalakshmi, 2009), such as remote sensing (Pan et al., 2010), geographic information science (Leung et al., 2007), medicine (Thangavel et al., 2005), artificial intelligence (Tay and Shen, 2003), in addition to landslide susceptibility mapping (Aldridge, 1999; Gorsevski and Jankowski, 2008). The support vector machine (SVM), which was proposed by Vapnik (1995), can incorporate learning techniques. SVMs are based on Vapnik–Chervonenkis (VC) dimensional theory and statistical learning theory, and classification is one of their most important applications. SVMs have attracted increasing attention in recent years because of their good classification performance and capabilities of fault-tolerance and generalization. Therefore, SVMs have been widely applied to machine learning, data mining, and knowledge discovery (Barakat and Bradley, 2010; Mountrakis et al., 2011), as well as landslide susceptibility assessment (Yao et al., 2008; Marjanović et al., 2011; Ballabio and Sterlacchini, 2012; Xu et al., 2012). Ballabio and Sterlacchini (2012) concluded that the SVM model was feasible and able to outperform other techniques in terms of accuracy and generalization capacity. However, the choice of optimal input environmental factors and the optimal model parameter play a crucial role in constructing a landslide susceptibility model with high accuracy and stability because the choice of factors influences the appropriate kernel parameters and vice versa (Frohlich et al., 2003). The selection of environmental factors is an important issue in building a prediction model, which seeks to identity

the significant factors and eliminates irrelevant factors to construct a good prediction model. Thus, a technique that can reduce cognitive complexity without any prior knowledge using only the information contained within the dataset while preserving the meaning of the original information is strongly desirable. As mentioned above, RS theory can be utilized as such a tool to identify the most important environmental parameters that influence landslide occurrence. To fully utilize such advantages of RS theory and the generalization capacity of the SVM model, a hybrid model of RS and SVM (RS–SVM) is proposed in this study. RS theory is employed as an attribute reduction tool to extract the optimal environmental factors, which are then used as the inputs in an SVM model for predicting landslide susceptibility. In this paper, a landslide-prone Zigui–Badong segment of the Three Gorges area was selected as a study area. The proposed novel approach was used to map landslide susceptibility; the results are explainable and may also have practical applications. Furthermore, we provided a comprehensive validation of the landslide susceptibility model. The set of tests included the degree of model fit, the robustness of the model, the uncertainty associated with the probabilistic estimate, and the model prediction skill. The performance of the proposed model was compared with that of the general SVM. The results demonstrate that the hybrid model has superior predictive performance. 2. Rough set theory and support vector machines 2.1. Rough set (RS) theory RS theory was introduced by Pawlak (1982) as a mathematical framework for approximate reasoning that considers uncertainty and vagueness in decision-making processes. An information system may be expressed as 4-tuple S = b U, A, V, f N, where U is a finite set of objects, called the universe, A is a finite set of attributes, V = Ua ∈ AVa with Va being a domain of the attribute a, and f : U × A → V is called an information function such that f(x,a) ∈ Va for ∀a ∈ A, ∀x ∈ U. Now consider subsets of the attributes. Every R p A generates an indiscernibility relation on U defined as: INDðRÞ ¼ fðx; yÞ∈U  U : ∀a∈R; aðxÞ ¼ aðyÞg

ð1Þ

where a(x) denotes the value of attribute a of object x. If (x,y) ∈ IND(R), x and y are said to be indiscernible with respect to R, so x and y are not distinguishable by R attributes. The equivalence classes of undistinguishable R relation are denoted by [x]R. Fig. 1 illustrates the lower and upper approximations, for any concept X p U and attribute set R p A, X could be approximated by the lower and upper approximations. The lower approximation of X is the set of objects of U that are surely in X, defined as:   R ðX Þ ¼ x∈U : ½xR p X :

ð2Þ

The upper approximation of X is the set of objects of U that are possibly in X, defined as:   RðX Þ ¼ x∈U : ½xR ∩X≠∅ :

ð3Þ

The boundary region of X is defined as: BndðX Þ ¼ RðX Þ−R ðX Þ:

ð4Þ

A set is said to be rough if its boundary region is non-empty, otherwise the set is crisp. Reduct and core attribute sets are two fundamental concepts of RS theory. A reduct attribute set is a minimal set of attributes from A that provided that the object classification is the same as with the full set of attributes. Given D and E p A, a reduct is a minimal set of attributes such that IND(D) = IND(E). Let RED(A) denote all reducts of A. The

L. Peng et al. / Geomorphology 204 (2014) 287–301

289

Fig. 1. Lower and upper approximations of a set in RS theory.

intersection of all reducts of A is referred to as a core of A, i.e., CORE(A) = ∩ RED(A), and the core is common to all reducts.



2.2. Support vector machines (SVMs) SVMs provide a method for creating functions from a set of labeled training data. For classification, SVMs operate by attempting to find a hyper surface in the space of possible inputs that splits the positive examples from the negative ones. The split will be chosen to have the largest distance from the hyper surface to the nearest of the positive and negative examples. Intuitively, this approach makes the classification correct for testing data that are near but not identical to the training data. A set of linear separable training vectors xi(i = 1, 2, …, n) consists of two classes that are denoted as yi = ±1 (landslide or non-landslide). The goal of an SVM is to search for an n-dimensional hyperplane that differentiates the two classes by the maximum gap (Fig. 2A). Mathematically, the n-dimensional hyperplane can be minimized as: 1 2 kwk 2

Then Eq. (5) can be modified as n X 1 2 ξi kwk þ C 2 i¼1

ð9Þ

where C is the penalty parameter of the error term, which controls the trade-off between maximizing the margin and minimizing the training error; larger values of C correspond to the assignment of a higher penalty to errors (Burges, 1998). In addition, a kernel function

ð5Þ

subject to the following constraints yi ððw • xi Þ þ bÞ≥1

ð6Þ

where ‖w‖ is the norm of the normal of the hyperplane, b is a scalar base, and “•” denotes the scalar product operation. Introducing the Lagrangian multiplier (λi), the cost function can be defined as L¼

n X 1 2 λi ðyi ððwxi Þ þ bÞ−1Þ: kwk − 2 i¼1

ð7Þ

The solution can be achieved by dual minimizing Eq. (7) with respect to w and b through the standard procedures; a detailed discussion on this procedure is provided in Vapnik (1995) and Tax and Duin (1999). For the non-separable case (Fig. 2B), one can modify the constraints by introducing slack variables ξi: yi ððwxi Þ þ bÞ≥ 1−ξi :

ð8Þ

Fig. 2. Illustration of the SVM principle. (A) n-dimensional hyperplane differentiating the two classes by the maximum gap. (B) Non-separable case and the slack variables ξ.

290

L. Peng et al. / Geomorphology 204 (2014) 287–301

K(xi,xj) is introduced by Vapnik (1995) to account for the nonlinear decision boundary. 3. Study area 3.1. General characteristics The Three Gorges lie along the middle reaches of the Yangtze River in the mountains separating the Sichuan and Jianghan Basins. The gorges are thought to have formed by river incision into massive limestone mountains of the Early Palaeozoic to Mesozoic (J1 Jialinjiang Group) in response to episodic tectonic uplift during the Quaternary (Chen et al., 1995; Li et al., 2001). The elevation ranges from 800 to 2000 m, and the terrain consists of a succession of limestone ridges and gorges, with inter-gorge valleys comprised primarily of interbedded mudstones,

shales, and thinly bedded limestones. Landslides tend to occur in failureprone lithological formations that are concentrated in the inter-gorge valleys (Fourniadis et al., 2007). The study area is located in Hubei Province, which includes Zigui and Badong counties, to the west of Xiling Gorge (Fig. 3). The site lies between latitudes of 30.02°N and 30.93°N and longitudes of 110.30°E and 110.87°E, and covers an area of 396 km2. The average annual precipitation is 1100 mm. The rainfall is generally concentrated in spring and summer, and the summer average can be as high as 200–300 mm per month (He et al., 2008). 3.2. Geological setting The basement of the study area consists of crystalline, pre-Sinian rocks, with a Sinian–Jurassic sedimentary cover (Wu et al., 2001).

Fig. 3. Location map of the study area. (A) Site map of the Three Gorges area of the Yangtze River, China. (B) Image of a three-dimensional view of the study area from the south (a digital image derived from Google Earth). (C) Digital elevation model (DEM) overlaid with landslides; the red hatched area represents landslides for training samples, and the blue hatched area represents landslides for verification.

L. Peng et al. / Geomorphology 204 (2014) 287–301

The Huangling anticline to the northeast of Zigui forms a structure of approximately 73 km in length, oriented mainly NNE–SSW in the area, and its core is composed of pre-Sinian metamorphic and magmatic rocks (Fig. 4). The strength and stability of this anticlinal structure are the principal reasons for the siting of the Three Gorges Dam in its location near the town of Sandouping. There are three main faults and fracture systems in the area. The first is the Xiannushan fault, which is oriented NNW–SSE to the southwest of the Huangling anticline and consists of three parallel shear zones (Chen, 1986). The second is the Jiuwanxi fault, which is oriented NNE–SSW and approaches the Xiannushan fault near Tianjiawan and disappears immediately south of it. The third is the Niukou–Xiangluping fault zone, which crosses the Zigui basin and has an orientation similar to that of the Jiuwanxi fault. In addition, the area to the south of Zigui and Badong is characterized by a system of secondary faults that follow the orientation of the fold system in the area, i.e., ENE–WSW (Wu et al., 1997). These secondary faults and fracture systems tend to form ‘weak’ zones that are conductive to slope instability (Liu et al., 2004).

291

3.3. Slope failures The landslides that have occurred in this region are widely distributed and represent a serious threat to property. Slides are the main types of mass movements in the study region. One of the most significant geomorphological characteristics influencing slope instability in this area is the relationship between strata dip, slope angle, and slope aspect. Areas in which strata dip occurs toward the slope face tend to have translational slope failures, such as landsliding at Huangtupo in Badong County (Deng et al., 2000). Most of the study area forms a typical middle mountain and gorge valley morphology and has steep slopes and large topographical relief due to the stratum lithology and structure. The distribution of complex landslides is mainly controlled by lithology combinations, structural planes, and climate conditions. The Xintan landslide, a huge landslide with a volume of about 20 million m3 that occurred on June 12, 1985, is a typical multi-period landslide in the area (Fig. 5A). There have been two large-scale landslides described in the historical record

Fig. 4. Regional geological and tectonic framework map of the study area.

292

L. Peng et al. / Geomorphology 204 (2014) 287–301

Fig. 5. Slope failures in the study area. (A) Xintan landslide. (B) Qianjiangping landslide. Photos were provided by the Headquarters of Prevention and Control of Geo-Hazards in Area of TGR.

which took place in years 1029 and 1542 and interrupted the traffic through the Yangtze River waterway. The bedrock outcropped at Xintan successively from east to west is Silurian shale and sandstone, Devonian quartz-sandstone, and carboniferous limestone with coal seams. The main cause of the Xintan landslide overall resurrection was the rock in the west and north sides of the cliffs collapsed year by year, the collapse deposits of the active slip regime continue to accumulate, the declined

force of the debris on the slopes, and the pressure of the groundwater static and dynamic water gradually met and exceed the stabilizing force of the slide bed. After impoundment, the reservoir's water level fluctuates cyclically between 145 and 175 m, and the bank slopes of the Three Gorges Reservoir are subjected to long periods of saturation in water. The hydrogeological conditions, particularly the physical and mechanical

Fig. 6. Landslides identified on the aerial photographs. A: Fanjiaping landslide; B: Baishuihe landslide; C: Qianjiangping landslide; D: Kaziwan landslide.

L. Peng et al. / Geomorphology 204 (2014) 287–301

293

Table 1 Values of landslide environmental parameters. Environmental parameters

Value

Geology Bedding structure Lithology Distance from lineaments (km) Geomorphology Elevation (m) Slope angle(°) Slope aspect Profile curvature (°/100 m) Plane curvature (°/100 m) Terrain roughness (m) Hydrology Distance from drainage(m) Catchment area (km2) Catchment slope (°) Catchment height (m) TWI NDMI MNDWI Land cover Land use NDVI TVI GTC

1) over-dip slope; 2) under-dip slope; 3) dip-oblique slope; 4) transverse slope; 5) anaclinal-oblique slope; 6) anaclinal slope 1) mudstone, shale and Quaternary deposits; 2) sandstones and thinly bedded limestones; 3) limestones and massive sandstones 1) 0–1.407; 2) 1.407–3.055; 3) 3.055–5.115; 4) 5.115–8.754 1) 80–330; 2) 330–620; 3) 620–1000; 4) 1000–2000 1) 0–10; 2) 10–25; 3) 25–36; 4) 36–78 1) N; 2) NE; 3) E; 4) SE; 5) S; 6) SW; 7) W; 8) NW 1) −2.944 – −1.123; 2) −1.123–0.161; 3) 0.161–1.702; 4) 1.702–3.199 1) −2.691 – −1.497; 2) −1.497 – −0.360; 3) −0.360–0.437; 4) 0.437–3.658 1) 0–20; 2) 20–43; 3) 43–72; 4) 72–310 1) 0–1230; 2) 1230–1860; 3) 1860–2740; 4) 2740–4700 1) 0.081–7.393; 2) 7.393–27.106; 3) 27.106–54.622; 4) 54.622–104.808 1) 0–0.322; 2) 0.322–0.466; 3) 0.466–0.615; 4) 0.615–1.467 1) 0–93; 2) 93–250; 3) 250–475; 4) 475–1082 1) 4.442–10.202; 2) 10.202–14.328; 3) 14.328–19.658; 4) 19.658–26.363 1) −0.427 – −0.159; 2) −0.159 – −0.079); 3) −0.079–0.002; 4) 0.002–0.462 1) −0.356–0.176; 2) 0.176–0.286; 3) 0.286–0.541; 4) 0.541–0.765 1) forest; 2) agriculture; 3) residential; 4) others 1) 0–0.2; 2) 0.2–0.431; 3) 0.431–0.647; 4) 0.647–1.0 1) 0–0.603; 2) 0.603–0.797; 3) 0.797–0.985; 4) 0.985–1.225 1) 0–0.435; 2) 0.435–0.529; 3) 0.529–0.737; 4) 0.737–1.0

parameters of the sliding zone of the landslides under water, vary due to the dynamic action of water, which inevitably affects the stability of bank slopes and also causes translational landslides. The Qianjiangping landslide, which represents a typical landslide in the Three Gorges area, occurred after the first impoundment of the Three Gorges Reservoir in July 2003 (Fig. 5B). The landslide had a tongue-shaped plan, with a total volume of approximately 2.4 × 107 m3. The average slope angle of the slide plane was greater than 20°, and the elevation of the main scarp was approximately 450 m. The landslide moved

approximately 250 m in the main sliding direction with a sliding velocity of 16 m s−1 (Xiao et al., 2010). The bedrock in the landslide was feldspathic quartz sandstone, fine sandstone with carbonaceous siltstone, siltstone with mudstone, and silty mudstone of the Early Jurassic Niejiashan formation. This landslide was a reactivated landslide with a dip structure, which was located in the Zigui syncline, and thus, the sliding surface was along a pre-existing structural plane of weakness. The high water level caused by impoundment of the reservoir was the trigger for the landslide occurrence. In addition, intense rainfall and

N

N

50

50

25

25 0

0 N

N

50

50

25

25

0

0 N

N

50 50 25 25 0 0

Fig. 7. Classification of the bedding structure. α: slope aspect; β: bed dip direction; γ: bed dip angle; δ: slope angle.

294

L. Peng et al. / Geomorphology 204 (2014) 287–301

Table 2 Environmental parameters in reduct construction. Attributea

Countb

Percent

Corec

BEDS LITH DISL ELEV SANG SASP PROC PLAC TERR DISD CATA CATS CATH TWI NDMI MNDWI LANU NDVI TVI GTC

7 7 1 7 3 7 5 7 3 4 7 7 7 4 7 7 7 4 7 7

6.1 6.1 0.9 6.1 2.6 6.1 4.3 6.1 2.6 3.5 6.1 6.1 6.1 3.5 6.1 6.1 6.1 3.5 6.1 6.1

Y Y N Y N Y N Y N N Y Y Y N Y Y Y N Y Y

a Attribute: BEDS = bedding structure; LITH = lithology; DISL = distance from lineaments; ELEV = elevation; SANG = slope angle; SASP = slope aspect; PROC = profile curvature; PLAC = plane curvature; TERR = terrain roughness; DISD = distance from drainage; CATA = catchment area; CATS = catchment slope; CATH = catchment height; LANU = land use. b Count: frequency of attributes in reduct construction. c Core: role of attributes in reduct construction. It indicates that Y is the core and N is not the core.

excavation put the slope in a critical state (Wang et al., 2004). In recent years, with the development of the economy and massive urbanization, human activities including the construction of immigration towns, excavation, and destruction of forests have accelerated landsliding and thus negatively impacted the environment and may cause damage to the local community.

4. Data 4.1. Description of landslides Landslides were identified based on the interpretation of 1:10,000scale color aerial photographs (Fig. 6). A series of field surveys were conducted to confirm the sizes and shapes of the landslides, define the types of movements and the materials involved, and review historical and bibliographical data, including geological, geomorphological, and landslide maps. In total, 202 landslides were mapped and subsequently

digitized and rasterized in ESRI's ArcGIS software with a grid cell size of 28.5 × 28.5 m. The grid size reflects the resolution of the DEM and the remote sensing data used (30 and 28.5 m, respectively). Other vector data layers, such as bedding structure, lithology, and distance from lineaments, were also rasterized with this grid size. The study area was divided into 548,534 mapping units (grid cells), including 28,713 cells for landslides. The mapped landslides cover an area of 23.7 km2, representing 5.98% of the study area. The smallest landslide that we could identify from the aerial photographs and subsequently recognize in the field had an area of 2069 m2, whereas the largest landslide, the Fanjiaping landslide located on the southern side of the Yangtze River, was approximately 1.5 km2.

4.2. Influencing parameters of landslides For the natural landslides in the Three Gorges area, several researchers have examined the correlations between landslide occurrence and various parameters such as elevation, slope angle, slope aspect, curvature, land cover, drainage network, lithology, bedding structure, and lineaments, based on remote sensing data, GIS, and correlative or statistical analyses (Liu et al., 2004; Fourniadis et al., 2007; Bai et al., 2010). In addition, researchers have attempted to relate topographic variability to soil properties and hillslope hydrology. Studies have also shown that soil moisture content is an important factor and can be estimated by the normalized difference moisture index (NDMI), topographic wetness index (TWI), and modified normalized difference water index (MNDWI) (Pelletier et al., 1997). A high NDMI value indicates that the zone tends to saturate first and that the slope is more likely to fail. Based on previous research by Liu et al. (2004), Fourniadis et al. (2007) and Bai et al. (2010) as well as our field reconnaissance, this study uses 20 environmental parameters to predict the potential distribution of landslides. The parameters comprise geological (bedding structure, lithology, and distance from lineaments), geomorphological (elevation, slope angle, slope aspect, profile curvature, plane curvature, and terrain roughness), hydrological (distance from drainage, catchment area, catchment slope, catchment height, TWI, NDMI, and MNDWI), and land cover parameters (land use, normalized difference vegetation index — NDVI, transformed vegetation index — TVI, and greenness component of tasseled cap transformation — GTC). The values of the parameters are listed in Table 1. The continuous variables were converted into four classes using the natural breaks method. Note that the bedding structure is a continuous raster layer representing the angular relationship between topography and strata attitude; this relationship is a product of the bed dip direction and angle, slope angle, and aspect. The classification for the bedding structure is shown in Fig. 7.

Table 3 Information on the extracted rules with strength of five or more.a Rules for presence of landslidesb (BEDS = 1) & (LITH = 2) & (ELEV = 2) & (SASP = 2) & (PLAC = 3) & (TERR = 2) & (DISD = 1) & (CATA = 1) & (CATS = 3) & (CATH = 1) & (TWI = 2) & (NDMI = 2) & (MNDWI = 3) & (LANU = 3) & (TVI = 1) & (GTC = 4) = N (landslide = {1}) (BEDS = 3) & (LITH = 1) & (ELEV = 1) & (SANG = 2) & (SASP = 5) & (PLAC = 2) & (TERR = 2) & (DISD = 2) & (CATA = 1) & (CATS = 2) & (CATH = 2) & (NDMI = 3) & (MNDWI = 2) & (LANU = 3) & (TVI = 1) & (GTC = 3) = N (landslide = {1}) (BEDS = 4) & (LITH = 2) & (ELEV = 1) & (SANG = 3) & (SASP = 8) & (PROC = 2) & (PLAC = 3) & (CATA = 1) & (CATS = 3) & (CATH = 2) & (NDMI = 2) & (MNDWI = 3) & (LANU = 2) & (NDVI = 2) & (TVI = 2) & (GTC = 3) = N (landslide = {1}) Rules for absence of landslides (BEDS = 5) &(LITH = 3) & (ELEV = 3) & (SANG = 4) & (SASP = 6) & (PROC = 4) & (PLAC = 1) & (CATA = 3) & (CATS = 1) & (CATH = 4) & (NDMI = 4) & (MNDWI = 1) & (LANU = 2) & (NDVI = 4) & (TVI = 3) & (GTC = 2) = N (landslide = {0}) (BEDS = 6) & (LITH = 3) & (ELEV = 4) & (SASP = 7) & (PLAC = 4) & (TERR = 4) & (DISD = 3) & (CATA = 4) & (CATS = 4) & (CATH = 3) & (TWI = 4) & (NDMI = 4) & (MNDWI = 4) & (LANU = 1) & (TVI = 4) & (GTC = 1) = N (landslide = {0}) (BEDS = 2) & (LITH = 1) & (ELEV = 1) & (SANG = 1) & (SASP = 7) & (PROC = 4) & (PLAC = 3) & (CATA = 3) & (CATS = 4) & (CATH = 1) & (NDMI = 3) & (MNDWI = 4) & (LANU = 1) & (NDVI = 4) & (TVI = 3) & (GTC = 1) = N (landslide = {0}) a b

Strength of five means that one single rule from the table correctly predicted five landslides. Abbreviation of parameters are the same as in Table 2. Numbers in the brackets denote the category value of landslide environmental parameters, see Table 1.

L. Peng et al. / Geomorphology 204 (2014) 287–301 Table 4 Optimal parameters of the two models used for the five subsets and entire dataset. Model

RS-SVM SVM

Parameter

γ C γ C

Entire dataset 10%

20%

40%

60%

80%

100%

0.125 64.0 0.25 16.0

1.0 8.0 1.0 2.0

2.0 4.0 1.0 8.0

2.0 2.0 1.0 8.0

2.0 4.0 1.0 8.0

2.0 4.0 1.0 8.0

In the study area, the main triggering factor for landslides is the high amount of precipitation. However, this study does not include precipitation because rain is relatively uniform throughout the study area, as is seismicity. The geomorphological and hydrological parameters were derived from the DEM, which was constructed from a 1:50,000-scale digital topographic map. Land cover parameters were derived from the Landsat ETM+ satellite image (May 14, 2000 and path row: 125/039). Geological parameters were obtained from 1:50,000-scale geological maps (Hubei Province Geological Survey, 1997) and supplied with accompanying field surveys and a review of historical and bibliographical data. 5. Methods RS theory was employed as an attribute reduction tool to identify the significant environmental parameters of landslide occurrence, and a SVM was used for prediction landslide susceptibility. The process was as follows: (1) The information table was constructed from the initial environmental parameters. This table consists of a total of 1760 objects representing the absence of landslides (non-landslides) and a total of 1782 objects representing the presence of landslides (landslides). Landslide and non-landslide objects account for 10% of the study area. All objects in the information tables are described by 20 attributes and one decision class represented by one for the presence of and zero for the absence of landslides.

295

(2) RS theory was used to eliminate some attributes that were redundant or disturbed in the information table. The attribute reduction analysis was performed using RSES2 software (Bazan and Szczuka, 2005). The approximations of the decision classes and the quality of classifications were calculated for the information tables. This step was followed by the generation of reduction based on the exhaustive algorithm (Bazan et al., 2000). The result was seven reducts associated with this information table, leaving 13 attributes as the core (intersection of all reducts). The core attributes include the bedding structure, lithology, elevation, slope aspect, profile curvature, catchment area, catchment slope, catchment height, NDMI, MNDWI, land use, TVI, and GTC. These 13 environmental parameters are likely important factors for landslide susceptibility prediction in this area. Table 2 presents information on the frequency and role of environmental parameters in reduct construction. A set of decision rules was then generated based on the reduction set. The exemplary rules with a high relative strength were extracted from the decision rules and are shown in Table 3. The identified rules suggest that landslide susceptibility is associated with various environmental parameters and that the relationship between factors and landslides is not straightforward. Thus, the SVM method was employed for landslide prediction in subsequent analyses. (3) The dataset was randomly partitioned into two subsets: one was used in the training phase of the models, and the other was used to validate the models and confirm their accuracy. The core factors and landslides constitute a new dataset, and the continuous attributes range was normalized to [0,1]. As follows, randomly extracted 80% of the landslides and the same number of non-landslides formed the training dataset, with the remaining 20% of the landslides as verification data. “Stable” sites must be generated for the model. These sites were generated randomly beyond a buffer zone of 100 m from a landslide, and the minimum distance between any two random sites was 50 m. (4) The SVM model was trained with the training data. In this study, LibSVM3.0 software (Chang and Lin, 2011), a widely used SVM

Fig. 8. Landslide susceptibility maps obtained from (A) the RS–SVM model and (B) the general SVM model.

296

L. Peng et al. / Geomorphology 204 (2014) 287–301

Table 5 The classification of the landslide susceptibility map generated using the RS–SVM model with the distribution of the study area and the inventoried landslides over the different classes. (The numbers in parentheses denote the number of mapping units.). Probability

Susceptibility

% of study areaa

% of inventoried landslidesb

0.0–0.173 0.173–0.439 0.439–0.741 0.741–1.0

Very low Low Medium High

64.5 (353,609) 15.8 (86,694) 11.6 (63,500) 8.1 (44,731)

1.2 (345) 9.3 (2673) 32.5 (9329) 57.0 (16,366)

a b

Total number of mapping units for the study area = 548,534. Total number of mapping units for the inventoried landslides = 28,713.

a cross-validation approach was adopted for the optimal parameter search. (5) The results of the RS–SVM model were compared with the results of the general SVM method, and a comprehensive validation of the reliability and predictive ability of the landslide susceptibility model is presented. (6) As the prediction accuracy of the RS–SVM model depends on sample size, a series of tests were carried out by varying the sample size. Five subsets were randomly generated from the entire dataset. Four subsets were sampled from 20% to 80% by 20% step. Another subset was 10%.

software library, was employed. The Gaussian Radial Basis Function (RBF) was adopted for the kernel:

6. Results and discussion

  2      K xi ; x j ¼ exp −γ xi −x j  ; γ N 0:

6.1. Landslide susceptibility mapping ð10Þ

where γ is the parameter controlling the width of the Gaussian kernel. This function is not sensitive to outliers (Tax and Duin, 1999), and it can handle the case in which the relationship between class labels and attributes is nonlinear. In addition, only one parameter, γ, must be determined for a chosen C. As landslide susceptibility mapping is a linearly non-separable problem,

Following the procedures described above, landslide susceptibility maps were produced for six individual datasets. The optimal values of γ and C for the RS–SVM model ranged from 0.125 to 2.0 and from 2.0 to 64.0, respectively. For the general SVM model, the value of γ ranged from 0.25 to 1.0 and C varied from 2.0 to 16.0 (Table 4). Fig. 8 presents typical landslide susceptibility maps of probability based on the entire dataset. These maps were obtained by choosing optimal γ and C values of 2.0 and 4.0 for the RS–SVM model and 1.0 and 8.0

Fig. 9. Landslide susceptibility mapping. (A) Susceptibility zone map obtained from the RS–SVM model. (B) Susceptibility zone map obtained from the general SVM model. (C) Map illustrating the estimated model uncertainty for the landslide susceptibility model shown in Fig. 8A.

L. Peng et al. / Geomorphology 204 (2014) 287–301

for the general SVM model, respectively. The probability values of the map range from zero to one, which correspond from low to high landslide susceptibility. The likelihood of potential landslides increases as these numbers become closer to one. In order to improve readability of the map, the natural breaks method in ArcGIS was used to divide the probability map into four susceptibility zones: very low, low, medium, and high (Table 5). Fig. 9A presents the map generated using the RS–SVM model based on this classification, showing that 64.5% of the study area was identified as very low susceptibility. Low- and medium-susceptibility zones comprise 15.8% and 11.6% of the study area, respectively. The zones with high susceptibility constitute 8.1% of the study area, consistent with the historical landslide data. To compare the performance of the two models, the probability map generated using the general SVM model was also divided into four susceptibility zones using the natural breaks method (Fig. 9B). 6.2. Degree of model fitting Table 5 provides a lumped estimate of model fit, but it does not provide a detailed description of the model performance of the different susceptibility classes. To determine the performance, one can conveniently compare the entire area of the training landslides in each susceptibility class with the percentage of an area of the susceptibility class. Fig. 10 presents the percentage of the total area ranked from most to least susceptible (x-axis) against the cumulative percentage of the training landslide area in each susceptibility class (y-axis) for six training datasets, revealing that the RS–SVM and SVM models present similar trends: the model fit will be better as the sample size increases. This figure also indicates that the RS–SVM model is more efficient than the generalized SVM model. Furthermore, regarding the RS–SVM and general SVM models used for the entire dataset, the 10.0% of the study area that is most susceptible to landslides covers 41.8% and 28.8% of the training landslide area, respectively. For the RS–SVM model on the entire dataset, the 56.1% of the study area that is most susceptible to landslides covers 90.5% of the training landslides (probability N 0.439), suggesting that 90.5% of the training landslides occur in medium- and high-susceptibility

297

zones. The figure provides a quantitative measure of the ability of the susceptibility model to match the known distribution of landslides in the study area. 6.3. Model sensitivity To investigate the sensitivity of the two models, we studied the total accuracy variation of the six datasets under different given percentages of the entire area in various susceptibility classes. Fig. 11 is a box plot that presents the cumulative percentage of landslide area in each susceptibility class (y-axis) for the six datasets (in a box) versus the percentage of the total area ranked from most to least susceptible (x-axis). The cross fork inside each box marks the 50th percentile (median) of that distribution. The lower and upper box boundaries mark the 25th and 75th percentiles of each distribution, respectively. The 0 and 100th percentiles are also given in the box. The box plots indicate that an increase in the percentage of susceptible area results in a decrease in the scatter around the median (50th percentile) and in the variability (0 to 100th percentile range) of the total accuracies. The scatter around the median of the RS–SVM model is smaller than that of the general SVM model, particularly for the percentages of susceptible area exceeding 60%. This observation implies that the RS–SVM model is less sensitive than the general SVM model, possibly because the RS has deleted some redundant or disturbed attributes and is better at capturing dominant factors from complex attribute space. The SVM model seeks to find the hyperplane classified data. The classifier's hyperplane relies only on a few data points, which are called the “support vector”. Thus, the RS–SVM model does not change significantly if the input data vary within a reasonable range, indicating greater robustness. 6.4. Uncertainty in the susceptibility estimate of individual mapping units The adopted approach to ascertain landslide susceptibility provides a single value for the probability of spatial occurrence of landslides for each mapping unit; however, a limitation of this approach is that it does not provide a measure of the uncertainty associated with the probability

Fig. 10. Fitting performance of the RS–SVM and general SVM models for the six datasets.

298

L. Peng et al. / Geomorphology 204 (2014) 287–301

Fig. 11. Box plots of the percentage of landslide area in each susceptibility class (y-axis) for the six datasets versus the percentage of total area ranked from most to least susceptible (x-axis).

estimate. To assess the uncertainty, we used five datasets of random samples and RS–SVM models, to yield five different estimates of the probability of the spatial occurrence of landslides for all mapping units. Fig. 12A compares the mean value of the five probability estimates from the individual training datasets (x-axis) for each mapping unit, with the single probability estimate obtained from the “best” RS–SVM model (y-axis) shown in Fig. 8A. The correlation between the two estimates is high (r2 = 0.9094), indicating that the two classifications are nearly identical. Based on this result, Fig. 12B presents the mean values of five probability estimates of all mapping units from low to high (x-axis) and two standard deviations (2σ) of the same probability estimate (y-axis). The measure of 2σ is relatively low (b0.35) for the mapping units classified as highly susceptible (probability N 0.75) or as largely stable (probability b 0.15). The scatter in the model estimate is larger for intermediate values of the probability (between 0.40 and 0.60). This observation indicates that for the latter mapping units, not only is the model unable to satisfactorily classify the terrain as stable or unstable but also the obtained estimate is also highly changeable and thus not very reliable. The variation in the model estimate shown in Fig. 12B can be modeled by the following equation (black line):

estimate, which is a proxy for the susceptibility model uncertainty (Guzzetti et al., 2006). Eq. (11) can be used to quantitatively estimate the model uncertainty for each mapping unit based on the computed probability estimate. Fig. 9C presents the uncertainty for the RS–SVM model estimated using Eq. (11). To further investigate the relationship between the predicted probability of spatial landslide occurrence and its variation, we randomly selected 30,000 mapping units based on the mean value of the computed probability estimates obtained from the individual training datasets. Fig. 13 presents the rank of the mapping units (x-axis) against statistics of the probability estimates (y-axis). In this figure, the dashed line denotes the average value of the landslide susceptibility estimates, whereas the thin lines denote ±2σ of the estimate. The value of 2σ varies with the predicted probability of the spatial occurrence of landslides. The variation is small for mapping units predicted as highly stable, increases to a maximum value toward intermediate values of the probability of spatial occurrence (between 0.40 and 0.60), where unclassified mapping units are shown, and decreases again to small values for mapping units predicted as highly unstable. 6.5. Analysis of model prediction skill

2

y ¼ −1:3795x þ 1:3843x

0≤x≤1

2

r ¼ 0:8306

ð11Þ

where x is the estimated value of the probability pertaining to an unstable mapping unit and y is the 2σ value of the obtained probability

The main purpose of a landslide susceptibility map is to predict the occurrence of new or reactivated landslides. According to our field survey, many landslides are distributed in high-susceptibility zones. Fig. 14

Fig. 12. Uncertainty associated with the probabilistic estimate. (A) Mean value of the five probability estimates from the individual training datasets (x-axis) with the single probability estimate obtained from the “best” RS–SVM model (y-axis). r2 = 0.9094. (B) Mean value of five probability estimates (x-axis) against two standard deviations of the probability estimate (y-axis). The black line denotes the estimated model uncertainty obtained by a linear regression fit (least-squares method).

L. Peng et al. / Geomorphology 204 (2014) 287–301

Fig. 13. Variation in the estimate of landslide susceptibility obtained from the RS–SVM model. For a random selection of 30,000 mapping units, ranked from low (left) to high (right) susceptibility values (x-axis), the graph presents the probability of the spatial occurrence of landslides (y-axis).

presents the Huanglashi and Zhangjiawan landslides in Badong County and Zigui County, respectively, which are distributed in the highsusceptibility zone. The frontal elevation of the Huanglashi landslides

299

is 139 m, and its trailing edge elevation is 500 m. The landslide is 680 m in length, 0.14 km2 in area and 1.14 × 107 m3 in volume. The landslide has a tongue-shaped plan and a concave-shaped profile, with main sliding nearly north–south directed and the slope of the landslide ranging from 25° to 40°. The Zhangjiawan landslides are composed of three sublandslides, with a total area and volume of up to 0.34 km2 and 1.13 × 108 m3, respectively. Recently, the three sublandslides have been active, and the trend of the slope is unstable. These sublandslides endanger 700 residents and 2 km of provincial roads. For the RS–SVM model on the entire dataset, 86.0% of the validated landslides are in medium- and high-susceptibility zones and 2.5% of the validated landslides are in very-low-susceptibility zones. By comparison, for the general SVM model on the entire dataset, 71.5% of the validated landslides are in medium- and high-susceptibility zones and 6.7% of the validated landslides are in very-low-susceptibility zones. The percentage of each susceptibility zone is shown in Fig. 15, which indicates that the predictive ability of the RS–SVM model is greater than that of the general SVM model. Furthermore, to quantitatively evaluate the ability of the susceptibility model to predict future landslides, the proportion of the landslide area in each susceptibility class was calculated. Fig. 16 presents the percentage of the most susceptible area (x-axis) against the cumulative percentage of the landslide area in each susceptibility class (y-axis), with the solid lines representing the performance of the RS–SVM model. The figure reveals that the 10.0% of the study area that is most susceptible to landslides contains 41.4% of all landslides (solid line with squares), 41.8% of the training landslides (solid line with stars),

Fig. 14. Typical landslides in the study area validated by fieldwork. (A) Huanglashi landslides in Badong County and (B) Zhangjiawan landslides in Zigui County. Original photos were from the Headquarters of Prevention and Control of Geo-Hazards in Area of TGR.

300

L. Peng et al. / Geomorphology 204 (2014) 287–301

As shown in Fig. 16, the RS–SVM and general SVM models predict the most susceptible 50.0% of the total area, which contains 80.6% and 70.1% of all landslides, 81.6% and 73.3% of the training landslides, and 77.2% and 58.2% of the verification landslides, respectively. The forecasting performance of the RS–SVM model is similar for the training landslides and all landslides and slightly worse for the verification landslides, but its predictive ability is greater than the fitting performance of the general SVM model when the most susceptible area is less than 70% (probability N 0.3). The performance of the general SVM model is basically acceptable for the training landslides but is very weak for the verification landslides. Essentially, the RS–SVM model decreases the over fitting of the training dataset. The model does not train unimportant details, and valuable information can be extracted by RS theory, whereas invalid, noisy, and incomplete information can be omitted as much as possible. Therefore, the overall goodness of fit of the RS– SVM model is higher than that of the general SVM model, which increases the accuracy of the model for landslide susceptibility mapping. Fig. 15. Distribution of the validated landslide area in each susceptibility zone.

and 39.9% of the verification landslides (solid line with diamonds). The accuracy of the model for overall landslides and training landslides is very similar, but the model performs less efficiently when applied to the verified landslides, a limitation of all statistical classifications (Michie et al., 1994; Chung and Fabbri, 2003). The dashed lines represent the performance of the general SVM model. The 10.0% of the study area that is most susceptible to landslides contains 26.1% of all landslides (dashed line with circles), 28.8% of the training landslides (dashed line with triangles), and 16.2% of the verification landslides in the study area (dashed line with diamonds). The curve distribution laws of the model are similar to those of the RS–SVM model, but the variability of the curve increases, likely because the general SVM model is more unstable and the RS–SVM model is more robust.

7. Conclusions This paper applies a hybrid model within the framework of GIS to landslide susceptibility mapping in the Three Gorges Reservoir area. The proposed model utilizes the greater generalization performance of SVMs and the effective elimination of redundant information by RS theory. The landslide susceptibility map obtained by the hybrid model was compared with that obtained by the general SVM, demonstrating the superior predictive performance of the proposed model. Furthermore, the landslide susceptibility map indicates four relative classes of landslide susceptibility zones. Medium- and high-susceptibility zones comprise 19.7% of the study area, encompassing 89.5% of historical landslides, and are mainly distributed throughout Badong County and other towns, as well as the rural residential and transportation areas along the Yangtze River and its tributaries. The proportion of highsusceptibility zones is far smaller than those of the low- and very-lowsusceptibility zones. The landslide susceptibility map produced in the present study can be used for landslide hazard assessment at regional levels and to support hazard mitigation actions by local authorities. Furthermore, the landslide susceptibility model was comprehensively evaluated using a set of tests, including the degree of model fit, the robustness of the model, the uncertainty associated with the probability estimate, and the model prediction skill. The results demonstrated that the hybrid model has superior prediction skill and higher reliability. In addition, 13 environmental parameters were identified as the most important factors for predicting landslide susceptibility by the RSbased reduction algorithm; these factors should be carefully considered for landslide prediction in this area.

Acknowledgments This research is supported by the National Program on Key Basic Research Project (Grant 2011CB710601), the National High-tech R&D Program of China (Grant 2012AA121303), and the Ministry of Land and prevention of geological disasters in Three Gorges reservoir area major research projects (SXKY3-6-2) grants. We are grateful to the Headquarters of Prevention and Control of Geo-Hazards in Area of TGR for providing data and material. We also thank the editor and anonymous referees for their comments. Fig. 16. Prediction skill of landslide susceptibility models for different datasets. The solid line with squares represents the performance of the RS–SVM model prediction for all landslides. The solid line with stars denotes the performance of the RS–SVM model prediction for the training landslides. The solid line with diamonds represents the performance of the RS–SVM model prediction for the verification landslides. The dashed line with circles represents the performance of the general SVM model prediction for all landslides. The dashed line with triangles denotes the performance of the general SVM model prediction for the training landslides. The dashed line with diamonds denotes the performance of the general SVM model prediction for the verification landslides.

Appendix A. Supplementary data Supplementary data associated with this article can be found in the online version, at http://dx.doi.org/10.1016/j.geomorph.2013.08.013. These data include Google maps of the most important areas described in this article.

L. Peng et al. / Geomorphology 204 (2014) 287–301

References Aldridge, C.H., 1999. Discerning Landslide Hazard Using a Rough Set Based Geographic Knowledge Discovery Methodology. In: Whigham, P.A. (Ed.), Proceedings of Eleventh Annual Colloquium of the Spatial Information Research Centre. Citeseer, University of Otago, New Zealand, pp. 251–266. Atkinson, P.M., Massari, R., 2011. Autologistic modelling of susceptibility to landsliding in the Central Apennines, Italy. Geomorphology 130, 55–64. Ayalew, L., Yamagishi, H., 2005. The application of GIS-based logistic regression for landslide susceptibility mapping in the Kakuda-Yahiko Mountains, Central Japan. Geomorphology 65, 15–31. Bai, S.B., Wang, J., Lu, G.N., Zhou, P.G., Hou, S.S., Xu, S.N., 2010. GIS-based logistic regression for landslide susceptibility mapping of the Zhongxian segment in the Three Gorges area, China. Geomorphology 115, 23–31. Ballabio, C., Sterlacchini, S., 2012. Support vector machines for landslide susceptibility mapping: the Staffora river basin case study, Italy. Math. Geosci. 44, 47–70. Barakat, N., Bradley, A.P., 2010. Rule extraction from support vector machines: a review. Neurocomputing 74, 178–190. Bazan, J., Szczuka, M., 2005. The Rough Set Exploration System. In: Peters, J., Skowron, A. (Eds.), Transactions on Rough Sets Iii. Lecture Notes in Computer Science. Springer, Berlin Heidelberg, pp. 37–56. Bazan, J.G., Nguyen, H.S., Nguyen, S.H., Synak, P., Wroblewski, J., 2000. Rough Set Algorithms in Classification Problem. In: Polkowski, L., Tsumoto, S., Lin, T.Y. (Eds.), Rough Set Methods and Applications. Physica-Verlag, Heidelberg, New York, pp. 49–88. Brabb, E.E., 1991. The world landslide problem. Episodes 14, 52–61. Burges, C.J.C., 1998. A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Disc. 2, 121–167. Carrara, A., Cardinali, M., Detti, R., Guzzetti, F., Pasqui, V., Reichenbach, P., 1991. GIS techniques and statistical models in evaluating landslide hazard. Earth Surf. Process. Landforms 16, 427–445. Castellanos Abella, E.A., Van Westen, C.J., 2008. Qualitative landslide susceptibility assessment by multicriteria analysis: a case study from San Antonio del sur, Guantanamo, Cuba. Geomorphology 94, 453–466. Chang, C.C., Lin, C.J., 2011. LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 1–27 (Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm). Chen, S., 1986. Atlas of Geo-Science Analyses of Landsat Imagery in China. National Remote Sensing Centre, Chinese Academy of Science, Science Press, Beijing (in Chinese). Chen, Q.X., Hu, H.T., Sun, Y., Tan, C.X., 1995. Assessment of regional crustal stability and its application to engineering geology in China. Episodes 18, 69–72. Choi, J., Oh, H.J., Lee, H.J., Lee, C., Lee, S., 2012. Combining landslide susceptibility maps obtained from frequency ratio, logistic regression, and artificial neural network models using ASTER images and GIS. Eng. Geol. 124, 12–23. Chung, C., Fabbri, A., 2003. Validation of spatial prediction models for landslide hazard mapping. Nat. Hazard. 30, 451–472. Deng, Q.L., Zhu, Z.Y., Cui, Z.Q., Wang, X.P., 2000. Mass rock creep and landsliding on the Huangtupo slope in the reservoir area of the Three Gorges Project, Yangtze River, China. Eng. Geol. 58, 67–83. Fell, R., Corominas, J., Bonnard, C., Cascini, L., Leroi, E., Savage, W., 2008. Guidelines for landslide susceptibility, hazard and risk zoning for land-use planning. Eng. Geol. 102, 85–98. Fourniadis, I.G., Liu, J.G., Mason, P.J., 2007. Landslide hazard assessment in the Three Gorges area, China, using ASTER imagery: Wushan–Badong. Geomorphology 84, 126–144. Frattini, P., Crosta, G., Carrara, A., 2010. Techniques for evaluating the performance of landslide susceptibility models. Eng. Geol. 111, 62–72. Frohlich, H., Chapelle, O., Scholkopf, B., 2003. Feature Selection for Support Vector Machines by Means of Genetic Algorithm. Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence. IEEE Computer Society, Sacramento, CA, USA, pp. 142–148. Gorsevski, P.V., Jankowski, P., 2008. Discerning landslide susceptibility using rough sets. Comput. Environ. Urban. Syst. 32, 53–65. Guzzetti, F., Carrara, A., Cardinali, M., Reichenbach, P., 1999. Landslide hazard evaluation: a review of current techniques and their application in a multi-scale study, Central Italy. Geomorphology 31, 181–216. Guzzetti, F., Reichenbach, P., Ardizzone, F., Cardinali, M., Galli, M., 2006. Estimating the quality of landslide susceptibility models. Geomorphology 81, 166–184. He, K.Q., Li, X.R., Yan, X.Q., Guo, D., 2008. The landslides in the Three Gorges Reservoir Region, China and the effects of water storage and rain on their stability. Environ. Geol. 55, 55–63. Hubei Province Geological Survey, 1997. Geological Map of Zigui and Badong County (1:50,000). Hubei Province Geological Survey Press, Wuhan (in Chinese).

301

Kanungo, D., Arora, M., Sarkar, S., Gupta, R., 2009. Landslide susceptibility zonation (LSZ) mapping — a review. J. South Asia Disaster Stud. 2, 81–105. Lee, S., Min, K., 2001. Statistical analysis of landslide susceptibility at Yongin, Korea. Environ. Geol. 40, 1095–1113. Leung, Y., Fung, T., Mi, J., Wu, W., 2007. A rough set approach to the discovery of classification rules in spatial data. Int. J. Geogr. Inf. Sci. 21, 1033–1058. Li, J.J., Xie, S.Y., Kuang, M.S., 2001. Geomorphic evolution of the Yangtze Gorges and the time of their formation. Geomorphology 41, 125–135. Liu, J.G., Mason, P.J., Clerici, N., Chen, S., Davis, A., Miao, F., Deng, H., Liang, L., 2004. Landslide hazard assessment in the Three Gorges area of the Yangtze river using ASTER imagery: Zigui-Badong. Geomorphology 61, 171–187. Liu, C.Z., Liu, Y.H., Wen, M.S., Li, T.F., Lian, J.F., Qin, S.W., 2009. Geo-Hazard Initiation and Assessment in the Three Gorges Reservoir. In: Wang, F.W., Li, T.L. (Eds.), Landslide Disaster Mitigation in Three Gorges Reservoir, China. Springer Verlag, Berlin Heidelberg, pp. 3–40. Liu, J.P., Zeng, Z.P., Liu, H.Q., Wang, H.B., 2011. A rough set approach to analyze factors affecting landslide incidence. Comput. Geosci. 37, 1311–1317. Marjanović, M., Kovačević, M., Bajat, B., Voženílek, V., 2011. Landslide susceptibility assessment using SVM machine learning algorithm. Eng. Geol. 123, 225–234. Melchiorre, C., Matteucci, M., Azzoni, A., Zanchi, A., 2008. Artificial neural networks and cluster analysis in landslide susceptibility zonation. Geomorphology 94, 379–400. Michie, D., Spiegelhalter, D.J., Taylor, C.C., 1994. Machine Learning, Neural and Statistical Classification. WWW page http://www.amsta.leeds.ac.uk/charles/statlog/. Mountrakis, G., Im, J., Ogole, C., 2011. Support vector machines in remote sensing: a review. ISPRS J. Photogramm. Remote Sens. 66, 247–259. Pan, X., Zhang, S.Q., Zhang, H.Q., Na, X.D., Li, X.F., 2010. A variable precision rough set approach to the remote sensing land use/cover classification. Comput. Geosci. 36, 1466–1473. Pawlak, Z., 1982. Rough sets. Int. J. Parallel Prog. 11, 341–356. Pelletier, J., Malamud, B., Blodgett, T., Turcotte, D., 1997. Scale-invariance of soil moisture variability and its implications for the frequency–size distribution of landslides. Eng. Geol. 48, 255–268. Remondo, J., González-Díez, A., De Terán, J.R.D., Cendrero, A., 2003. Landslide susceptibility models utilising spatial data analysis techniques. A case study from the Lower Deba Valley, Guipuzcoa (Spain). Nat. Hazard. 30, 267–279. Tax, D.M.J., Duin, R.P.W., 1999. Support vector domain description. Pattern Recognit. Lett. 20, 1191–1199. Tay, F., Shen, L., 2003. Fault diagnosis based on rough set theory. Eng. Appl. Artif. Intell. 16, 39–43. Thangavel, K., Pethalakshmi, A., 2009. Dimensionality reduction based on rough set theory: a review. Appl. Soft Comput. 9, 1–12. Thangavel, K., Jaganathan, P., Pethalakshmi, A., Karnan, M., 2005. Effective classification with improved quick reduct for medical database using rough system. Bioinforma. Med. Eng. 5, 7–14. Van Den Eeckhaut, M., Hervás, J., Jaedicke, C., Malet, J.P., Montanarella, L., Nadim, F., 2012. Statistical modelling of Europe-wide landslide susceptibility using limited landslide inventory data. Landslides 9, 357–369. Van Westen, C., Van Asch, T., Soeters, R., 2006. Landslide hazard and risk zonation―why is it still so difficult? Bull. Eng. Geol. Environ. 65, 167–184. Vapnik, V., 1995. The Nature of Statistical Learning Theory. Springer-Verlag, Inc., New York. Wang, F.W., Zhang, Y.M., Huo, Z.T., Matsumoto, T., Huang, B.L., 2004. The July 14, 2003 Qianjiangping landslide, Three Gorges Reservoir, China. Landslides 1, 157–162. Wang, H.B., Wu, S.R., Wang, W.B., 2008. A framework for intelligent prediction of landslide hazards. Geol. Sci. Technol. Inf. 27, 17–20 (in Chinese). Wu, S.R., Hu, D., Chen, Q., Xu, R., Mei, Y., 1997. Assessment of the Crustal Stability in the Qingjiang River Basin of the Western Hubei Province and its Peripheral Area, China. Proceedings of the Thirtieth International Geological Congress. Beijing, China. VSP International Science Publishers, Zeist, The Netherlands, pp. 375–385. Wu, S.R., Shi, L., Wang, R.J., Tan, C.X., Hu, D.G., Mei, Y.T., Xu, R.C., 2001. Zonation of the landslide hazards in the forereservoir region of the Three Gorges Project on the Yangtze River. Eng. Geol. 59, 51–58. Xiao, S.R., Liu, D.F., Hu, Z.Y., 2010. Study of high speed slide mechanism of Qianjiangping landslide in Three Gorges Reservoir area. Rock Soil Mech. 31, 3531–3536 (in Chinese). Xu, C., Dai, F.C., Xu, X.W., Lee, Y.H., 2012. GIS-based support vector machine modeling of earthquake-triggered landslide susceptibility in the Jianjiang River watershed, China. Geomorphology 145–146, 70–80. Yalcin, A., 2008. GIS-based landslide susceptibility mapping using analytical hierarchy process and bivariate statistics in Ardesen (Turkey): comparisons of results and confirmations. Catena 72, 1–12. Yao, X., Tham, L., Dai, F.C., 2008. Landslide susceptibility mapping based on Support Vector Machine: a case study on natural slopes of Hong Kong, China. Geomorphology 101, 572–582.