AI-based identification of low-frequency debris flow catchments in the Bailong River basin, China

AI-based identification of low-frequency debris flow catchments in the Bailong River basin, China

Journal Pre-proof AI-based identification of low-frequency debris flow catchments in the Bailong River basin, China Yan Zhao, Xingmin Meng, Tianjun Q...

3MB Sizes 0 Downloads 28 Views

Journal Pre-proof AI-based identification of low-frequency debris flow catchments in the Bailong River basin, China

Yan Zhao, Xingmin Meng, Tianjun Qi, Feng Qing, Muqi Xiong, Yajun Li, Peng Guo, Guan Chen PII:

S0169-555X(20)30097-0

DOI:

https://doi.org/10.1016/j.geomorph.2020.107125

Reference:

GEOMOR 107125

To appear in:

Geomorphology

Received date:

25 October 2019

Revised date:

27 January 2020

Accepted date:

26 February 2020

Please cite this article as: Y. Zhao, X. Meng, T. Qi, et al., AI-based identification of low-frequency debris flow catchments in the Bailong River basin, China, Geomorphology(2020), https://doi.org/10.1016/j.geomorph.2020.107125

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2020 Published by Elsevier.

Journal Pre-proof

AI-based identification of low-frequency debris flow catchments in the Bailong River basin, China Yan Zhao a, Xingmin Meng a, b, c, *, Tianjun Qi a, Feng Qing a, d, Muqi Xiong a, Yajun Li c, Peng Guo c

, Guan Chen c

a

College of Earth and Environmental Sciences, Key Laboratory of Western China’s

of

Environmental System with the Ministry of Education, Lanzhou University, Lanzhou 730000, China.

Gansu Tech Innovation Centre for Environmental Geology and Geohazard Prevention, Lanzhou

ro

b

-p

730000, China.

School of Earth Sciences, Lanzhou University, Lanzhou 730000, China.

d

Department of Emergency Management of Gansu Province, Lanzhou 730070, China.

lP

re

c

na

Abstract: Debris flow is a major geohazard in mountainous regions and pose a significant threat to life and property. The damage caused by debris flows have increased with the expansion of

Jo ur

human settlements and activity into the mountainous regions of China. In regards to risks from debris flows, previously unrecognized low-frequency debris flow catchments constitute an especially significant threat. According to our investigation, only about 500 catchments have debris flow records in more than 2000 catchments of Bailong River basin. The main purpose of this paper is to introduce a new methodology using Artificial Intelligence (AI) that can simultaneously input parameters related to geomorphological conditions and material conditions to better distinguish low-frequency debris flow catchments (LFDs) from medium-high frequency debris flow catchments (MHFDs). A total of 449 prototypical debris flow catchments, 15 parameters, and 9 commonly used learning machines were used to build identification models.

Journal Pre-proof

Debris flow catchments are divided into 4 cases (LO1-LO4) based on different sample ratio of LFDs and MHFDs, which are input into each classifier one by one. Based on model evaluation, the CHAID model in the case LO2 performs best, which only uses five parameters (formation lithology index, land use index, vegetation coverage index, drainage density and landslide density index) to predict LFDs. The results indicate that LFDs are mainly distributed in areas with less

of

landslide distribution and areas with better vegetation coverage compared with MHFDs. However, the distribution of LFDs is concentrated on FLI (formation lithology index) =4, which is the weak

ro

lithology area. The tree classifier seems to be better at classifying fluvial processes. The models

-p

developed in this paper might help us quickly find LFDs in similar areas, and help to assess the

re

risk of debris flows.

1. Introduction

na

basin, China.

lP

Keywords: low-frequency debris flow; Artificial Intelligence; Classification machine; Bailong River

Jo ur

Debris flows, one of major geohazards in mountainous regions, pose a significant threat to life and property. They are rapid, surging flows of water heavily charged with rock sediments in a steep channel (Jomelli et al., 2015). The damage caused by debris flows have increased with the expansion of human settlements and activity into the mountainous areas of China (Zhou et al., 2016). In regards to risks from debris flows, previously unrecognized low-frequency debris flow catchments constitute an especially significant threat. Many of them have a frequency that occurs once in more than 50 years. Because these catchments haven’t had debris flows for a long time, many people choose to live at the outlet of the catchments. Once a rare rainstorm occurs and triggers a debris flow, it may cause a catastrophic disaster. Especially in the past 50 years, with

Journal Pre-proof

the global increase of extreme precipitation events (Beniston and Stephenson, 2004; Fuhrer et al., 2006), the frequency of such sudden debris flow disasters has increased significantly. For these reasons, the identification of low-frequency debris flow catchment areas is of great significance. Possible occurrence times and locations are two crucial topics in debris flow research. In terms of occurrence times, Caine (1980) proposed a global rainfall intensity-duration (I-D)

of

threshold for the occurrence of shallow landslides and debris flows, which has been widely used (Guzzetti et al., 2008; Cannon et al., 2011; Elsebaie, 2012; Mirhosseini et al., 2013; Staley et al.,

ro

2013; Ma et al., 2017). In terms of occurrence locations, an effective method is to identify the

-p

traces left by past debris flows, such as associated landforms (alluvial fans, lobes, etc.) and

re

deposits (poorly sorted cobble to boulder-sized clasts) (Costa and Jarrett, 1981; Jackson et al.,

na

historical information is available.

lP

1987), which allows for the identification of debris flow processes in catchments where no

Most studies use thresholds for one or more geomorphologically related parameters to

Jo ur

distinguish between debris flow and non-debris flow processes. Jackson et al. (1987) used the Melton Index, reflecting the ruggedness of a catchment (Melton, 1965), to discriminate between debris flow catchments and fluvial catchments in southern Canadian Rocky Mountains; Smith (1988) used topographic maps and aerial photographs to identify debris flow catchments in granite and sandstone areas of San Francisco, and suggested that the shale regions in California with slopes greater than 20° are prone to debris flows; Shieh and Chen (1993) used effective watershed areas (the mean slope of the catchment is >15°) to identify potential debris flow catchments in eastern Taiwan. They found that a catchment with an effective watershed area greater than 0.06km2 is prone to debris flows; Bovis and Jakob (1999) pointed out that a debris

Journal Pre-proof

flow catchment in southwestern British Columbia, Canada has a Melton Index of >0.53; Pareschi et al. (2002) studied the potential danger zone assessment of volcanic debris flows in the Campania region of Italy and used the average slope of the catchment to divide the regional watershed into three types of potential danger zones: high (>25°), medium (15°-25°), and low (<15°); Wilford et al. (2004) developed a model using the catchment length combined with the

of

Melton Index to differentiate between debris flow-prone and debris flood-prone catchments; Bertrand et al. (2013) assessed the thresholds for discriminating the types of flow response based

ro

on a large database of 620 fluvial and debris flow catchments and fans using Melton ratio and the

-p

fan slope; Zhou et al. (2016) constructed threshold empirical models for the identification of

re

potential debris flows using catchment area and catchment relief in the Wenchuan earthquake

lP

area. However, a limitation of these studies is that the morphometric approach ignores the

na

conditions of sediment transfer in the catchment, which could be interrupted by sediment trap (i.e., glacial lakes) or a reduction of the gravitational energy along the streams from upstream to

Jo ur

downstream (Jackson et al., 1987; Marchi et al., 1993; Bertrand et al., 2013). Heiser et al. (2015) used classification models to distinguish between three types of processes, that is, debris flow processes, pure water processes, and fluvial sediment transport processes. They divided the parameters into two categories: the first category includes parameters related to the relief gradients, such as the average channel slope, the Melton Ratio (Melton, 1957), the ruggedness number (Strahler, 1952), the relief ratio (Schumm, 1954), the elevation relief ratio (Wood and Snell, 1960), and the roughness index (Cavalli et al., 2008). The second category includes parameters related to the catchment shape, including the weighted bifurcation ratio (Strahler, 1953), the sediment connectivity index (Borselli et al., 2008), the

Journal Pre-proof

circularity ratio (Miller, 1953), the elongation ratio (Schumm, 1956), and the form factor (Horton, 1932). Recently, Artificial Intelligence (AI) methods have become an important method for extracting information and knowledge from the increasingly available geographic data. However, there are many classification machines for machine learning at present, and the effectiveness and

of

applicability are different. Comparative studies need to be developed to obtain the optimal model. This paper aims to propose a new methodology using Artificial Intelligence (AI) that can

ro

better distinguish low-frequency debris flow catchments (LFDs) from medium-high frequency

-p

debris flow catchments (MHFDs). In this study, based on prototypical LFDs in Bailong River basin,

re

the optima AI identification method is determined by comparing various classifying machines.

2. Study Area

Jo ur

2.1 General conditions

na

identified by the optima AI method.

lP

Low-frequency debris flow catchments that are not recognized in Bailong River basin are

The tectonically active Bailong River basin lies at the transition zone from the Qinghai-Tibet Plateau (the first step) to the Yunnan-Guizhou Plateau and the Loess Plateau (the second step) (Fig. 1a), and the topography is dominated by alpine deep valleys (Wei et al., 2008; Zhang et al., 2018), resulting in large altitude fluctuations, ranging from 406m to 4457m (Fig. 1b). It is strongly influenced by the Asian monsoon, with a range of annual precipitation of 300 mm in the northwest to 900 mm in the southeast. Seventy-five percent of the precipitation occurs between June and September (Xiong et al., 2016; Li et al., 2018). The average minimum and maximum temperatures are −14 °C to 3 °C in January, and 11 °C to 27 °C in July (Johnson et al., 2006).

na

lP

re

-p

ro

of

Journal Pre-proof

Jo ur

Fig. 1. Location of the Bailong River basin and a longitudinal profile across the Bailong River basin. The dividing lines divide China into three steps from west to east. The Bailong River basin is structurally located on the eastern boundary of the Cenozoic Indian-Asian plate collision zone, and is crossed by a well-developed fault zone. The lithology of the strata in the area is quite complicated. Except for the distribution of Jurassic and Cretaceous strata, there are more exposures from the lower Paleozoic Silurian to the Quaternary. The low strength of various rocks provides sufficient conditions for the development of loose solid matter and the formation of landslides and debris flows (Fig. 2). The lithology data is divided into very hard, hard, medium, soft, and very soft according to the hardness of the rock: Quaternary loose material (pebble, gravel, silty clay), very soft; Neogene

Journal Pre-proof

stratified clastic rocks (conglomerate, shale, sandstone), very soft; Paleogene stratified clastic rocks (conglomerate), very soft; Cretaceous stratified clastic rocks (conglomerate, sandstone, mudstone), soft; Jurassic stratified clastic rocks (sandstone, mudstone, conglomerate, shale), medium; Triassic and Permian layered carbonate (limestone, sandstone, shale), hard; Triassic and Permian intrusive rocks (granite, diorite, granite gneiss, basalt, diabase, diabase, andesite,

of

quartz sandstone, gneiss, quartzite, Siliceous conglomerate, siliceous limestone, calcareous conglomerate), very hard; Permian layered metamorphic rocks (sandstone, sandy slate, tuff,

ro

phyllite), medium; Carboniferous layered carbonate rock (limestone), hard; Devonian layered

-p

carbonate rock (slate, phyllite, limestone), medium; Devonian layered carbonate rocks (limestone,

re

shale, slate, sandstone), hard; Silurian layered metamorphic rocks (sandstone, limestone, phyllite,

Jo ur

na

lP

slate), medium; The data is obtained by vectorization of the geological map of 1: 200, 000 scale.

Journal Pre-proof

Fig. 2. Geological map of the Bailong River basin. 2.2 Debris flows and its distribution As a results of large terrain difference, uneven distribution of precipitation, abundant unconsolidated debris from weak rock formations, landslides and gully-type debris flows are very developed (Wang et al., 2017; Xiong et al., 2016). According to the geohazards census by Gansu

of

Institute of Geological Environment Monitoring in 2015, there are 811 known debris flow catchments (Fig. 2), making it one of the most serious areas for debris flow disasters in China. In

ro

these debris flow events, the most serious disaster event is the Zhouqu debris flow of 07 August

-p

2010 which killed 1765 people.

re

The distribution characteristics of debris flows help us to select the appropriate parameters

lP

for machine learning. Combined with debris flow distribution in Fig. 2, we summarize the following



na

points:

Debris flows are mainly distributed in the middle and lower reaches of the mainstream of

Jo ur

the Bailong River. In the middle reaches, debris flows are developed mainly along the large faults. 

Debris flows are mainly distributed in densely populated areas, of which Wudu are the most serious, followed by Wenxian, Zhouqu and Tanchang, and Diebu has less debris flow.



Debris flows are mostly distributed near the faults and folds. The most typical ones are the Guanggaishan-Dieshan fault in the north of the Bailong River anticline and the south of the Diebu-Bailongjiang fault.



Debris flows are mostly distributed in week lithology areas, such as the Silurian phyllite,

Journal Pre-proof

shale, slate, and Tertiary mudstone. 

Debris flows are mostly distributed in places with poor vegetation coverage, where human activities are intense.

3. Methods As mentioned above, a variety of AI classification models have developed, but which method

of

can better identify LFDs requires a comparative study. In this study, a total of 449 prototypical debris flow catchments, 15 parameters, and 9 commonly used classification machines were used

ro

to build identification models. The modelling procedure include several steps, consisting of

-p

prototypical debris flow catchment collection and classification, parameter selection and

re

preprocessing, fitting different classification models, and model evaluation (Fig. 3). Finally, based

Jo ur

na

optimal model.

lP

on model evaluation, the performance of the model was comprehensively compared to select the

Fig. 3. Flow chart of the modelling procedure.

Journal Pre-proof

3.1 Prototypical debris flow catchments collection and classification The debris flow occurrence frequency data for the Bailong River basin were obtained from the Geohazards Census Form of Gansu Province produced by the Gansu Provincial Geological Environment Monitoring Institute, which details the location, occurrence time, occurrence frequency and casualties of each debris flow catchment. Combined with field surveys and

of

verification of a large amount of literature, which include Gansu Debris Flow, Study on Major Natural Disasters in Longnan and Record of Landslide and Debris Flow Disaster in Longnan, 449

Jo ur

na

lP

re

-p

ro

prototypical debris flow catchments were finally selected for modeling (Fig. 4).

Fig. 4. Frequency distribution of debris flows in Bailong River basin. LFD does not currently have a closing definition. According to previous experience, it is generally considered to occur once in ≥20-50 years. Here, we define the frequency of LFDs to

Journal Pre-proof occur once in ≥50 years, because these catchments often cause catastrophic disasters, thus the frequency of MHFDs occurs once in <50 years. In order to have a suitable sample ratio for LFDs and MHFDs (generally >2: 8), and to increase contrast, MHFDs are divided into four cases, which are ≥5 times per year, ≥6 times per year, ≥7 times per year, and ≥8 times per year. Thus, we divided the sample catchments into 4 cases (Table 1), and selected the sample catchments with

of

debris flow frequencies of 1-4 times per year as the test samples (a total of 271 catchments).

Negative

Sample ratio

LO1

≤1/50

≥5

24: 140

LO2

≤1/50

≥6

24: 83

LO3

≤1/50

≥7

24: 60

LO4

≤1/50

≥8

ro

Positive

re

-p

Case

lP

Table 1 The 4 cases of samples.

na

24: 48

Note: “Positive” indicates the frequency threshold (times per year) of the LFDs, and “Negative”

Jo ur

indicates the frequency threshold (times per year) of the MHFDs. 3.2 Parameter selection

The formation of a debris flow requires three main conditions: geomorphic conditions (e.g., catchment area, relative relief, channel gradient, and slope), material conditions (e.g., geology, lithology, surface cover, vegetation, and landslide), and triggering conditions (e.g., precipitation and ice and snow melt) (Iverson, 1997; Wei et al., 2008; Wei et al., 2015; Zhao et al., 2017). Special geological and geomorphological conditions result in high rainfall thresholds for LFDs, which can only be triggered by rare heavy rainfall. Therefore, the selection of parameters is divided into two aspects, which are parameters related to geomorphological conditions and

Journal Pre-proof

parameters related to material conditions. The selection of parameters should be easy to obtain. Referring to the existing research, appropriate parameters are selected based on the geomorphology and geological characteristics of debris flow distribution in section 2.2. Parameters related to geomorphological conditions The topography and geomorphological characteristics of the catchment determine its

of

potential energy conditions and affect the water flow process (Wei et al., 2015). The relevant parameters selected are related to the relief gradients, form-roughness and geomorphological

ro

evolution of a debris flow catchment (Heiser et al., 2015).

-p

Catchment area (CA), channel length (CL) (Wilford et al., 2004) and catchment perimeter

re

(CP) can reflect the basic morphometric information of a catchment.

lP

Average slope (AS) was obtained using the Spatial Analyst function in Arc GIS; Catchment

na

relief (CR) is the difference in elevation between the top and the outlet of the catchment (Zhou et al., 2016); Relief ratio (RR) indicates the overall steepness of a catchment and is found by dividing

Jo ur

the CR by the longest horizontal distance of the catchment measured parallel to the major stream (Johnson et al., 1991), i.e., the CL. AS, CR and RR are important impact factors, which can afford enough energy for debris flow initiation and transportation (Zhou et al., 2016). Drainage density (DD) was defined as the ratio of the CL to the CA; cut density (CD) was defined as the ratio of the RR to the CP; circularity ratio (CR2) was defined as the ratio of the CP to the CL (Miller, 1953). The Hypsometric Integral (HI) indicates how the slope is distributed within the watershed (Johnson et al., 1991). The HI can be calculated using the relief ratio method proposed by Pike and Wilson (1971), which can be easily calculated using DEM data:

Journal Pre-proof

HI = (Hmean-Hmin)/(Hmax-Hmin)

(1)

Where Hmean, Hmax, and Hmin are the average, maximum, and minimum elevations in the catchment, respectively. When calculating HI, the selection of outlet is the same as the catchment. Melton Ratio (MR) was defined as the ratio of the CR to the square root of the CA, which

of

reflects the basin’s dynamics and its susceptibility to debris flows and is widely used in the study of debris flow (Melton, 1957; Jackson et al., 1987; Dikau et al., 1996; Bovis and Jakob, 1999;

ro

Wilford et al., 2004).

-p

In the study of Heiser et al. (2015), HI (equivalent to elevation relief ratio) and MR were

re

chosen as key parameters to distinguish different torrential processes (pure water processes,

lP

fluvial sediment transport processes, and debris flow processes), which proves that the two

na

geomorphic parameters are very promising for the identification of torrential processes. It should be noted that the watershed boundaries and channels were obtained from

Jo ur

hydrological analysis (Arc SWAT) using DEM data. The outlet of each catchment is chosen by the apex of the fan (Bertrand et al., 2013; Heiser et al., 2015). The relevant geomorphic parameters selected in this paper can be obtained from DEM and its derived data. The DEM data are ASTER GDEM data, have a resolution of 30 m, were obtained from METI (Japan) and NASA in 2009, and were projected in geographic-WGS84 coordinates. Parameters related to material conditions The amount and difficulty of the materials that can be converted to debris flows in a catchment will affect the rainfall threshold and formation process of the debris flow. Lithology, landslide, land use and vegetation conditions are important factors influencing the occurrence of

Journal Pre-proof

debris flows (Lin et al., 2002; Avanzi et al., 2004; Lu et al., 2007; Tiranti et al., 2008; Zhang et al., 2013; Zhou et al., 2016). The relevant parameters selected in this study include the formation lithology index (FLI), land use index (LUI), landslide density index (LDI), and vegetation coverage index (VCI). The data related to these material conditions cannot be quantified in terms of catchment units, thus limiting the application of these parameters to the analysis of many

of

problems in debris flow through machine learning. In response to this problem, we have given the solution as follows.

ro

The formation lithology data cannot be directly assigned based on the catchment unit (Fig.

-p

5a). In this study, the formation lithology data is divided into very hard, hard, medium, soft, and

re

very soft according to the hardness of the rock, and are assigned numbers 1-5, respectively. Then

lP

convert the assigned vector data to 30m × 30m raster data (Fig. 5b) so that each grid will have an

na

independent lithology value assignment. The average value of all grids in each catchment was obtained through regional statistical analysis (ArcGIS tool, that counts the average of all the grids

Jo ur

in a given area) as the FLI of the catchment (Fig. 5c). The higher the value of FLI, the softer the overall lithology within the catchment and the richer the loose material that can be supplied. The land use data was obtained from the interpretation of remote sensing image (Fig. 5d). Similarly, according to the influence degree of human activities, the unused land, forest land, grassland, cultivated land, and the residential and industrial land use types were assigned numbers 1-5, respectively (Fig. 5e). The average value of all grids in each catchment was obtained through regional statistical analysis and was defined as the LUI of the catchment (Fig. 5f). The higher the LUI value, the higher the degree of influence by human activities. The landslide point density data is obtained using the nuclear density spatial smoothing

Journal Pre-proof

transformation based on the actual landslide point data obtained (Fig. 5g). The areas with landslide density ranges of <0.02, 0.02-0.06, 0.06-0.12, 0.12-0.20, and >0.20 were marked with numbers 1-5, respectively (Fig. 5h). The average value of all grids in each catchment was obtained through regional statistical analysis and was defined as the LDI of the catchment (Fig. 5i). The higher the LDI value, the more abundant the loose material that the landslide can provide.

of

The NDVI was calculated from images obtained by China’s Gaofen-1 Satellite, with a resolution of 8 m. The NDVI data also cannot directly assign values based on the catchment unit

ro

(Fig. 5j). We used regional statistical analysis to obtain the average NDVI of each catchment and

-p

used it as the VCI of the catchment (Fig. 5k). The higher the VCI value, the better the vegetation

Jo ur

na

lP

re

coverage.

Jo ur

na

lP

re

-p

ro

of

Journal Pre-proof

Fig. 5. Process for assigning formation lithology data, land use data, landslide point density data and NDVI data to each catchment. 3.3 Parameter preprocessing In summary, a total of 15 parameters were selected, of which 11 parameters related to geomorphological conditions and 4 parameters related to material conditions (Table 2). Table 2 15 parameters and their units and ranges.

Journal Pre-proof

Name of parameter

Abbreviation

Unit

Range

1

Catchment area

CA

km2

0.08-163.86

2

Catchment perimeter

CP

km

1.29-67.51

3

Channel length

CL

km

0.42-31.01

4

Catchment relief

CR

m

210-2819

5

Average slope

AS

°

12.49-46.41

6

Relief ratio

RR



7

Drainage density

DD

km-1

8

Cut density

CD

/

9

Circularity ratio

CR2

10

Hypsometric

ro

0.01-0.32

-p

0.01-18.22 0.35-39.58

/

0.30-0.78

MR

/

0.10-1.75

VCI

/

0.23-0.70

HI

re

/

Melton Ratio

12

Vegetation

coverage

Jo ur

11

na

value

index

41.03-2836.46

lP

Integral

of

No

13

Formation lithology index

FLI

/

1.00-5.00

14

Land use index

LUI

/

1.00-5.00

15

Landslide density index

LDI

/

1.00-5.00

An important step in preprocessing the parameter was the detection of predictive variables with a high degree of correlation respectively multicollinearity (Heiser et al., 2015). To overcome the problem of multicollinearity, we removed parameters with a correlation coefficient >0.7 as proposed by Dormann et al. (2013). According to the correlation matrix of the parameters (Fig. 6),

Journal Pre-proof

ro

of

CA, CP, RR and MR are removed, the remaining 11 parameters were selected for modeling.

re

3.4 Fitting different classification models

-p

Fig. 6. Correlation matrix of parameters.

lP

The classifiers used in this article runs on SPSS Modeler, which is a set of data mining tools that enable us to quickly develop predictive models using business expertise (Wendler and

na

Gröttrup, 2016). There are many learning machines included in the software that can perform

Jo ur

binary classification, and comparative research is needed to determine the optimal model. Nine classifiers were selected in this study which are the currently used and relatively stable (Table 3). Table 3 Selected classifiers and their characteristics (according to the Help System in SPSS Modeler). Classifier C5.0

Characteristic It works by splitting the sample based on the field that provides the maximum information gain at each level.

Logistic Regression (LR)

It is a statistical technique for classifying records based on values of input fields. It is analogous to linear regression but takes a

Journal Pre-proof

categorical target field instead of a numeric range. Bayesian Network (BN)

It enables us to build a probability model by combining observed and recorded evidence with real-world knowledge to establish the likelihood of occurrences. The node focuses on Tree Augmented Naïve Bayes (TAN) and Markov Blanket networks that are

of

primarily used for classification. It enables you to classify data into one of two groups without over

(SVM)

fitting. SVM works well with wide data sets, such as those with a

ro

Support Vector Machine

Detector

optimal splits. Unlike the C&R Tree and QUEST nodes, CHAID

re

It generates decision trees using chi-square statistics to identify

lP

Interaction

Automatic

(CHAID)

can generate nonbinary trees, meaning that some splits have

na

Chi-Square

-p

very large number of input fields.

more than two branches. It provides a binary classification method for building decision

Statistical Tree (QUEST)

trees, designed to reduce the processing time required for large

Jo ur

Quick Unbiased Efficient

C&R Tree analyses while also reducing the tendency found in classification tree methods to favor inputs that allow more splits.

Classification

and

It generates a decision tree that allows us to predict or classify

Regression Tree (C&RT)

future observations. The method uses recursive partitioning to split the training records into segments by minimizing the impurity at each step.

Artificial Neural Network

It uses a simplified model of the way the human brain processes

Journal Pre-proof

(ANN)

information.

It

works

by

simulating

a

large

number

of

interconnected simple processing units that resemble abstract versions of neurons. Discriminant

Analysis

(DA)

It makes more stringent assumptions than logistic regression but can be a valuable alternative or supplement to a logistic regression

of

analysis when those assumptions are met.

ro

The training samples of 4 cases were input into each classifier one by one for exploratory modeling using the automatic classifier, and the optimal models were selected according to the

-p

AUC (area under ROC curve, which the greater its value, the better the classifier effect; ROC is

re

the receiver operating characteristic curve.). The results are shown in Table 4, from which we can

lP

see that although a lot of classification machines are selected, the final 4 optimal models only

LO2

na

include CHAID.

AUC

LO3

AUC

LO4

AUC

CHAID

0.988

CHAID

1.000

CHAID

0.966

C5.0

0.964

C5.0

0.979

ANN

0.963

0.935

LR

0.948

LR

0.965

C5.0

0.962

LR

0.904

DA

0.924

DA

0.946

LR

0.954

SVM

0.903

C&RT

0.903

ANN

0.942

DA

0.949

DA

0.877

ANN

0.899

SVM

0.927

SVM

0.914

BN

0.701

SVM

0.898

C&RT

0.707

BN

0.749

Quest

0.500

BN

0.715

Quest

0.688

Quest

0.646

LO1

AUC

CHAID

0.986

C5.0

0.940

ANN

Jo ur

Table 4 Modeling results.

Journal Pre-proof

C&RT

0.500

Quest

0.500

BN

0.685

C&RT

0.500

4. Results 4.1 Model evaluation and optimization Five indicators, which are accuracy (ACC), sensitivity (TPR), specificity (TNR), AUC and test sample accuracy (TSA), were selected to evaluate the established model. The ACC ((TP+TN)/(TP+FN+FP+TN)) indicates the correct rate of all samples participating in the modeling

of

(Table 5). The TPR (TP/(TP+FN)) indicates the model’s ability to predict Positives. The TNR

ro

(TN/(TN+FP)) indicates the model’s ability to predict Negatives. The AUC indicates the tradeoff

lP

Table 5 Confusion Matrix.

re

that are not involved in model training.

-p

between sensitivity and specificity (Kern, 2017). The TSA indicates the correct rate of samples

Positive

Negative

Positive

True Positive (TP)

False Negative (FN)

Negative

False Positive (FP)

True Negative (TN)

Jo ur

True label

na

Predicted label

For the CHAID, it will automatically examine the cross-tabulations between each input fields and the outcome, and test for significance using a chi-square independence test, and select important variables to build a decision tree (Kass, 1980). The maximum depth of the tree was set to 5 to prevent overfitting. Bonferroni method was used to adjust significance values to better control the false-positive error rate (Benjamini and Yekutieli, 2001). The Alpha value (significance level for splitting, the smaller the value, the fewer nodes the resulting tree will have.) was used to fine-tuned the model, and the results show that when the

Journal Pre-proof

Alpha value is between 0.04 and 0.1, the model is relatively stable (Fig. 7). In order to avoid the model being too complicated, we chose 0.05 as the optimal value. Predictor importance, which is determined by computing the reduction in variance of the target attributable to each predictor (Saltelli et al., 2004), were calculated to detect whether the selected variables have predictive capability. It can be seen from Fig. 8 that only the CR in LO4 is

of

less important and removed. The predictor importance of the remodeled LO4 is shown in Fig. 8 LO4-remodel, from which we can see that the remaining four variables have certain

Jo ur

na

lP

re

-p

ro

predictive capability.

Fig. 7. Alpha parameter optimization results.

na

lP

re

-p

ro

of

Journal Pre-proof

Fig. 8. Predictor importance of the models.

Jo ur

The evaluation results of the four final models are shown in Table 6, from which we can see that all models except LO3 work well. LO3 has the highest ACC, but the lowest TSA, indicating that it may be overfitting. To prevent overfitting, taking TSA as a reference, LO2 is slightly better than LO1 and LO4, so the LO2-CHAID (Fig. 9) is selected as the optimal model. Table 6 Evaluation results of the 4 models Model

ACC

TPR

TNR

AUC

TSA

LO1

96.95%

87.50%

97.87%

0.986

88.56%

LO2

96.26%

83.33%

100%

0.988

89.30%

LO3

98.81%

95.83%

100%

1

63.84%

Journal Pre-proof

93.06%

87.50%

95.83%

0.977

85.23%

Jo ur

na

lP

re

-p

ro

of

LO4

Fig. 9. CHAID model established by LO2. (Note: “1” indicates LFDs, and “0” indicates MHFDs) 4.2 The final model A purpose of this paper is to identify potential LFDs, so the LFDs in the model (“1” in Fig. 9) are taken out separately and simplified into the final model (Fig. 10). The final model can quickly identify LFDs using only five parameters (FLI, LUI, VCI, DD and LDI) to predict LFDs, and is easy to understand. As can be seen from Fig. 10, there are two paths to identify potential LFDs:

Journal Pre-proof If a catchment without debris flow record has 3.976 < FLI ≤ 4, and LUI ≤ 2.951, it is a potential LFD; If a catchment without debris flow record has 3.976 2.951, LDI ≤3.6,

Jo ur

na

lP

re

-p

ro

of

DD >0.068, and VCI >0.37, it is a potential LFD.

Fig. 10. The final model. 4.3 Identification of LFDs in Bailong River basin The final model was used to identify potential LFDs in 1395 unknown catchments of the Bailong River basin. The results are shown in Figure 11. 262 catchments were identified as potential low-frequency debris flow catchments, which will provide an important reference for debris flow risk management in this region. Of the known LFDs, four LFDs were not identified.

lP

re

-p

ro

of

Journal Pre-proof

5. Discussion

na

Fig. 11. Identification results of the potential LFDs.

Jo ur

5.1 Material condition characteristics of LFDs The importance of the parameters was ranked as FLI >LUI >VCI >DD >LDI in LO2-CHAID (Fig. 8 LO2). The four parameters related to material conditions (FLI, LUI, VCI and LDI) are all used, and DD is the only geomorphic condition parameter. This indicates that the comprehensive material conditions seem to be the main influencing factors of LFDs. The triggering rainfall thresholds of post-seismic debris flows are significantly reduced compared to the rainfall thresholds of pre-earthquake debris flows because of the increased material supply (Lin et al., 2004; Chang et al., 2011; Chen et al., 2009; Chen, 2011), which confirms this view. From another perspective, for a catchment with potential energy conditions for debris flow,

Journal Pre-proof

many studies generally assume that the current material conditions are sufficient and are considered infinite for a period of time (Zimmermann and Haeberli, 1992; Haeberli, 1996; Zimmermann et al., 1997; Bovis and Jakob, 1999), the frequency of debris flow is determined entirely by the frequency of rainfall events that reach the rainfall threshold for this catchment. If the material conditions of a catchment are not sufficient, it will take some time to recover to meet the

of

material needs after a debris flow (Glade, 2005). Therefore, the frequency of debris flow in a catchment is determined by the coupling of matter and rainfall (Glade, 2005; Jakob et al., 2005).

-p

conditions in the catchment play a decisive role.

ro

The low-frequency debris flow is generally triggered by rare rainstorms, so the material supply

re

From the final model in Fig. 10, we can see the distribution characteristics of LFDs in each

lP

parameter. Firstly, the LFDs are mainly distributed at 3.976
na

are mainly distributed in the soft lithology area, and 42% of the LFDs can be identified combined with LUI >2.591. Secondly, LFDs are mainly distributed in areas with less landslide distribution

Jo ur

(LDI ≤3.600), areas with better vegetation coverage (VCI >0.37) and areas with higher drainage density (DD >0.068). These characteristics indicate that, except FLI, the LFDs are less rich in terms of material supply compared with MHFDs. Good vegetation cover can not only protect the surface soil, but also slow the rate of precipitation infiltration and convergence. These characteristics reduce the susceptibility of debris flow to a certain extent. Moreover, the distribution of LFDs is concentrated on FLI=4 (Fig. 12), which is very interesting. According to research by Jakob et al., 2005, debris recharge rates have a significant effect on the frequency of debris flows, so it is possible that FLI=4 joined with other material conditions would cause the debris recharge rates to just maintain a slower state, allowing the debris flow to occur at a lower

Journal Pre-proof

frequency. Coincidentally, the characteristics of these parameters are very similar to the survey results of the Loushe debris flow catchment, in which debris flow occurred after heavy rainfall (109.8 mm) on August 7, 2017 near Baima Village in Wen County. The disaster killed one person and destroyed a large number of houses. However, there were no historical records of previous debris flows. The survey results are as follows: The vegetation coverage in the catchment is very high. The middle and upper reaches

of



are shrub-based, and the coverage rate is over 90%. The downstream vegetation

There are fewer large-scale loose sources such as landslides and collapses. However,

re



-p

vegetation coverage is only about 30%.

ro

coverage is slightly poor. Most of the slopes are cultivated as cultivated land, and the

lP

the rock and soil are broken, and the metamorphic rock structure dominated by phyllite

na

and slate is fragmented in the upper part of the channel, and the thickness of the residual slope formed by physical weathering is between 0.5 and 2m. The thickness of the

Jo ur

downstream residual slope is larger, at 1.2-3m, and the exploration well shows that the thickness of the channel deposit reaches 4-10m. Under the action of heavy rainfall, the loose material of these slopes is activated, and along the way, the two sides of the channel and the bed deposits are eroded to form a large-scale debris flow. 

The intensity of the storm exceeded the extreme value since the local meteorological record, with strong precipitation for two consecutive hours, with precipitations of 46.1 mm and 37.1 mm, respectively (Figure 13).



There have never been catastrophic debris flows in the catchment for 80 years, and local residents have built a large number of houses along the sides of the channel.

Journal Pre-proof

lP

re

-p

ro

of

Fig. 12. Distribution characteristics of LFDs in formation lithology index and land use index.

na

Fig. 13. Rainfall characteristics of Loushe debris flow. 5.2 Geomorphological characteristics of LFDs

Jo ur

In addition, although MR and HI are not used in the final model, they are important reference parameters in many related debris flow classification studies. MR thresholds for a potential debris flow occurrence is from 0.25-0.95 according to several authors (Jackson et al., 1987; Dikau et al., 1996; Bovis and Jakob, 1999; Wilford et al., 2004). Coincidentally, the MR values of LFDs are mainly concentrated in the controversial threshold region (Fig. 14). It is possible that in this range, there are many LFDs that have not yet been discovered.

Journal Pre-proof

of

Fig. 14. Distribution of MR parameters in LFDs and MHFDs. Heiser et al. (2015) compared the hypsometric curves of different fluvial process types and

ro

found that the group of debris flow process types reflects more the young stage, indicated by a

-p

convex shape of the hypsometric curves. In our study area, hypsometric curves for LFDs seem to

re

converge more to the 1:1 line (where the aspect ratio between relative elevation and relative area

Jo ur

na

lP

is 1) compared to MHFDs (Fig. 15), and seem to be in a mature equilibrium state.

Fig. 15. Hypsometric curves for LFDs and MHFDs. 5.3 Applicability of the final model Among the AI models, it seems that tree-type models (such as CHAID and C5.0) are better at classifying fluvial processes, as is the case with Heiser et al. (2015). Among the parameters selected in this study, the geomorphological parameters can be easily calculated from the DEM data. The parameters related to the material are also commonly used, and they can be easily

Journal Pre-proof

calculated using the density or by grading assignment. Therefore, the model established in this study is very practical. The debris flow in the study area belongs to the gully-type debris flow, so the evolution rate of the geomorphology is slow, and the parameters related to the geomorphological conditions will not change much within a certain period of time. The parameters related to material conditions

of

have great variability. For example, after an earthquake, geological phenomena such as collapse and landslide in the area will increase significantly; and on other hand, the increase of human

ro

farming and other activities will also change the land use and vegetation cover. Therefore, the

-p

identification of the LFDs needs to be re-evaluated based on changes in these parameters.

re

6. Conclusions

lP

In this study, prototypical debris flow catchments in Bailong River basin were selected for

na

modeling to identify LFDs. Through the model evaluation and optimization, the CHAID model in the LO2 is selected as the final model. The results of the test samples show that the final model

Jo ur

has good reliability and accuracy for the identification of LFDs. The parameters used in the models and their relative importance indicate that FLI, LUI, VCI, DD and LDI are stronger predictors, and the comprehensive material conditions seem to be the main influencing factors of LFDs. It can be seen from the final model that LFDs generally have weak lithology, few landslides and high vegetation coverage, which seems to be a common feature of LFDs in the study area. Compared with previous studies, we simultaneously selected parameters related to material conditions and geomorphological conditions, and tried various classification machines to obtain a better model. The parameters used in this study are easy to obtain and the final model is simple

Journal Pre-proof

and easy to understand. The final model can help us quickly find LFDs in similar areas and help assess the risk of debris flows. Acknowledgments This work was supported by the National Key Research and Development Program of China (Grant Nos. 2017YFC1501005, 2018YFC1504704); Major scientific and technological projects of

41661144046);

and

the

Fundamental

Research

of

Gansu Province (No. 19ZD2FA002); the National Natural Science Foundation of China (No. Funds

for

the

Central

Universities

ro

(lzujbky-2018-k14, lzujbky-2017-it92). The DEM data were provided by the International Scientific

-p

and Technical Data Mirror Site, Computer Network Information Center, Chinese Academy of

re

Sciences. We thank Runqiang Zeng, Susie Goodall, Liang Qiao, Siyuan Wang, Zhijie Cui, Yi

lP

Zhang, Xi Chen, Lintong Liu, and Zhongkang Yang for their assistance during the study.

na

References

Avanzi, G.D., Giannecchini, R., Puccinelli, A., 2004. The influence of the geological and

Jo ur

geomorphological settings on shallow landslides. An example in a temperate climate environment: The June 19, 1996 event in northwestern Tuscany (Italy). Eng. Geol. 73, 215– 228. https://doi.org/10.1016/j.enggeo.2004.01.005 Benjamini, Y., Yekutieli, D., 2001. The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 29, 1165–1188. https://doi.org/10.1214/aos/1013699998 Beniston, M., Stephenson, D.B., 2004. Extreme climatic events and their evolution under changing

climatic

conditions.

Glob.

Planet.

Change

44,

1–9.

https://doi.org/10.1016/j.gloplacha.2004.06.001 Bertrand, M., Liébault, F., Piégay, H., 2013. Debris-flow susceptibility of upland catchments. Nat.

Journal Pre-proof

Hazards 67, 497–511. https://doi.org/10.1007/s11069-013-0575-4 Borselli, L., Cassi, P., Torri, D., 2008. Prolegomena to sediment and flow connectivity in the landscape:

A

GIS

and

field

numerical

assessment.

Catena

75,

268–277.

https://doi.org/10.1016/j.catena.2008.07.006 Bovis, M.J., Jakob, M., 1999. The role of debris supply conditions in predicting debris flow activity. Surf.

Process.

Landforms

24,

1039–1054.

of

Earth

https://doi.org/10.1002/(SICI)1096-9837(199910)24:11<1039::AID-ESP29>3.0.CO;2-U

ro

Caine, N., 1980. The rainfall intensity: duration control of shallow landslides and debris Flows.

-p

Geogr. Ann. Ser. A, Phys. Geogr. 62, 23–27. https://doi.org/10.2307/520449

re

Cannon, S.H., Boldt, E.M., Laber, J.L., Kean, J.W., Staley, D.M., 2011. Rainfall intensity-duration

lP

thresholds for postfire debris-flow emergency-response planning. Nat. Hazards 59, 209–236.

na

https://doi.org/10.1007/s11069-011-9747-2

Cavalli, M., Tarolli, P., Marchi, L., Dalla Fontana, G., 2008. The effectiveness of airborne LiDAR in

the

recognition

of

Jo ur

data

channel-bed

morphology.

Catena

73,

249–260.

https://doi.org/10.1016/j.catena.2007.11.001 Chang, C.W., Lin, P.S., Tsai, C.L., 2011. Estimation of sediment volume of debris flow caused by extreme

rainfall

in

Taiwan.

Eng.

Geol.

123,

83–90.

https://doi.org/10.1016/j.enggeo.2011.07.004 Chen, J.C., 2011. Variability of impact of earthquake on debris-flow triggering conditions: Case study

of

Chen-Yu-Lan

watershed,

Taiwan.

Environ.

Earth

Sci.

64,

1787–1794.

https://doi.org/10.1007/s12665-011-0981-4 Chen, N., Yang, C., Zhou, W., Hu, G., Li, H., Hand, D., 2009. The critical rainfall characteristics for

Journal Pre-proof

torrents and debris flows in the Wenchuan earthquake stricken area. J. Mt. Sci. 6, 362–372. https://doi.org/10.1007/s11629-009-1064-9 Costa, J.E., Jarrett, R.D., 1981. Debris Flows in Small Mountain Stream Channels of Colorado and

Their

Hydrologic

Implications.

Environ.

Eng.

Geosci.

xviii,

309–322.

https://doi.org/10.2113/gseegeosci.xviii.3.309

and

Causes.

John

Wiley

http://dx.doi.org/10.1016/S0169-555X(97)00047-0

and

Sons,

Chichester.

ro

Movement,

of

Dikau, R., Brunsden, D., Schrott, L., Ibsen, M.L., 1996. Landslide Recognition. Identification,

-p

Dormann, C.F., Elith, J., Bacher, S., Buchmann, C., Carl, G., Carré, G., Marquéz, J.R.G., Gruber,

re

B., Lafourcade, B., Leitão, P.J., Münkemüller, T., Mcclean, C., Osborne, P.E., Reineking, B.,

lP

Schröder, B., Skidmore, A.K., Zurell, D., Lautenbach, S., 2013. Collinearity: a review of

na

methods to deal with it and a simulation study evaluating their performance. Ecography (Cop.). 36, 27–46. https://doi.org/10.1111/j.1600-0587.2012.07348.x

in

Saudi

Jo ur

Elsebaie, I.H., 2012. Developing rainfall intensity–duration–frequency relationship for two regions Arabia.

J.

King

Saud

Univ.

-

Eng.

Sci.

24,

131–140.

https://doi.org/10.1016/j.jksues.2011.06.001 Fuhrer, J., Beniston, M., Fischlin, A., Frei, C., Goyette, S., Jasper, K., Pfister, C., 2006. Climate risks and their impact on agriculture and forests in Switzerland. Clim. Change 79, 79–102. https://doi.org/10.1007/s10584-006-9106-6 Glade, T., 2005. Linking debris-flow hazard assessments with geomorphology. Geomorphology 66, 189–213. https://doi.org/10.1016/j.geomorph.2004.09.023 Guzzetti, F., Peruccacci, S., Rossi, M., Stark, C.P., 2008. The rainfall intensity-duration control of

Journal Pre-proof

shallow

landslides

and

debris

flows:

an

update.

Landslides

5,

3–17.

https://doi.org/10.1007/s10346-007-0112-1 Haeberli, W., 1996. Gletscherschwund, Permafrostdegradation und periglaziale Murgänge im hochalpinen Bereich. In: Oddsson, B.H. (Ed.), Instabile Hänge Und Andere Risikorelevante Natürliche

Prozesse.

Verlag

Birkhäuser,

Basel,

pp.

163–182.

of

https://doi.org/10.1007/978-3-0348-9042-7_14 Heiser, M., Scheidl, C., Eisl, J., Spangl, B., Hübl, J., 2015. Process type identification in torrential in

the

eastern

Alps.

232,

239–247.

-p

https://doi.org/10.1016/j.geomorph.2015.01.007

Geomorphology

ro

catchments

re

Horton, R.E., 1932. Drainage‐basin characteristics. Eos, Trans. Am. Geophys. Union 13, 350–

R.M.,

1997.

The

physics

of

debris

flows.

Rev.

Geophys.

35,

245–296.

na

Iverson,

lP

361. https://doi.org/10.1029/TR013i001p00350

https://doi.org/10.1029/97RG00426

Jo ur

Jackson, L.E., Kostaschuk, R.A., MacDonald, G.M., 1987. Identification of debris flow hazard on alluvial fans in the Canadian Rocky Mountains. GSA Rev. Eng. Geol. 7, 115–124. https://doi.org/10.1130/REG7-p115 Jakob, M., Bovis, M., Oden, M., 2005. The significance of channel recharge rates for estimating debris-flow magnitude and frequency. Earth Surf. Process. Landforms 30, 755–766. https://doi.org/10.1002/esp.1188 Johnson, K.R., Ingram, B.L., Sharp, W.D., Zhang, P., 2006. East Asian summer monsoon variability during Marine Isotope Stage 5 based on speleothem δ18O records from Wanxiang Cave,

central

China.

Palaeogeogr.

Palaeoclimatol.

Palaeoecol.

236,

5–19.

Journal Pre-proof

https://doi.org/10.1016/j.palaeo.2005.11.041 Johnson, P.A., McCuen, R.H., Hromadka, T. V., 1991. Magnitude and frequency of debris flows. J. Hydrol. 123, 69–82. https://doi.org/10.1016/0022-1694(91)90069-T Jomelli, V., Pavlova, I., Eckert, N., Grancher, D., Brunstein, D., 2015. A new hierarchical Bayesian approach to analyse environmental and climatic influences on debris flow occurrence.

of

Geomorphology 250, 407–421. https://doi.org/10.1016/j.geomorph.2015.05.022 Kass, G. V., 1980. An Exploratory Technique for Investigating Large Quantities of Categorical

ro

Data. Appl. Stat. 29, 119. https://doi.org/10.2307/2986296

-p

Kern, A.N., Addison, P., Oommen, T., Salazar, S.E., Coffman, R.A., 2017. Machine Learning

United

States.

Math.

Geosci.

49,

717–735.

lP

Western

re

Based Predictive Modeling of Debris Flow Probability Following Wildfire in the Intermountain

na

https://doi.org/10.1007/s11004-017-9681-2

Li, Y., Armitage, S.J., Stevens, T., Meng, X., 2018. Alluvial fan aggradation/incision history of the

Jo ur

eastern Tibetan plateau margin and implications for debris flow/debris-charged flood hazard. Geomorphology 318, 203–216. https://doi.org/10.1016/j.geomorph.2018.06.016 Lin, C.W., Shieh, C.L., Yuan, B.D., Shieh, Y.C., Liu, S.H., Lee, S.Y., 2004. Impact of Chi-Chi earthquake on the occurrence of landslides and debris flows: Example from the Chenyulan River

watershed,

Nantou,

Taiwan.

Eng.

Geol.

71,

49–61.

https://doi.org/10.1016/S0013-7952(03)00125-X Lin, P.S., Lin, J.Y., Hung, J.C., Yang, M. Der, 2002. Assessing debris-flow hazard in a watershed in Taiwan. Eng. Geol. 66, 295–313. https://doi.org/10.1016/S0013-7952(02)00105-9 Lu, G.Y., Chiu, L.S., Wong, D.W., 2007. Vulnerability assessment of rainfall-induced debris flows

Journal Pre-proof

in Taiwan. Nat. Hazards 43, 223–244. https://doi.org/10.1007/s11069-006-9105-y Ma, C., Wang, Y., Hu, K., Du, C., Yang, W., 2017. Rainfall intensity–duration threshold and erosion competence of debris flows in four areas affected by the 2008 Wenchuan earthquake. Geomorphology 282, 85–95. https://doi.org/10.1016/j.geomorph.2017.01.012 Marchi, L., Pasuto, A., Tecca, P.R., 1993. Flow processes on alluvial fans in the eastern Italian

of

Alps. Zeitschrift fur Geomorphol. 37, 447–458. Melton, M.A., 1957. An Analysis of the Relations among Elements of Climate, Surface Properties,

ro

and Geomorphology. Project NR 389-042, Technical Report 11. Office of Naval Research,

-p

Geography Branch.

re

Melton, M.A., 1965. The geomorphic and paleoclimatic significance of alluvial deposits in

lP

Southern Arizona. J. Geol. 73, 1–38. https://doi.org/10.1086/627044

na

Miller, V.C., 1953. A Quantitative Geomorphic Study of Drainage Basin Characteristics in the Clinch Mountain Area, Virginia and Tennessee. Dep. Geol. Columbia Univ. New York.

Jo ur

Mirhosseini, G., Srivastava, P., Stefanova, L., 2013. The impact of climate change on rainfall Intensity-Duration-Frequency (IDF) curves in Alabama. Reg. Environ. Chang. 13, 25–33. https://doi.org/10.1007/s10113-012-0375-5 Pareschi, M.T., Santacroce, R., Sulpizio, R., Zanchetta, G., 2002. Volcaniclastic debris flows in the Clanio Valley (Campania, Italy): Insights for the assessment of hazard potential. Geomorphology 43, 219–231. https://doi.org/10.1016/S0169-555X(01)00134-9 Pike, R.J., Wilson, S.E., 1971. Elevation-relief ratio, hypsometric integral, and geomorphic area-altitude

analysis.

Bull.

Geol.

Soc.

Am.

https://doi.org/10.1130/0016-7606(1971)82[1079:ERHIAG]2.0.CO;2

82,

1079-1084.

Journal Pre-proof

Saltelli, A., Tarantola, S., Campolongo, F., Ratto, M., 2004. Sensitivity Analysis in Practice: A Guide to Assessing Scientific Models. John Wiley, Chichester. Schumm, S.A., 1954. The relation of drainage basin relief to sediment loss. Int. Assoc. Hydrol. Sci. 36, 216–219. Schumm, S.A., 1956. Evolution of drainage systems and slopes in badlands at Perth Amboy, New Bull.

Geol.

Soc.

Am.

67,

597–646.

of

Jersey.

https://doi.org/10.1130/0016-7606(1956)67[597:EODSAS]2.0.CO;2

ro

Shieh, C.L., Chen, L.J., 1993. A study on the danger ranks of potential debris flows. J. Chi. Soil

-p

Water Cons. 24, 13-19. (In Chinese)

re

Smith, T.C., 1988. A method for mapping relative susceptibility to debris flows, with an example

lP

from San Mateo County. In: Ellen, S.D., Wieczorek, G.F. (Eds.), Landslides, Floods, and

na

Marine Effects of the Storm of January 3-5, 1982, in the San Francisco Bay Region, CA. U.S. Geol. Survey Prof. Paper 1434, pp. 185–194.

Jo ur

Staley, D.M., Kean, J.W., Cannon, S.H., Schmidt, K.M., Laber, J.L., 2013. Objective definition of rainfall intensity-duration thresholds for the initiation of post-fire debris flows in southern California. Landslides 10, 547–562. https://doi.org/10.1007/s10346-012-0341-9 Strahler, A.N., 1952. Hypsometric (area-altitude) analysis of erosional topography. Bull. Geol. Soc. Am. 63, 1117–1142. https://doi.org/10.1130/0016-7606(1952)63[1117:HAAOET]2.0.CO;2 Strahler, A.N., 1953. Revision of Horton's quantitative factors in erosional terrain. EOS Trans. Am. Geophys. Union 34, 345. Tiranti, D., Bonetto, S., Mandrone, G., 2008. Quantitative basin characterisation to refine debris-flow triggering criteria and processes: An example from the Italian Western Alps.

Journal Pre-proof

Landslides 5, 45–57. https://doi.org/10.1007/s10346-007-0101-4 Wang, S., Meng, X., Chen, G., Guo, P., Xiong, M., Zeng, R., 2017. Effects of vegetation on debris flow mitigation: A case study from Gansu province, China. Geomorphology 282, 64–73. https://doi.org/10.1016/j.geomorph.2016.12.024 Wei, F., Gao, K., Hu, K., Li, Y., Gardner, J.S., 2008. Relationships between debris flows and earth factors

in

Southwest

China.

Environ.

Geol.

55,

619–627.

of

surface

https://doi.org/10.1007/s00254-007-1012-3

Beijing

Science

Press,

-p

forecasting.

ro

Wei, F.Q., Gao, K.C., Jiang, Y.H., Zhang, S.J., 2015. Principles and methods of debris flow (In

Chinese)

re

http://ir.imde.ac.cn/handle/131551/17812

Beijing.

lP

Wendler, T., Gröttrup, S., 2016. Data mining with SPSS modeler: Theory, exercises and solutions,

na

Data Mining with SPSS Modeler: Theory, Exercises and Solutions, Springer, Cham, Switzerland. https://doi.org/10.1007/978-3-319-28709-6

Jo ur

Wilford, D.J., Sakals, M.E., Innes, J.L., Sidle, R.C., Bergerud, W.A., 2004. Recognition of debris flow, debris flood and flood hazard through watershed morphometrics. Landslides 1, 61–66. https://doi.org/10.1007/s10346-003-0002-0 Wood, W.F., Snell, J.B., 1960. A quantitative system for classifying landforms. Technical Report EP-124,

U.S.

Army

Quartermaster

Research

and

Engineering

Center,

pp.

20.

http://geomorphometry.org/content/quantitative-system-classifying-landforms Xiong, M., Meng, X., Wang, S., Guo, P., Li, Y., Chen, G., Qing, F., Cui, Z., Zhao, Y., 2016. Effectiveness of debris flow mitigation strategies in mountainous regions. Prog. Phys. Geogr. 40, 768–793. https://doi.org/10.1177/0309133316655304

Journal Pre-proof

Zhang, W., Chen, J. ping, Wang, Q., An, Y., Qian, X., Xiang, L., He, L., 2013. Susceptibility analysis of large-scale debris flows based on combination weighting and extension methods. Nat. Hazards 66, 1073–1100. https://doi.org/10.1007/s11069-012-0539-0 Zhang, Y., Meng, X., Jordan, C., Novellino, A., Dijkstra, T., Chen, G., 2018. Investigating slow-moving landslides in the Zhouqu region of China using InSAR time series. Landslides 15,

of

1299–1315. https://doi.org/10.1007/s10346-018-0954-8 Zhao, Y., Meng, X.M., Zheng, J.Y., Qing, F., 2017. Application of geomorphological theory in study

ro

of debris flow and exploration of its applied theory. Journal of Catastrophology 32, 43-49. (In

-p

Chinese) DOI: 10.3969/j.issn.1000-811X.2017.01.009

re

Zhou, W., Tang, C., Van Asch, T.W.J., Chang, M., 2016. A rapid method to identify the potential of

lP

debris flow development induced by rainfall in the catchments of the Wenchuan earthquake

na

area. Landslides 13, 1243–1259. https://doi.org/10.1007/s10346-015-0631-0 Zimmermann, M., Haeberli, W., 1992. Climatic change and debris flow activity in high-mountain

Jo ur

areas - a case study in the Swiss Alps. Catena Suppl. 22, 59–72. Zimmermann, M., Mani, P., Romang, H., 1997. Magnitude-frequency aspects of alpine debris flows. Eclogae Geol. Helv. 90, 415–420. https://doi.org/10.5169/seals-168173

Journal Pre-proof

Declaration of competing interests

☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Jo ur

na

lP

re

-p

ro

of

☐The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:

Journal Pre-proof

Highlights  A simple decision tree model was built to identify low-frequency debris flows  Geomorphological and material parameters are used simultaneously for modeling  Material seem to be the main influencing factor of low-frequency debris

of

flows

Jo ur

na

lP

re

-p

ro

 The tree classifier seems to be better at classifying fluvial processes