Available online at www.sciencedirect.com
Computers, Environment and Urban Systems 32 (2008) 53–65 www.elsevier.com/locate/compenvurbsys
Discerning landslide susceptibility using rough sets Pece V. Gorsevski a,*, Piotr Jankowski b a
School of Earth, Environment & Society, Bowling Green State University, Bowling Green, OH 43403, USA b Department of Geography, San Diego State University, San Diego, CA 92182, USA Received 11 September 2006; received in revised form 27 April 2007; accepted 27 April 2007
Abstract Rough set theory has been primarily known as a mathematical approach for analysis of a vague description of objects. This paper explores the use of rough set theory to manage the complexity of geographic characteristics of landslide susceptibility and extract rules describing the relationships between landslide conditioning factors and landslide events. The proposed modeling approach is illustrated using a case study of the Clearwater National Forest in central Idaho, which experienced significant and widespread landslide events in recent years. In this approach the landslide susceptibility is derived from decision rules of variable strengths computed in rough set analysis and presented on maps for roaded and roadless areas. The rough set approach to modeling landslide susceptibility offers advantages over other modeling methods in accounting for data vagueness and uncertainty and in potentially reducing data collection needs. From an application perspective the rough set-based approach is promising as a decision support tool in forest planning involving the maintenance, obliteration or development of new forest roads in steep mountainous terrain. Ó 2007 Elsevier Ltd. All rights reserved. Keywords: Landslides; Rough sets; Land-use management; Forest roads; Rule-based predictive models
1. Introduction Landslides initiated in steep mountainous terrain are a major concern to land-use managers worldwide. Human activities, such as road-building and deforestation accelerate the process of landslides resulting in adverse impacts to the environment (Burton & Bathurst, 1998; Chung, Fabbri, & van Westen, 1995). In the US alone an estimated annual average cost of $1.5 billion dollars due to landslides has been reported (Glade, 1998). During above average wet seasons such as the winter of 2005/06 along the most of U.S. Pacific Coast this number might have been substantially higher. In many developing countries landslides are a serious hazard resulting in losses of life and at least 0.5% of the gross national product (Chung et al., 1995). Poorly designed land use practices such as road construction and forest harvesting are widely recognized to *
Corresponding author. E-mail addresses:
[email protected] (P.V. Gorsevski), piotr@ geography.sdsu.edu (P. Jankowski). 0198-9715/$ - see front matter Ó 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.compenvurbsys.2007.04.001
increase the risk of landsliding in forested and mountainous terrain (Dyrness, 1967; McClelland et al., 1997; Sidle, Pearce, & O’Loughlin, 1985). For instance, roads are often constructed on steep terrain, weak geologic material, and when combined with heavy rainfall, constitute a high-risk situation. Furthermore, when roads are placed on steep slopes, the geometry of the slope is changed because cut slopes are steeper than natural hill slopes. Therefore, roads intercept water flowing downhill altering the natural drainage flow of both surface and subsurface water (Elliot, Foltz, Luce, & Koler, 1996). Change in the forest cover, especially from clear-cuts, unfortunately results in similar consequences as the changes caused by roads, including the tendency to decrease slope stability and increase the risk of landsliding. Landslide events related to forest roads and harvesting are considered as a major reason for deteriorated water quality, loss of fish spawning habitat, and debris jams that may break during peak flows, thereby scouring channels and destroying riparian vegetation. Therefore, reliable methods of mapping areas susceptible to landslides are essential for land-use management
54
P.V. Gorsevski, P. Jankowski / Computers, Environment and Urban Systems 32 (2008) 53–65
and are rapidly becoming the standard tool for sound landuse planning. Consequently, there is a need for methods guiding managers to choose the best management strategies while minimizing impacts from land-use activities, such as road construction and forest harvesting in vulnerable slope areas. Many methods and techniques have been proposed to evaluate where or when landslides are most likely to occur, some using Geographic Information Systems (GIS) (Carrara, 1983; Carrara, Cardinali, Guzzetti, & Reichenbach, 1995; Duan & Grant, 2000; Gorsevski, 2002; Gorsevski, Gessler, & Jankowski, 2003; Gorsevski, Gessler, & Jankowski, 2004; Gorsevski, Jankowski, & Gessler, 2005; Gorsevski, Gessler, Foltz, & Elliot, 2006a; Gorsevski, Gessler, Boll, Elliot, & Foltz, 2006b; Gorsevski, Jankowski, & Gessler, 2006c; Hammond, Hall, Miller, & Swetik, 1992; Montgomery & Dietrich, 1994; Okimura & Ichikawa, 1985; Wu & Sidle, 1995). Statistical models that link environmental attributes using spatial correlation are the most widely used methods for mapping landslide susceptibility (Carrara, 1983; Carrara et al., 1991; Chung et al., 1995; Chung & Fabbri, 1999; Dhakal, Amada, & Aniya, 2000; Gorsevski, 2002; Gorsevski et al., 2003, 2004, 2005, 2006a). However, the applicability of statistical methods sometimes is limited by rigid a priori data assumptions and the lack of techniques to analyze and characterize the structural relationships existing in the data. This paper presents an alternative approach to the analysis of landslide susceptibility using the rough set (RS) theory (Pawlak, 1982). The RS theory has been used in a number of discipline-specific applications such as remote sensing (Pal & Mitra, 2002), geographic information science (Ahlqvist, Keukelaar, & Oukbir, 2003), economics (McKee, 2003; Slowinski & Zopounidis, 1995), multi-attribute decision analysis (Pawlak & Slowinski, 1993, 1994), medicine (Komorowski & Øhrn, 1999), civil engineering (Arciszewski & Ziarko, 1990), and artificial intelligence (Predki, Slowinski, Stefanowski, Susmaga, & Wilk, 1998; Predki & Wilk, 1999). The RS theory deals with identifying structural relationships in the data and it is useful in discovering potentially significant facts or data patterns in multidimensional attribute collections. Because of uncertainty and imprecision in classifying information, the RS theory considers information about classification decisions to be vague and approximates information classes by providing their ‘‘rough” description (Pawlak & Slowinski, 1993). The approximation in the RS theory reflects different levels of granularity in information while the amount of information affording unambiguous classification of objects determines the degree of roughness. For example, observing landslides from aerial photos with different spatial or spectral resolution will yield different amounts of information for classification of landscape features into these associated with the landslides and those not associated with landslides. Compared to other mathematical approaches that deal with vagueness and uncertainty the RS theory bears some resemblance to the Dempster–Shafer (D–S) theory (Demp-
ster, 1967; Shafer, 1976). However, the difference between the two theories is that the RS theory uses sets of lower and upper approximations to represent knowledge in data collections while the D–S theory uses belief functions represented by lower and upper probability functions. The approximations for a given data set derived with the RS theory are based solely on data while the approximations derived by the D–S theory involve calculations of belief values using both subjective judgments and data (Dempster, 1967). The D–S methodology, coupled with the fuzzy k-means, was used by Gorsevski et al. (2005) to predict road related (RR) landslide susceptibility (spatial locations within roaded areas) and non-road related (NRR) susceptibility (spatial locations within non-roaded areas) for a study site used by McClelland et al. (1997). This paper focuses on the application of the RS methodology to the same study area using the same datasets and comparatively evaluates the fitness of both methodologies to map landslide susceptibility. The proposed approach is demonstrated using a case study of the Clearwater National Forest (CNF) in central Idaho. The study purpose has been to address the following two research questions: (1) Can RS models for RR and NRR landslide susceptibility be developed with the same predictor variables? (2) Is spatial prediction of landslide susceptibility using the RS approach better than spatial prediction using the D–S approach? Answering the first question will help establish whether different combinations of predictor variables are necessary in the development of predictive RS models and if the development of two independent models for RR and NRR landslide susceptibility is necessary. Answering the second question will help determine if the RS methodology yields better predication of landslide susceptibility than the D–S approach. 2. Rough set theory The rough set (RS) theory was introduced by Pawlak (1982, 1991) as a mathematical framework for approximate reasoning dealing with uncertainty and vagueness in decision making processes. The theory belongs to a branch of computer science called soft computing and has been used in data mining, knowledge discovery, pattern recognition, machine learning, and other areas of artificial intelligence. The RS theory is based on the assumption that each object in the universe is associated with knowledge which can be used to classify it. The knowledge is represented in an information table or an information system (data table) where rows represent objects (for example, landslide locations represented by polygons or points) and the columns represent attributes (for example, elevation, slope, solar radiation, and wetness index). A special form of information table is called a decision table, where one column represents
P.V. Gorsevski, P. Jankowski / Computers, Environment and Urban Systems 32 (2008) 53–65
a decision (classification) and the rest of columns represent conditions (object characteristics). More formally, an information system is a pair A = (U, A), where U is a non-empty finite set of objects called the universe, and A = ({a1, . . . , an) is a non-empty finite set of attributes, i.e., ai:U ? Va for a 2 A, where Va is called value set of the attribute ai. The decision table is then a pair A = (U, A \ {d}), where d represents a distinguished attribute called a decision. In the decision table the attributes that belong to A are called conditional attributes or conditions and are assumed to be finite. The ith decision class is a set of objects Ci = {o 2 U:d(o) = di}, where di is the ith decision value taken from decision value set V d ¼ fd i ; . . . ; d jV d j g. The indiscernibility relation occurs when objects with the same attribute values are present in the information system. For any subset of attributes B A the indiscernibility relation IND(B) for x, y 2 U is defined as follows: x INDðBÞy () 8a2B aðxÞ ¼ aðyÞ: Indiscernible objects are not distinguishable and cannot be further classified. The indiscernibility relation thus partitions an information system into collections of indiscernible objects called elementary sets, which can be used to build knowledge about a real or abstract world (Slowinski, Stefanowski, Greco, & Matarazzo, 2000). This fact leads to the concept of rough set in terms of approximation of any set X, where X U, and the classification of elementary sets comprising the set X into disjoint categories. Therefore, a rough set is a pair of a lower and upper approximation of indiscernible objects, where BX = {x 2 U : [x]B X} and B is the lower approximation of X, and BX ¼ fx 2 U : ½xB \ X 6¼ 0g and B is the upper approximation of X. The lower approximation consists of all objects which certainly belong to the set and are certainly classified as elements of that set, while upper approximation consists of all objects which possibly belong to the set and are possibly classified as elements of that set. The boundary or the doubtful region is defined as BNB ðX Þ ¼ BX BX , which is the difference between the upper and the lower approximation and is a set of elements which cannot be certainly classified as belonging to the set X using the set’s attributes. The boundary is a non-definable set of the universe and contains objects that cannot be classified with certainty into a set. A set X is an ordinary set, called an exact set if BNB(X) = ;, which results in BX ¼ BX . Otherwise, if BNB(X) 6¼ ;, the set X is a rough set that can be approximated with some accuracy. An accuracy of approximation is influenced by the existence of a doubtful region where a greater doubtful region of a set yields a lower accuracy of a set. The accuracy of approximation is defined as follows: aB ðX Þ ¼
jBX j ; jBX j
where X 6¼ 0
55
where the quotient is represented by the lower and the upper approximation of X. When classifying objects, this ratio represents the percentage of possible correct decisions while the measure of roughness that quantifies our knowledge is obtained by the following equation: qB(X) = 1 aB(X), where 0 6 aB(X) 6 1 and 0 6 qB(X) 6 1 for any B A and X U. Additionally, quality of classification can also be defined for individual classes as a quotient of the cardinalities of all lower approximations of the classes in which the objects set is classified and the cardinality of the object set. Discovering dependences between attributes in an information system is a fundamental task in the RS theory that enables reduction of unnecessary attributes. Any B A such that IND(A) = IND(B) is a reduct in information system A and RED(A) is the set of all reducts for A. Therefore, a reduct is the minimal subset of attributes that provides the same quality of classification as the set of all attributes. When an information table has more than one reduct the intersection of all reducts CORE(A) = \RED(A) comprises the core. The core represents a collection of the most important attributes in the information table. In the case of an information system that contains conditions and decisions, a reduced information table can be used to generate rules such as ‘‘if conditions then decisions” (Slowinski et al., 2000; Ziarko & Shan, 1994). Deterministic rules are generated when objects match one or more rules indicating a unique decision; non-deterministic rules are generated when objects match one or more rules indicating multiple decisions. For example, when conditions from the information system yield unique decisions of only presence of landslides or only absence of landslides the rules are deterministic. The rules are non-deterministic when conditions from the information system yield nonunique decisions such as concurrent decisions of presence and absence of landslides. The number of objects satisfying the conditions of the rule determines the strength of the rule. The strength of the rule can be used as a measure of uncertainty of assigning objects to a decision class. Deterministic and non-deterministic rules can be generated to assign new objects to decision classes by matching rule premises with the objects characteristics. 3. Materials and methods 3.1. Study area The study area is within the Clearwater National Forest (CNF), located on the western slopes of the Rocky Mountains in north central Idaho. The CNF is located west of the Montana border and is bounded on three sides by four other National Forests; the Lolo in Montana; the Bitterroot in Montana and Idaho; the Nez Perce in Idaho; and the Panhandle in Idaho. The CNF map is shown in Fig. 1. Nearly 5200 km2 including wilderness areas are designated as roadless areas, while 2235 km2 has been developed with roads. The topography is highly dissected with
56
P.V. Gorsevski, P. Jankowski / Computers, Environment and Urban Systems 32 (2008) 53–65
Fig. 1. Distribution of NRR and RR predictor and test landslides over the Clearwater National Forest. The dark grey area represents the watershed used for the prediction of landslides while the light area represents the watershed used for the validation of landslides.
elevations ranging from 485 m to 2700 m and slopes vary between 0% and 100%. The climate is characterized by dry and warm summers, and cool wet winters (McClelland et al., 1997). Annual precipitation ranges between 600 mm at low elevations to more than 2000 mm at high elevations. Much of the annual precipitation falls as snow during winter and spring, while peak stream discharge occurs in late spring and early summer. The soils are highly variable but typically well drained and primarily derived from parent materials such as granitics, metamorphic rocks, quartzites, and basalts or surface colluvium. The land cover is predominately forested with coniferous species such as grand fir (Abies grandis), Douglas fir (Pseudotsuga menziesii), subalpine fir (Abies lasiocarpa), western red cedar (Thuja plicata), western white pine (Pinus Monticola), and various other shrubs and grasses that have short growing seasons, particularly at the higher elevations. 3.2. Landslides Landslides were assessed through aerial reconnaissance flights and field inventory in July 1996. Aerial photography was acquired at a scale of 1:15,840 followed by photo interpretation between October 1996 and February 1997 (McClelland et al., 1997). The landslides interpreted from aerial photos were classified into road related (RR) and non-road
related (NRR). A total of 865 landslides were recorded, with 55% RR and 45% NRR landslides. The presence or absence of a landslide was represented on 30 m resolution grid with values of 1 for presence and 0 for absence. The initiation area of each landslide (i.e. the area where the main scarp of the landslide occurred) was interpreted as the point representing the presence of a landslide. The RR landslides, which are associated with forest roads, were coded separately from the NRR landslides. This enabled the development of independent quantitative models for each dataset (RR, NRR). 3.3. Compilation and development of data sets The information tables for the RR and NRR were developed from a total of 15 environmental attributes derived from 30 m digital elevation models (DEMs) using System for Automated Geographical Analysis (SAGA) software (Conrad, 2004). The information tables consisted of the following primary and secondary attributes (Fig. 2): elevation, slope, aspect, profile curvature, plane curvature, tangent curvature, compound topographic index (CTI), stream power, sediment transport capacity index, catchment area, catchment slope, catchment height, convergence index, solar radiation and duration of insolation (Wilson & Gallant, 2000). The primary attributes are computed
P.V. Gorsevski, P. Jankowski / Computers, Environment and Urban Systems 32 (2008) 53–65
57
Fig. 2. Primary and secondary attributes with their corresponding quartile classes that were used for the development of the RR and NRR information systems.
directly from a DEM and include the first and the second derivatives of the elevation surface. Secondary or compound attributes are computed from combinations of two or more primary attributes (Moore, Gessler, & Nielsen, 1993; Wilson & Gallant, 2000). Such computed attributes can be used to quantitatively describe water movement, hydrological process, morphometry, catchment position, and soil-landscape processes (Beven & Kirkby, 1979; Burrough, Wilson, van Gaans, & Hansen, 2001; Moore, Grayson, & Ladson, 1991, 1993; O’Loughlin, 1986; Wilson & Gallant, 2000). These attributes comprise common topographic and hydrological characteristics of landslide susceptibility (Duan & Grant, 2000; Fernandes et al., 2004; Gorsevski et al., 2003; Montgomery & Dietrich, 1994). For instance, slope affects both the hydrological conditions and landslide susceptibility. Contributing area defines the location of the convergent segments in a landscape that are directly associated with the concentration of surface and subsurface flows, contributing to soil saturation and landslide susceptibility. The CTI is used to represent the spatial distribution of water flow across a given area and to predict zones of saturation. Significance of other attributes can characterize spatial variability of specific processes occurring in the landscape that can be tied to landslide susceptibility. In addition a decision binary variable was included in both information tables to represent presence and absence of landslides. Two sub-watersheds were used to separate the data into predictor and validation data (Fig. 1). The presence of landslides in each information table for the predictor data was represented by the landslides from one sub-watershed while the absence of landslides was represented by randomly sampled 1% of all non-landslides cells. The random sampling was implemented to ensure that each non-landslide cell (absence) had the same chance of being present in the information table and to reduce the data size due to the non-landslide
cells representing the overwhelming majority of the landscape. Also, using the entire non-landslide dataset could have led to assimilation of fewer landslide cells (presence) and a false conclusion that there is no difference between the presence and absence cells used for the prediction. This is because a large proportion of the landscape did not experience any landsliding during the events described in the paper, but the absence of landslides does not imply an absence of landslide hazard which can be associated with different events and time period. The sampling was carried out within a 100 m wide road buffer (road right-of-way) for RR landslides and outside of the buffer for NRR landslides. Herein, the randomly sampled percentage value was an arbitrary choice for the analysis but other attempted percentage values (i.e., 0.5%, 1.5% and 2%) yielded insignificant sensitivity in the spatial patterns of the final results. However, different percentage values that were tested contributed to small differences in the classification quality and there were some trivial differences in the overall rule strengths. 3.4. Methods Two information tables for the NRR and RR landslides were constructed by categorizing each primary and secondary attribute of continuous data into quartile classes (Fig. 2). The NRR information table consisted of a total of 999 objects representing the absence of landslides (non-landslides) and a total of 163 objects representing the presence of landslides (landslides). The RR information table consisted of a total of 337 objects representing the absence of landslides (non-landslides) and a total of 184 objects representing the presence of landslides (landslides). All objects in the information tables were described by fifteen attributes and one decision class represented by 1 for presence and 0 for absence of landslides.
58
P.V. Gorsevski, P. Jankowski / Computers, Environment and Urban Systems 32 (2008) 53–65
The rough set analysis was carried out with ROSE2 software (Predki et al., 1998; Predki & Wilk, 1999). First the approximations of the decision classes and the quality of classifications were calculated for the information tables. This was followed by the generation of reducts, which were used for the derivation of decision rules. The obtained decision rules were implemented in ArcGIS to generate RR and NRR landslide susceptibility maps for the study area. The spatial predictions obtained by the RS approach were validated using an independent test data and compared with spatial predictions obtained by the Fuzzy/D–S approach (Gorsevski et al., 2005). 4. Results The RS approximation for the NRR landslide susceptibility showed a high quality classification of 0.9948, which means that 99.48% of cells were correctly classified by means of primary and secondary attributes. The lower approximation for the absence class contained 996 cells while the upper approximation contained 1002 cells. The accuracy of approximation equaled 0.9940, which means that the ambiguity (roughness) associated with the classification of cells by means of primary and secondary attributes was very small. The difference between the total number of 999 cells representing the absence of landslides and the number of cells in the lower and upper approximations results from the indiscernibility relation. This means that there are a few cells indiscernible by attribute (their attribute values are the same), which have been assigned to different classes instead of being assigned to the same class. Such cells cannot belong with certainty to either of the classes (presence, absence) but only possibly and hence constitute a boundary region of a rough set. A converse situation is also possible where a few cells assigned to the same class and hence indiscernible by class, have different attribute values. In this case 6 cells belong to the boundary region and cannot be assigned with certainty to the nonlandslides set, which means that either they were classified as ‘‘absent” but their attribute values were not unique to the ‘‘absent” class or that they had the same attribute values and some of them were classified as ‘‘absent” whereas some as ‘‘present”. The lower approximation for the presence class contained 160 cells while the upper approximation contained 166 cells. The accuracy of approximation equaled 0.9639. Such results indicate that the presence and absence classes are highly distinguishable. This is because the quality of the classification is close to one while a very few objects represent the boundary region, which indicates the region of uncertainty. The objects which represent the boundary region are indiscernible objects by means of available knowledge from attributes which prevent their precise classification. The lower approximation contains all objects that doubtlessly belong to the presence/absence set while the upper approximation contains all objects that possibly belong to the presence/absence set. Also, the high accuracy
of approximation values suggests very small doubtful regions with the sets in this analysis. The RS approximation for the RR landslide susceptibility showed also a high quality classification of 0.9962. The lower approximation for the absence class contained 336 cells while the upper approximation contained 338 cells. The accuracy equaled 0.9941. The lower approximation for the presence class contained 183 cells while the upper approximation contained 185 cells. The accuracy equaled 0.9892. The results for both sets of landslides – RR and NRR indicate that fifteen attributes used in the RS analysis achieve the high discrimination of cell-based locations between the presence and absence landslide classes while the RS-based reduction analysis suggested only a few of the attributes are needed for the approximation of the two decision classes. In the next sections we present the results addressing the two research questions from the introduction section. 4.1. Comparison of predictor datasets and rule-based predictive models The decision rules for predicting landslide susceptibility were obtained from different predictor datasets of reduced information tables. The reduction analyses (identification of reducts) were run in ROSE2 to find the most meaningful attributes in the information tables called the core. The core for the NRR landslides consisted of 13 attributes and 10 attributes for the RR landslides. For the NRR information table all attributes except the profile curvature and the slope comprised the core. The quality of the classification for the core attributes was 0.9931, which is similar to the quality of classification of the whole set. There is only one reduct associated with this information table and it contains the same attributes as the core. The RR information table contained the following 10 core attributes: elevation, slope, aspect, tangent curvature, stream power, sediment transport capacity index, catchment area, catchment slope, catchment height and duration of insolation. The attributes not included were: profile curvature, plane curvature, compound topographic index (CTI), convergence index, and solar radiation. The quality of classification with the core attributes was 0.9731 and also one reduct was extracted for further analysis. A total of 182 decision rules were extracted for the NRR landslides where 178 were deterministic rules and 4 nondeterministic or approximate rules (Table 1). The decision rules represent classification patterns discovered in the NRR landslides data set. The deterministic rules were used to describe the unique decisions (the landslide susceptibility predictions or the presence) while the non-deterministic rules were used to describe the non-unique decisions (i.e., the doubtful region or uncertainty). Out of 178 deterministic rules there were 98 rules associated with the absence of landslides and 80 rules associated with the presence of landslides. Table 1 shows 8 exemplary deterministic rules associated with the presence of landslides and all non-
P.V. Gorsevski, P. Jankowski / Computers, Environment and Urban Systems 32 (2008) 53–65
59
Table 1 A subset of 8 deterministic rules with the strength of 5 cells and more, selected from the total of 178 rules, and 4 non-deterministic rules for the NRR landslide susceptibility Deterministic rules (A0 in [3, 4)) & (A3 = 4) & (A6 in [1, 3)) & (A7 = 4) & (A9 in [2, 3)) & (A12 in [3, 4)) => (Dec = 1) (A2 in [2, 3)) & (A5 in [1, 2)) & (A6 in [1, 2)) & (A9 in [3, 4)) & (A12 in [3, 4)) => (Dec = 1) (A0 in [3, 4)) & (A1 in [1, 2)) & (A7 in [3, 4)) & (A8 in [3, 4)) & (A9 in [2, 4)) => (Dec = 1) (A2 in [3, 4)) & (A4 in [2, 3)) & (A6 in [1, 2)) & (A13 = 4) => (Dec = 1) (A0 in [3, 4)) & (A1 in [3, 4)) & (A6 in [1, 2)) & (A7 in [1, 4)) & (A8 in [3, 4)) => (Dec = 1) (A2 = 4) & (A3 = 4) & (A8 in [2, 3)) & (A9 in [2, 3)) & (A12 in [3, 4)) & (A14 in [2, 3)) => (Dec = 1) (A0 in [2, 3)) & (A3 = 4) & (A4 in [1, 2)) & (A6 in [2, 3)) => (Dec = 1) (A1 in [1, 3)) & (A3 = 4) & (A4 in [2, 3)) & (A6 in [2, 3)) => (Dec = 1) Non-deterministic rules (A0 in [1, 2)) & (A5 in [3, 4)) & (A6 in [1, 2)) & (A13 = 4) & (A14 in [1, 2)) => (Dec = 0) OR (Dec = 1) (A0 in [2, 3)) & (A3 in [2, 3)) & (A6 in [3, 4)) & (A7 in [1, 2)) & (A14 = 4) => (Dec = 0) OR (Dec = 1) (A0 in [2, 3)) & (A3 = 4) & (A5 = 4) & (A8 in [2, 3)) & (A12 = 4) & (A14 in [1, 2)) => (Dec = 0) OR (Dec = 1) (A0 in [1, 2)) & (A6 in [2, 3)) & (A7 in [1, 2)) & (A8 in [3, 4)) & (A9 = 4) & (A12 in [2, 3)) => (Dec = 0) OR (Dec = 1) where A0 = aspect; A1 = catchment area; A2 = catchment height; A3 = catchment slope; A4 = convergence index; A5 = cti; A6 = elevation; A7 = sediment transport capacity index; A8 = duration of insolation; A9 = plane curvature; A10 = profile curvature; A11 = slope; A12 = solar radiation; A13 = stream power; A14 = tangent curvature. Numbers in the brackets denote quartiles resulting from categorizing the data.
deterministic rules for the NRR landslides. The deterministic rules in the table have a high relative strength that satisfies the conditions listed in rules premises for at least 5 cell-based locations. The strength of 5 cells means that one single rule from the table correctly predicted at least 5 NRR landslides. The rules discovered from the landslides dataset differed not only in terms of the number of correctly predicted landslide locations but also in terms of the attributes constituting rule premises. In case of the deterministic rules for NRR landslides the following attributes occurred in four or more rules: elevation, aspect, catchment slope and plane curvature while a combination of elevation, catchment slope and aspect occurred in three rules (Table 2). In the non-deterministic rules a combination of aspect, elevation and tangent curvature occurred in three rules (Table 3).
Taking a closer look at the meaning of the discovered rules, for instance, the first deterministic rule (Table 1) implies that landslide susceptibility is associated with a combination of north-western aspect, mid catchment heights, high catchment slopes, lower and medium elevations, high erosivity areas, medium plane curvature and medium solar radiation. This implies that the north-western aspect and medium solar radiation may influence landsliding by high moisture and lower evapotranspiration rates due to little exposure to sunlight. The importance of the concentration of surface and subsurface flow can be interpreted from the catchment slope that influences the time of concentration, the sediment transport capacity index that influences flow acceleration, and the plane curvature that influences converging and diverging flow and soil water content. The significance of the elevation and
Table 2 Frequency of occurrences of environmental attributes in the deterministic decision rules for the NRR landslides
ASP CATCH CATCHH CATCHS CONV CTI ELEV SED DURINS PLANC SOLAR STRPOW TANGC
ASP
CATCH
CATCHH
CATCHS
CONV
CTI
ELEV
SED
DURINS
PLANC
SOLAR
STRPOW
TANGC
4
2 3
0 0 3
2 1 1 4
1 1 3 2 3
0 0 1 0 0 1
3 2 2 3 3 1 6
3 2 0 1 0 0 2 3
2 2 1 1 0 0 1 2 3
2 1 2 2 0 1 2 2 2 4
1 0 2 2 0 1 2 1 1 3 3
0 0 1 0 1 0 1 0 0 0 0 1
0 0 1 1 0 0 0 0 1 1 1 0 1
where ASP = aspect; CATCH = catchment area; CATCHH = catchment height; CATCHS = catchment slope; CONV = convergence index; CTI = cti; ELEV = elevation; SED = sediment transport capacity index; DURINS = duration of insolation; PLANC = plane curvature; SOLAR = solar radiation; STRPOW = stream power; TANGC = tangent curvature. The frequency of occurrences is associated with the exemplary deterministic rules from Table 1.
60
P.V. Gorsevski, P. Jankowski / Computers, Environment and Urban Systems 32 (2008) 53–65
Table 3 Frequency of occurrences of environmental attributes in the non-deterministic decision rules for the NRR landslides
ASP CATCHS CTI ELEV SED DURINS PLANC SOLAR STRPOW TANGC
ASP
CATCHS
CTI
ELEV
SED
DURINS
PLANC
SOLAR
STRPOW
TANGC
4
2 2
2 1 2
3 1 1 3
2 1 0 2 2
2 1 1 1 1 2
1 0 0 1 1 1 1
2 1 1 1 1 2 1 2
1 0 1 1 0 0 0 0 1
2 1 1 2 1 0 0 0 1 3
The frequency of occurrences is associated with all of the non-deterministic rules from Table 1.
catchment height attributes suggests a narrow range of elevations where water redistribution is critical to landslide occurrence. For instance, preliminary data analysis showed that fewer landslides occurred at higher elevations, which might have been the result of snowpack and cooler temperatures preventing rain-on-snow events. Similar logic can be used to interpret each individual rule and specific implications associated with those rules. A total of 150 decision rules were extracted for the RR landslides where 144 were deterministic rules and 6 were non-deterministic (Table 4). Out of 144 deterministic rules there were 72 deterministic rules associated with both the absence and presence of landslides. Table 4 shows a total of 4 exemplary deterministic rules for the RR landslides with the strength equal to or exceeding 7 cells and all non-deterministic rules. The most frequent attributes of the deterministic rules are aspect, catchment height, catchment slope and elevation appearing in 3 decision rules. Catchment area, duration of insolation and slope are present in 2 decision rules, and sediment transport capacity index and tangent curvature in 1 decision rule. There is one rule where catchment height, catchment slope and elevation appear together (Table 5). For the non-deterministic
rules aspect and catchment area are present in all decision rules, elevation in 4 decision rules, catchment slope and tangent curvature in 3 decision rules, stream power in 2 decision rules, and sediment transport capacity index, duration of insolation and slope in 2 decision rules (Table 6). The rules discovered in the RS analysis can be evaluated in light of the research questions posed in the introductory section of the paper. Addressing the first research question about the feasibility of developing RS models with the same variables for predicting RR and NRR landslide susceptibility we conclude that there are differences between the rule-based models of NRR and RR and hence, two separate models are necessary for the prediction of landslide susceptibility. Figs. 3 and 4 illustrate the spatial implementation of the rules described in Tables 1 and 4. Figs. 3 and 4 also show the differences between the predictive models derived from different rule strengths as well as the difference between the NRR and RR predictive models. The legends in Figs. 3 and 4 show the percentages of susceptible and non-susceptible areas from the total area. The NRR model provides a better overall fit, which is illustrated in Table 7 suggesting that both the accuracy and the precision
Table 4 A subset of 4 deterministic rules with the strength of 7 cells and higher, selected from the total of 144 rules, and 6 non-deterministic rules for the RR landslide susceptibility Deterministic rules (A1 in [2, 4)) & (A2 in [2, 3)) & (A3 = 4) & (A6 in [2, 3)) & (A7 in [3, 4)) & (A11 = 4) => (Dec = 1) (A0 in [2, 4)) & (A3 = 4) & (A6 in [1, 2)) & (A8 in [2, 3)) & (A14 in [3, 4)) => (Dec = 1) (A0 in [3, 4)) & (A2 in [2, 3)) & (A3 = 4) & (A8 in [3, 4]) & (A11 = 4) => (Dec = 1) (A0 in [3, 4)) & (A1 in [2, 3)) & (A2 in [2, 4)) & (A6 in [2, 3)) => (Dec = 1) Non-deterministic rules (A0 in [1, 2)) & (A1 in [1, 2)) & (A6 in [2, 3)) & (A11 in [1, 2)) & (A14 = 4) => (Dec = 0) OR (Dec = 1) (A0 = 4) & (A1 in [1, 2)) & (A3 in [1, 2)) & (A6 in [2, 3)) & (A14 = 4) => (Dec = 0) OR (Dec = 1) (A0 in [1, 2)) & (A1 in [1, 2)) & (A3 in [2, 3)) & (A6 in [2, 3)) & (A14 = 4) => (Dec = 0) OR (Dec = 1) (A0 in [3, 4)) & (A1 in [2, 3)) & (A8 = 4) & (A13 in [1, 2)) => (Dec = 0) OR (Dec = 1) (A0 = 4) & (A1 in [2, 3)) & (A6 in [2, 3)) & (A13 in [3, 4)) => (Dec = 0) OR (Dec = 1) (A0 in [2, 3)) & (A1 in [1, 2)) & (A3 in [2, 3)) & (A7 in [1, 2)) => (Dec = 0) OR (Dec = 1) where A0 = aspect; A1 = catchment area; A2 = catchment height; A3 = catchment slope; A4 = convergence index; A5 = cti; A6 = elevation; A7 = sediment transport capacity index; A8 = duration of insolation; A9 = plane curvature; A10 = profile curvature; A11 = slope; A12 = solar radiation; A13 = stream power; A14 = tangent curvature. Numbers in the brackets denote quartiles resulting from categorizing the data.
P.V. Gorsevski, P. Jankowski / Computers, Environment and Urban Systems 32 (2008) 53–65
61
Table 5 Frequency of occurrences of environmental attributes in the deterministic decision rules for the RR landslides
ASP CATCH CATCH_H CATCH_S ELEV SED DUR_INS SLOPE TANGC
ASP
CATCH
CATCH_H
CATCH_S
ELEV
SED
DUR_INS
SLOPE
TANGC
3
1 2
2 2 3
2 1 2 3
2 2 2 2 3
0 1 1 1 1 1
2 0 1 2 1 0 2
1 1 2 2 1 1 1 2
1 0 0 1 1 0 1 0 1
The frequency of occurrences is associated with the exemplary deterministic rules from Table 4.
Table 6 Frequency of occurrences of environmental attributes in the non-deterministic decision rules for the RR landslides
ASP CATCH CATCH_S ELEV SED DUR_INS SLOPE STR_POW TANGC
ASP
CATCH
CATCH_S
ELEV
SED
DUR_INS
SLOPE
STR_POW
TANGC
6
6 6
3 3 3
4 4 2 4
1 1 1 0 1
1 1 0 0 0 1
1 1 0 1 0 0 1
2 2 0 1 0 1 0 2
3 3 2 3 0 0 0 0 3
The frequency of occurrences is associated with all of the non-deterministic rules from Table 4.
Fig. 3. NRR landslide susceptibility maps based on different strength of the rules: (a) strength > 2 (b) strength > 3 (c) strength > 4 and (d) based on nondeterministic rules.
62
P.V. Gorsevski, P. Jankowski / Computers, Environment and Urban Systems 32 (2008) 53–65
Fig. 4. RR landslide susceptibility maps based on different strength of the rules: (a) strength > 5 (b) cut-strength > 6 (c) strength > 7 and (d) based on nondeterministic rules. Table 7 Proportions of correctly predicted landslides and susceptible area for NRR and RR landslides derived from different deterministic and non-deterministic rule strengths Strength
Deterministic rules >2 >3 >4
Non-road related
Strength
Landslides
Area
Ratio
73.8% 53.4% 49.2%
50.5% 29.4% 23.3%
1.46 1.82 2.11
rules 54.0% 34.8% 29.0%
1.53 1.82 2.04
Union of deterministic and non-deterministic >2 82.7% >3 63.4% >4 59.2%
are slightly higher for the NRR predictive models than for the RR predictive models. Also, comparing Figs. 3d and 4d one can see that in the NRR model only 7.2% of land is classified as susceptible while in the RR model 21.5% of land is classified as susceptible. Figs. 3d and 4d are derived from non-deterministic rules, which are ambiguous in designating an area as either susceptible or non-susceptible to landslides. Therefore, the RR model prediction is associated with higher uncertainty then the NRR model. 4.2. Comparison of RS and D–S predictive models Table 7 shows the results for the NRR and RR landslide susceptibility derived from different sets of rule strength,
Road related Landslides
Area
Ratio
>5 >6 >7
65.9% 60.1% 24.2%
51.3% 39.6% 21.5%
1.28 1.52 1.13
>5 >6 >7
69.5% 64.6% 35.9%
59.9% 50.5% 36.8%
1.16 1.28 0.97
which are cross-tabulated against the test data. The ratio of correctly identified landslides to area susceptible to landslides was used for comparison of the model’s precision. The higher ratio number suggests better precision of prediction of landslide susceptibility. For instance, applying the NRR landslide deterministic rules with the strength of 5 and higher (>4) from Table 1 results in 23.3% of susceptible area from the total area and the correct prediction of 49.2% of all landslides. The susceptible area (23.3%) derived from the deterministic rules with the strength of 5 and higher is shown in Fig. 3c. The rules can also be represented in terms of union and intersection of deterministic and the non-deterministic rules. Table 7 shows the results that represent the union
P.V. Gorsevski, P. Jankowski / Computers, Environment and Urban Systems 32 (2008) 53–65
63
Table 8 Proportions of correctly predicted landslides and susceptible area for NRR and RR landslides associated with probabilities derived form Fuzzy/D–S approach Probability cutoff
Non-road related
Road related
Landslides
Area
Ratio
Belief >.4 >.6 >.8
95.8% 82.6% 52.1%
67.8% 40.0% 13.5%
1.41 2.07 3.86
Plausibility >.4 >.6 >.8
97.8% 87.3% 64.7%
71.9% 47.4% 18.6%
1.36 1.84 3.48
(logical OR) between the deterministic and the non-deterministic rules. A comparison between the ratio numbers associated with the strength of 5 and higher suggests that in both cases of NRR and RR landslides the union between the deterministic and the non-deterministic rules improves the respective accuracies (from 49.2% to 59.2% for NRR and from 24.2% to 35.9% for RR) but the precision is higher when only the deterministic rules are applied (2.11 for NRR and 1.13 for RR). Accuracy denotes the percentage of correctly identified landslides. Precision denotes the ratio of correctly identified landslide areas to the overall study area. Addressing the second research question about the performance differences between the RS approach and the Fuzzy/D–S approach (Gorsevski et al., 2005). Table 8 shows cross-tabulation of goodness-of-fit between the Fuzzy/D–S models and the independent test data for presence and absence associated with the probabilities for both RR and NRR landslides. In the table the belief function denotes the lower bound for the probability function, whereas the plausibility function denotes the upper bound for the probability function. Ratio values suggest that there is a high precision especially with high probability cut-offs. For example, the belief function for the NRR landslides at a cut-off value of >.8 is associated with 52.1% correctly predicted landslides and 13.5% of the area classified as susceptible. The ratio value is 3.86, which is higher than any of the ratio values generated with the RS approach. Both approaches yield similar ratios for NRR locations using the cut-off value of >.6 for the Fuzzy/D–S approach and the rule strength of 5 and higher for the RS approach. Also, at the respective cut-off and rule strength it is interesting to observe that the accuracy of the Fuzzy/D–S approach is higher but the precision is lower than for the RS approach. For example, the ratio value for the NRR landslides from the deterministic rules is 2.11 (Table 7) which is higher that 2.07 (Table 8) from the Fuzzy/D–S approach. However, since the modeling approaches are considerably different the comparison here is intended to set a point of reference for the capabilities of the modeling approach. The uncertainty of prediction obtained with the RS approach is shown in Figs. 3d and 4d, which are based
Landslides
Area
Ratio
97.3% 86.5% 46.9%
73.7% 40.6% 14.7%
1.32 2.13 3.19
100.0% 91.0% 57.2%
77.2% 46.9% 17.0%
1.30 1.94 3.36
on the non-deterministic rules from Tables 1 and 4. In the RS approach the uncertainty results from indiscernibility relations affecting object (cell location) classification while in the D–S approach the uncertainty results from the difference between the probability intervals of lower and upper bounds. The lower bound interval represents the belief function, which measures the amount of belief in the hypotheses on the basis of observed evidence. The upper bound interval is the plausibility that represents the maximum level of belief possible or the degree to which a hypothesis cannot be disbelieved. The D–S inference is statistically based and is used when the predictor database contains incomplete information while the RS analysis deals with structural relationships in the data and requires no external parameters for knowledge discovery. Further, calculating the belief and plausibility functions in the D–S approach may require subjective judgment in the absence of sufficient empirical data whereas the computation of lower and upper approximations of rough set is based solely on attribute and decision class values in the information table. 5. Conclusions The research we have reported about in this paper shows that RS theory could provide a useful approach to derive RR and NRR landslide susceptibility decision models. We constructed two different information tables for the RR and NRR landslides using terrain derivatives to model presence and absence decision classes. Using the indiscernibility relation of RS theory the attributes of information tables were reduced to subsets of the same quality of decision class approximation as the full set of attributes and core attributes were used in the analysis. Rule-based models of RR and NRR landslide susceptibility were derived for different levels of empirical rule support and their mapping in GIS enabled a visual analysis of susceptible areas. The first research question pertained to the difference between predictor datasets and the analysis results showed that the core attributes for the RR and NRR landslides were different. In the case of the RR model a total of 10 attributes were used while in the case of the NRR model a total of 13 attributes were used. Accuracy and precision
64
P.V. Gorsevski, P. Jankowski / Computers, Environment and Urban Systems 32 (2008) 53–65
of predicating landslide susceptible areas were higher for the NRR model than for the RR model. This may be because the NRR landslides are closer associated with geomorphological characteristics and natural processes represented by 15 primary and secondary variables used in the analysis then RR landslides. It may also be due to the difference in the spatial sampling of non-landslide cells associated with the NRR and RR landslide areas (i.e. within road buffer and outside of the buffer). Since there is, an inverse relationship between accuracy/precision and uncertainty, the uncertainty is higher for the RR model than for the NRR model. The second research question compared the RS approach with the Fuzzy/D–S (Dempster–Shafer) approach (Gorsevski et al., 2005) in terms of the accuracy and precision of predicting landslide susceptible areas. The comparison revealed that although the accuracy of Fuzzy/ D–S approach was higher than that of RS approach the precision of prediction achieved with the RS approach was comparable with the Fuzzy/D–S approach. This result makes the RS approach a potentially attractive spatial decision support tool for classifying locations into nonoverlapping decision classes (e.g. presence and absence) since the computation of predictive rules is based entirely on data and does not utilize subjective judgments as this may be the case with the Fuzzy/D–S approach. Further research is necessary to fully evaluate the utility of RS approach to modeling landslide susceptibility and other spatial phenomena. In the study report herein we used a limited data sampling of landslide events and their characteristics. Nonetheless, the advantage of the RS modeling approach is that the decisions generated by the model are explicit and the modeling process is not limited to restrictive assumptions. Additional advantages of the RS approach include a method of reducing the cognitive complexity of the attribute space by finding reducts (subsets of attributes) and the flexibility to decrease (prune) decision rules based on their strength, thus providing analysts and land managers with an additional valuable tool in finding essential attributes. References Ahlqvist, O., Keukelaar, J., & Oukbir, K. (2003). Rough and fuzzy geographical data integration. International Journal of Geographical Information Science, 17(3), 223–234. Arciszewski, T., & Ziarko, W. (1990). Inductive learning in civil engineering: A rough sets approach. Microcomputers in Civil Engineering, 5(1), 19–28. Beven, K., & Kirkby, M. J. (1979). A physically based, variable contributing area model of basin hydrology. Hydrological Sciences Bulletin, 24, 43–69. Burton, A., & Bathurst, J. C. (1998). Physically based modelling of shallow landslide sediment yield at a catchement scale. Environmental Geology, 35, 89–99. Burrough, P. A., Wilson, J. P., van Gaans, P. F. M., & Hansen, A. J. (2001). Fuzzy k-means classification of topo-climatic data as an aid to forest mapping in the Greater Yellowstone Area, USA. Landscape Ecology, 16, 523–546. Carrara, A. (1983). Multivariate models for landslide hazard evaluation. Mathematical Geology, 15, 402–426.
Carrara, A., Cardinali, M., Detti, R., Guzzetti, F., Pasqui, V., & Reichenbach, P. (1991). GIS techniques and statistical models in evaluating landslide hazard. Earth Surface Processes and Landforms, 16, 427–445. Carrara, A., Cardinali, M., Guzzetti, F., & Reichenbach, P. (1995). GIS technology in mapping landslide hazard. In A. Carrara & F. Guzzetti (Eds.), Geographical information systems in assessing natural hazards (pp. 135–175). Dordrecht, The Netherlands: Kluwer Academic Publishers. Chung, C. F., Fabbri, A. G., & van Westen, C. J. (1995). Multivariate regression analysis for landslide hazard zonation. In A. Carrara & F. Guzzetti (Eds.), Geographical information systems in assessing natural hazards (pp. 107–133). Dordrecht, The Netherlands: Kluwer Academic Publishers. Chung, C. F., & Fabbri, A. G. (1999). Probabilistic prediction model for landslide hazard mapping. Photogrammetric Engineering and Remote Sensing, 65, 1389–1399. Conrad, O. (2004). System for Automated Geographical Analysis (SAGA) software. Version 1.1. Department for Physical Geography, University of Go¨ttingen (Germany), Accessed 12 June 2004, http://www.sagagis.org. Dempster, A. P. (1967). Upper and lower probabilities induced by a multivalued mapping. Annals of Mathematical Statistics, 38, 325– 339. Dhakal, A. S., Amada, T., & Aniya, M. (2000). Landslide hazard mapping and its evaluation using GIS: An investigation of sampling schemes for a grid-cell based quantitative method. Photogrammetric Engineering and Remote Sensing, 66, 981–989. Duan, J., & Grant, G. E. (2000). Shallow landslide delineation for steep forest watersheds based on topographic attributes and probability analysis. In J. P. Wilson & J. C. Gallant (Eds.), Terrain analysis principles and applications (pp. 311–329). New York: Wiley. Dyrness, C. T. (1967). Mass soil movements in the H.J. Andrews Experimental Forest. USDA Forest Service, Pacific Northwest Forest and Range Experiment Station, Research Paper PNW-42 (p. 12). Elliot, W. J., Foltz, R. B., Luce, C. H., & Koler, T. E. (1996). Computeraided risk analysis in road decommissioning. Watershed Restoration Management: Physical, chemical, and biological consideration. In Proceedings AWRA annual symposium, Syracuse, New York, July 14– 17. Fernandes, N. F., Guimara˜es, R. F., Gomes, R. A. T., Vieira, B. C., Montgomery, D. R., & Greenberg, H. (2004). Topographic controls of landslides in Rio de Janeiro: field evidence and modeling. Catena, 55, 163–181. Glade, T. (1998). Establishing the frequency and magnitude of landslidetriggering rainstorm events in New Zealand. Environmental Geology, 35, 160–174. Gorsevski, P. V. (2002). Landslide hazard modeling using GIS. Ph.D. Dissertation, University of Idaho, Moscow, USA. Gorsevski, P. V., Gessler, P. E., & Jankowski, P. (2003). Integrating a fuzzy k-means classification and a Bayesian approach for spatial prediction of landslide hazard. Journal of Geographical Systems, 5, 223–251. Gorsevski, P. V., Gessler, P. E., & Jankowski, P. (2004). Spatial prediction of landslide hazard using fuzzy k-means and Bayes theorem. In W. Widacki, A. Bytnerowicz, & A. Riebau (Eds.), A message from the Tatra: Geographical Information Systems and remote sensing in Mountain Environmental Research (pp. 159–172). Krakow, Poland: Jagiellonian University Press. Gorsevski, P. V., Jankowski, P., & Gessler, P. E. (2005). Spatial prediction of landslide hazard using fuzzy k-means and Dempster–Shafer theory. Transactions in GIS, 9, 455–474. Gorsevski, P. V., Gessler, P. E., Foltz, R. B., & Elliot, W. J. (2006a). Spatial prediction of landslide hazard using logistic regression and ROC analysis. Transactions in GIS, 10(3), 395–415. Gorsevski, P. V., Gessler, P. E., Boll, J., Elliot, W. J., & Foltz, R. B. (2006b). Spatially and temporally distributed modeling of landslide susceptibility. Geomorphology, 80(3–4), 178–198.
P.V. Gorsevski, P. Jankowski / Computers, Environment and Urban Systems 32 (2008) 53–65 Gorsevski, P. V., Jankowski, P., & Gessler, P. E. (2006c). An heuristic approach for mapping landslide hazard by integrating fuzzy logic with analytic hierarchy process. Control and Cybernetics, 35(1), 121–146. Hammond, C., Hall, D., Miller, S., & Swetik, P. (1992). Level I Stability Analyses (LISA) documentation for version 2.0. Gen. Technical Report INT-285, U.S. Department of Agriculture, Forest Service, Intermountain Research Station, Ogden, UT (p. 190). Komorowski, J., & Øhrn, A. (1999). Modelling prognostic power of cardiac tests using rough sets. Artificial Intelligence in Medicine, 15(2), 167–191. McClelland, D. E., Foltz, R. B., Wilson, W. D., Cundy, T. W., Heinemann, R., Saurbier, J. A., et al. (1997). Assessment of the 1995 & 1996 floods and landslides on the Clearwater National Forest, Part I: Landslide Assessment. A Report to the regional Forester Northern Region U.S. Forest Service, December. McKee, T. (2003). Rough sets bankruptcy prediction models versus auditor signalling rates. Journal of Forecasting, 22, 569–586. Montgomery, D. R., & Dietrich, W. E. (1994). A physically based model for the topographic control on shallow landsliding. Water Resources Research, 30, 1153–1171. Moore, I. D., Grayson, R. B., & Ladson, A. R. (1991). Digital terrain modeling: A review of hydrological, geomorphological, and biological applications. Hydrological Processes, 5, 3–30. Moore, I. D., Gessler, P. E., & Nielsen, G. A. (1993). Soil attribute prediction using terrain analyses. Soil Science Society of American Journal, 57(2), 443–452. Okimura, T., & Ichikawa, R. (1985). A prediction method for surface failures by movements of infiltrated water in a surface soil layer. Natural Disaster Science, 7, 41–51. O’Loughlin, E. M. (1986). Prediction of surface saturation zones in natural catchments by topographic analysis. Water Resources Research, 22(5), 794–804. Pal, S. K., & Mitra, P. (2002). Multispectral image segmentation using the rough-set-initialized EM Algorithm. IEEE Transactions on Geoscience and Remote Sensing, 40(11), 2495–2501. Pawlak, Z. (1982). Rough sets. International Journal of Computer Information Sciences, 11(5), 341–356.
65
Pawlak, Z. (1991). Rough sets – Theoretical aspects of reasoning about data. Dordrecht, The Netherlands: Kluwer Academic Publishers. Pawlak, Z., & Slowinski, R. (1993). Rough set approach to multi-attribute decision analysis. ICS Research Report 36, Warsaw University of Technology. Pawlak, Z., & Slowinski, R. (1994). Rough set approach to multi-attribute decision analysis. European Journal of Operational Research, 72, 443–459. Predki, B., Slowinski, R., Stefanowski, J., Susmaga, R., & Wilk, Sz. (1998). ROSE – Software implementation of the Rough Set Theory. In L. Polkowski & A. Skowron (Eds.), Rough sets and current trends in computing. Lecture notes in artificial intelligence (Vol. 1424, pp. 605–608). Berlin: Springer. Predki, B., & Wilk, Sz. (1999). Rough set based data exploration using ROSE system. In Z. W. Ras & A. Skowron (Eds.), Foundations of intelligent systems. Lecture notes in artificial intelligence (Vol. 1609, pp. 172–180). Berlin: Springer. Shafer, G. (1976). A mathematical theory of evidence. Princeton: Princeton University Press. Sidle, R. C., Pearce, A. J., & O’Loughlin, C. L. (1985). Hillslope stability and landuse. American Geophysical Union, Water Resource Monograph No. 11, Washington, DC. Slowinski, R., & Zopounidis, C. (1995). Application of the rough set approach to evaluation of bankruptcy risk. International Journal of Intelligent Systems in Accounting, Finance and Management, 4(1), 27–41. Slowinski, R., Stefanowski, J., Greco, S., & Matarazzo, B. (2000). Rough sets based processing of inconsistent information in decision analysis. Control and Cybernetics, 29, 379–404. Wilson, J. P., & Gallant, J. C. (2000). Digital terrain analysis. In J. P. Wilson & J. C. Gallant (Eds.), Terrain analysis principles and applications (pp. 1–29). New York: Wiley. Wu, W., & Sidle, R. C. (1995). A distributed slope stability model for steep forested basins. Water Resources Research, 31, 2097–2110. Ziarko, W., & Shan, N. (1994). An incremental learning algorithm for constructing decision rules. In W. P. Ziarko (Ed.), Rough sets, fuzzy sets and knowledge discovery (pp. 326–334). Berlin: Springer.