Spatial and temporal dimensions of land use change in cross border region of Luxembourg. Development of a hybrid approach integrating GIS, cellular automata and decision learning tree models

Applied Geography 67 (2016) 94e108 Contents lists available at ScienceDirect Applied Geography journal homepage: www.elsevier.com/locate/apgeog Spa...

Download PDF

10MB Sizes 0 Downloads 12 Views

Report

Full Text

Applied Geography 67 (2016) 94e108

Contents lists available at ScienceDirect

Applied Geography journal homepage: www.elsevier.com/locate/apgeog

Spatial and temporal dimensions of land use change in cross border region of Luxembourg. Development of a hybrid approach integrating GIS, cellular automata and decision learning tree models dis c Reine Maria Basse a, *, Omar Charif b, Katalin Bo a

Departement of Urban Development and Mobility, Luxembourg Institute of Socio-Economic Research (LISER), 11 Porte des Sciences, L-4366 Esch, Alzette, Luxembourg Department of Geomatics Engineering, University of Calgary, 2500 University Dr NW, Calgary, AB T2N 1N4, Canada c European Commission, Joint Research Centre, Institute for Energy and Transport, Via Enrico Fermi 2749, TP450, I-21027, Ispra, Italy b

a r t i c l e i n f o

a b s t r a c t

Article history: Received 1 July 2015 Received in revised form 29 November 2015 Accepted 3 December 2015 Available online xxx

This paper presents a geographical and computational modelling approach to explore the nonlinear relationship between land use types and geospatial driving factors. It focuses on the dynamism of land use characteristics in a cross-border region. The developed model is based on fully integrated Cellular Automata (CA), Geographic Information System (GIS) and Decision Learning Tree (DLT) model, which is used to deﬁne the CA transition rules. Existing literature considers CA as one of the most relevant tools for modelling spatial changes over time, particularly when complex systems such as land use are involved. The literature also highlights that, when CA is combined with other tools, results lead to a better spatial prospect of land use dynamics. Our results reveal how land use is structured around both the transportation system and the border, and how measuring accessibility from different angles using GIS platform permits analysis of the temporal and spatial discontinuity of land use itself, thereby identifying the discontinuity components of land use patterns determined by land use boundaries. © 2015 Elsevier Ltd. All rights reserved.

Keywords: Cellular automata Decision learning tree GIS Land use Boundaries Discontinuity Accessibility

1. Introduction In geography and in environment science in general, modelling became essential to study the spatial and temporal dynamics of land use as they reveal future changes, development potential and planning issues and challenges (Geist & Lambin, 2001). Land use is a synthetic indicator that translates not only human dispersal but also the distribution of natural elements and urban activities. Moreover, land use is a very complex structure and this complexity is a result of multiple interactions between its categories, and environmental constraints (Lambin, Geist, & Lepers, 2003). To explore these interactions, the derivative implications on the structure and the evolution of land use patterns, researchers devoted much effort in modelling the land use change. Methods from various research ﬁelds were used when developing land use dynamic models including methods from artiﬁcial intelligence, GIS,

* Corresponding author. E-mail addresses: [email protected] (R.M. Basse), [email protected] dis). (O. Charif), [email protected] (K. Bo http://dx.doi.org/10.1016/j.apgeog.2015.12.001 0143-6228/© 2015 Elsevier Ltd. All rights reserved.

geography and statistics. In the toolset of artiﬁcial intelligence, CA is the most frequently used model for studying land use change. It has been proven that when CA are applied in the modelling of a cross-border region, determining targets and analysing land use dynamics are more complex issues as a result of the particularity of the cross-border system (Basse, Omrani, Charif, Gerber, & Bodis, 2014). By taking into account this complexity, the methodological approach proposed in this paper combines different methods and tools: CA, GIS and DLT. These tools (GIS) and methods (CA and DLT) are “consistent and mature technologies” capable of modelling land use dynamics. The proposed model seems efﬁcient to respond to the following challenging questions: How can land use dynamics in a cross-border system be formulated using spatial models? What are the main drivers of land use changes? What can we learn from the combined modelling exercises? The ﬁndings will contribute to showing why these combined models are also a plus for (1) a better understanding of the complexity of land use dynamics in a crossborder system and (2) building a comprehensive CA-based land use model. Coupling CA, GIS, and DLT can provide the essential background information to explore the structure of land use and its

R.M. Basse et al. / Applied Geography 67 (2016) 94e108

potential evolution. “Simple rules for modelling and simulating complex phenomena” e is doubtless the most simple and “naive” deﬁnition we can attribute to CA, while remaining faithful to the precursors; Alan Turing (1950), John Von Neumann (1951) and Stanislaw Ulam (1952). CA has sparked interest in many domains such as computer science, physics, biology, chemistry, ecology, economy, mathematics and geography. The use of CA in geography is relatively recent and has essentially been applied in urban and land use modelling (Batty & Xie, 1994; Benenson & Torrens, 2004; Couclelis, 1985; Phipps, 1989; Tobler, 1970, 1979; White and Engelen, 1997 and others). The development of GIS contributed to popularising the use of CA in geography. Indeed, when associated with GIS, CA is one of the most relevant computational methods used to model and simulate the intrinsic complexity of, for instance, the land use system (Batty, Xie, & Sun, 1999; Clarke & Gaydos, 1998; Clarke & Gaydos, 1998; White & Engelen, 2000). Two types of CA can be distinguished: (A), standard or simple CA, based on the original universe of Conway's Game of Life which was developed to simulate the emergence and self-organisation/self-replication of living organisms (Gardner, 1970). For this type of CA, the characteristics are follows: The environment is a two-dimensional orthogonal grid of square cells/lattice. Cell states have two possibilities (alive or dead). Time is incremental (iteration). Simple conditional rules deﬁning cells' state transitions. Neighbourhood setting (eight neighbourhood). Cells are autonomous. The standard example is Conway's GAME of LIFE. And, (B), the advanced CA, used to model and to simulate more complex human or other living organisms and environmental phenomena. This kind of CA is also known in geography and environmental sciences as a CA-based model. For this type of CA the characteristics are follows: The environment/space is a two-dimensional orthogonal grid/ lattice of square cells. Cells are pixels presenting real world geographical units. States are multiples (e.g. in CA-based land use model, the CA states present the different land use categories). Time is given in discrete steps presenting real-life time unit such as years. Rules determine local interactions between geographical units and dictate their evolution (change, stationary or/and growth). Different models have different sets and neighbourhoods. The CA-based model developed in this paper consists of (1) a regular discrete lattice of cells in raster GIS format presenting the study area; (2) discrete time steps controlling the evolution of cells; (3) a ﬁnite set of state (land use classes) to characterise cells; (4) Moore neighbourhood identifying cells that affect the evolution of the central cell; (5) a set of transition rules determining cells' future states using the states of the cell and its neighbours. In this paper, the transition rule reﬂects land use dynamics over the 1990 to 2000 period. Therefore, the CA model was calibrated based on the historical period of 1990 and 2000. The CA transition rules were implemented using the DLT. More precisely, the CART (Classiﬁcation and Regression Tree) (Breiman, 1993) implementation of decision trees with a GINI impurity index (Raileanu & Stoffel, 2004) for selecting the best split criterion was used. The model was

95

validated by comparing unseen parts (i.e. areas that were not used for the calibration of the model) of the actual (i.e. observed) Corine Land Cover dataset of the year 2000 with the simulated ones. Using the calibrated and validated CA model, the land use map up to 2006 (e.g. the Business As Usual scenario) was generated, analysed and discussed. The land use map up to 2000 and 2006 reﬂects how the CA model re-interpreted and re-produced local land use change processes in 2000 and 2006. 2. The study area Cross-border regions are complex spatial systems strongly characterised by neighbouring nations. These zones are “crossroads” between heterogeneous human societies and spaces where various socio-economic, socio-cultural and socio-demographic characteristics interact to and inﬂuence each other. The next section presents a speciﬁc cross border region, the Luxembourg cross border region, which has been already investigated (Charif, 2013; Decoville, Durand, Sohn, & Walther, 2013; Schiebel, Omrani, & Gerber, 2015; Sohn, Reitel, & Walther, 2009). 2.1. The morphological characteristics of the study area The Grand Duchy of Luxembourg is a landlocked country surrounded by France, Belgium and Germany (Fig. 1). Although Luxembourg is among the smallest countries in Europe, it is one of the most dynamic and attractive regions in terms of economy and quality of life. Indeed, the Luxembourg that has emerged in Europe as an attractive and metropolitan area is experiencing an increase in residential and daily mobility (Gerber, 2012). Today, the number of commuters from neighbouring countries shows that Luxembourg is a signiﬁcant economic force in the region. It plays a leading role in European politics, it is a founding member of the European Union and its capital is the headquarters of several EU institutions. Researchers from different ﬁelds will certainly agree that carrying out cross-border land used change research (in particular on a national scale) is a challenging and complex exercise. On one hand, this is because of the particularity of crossborder areas that are generally governed by several interactions and relationships between fundamental drivers such as spatial, policy, social and economic characteristics, speciﬁc to each country in the study area (Brunet & Jailly, 2005; Paasi, 2005). On the other hand, cross-border research is complex because it is difﬁcult to collect relevant data, set up harmonised land use datasets, and ﬁnd out land use drivers that are common among all studied countries. Based on the digital elevation model, we can describe the physical morphology of the study area. The westeeast proﬁles in Fig. 2 represent the fragmentation of the terrain and show the relief of the study area. The municipalities located in the northern section of the study area are physically more constrained than the municipalities in the southern section. In the real world, especially with regard to potentially suitable areas for construction, terrain conditions (e.g., slope gradient in Fig. 3) rarely hamper construction plans; only prices can rise indeﬁnitely. This factor could thus be a new aspect to investigate in future studies. Groundwater also deserves further analysis. Indeed, groundwater levels and conditions are crucial for agricultural and industrial activities, as well as for building residential areas or industrial units. 2.2. Measuring the spatial and temporal analysis of urban and industrial land use within variable accessibilities This section presents different types of accessibility maps and their calculation. The objective here is to highlight the existing correlation between land use distribution and location and the

96

R.M. Basse et al. / Applied Geography 67 (2016) 94e108

Fig. 1. Geographical location and administrative division of the study area. The shaded relief background derived from a digital elevation model indicates the terrain conditions.

structure of accessibilities by using different variables (e.g. distance to state border in kilometres, time in minutes needed to access speciﬁc networks). Fig. 4 shows the travel time to reach urban and industrial zones from the highway. The closest urban areas or industrial units (cells representing them) are those that can be reached within the shortest time from the transport network. However, cell density (number of adjacent cells) varies depending on the level of accessibility. For example, the urban land use class is highly present within two minutes from highway exits. Indeed, this phenomenon is illustrated in Fig. 6c where the peak of more than 10,000 cells is reached. It is also clear that the further away from the highway, the higher the intensity of urban cell expansion. As regards the industrial class, the peak of 3500 cells is reached within two minutes from the highway (Fig. 6a). This can be explained by the supposed priority of industrial and business activities aiming to attract the maximum number of clients and to minimise transport costs. Indeed, within approximately 30 min from the highway, the industrial class vanishes from the urban landscape (Fig. 4). On the contrary, the urban class, which exhibits a more complex

behaviour, continues to be localised in areas further than 30 min from the highway. This is because people may choose not to live next to busy roads because of air and noise pollution and therefore seek locations at a comfortable distance and providing quiet and healthy living conditions and good connectivity. This optimisation push/pull effect partly explains the observed scattered phenomenon and the difﬁculty in calibrating these types of land occupation within a modelling context using cellular automata-based models (White & Engelen, 2000). Figs. 5 and 6 show the discontinuity in land use distribution. These discontinuities are more consistent in Fig. 5, which shows the physical distance from the state border indicating accessibility of urban and industrial zones. Indeed, as Fig. 6b and d show, land use movements are jerky. From 2 km from the border, urban and industrial cells appear to compete in order to be located within the same areas; they are therefore often adjacent and interlocked (Fig. 5). This spatial competition for the best location explains why cells belonging to the urban and industrial classes show similar jerking movements up until 15 km (see Fig. 6b and d).

R.M. Basse et al. / Applied Geography 67 (2016) 94e108

97

Fig. 2. WesteEast relief proﬁles of the study area.

Undertaking comparative analysis with variable accessibilities in relation to land use type, location and distribution allows better understanding of the behaviour of land use patterns in the analysed cross-border system. Identiﬁed spatial discontinuities lead to the emergence of what we have referred to in here as land use boundaries or cellular barriers. In relation to the real world, these boundaries can be caused by the presence of a transport network (e.g., highways, access to highway railway stations), the presence of a hydraulic network, geographic exposure (south-east; south-west, etc.), steep slope gradient and other anthropic activities (such as the establishment of recreational green spaces around an urban area) or simply vicinity to an international border (see Fig. 6 b and d). 2.3. Analysing the land use dynamics of the study area between 1990 and 2006 In addition to transport networks, we have selected ﬁve land use classes (urban areas, industrial units, agriculture areas, forest and water bodies) based on the Corine Land Cover datasets for 1990, 2000 and 2006 (EEA, 2000). Fig. 7 shows how the land use system is organised around transport networks. Table 1 summarises the evolution of land use in different periods: between 1990 and 2000, between 2000 and 2006 and between 1990 and 2006. This distinction of land use evolution is made in different time series in order to show the difference degree of land use dynamics in 10 years, 4 years and 16 years. Table 1 also reveals land use growth tendencies of urban areas with a growth of 3.56% between 1990 and 2000, 3.10% between 2000 and 2006. The growth of 6.8% is registered for the full period from 1990 till 2006. The most important growth is recorded in industrial units with 12.30%, between 1990 and 2000, 7.9% between 2000 and 2006, and 20.26% for the entire period of 16 years. At ﬁrst sight, we are confronted with a

territory that is primarily dominated by two classes: the agricultural class and the forest class that represent respectively 52.74% and 39,44% in percentage of the land use of total area of interest in 2006 (Table 1 and Fig. 7). If we look at their evolution, we notice a loss of momentum, particularly for the agricultural classes with 0.68% between 1990 and 2000, 0.62% between 2000 and 2006 and 1.29% between 1990 and 2006. The decrease of agriculture class undoubtedly beneﬁts artiﬁcial classes; that is, urban and industrial units. Contrary to the agricultural category, forest category areas appear less affected by the growth of artiﬁcial classes because this class decreased slowly between 1990 and 2000 (0.01%) but started to increase for example between 2000 and 2006 (þ0. 07%), thus over the studied 16 years forest area witnessed grew with a ratio of þ0.16%. The increase of the water bodies' class between 1990 and 2000 can be explained by the construction of an artiﬁcial lake between the two periods or by errors in land cover images classiﬁcation. Table 1 shows that this study deals with a highly stable area. 3. Model development 3.1. The conceptual modelling framework The conceptual modelling framework (Fig. 8) describes the way how this model was built in particular the process, its components and their interaction between building and thinking. It details the developed methodological approach integrating CA, GIS, and DLT. Basically, Fig. 8 answers the questions on how we built the model and what the main ingredients were used to develop the model? The steps of data collection, data harmonization, pre-processing, and preparation of model input dataset were all completed whithin a GIS environment (ArcGIS) whereas model calibration, validation and performance assessment have been developed using

98

R.M. Basse et al. / Applied Geography 67 (2016) 94e108

Fig. 3. Terrain conditions represented by slope gradient of the study area.

MATLAB. The conceptual modelling framework indicates also the driving factors e.g., distance to Luxembourg borderline and travel time to main cities such as Luxembourg city, Esch sur Alzette and Differdange. 3.2. Model rule-based DLT: how do we formalised the decision learning tool? 3.2.1. What is DLT in concrete terms? Suppose we want to predict the output Y ¼ fY1 ; …; Ym g (dependent variable) using explanatory variables fX1 ; X2 ; …; Xn g. A classiﬁcation tree (Breiman, 1993) is a simple data mining method used to deﬁne the relationship between Y and the explanatory variables Xi ; …; n. This method is based on the simple idea of asking a series of well crafted questions about the explaining variables Xi ; …; n to ﬁnally predict the value of Y. Classiﬁcation trees are particularly useful for data with many features where the interaction between the dependent and the explanatory variables is complicated and non linear. They are usually used when linear or

polynomial regressions are inapplicable. Using the values of explanatory variables, this method consists of sub-dividing the decision space into areas A1 ; …; An in which prediction is straightforward. Each area is associated with one of the prediction values. A decision tree is a hierarchical structure composed of three types of nodes: - Root node: represents the higher level of this hierarchy. It represents the data to predict, with the ﬁrst orientation question deﬁning its path through the decision tree. - Internal node: represents the data with further orientation questions and thus reduces its decision/search space. - Leaf node: is the lower level of the decision tree structure in which the prediction is made. 3.2.2. Example of a decision tree The following section shows an example of a decision tree for classifying non overlapping set coordinates from three different 2D Gaussian functions (350 couples of coordinates). The full dataset

R.M. Basse et al. / Applied Geography 67 (2016) 94e108

99

Fig. 4. Accessibility expressed by travel time in minutes between highway exits and industrial zones or urban districts.

was split into two parts: 245 records for training and 105 for testing. A sample of this data is presented in Table 2. Fig. 9 shows the decision tree that resulted from processing this data and how the decision area A1 ; …; A6 was constructed using root and internal node questions. In this paper, we used the CART implementation of decision trees with a GINI impurity index to select the best split criterion. CART implementation uses a series of binary split over one variable to deﬁne the decision tree. It calculates all places where splits are possible on all variables and selects the one that minimises the GINI index given by the following equations:

GINIðnt Þ ¼ 1

X

½pðijnt Þ2

(1)

i¼1;…;m

where pðijnt Þ is the fraction of proportion record belonging to class i given nt node split. The GINI indexes of children are then summed up to calculate the parent node (root and internal nodes) using the

following equation:

GINIðnt Þ ¼

2 X ni i¼1

n

GINIðiÞ

(2)

where ni and n are the total number of records belonging to the ith child and to the parent node nt respectively. 3.2.3. Why combining CA and decision learning tree model in land use change modelling? CA is often combined with others methods capable of predicting land use evolution. Two types of predictive models were used: 1) data models such as logistic regression and 2) machine learning based models such as DLT (Pal & Mather, 2003; Ballestores & Qiu, 2012). Indeed, DLTs are among the ﬁrst machine learning method that were known for their ability: (a) to classify spatial data; (b) to extract patterns from and mine data; (c) to predict and consequently (d) take reasonable and comprehensive decisions (Breiman, Friedman, Olshen, & Stone, 1984; Goodman & Smyth,

100

R.M. Basse et al. / Applied Geography 67 (2016) 94e108

Fig. 5. Overview of urban and industrial locations and their geometric distances from the country border line.

1988; Moore et al, 1991; Wu, Silvan-Cardenas, & Wang, 2007; Speybroeck et al., 2004; Li & Claramunt, 2006). DLT, like other machine learning method (e.g. artiﬁcial neural networks; support vector machine) build a non-linear relationships between land use categories, and the land use change driving factors identiﬁed using land use modelling exercise (Razi & Athappilly, 2005). Comparing to other machine learning, decision tree has one major advantage that it is not a “black-box system”, in contrary it can be deﬁned as a “white-box system” or “transparent-box system” in the sense that it allows a comprehensive modelling of the evolution processes of the studied system. Indeed, the transition processes generate by the DLT during the modelling phase can be easily read and interpreted by modeller and decision makers (Breiman et al., 1984; Li & Yeh, 2004). In land use change prediction, the possibility of interpreting the processes that inﬂuence land use change is an important aspect and remain one of the main challenges (Briassoulis, 1999; Lambin, Rounsevell, & Geist, 2000; Guisan & Zimmermann, 2000; Houet, Verburg, & Loveland, 2010; Verburg, van Berkel, van Doorn, van Eupen, & van den Heiligenberg, 2010).

In the developed model, the decision tree is used to deﬁne the CA transition rules so it is kernel part of the model. DLT model deduce the transition rules directly from data (characteristics of cells). It extracts patterns of the land use, and deﬁnes a set of “if else” conditions over the driving factors that leads to predicting land use dynamics. Deducing the transition rules directly from data instead of manually calibrating the model has many advantages including the following: 1) It simpliﬁes, shorten the time, and decrease the effort needed for the development of the model. 2) It calibrates the model automatically and thus allows the modeller to focus on identifying the adequate driving factors and on conceptualizing the land use change model. 3) It extracts the CA transition rules through a learning process using training processes and evaluating with unseen datasets. Thus when these two dataset are carefully sampled, the model is capable of generalizing and predicting land use changes well. In other words comparing to manual calibration, DLT may be

R.M. Basse et al. / Applied Geography 67 (2016) 94e108

Fig. 6. (a, b, c, d). Spatial discontinuities of land use distribution in 2000.

Fig. 7. Land use maps (left 1990; right 2006).

101

102

R.M. Basse et al. / Applied Geography 67 (2016) 94e108

Table 1 Land Use evolution between 1990, 2000 and 2006. Land use Categories (LUC)

LUC_1990 in total cells [hectare]

1990e2000 2000e2006 1990e2006 LUC_2006 LUC_2006 in % of the LUC_2000 LUC_2000 In % of the LUC_1990 in % of the land use of total area of evolution In evolution In evolution In land use of total area of in total land use of total area of in total % % % interest cells interest cells interest

Urban areas Industrial units Agricultural areas Forest Water bodies Total Cells

70,152 16,081

5.3 1.22

72,672 18,059

5.5 1.37

74,924 19,339

5.67 1.46

3.59 12.30

3.10 7.09

6.80 20.26

706,640

53.43

701,867

53.07

697,510

52.74

0.68

0.62

1.29

520,731 8838

39.38 0.67

520,677 9167

39.37 0.69

521,543 9126

39.44 0.69

0.01 3.72

0.17 0.45

0.16 3.26

1,322,442

100

1,322,442 100

0.00

0.00

0.00

1,322,442 100

Fig. 8. The conceptual modelling framework.

R.M. Basse et al. / Applied Geography 67 (2016) 94e108 Table 2 Sample of datasets. X

Y

Gaussian

10.78922 8.628982 6.238848 0.467834 8.472096 6.649388 3.829801 7.088098 9.424672

6.067396 3.657541 1.774954 7.455956 7.425656 3.249413 5.957257 1.424874 7.728822

1 2 2 3 1 2 3 2 1

103

2000 and 2006, version 16, released in April 2012. Corine data classiﬁes land cover types into 44 classes using the methodology described in the “Corine land cover technical guide e Addendum 2000” (EEA, 2000). Surface/terrain characteristics (e.g., elevation, gradient, proﬁles) were described in the digital elevation model (DEM) obtained by the Shuttle Radar Topography Mission (SRTM). The raw elevation data was further processed by Vogt et al. (2007). The spatial resolution of the cell-based input data was 100 m. The vector-based input data (e.g., administrative areas and borders, transport network and facilities) were available from different sources e LISER (Luxembourg Institute of Economic

Fig. 9. The decision learning tree structure.

presented as very capable solution to the problem of overﬁtting. 3.3. Description of input data Spatial information related to land use were based on the European Corine Land Cover/Land Use dataset for the years 1990,

Research; OSM (Open Street Maps). Regarding the different original reference systems of data sources, a harmonised reference system was chosen for the analysis (Lambert Azimuthal Equal Area projection with the ETRS 1989 datum). This system is also in line with European standards (Annoni, , Luzet, & Gubler andIhde, 2001). In addition to the previously described settings of CA and DLT, the model inputs are as follows (Table 3):

104

R.M. Basse et al. / Applied Geography 67 (2016) 94e108

(1) Land use in 1990 and 2006 with ﬁve classes (urban, industrial, agriculture, forest and water) (Corine Land Cover, EEA., 2000). (2) The used neighbourhood is an adapted version of the Moore neighbourhood which is characterised by a 10*10 window where cells are symmetrically arranged (composed of 99 cells which surround the studied central cell in a the two dimensional square lattice). (3) The transport networks and their elements (e.g., bus stations, railway stations, highway exits, secondary roads, derived distance from objects). (4) State borders of Luxembourg and the calculated geometric distance both sides of the border line. (5) The Euclidian distance to the centre of three main cities (Luxembourg, Esch-sur-Alzette and Differdange). (6) the gradient map in percentage derived from the DEM [SRTM-Digital Elevation Model] (Farr et al., 2007).

3.4. Model calibration and validation This section presents the calibration and validation process and also answers the questions on the extent to which the model was able to replicate real, observed changes in the land use of the study area. To calibrate the model, we divided the database into two parts: a learning part and a testing part at two levels. The database was ﬁrst divided at the level of cells which changed their state between 1990 and 2000 (¼7802) of which 70% (¼5462) were used for learning and 30% (¼2340) for testing/validation. Second, it was divided at the level of cells which remained unchanged between 1990 and 2000 (¼454,015) of which 30% (¼136,205) were used for learning and the remaining 70% for testing/validation. This approach is important as it made it possible to identify land use changes patterns even when the “system” appeared to be “stable” in time and allowed the model to deal with the issue of unbalanced data (i.e. the number of observed unchanged cells is much more than changed one) (Charif, 2013; Charif and Basse, 2015). Table 4 shows the model performance at two levels; ﬁrst, when predicting the

cells which change their state (changed-set-results) with a prediction score of 82.44%. It is important to ﬁrst concentrate on these types of cells in order to predict realistically the number of cells which actually change their state within the entire system and to be able to detect and isolate the land use type that is more conducive/ sensitive to change (Pontius, Huffaker, & Denman, 2004a; 2004b; White, 2006). Focussing on these changes, it becomes possible to quantify the change potential in the overall land use system. Model validation then proceeds at the level of cells which do not change their state e the “unchanged-set-results”. These cells were successfully predicted with a rate 99.35%. Therefore, The overall success rate considering both changed and unchanged cell is 99.23%. The high success rate showcases that the model was capable of learning the land use change patterns despites dealing with the complexity of the cross-border spatial system. Table 5 re-enforces our previous remarks. In this table, the diagonal elements in the matrix represent the number of correctly predicted pixels of each of the four land use classes and the offdiagonal elements present the number of wrongly predicted cells i.e. prediction errors. Based on the confusion matrix (Table 5), the model can be considered well constructed, able to use diverse empirical datasets and is now ready to predict land use change until 2006. To evaluate model performance in 2006, we predicted the land use map of 2006. The confusion matrix shows that the model managed to accurately predict the land use of 2006 (Table 6). Regarding the overall model performance (table, 4, 5 and 6), we believe that decision learning tree algorithm enhance the performance of the CA model when predicting land use maps of 2000 and 2006, by highlighting (a) the inﬂuence of neighbourhood in the land use class interactions (b) revealing the spatial distribution of

Table 4 Three steps for the validation of results. In percentage from the full dataset (%) Changed-Set-Results Unchanged-Set-Results Full-Set-Results

82.44 99.35 99.23

Table 3 Model inputs. Variables

Description of the variables

Land use states

1 Urban, 2 Industry 3 Agriculture 4 Forest 5 Water 6 Transport - Urbaneneighbours in the 10 10 Moore neighbourhood - Industrialeneighbours in the 10 10 Moore neighbourhood - Agricultureeneighbours in the 10 10 Moore neighbourhood - Agricultureeneighbours in the 10 10 Moore neighbourhood - Foresteneighbours in the 10 10 Moore neighbourhood - Watereneighbours in the 10 10 Moore neighbourhood Distance to border/Frontier in kilometres Slope value of cell (%) Distance to main cities - Distance to Luxembourg city in min - Distance to Differdange city in min - Distance to Esch-sur-Alzette city in min Amount of transport cells in the neighbourhood of the studied cell - Distance to the closest bus station (metres) - Distance to the closest train station (metres) - Distance from cell to the nearest highway access point (km) - Number of bus stations located 2 km away from cell - Number of train stations located 2 km away from cell

Adapted Moore-neighbourhood

Border/Frontier Slope gradient Distance Travel time

Transporteneighbours

R.M. Basse et al. / Applied Geography 67 (2016) 94e108

105

Table 5 The confusion matrix table for the year 2000. Observed situation 2000

Simulated situation 2000 Urban Industry Agriculture Forest

Urban 69,142 27 5507 68

Industry 21 15,348 1629 823

Agriculture 141 298 684,229 904

Forest 10 103 113 513,226

the land use pattern, (c) showing key drivers that inﬂuence land use transition within the hierarchical structure of the DLT. With the use of DLT as the rule-based model, knowledge about transition from one land use class to another is formalised to a certain degree. 3.5. Neighbourhood sensitivity: a different way to explore model performances In order to model the sensitivity of CA to the Moore neighbourhood, we trained the decision tree with a neighbourhood radius varying from 1 to 10. Its performance in predicting land use change on the test dataset was then recorded. To avoid randomness bias, we repeated this procedure 100 times for each neighbourhood setting. Fig. 10 shows that performance increased and reached its maximum for a neighbourhood radius equal to 10. Neighbourhood sensitivity was tested in a highly restrictive area which corresponded to the Luxembourg national territory. Indeed, reducing the size of the study area made it possible to better measure the probability of cell changes within the varying neighbourhood. Doing this in a restrictive area (smaller that the initial study area) had the major advantage of enabling us to overlook the “bordure” of the initial study area. The “bordure” can also be addressed by increasing the size of the study area. We chose to reduce it in order to control the quality of the harmonised data and keep the accurate model validation/model calibration match. 4. Discussion The exercise presented involved modelling the land use of a cross-border region by taking into consideration variables such as the transport network, state borders, physical determinants (slope gradient in percentage) and neighbourhood effects. Results show that taking into account the complexity of land use within a border context cannot be achieved using simple statistical tools alone, but that the applied tools must be supplemented with GIS and other more comprehensive methods such as the model presented in this paper. Integrating DLT and CA allowed involvement of all structural variables of cross-border land use systems and provision of a realistic, robust and complex land use model. Indeed, analysing the model's results, considering the variable accessibility maps suggests that land use has a multiform behaviour concerning the relationship to the transport facilities and in particular for residential zones (urban class). This is because it is either attracted or repelled by transport infrastructure. Spatial competition between urban and industrial features occupying the same location was

Fig. 10. Model performance: sensitivity to neighbourhood radius in cell.

obvious. The observed rivalry explains why cells belonging to urban and Industrial classes are interlocked and naturally adjacent in some regions. In cases where there was repulsion, the urban class distanced itself from the transport infrastructure, leading to sprawl and dispersion at urban cell level (urban sprawl phenomenon). On the contrary, the industrial land use categories often sought to be located in immediate proximity to transport facilities in order to obtain the best economic performance (e.g., price of properties, rental fees, transport costs, attracting clients, vicinity of human resources). Focussing on the state borders, we observed a physical barrier effect. This is logical if we consider the variables used to construct the model. We were therefore able to highlight the fact that the majority of urban and industrial zones were located close to the border. Moreover, the further one moved away from the border, the less concentrated were the urban and industrial cells. The only exception was the “sillon lorrain” (extreme south of the study area) which forms a dense urban belt with signiﬁcant industrial zones. The cells which make up the industrial category were primarily concentrated within and just outside Luxembourg's southern borders. This phenomenon can be credited to the economic history of the region which has specialised in heavy industry (speciﬁcally steel) over the decades. Indeed, cities such as Florange and

Table 6 The confusion matrix table for 2006. Observed situation 2006 Urban Industrial Agriculture Forest

Simulated situation 2006 Urban Industrial 68,783 159 291 15,180 2701 1211 111 674

Agriculture 2397 865 681,384 3128

Forest 204 546 2025 512,045

106

R.M. Basse et al. / Applied Geography 67 (2016) 94e108

Fig. 11. Top left (observed/mapped land use based on CLC2000) top right (predicted/modelled land use for the year 2000). Bottom left (observed/mapped land use based on the CLC2006) bottom right (predicted/modelled land use 2006).

Differdange still bare traces of this industrial history. However, we must remain prudent at this level of the study. Indeed, only socioeconomic and demographic data can support the existence of a border effect by showing, for example, how the concentration of industrial activities in Luxembourg's southern border area inﬂuenced employment organisation in the southern part of the study area. To a certain extent, this inﬂuence led to the specialisation of the Lorrain corridor's workforce. Finally, the land use maps (simulated situation in 2000, simulated situation in 2006 (Fig. 11)) and the difference maps between observed and simulated land use maps 2000; 2006 (Fig. 12)

indicate that there is no signiﬁcant land use change in the study area. The unchanged-set-results (99.35%) and the full-set-results (99.23%) during the validation process also conﬁrm this observation. The global modelling results reveal that the area is still dynamic but changes are slow and do not affect the structure of the land use system. Regarding model performance, we have no doubt that the presented methodology can be replicated and that the model can be adapted to other areas of interest in an efﬁcient and successful manner.

R.M. Basse et al. / Applied Geography 67 (2016) 94e108

107

Fig. 12. Areas where differences occur between observed and predicted features for the years 2000 and 2006.

5. Conclusion The results of the study demonstrate that land use dynamics in a cross-border region can be modelled adequately by combining GIS, CA and DLT methods, particularly when spatial and explicit variables are mobilised. Indeed, these variables play a crucial role in land use development and spatial interconnections. The results also indicate that the model is able to predict correctly the changes in land use characteristics through the change-set-results validation process. These results strengthen the hypothesis that the use of CART implementation of a decision tree with a GINI index is able to test all probabilities in cell state transition. It is also capable of learning the transition between land use categories during the simulation period by proposing the “ideal” choice for selecting the best split criterion in a satisfactory way. Although, this paper focuses on neighbourhood sensitivity, it is clearly beneﬁcial in the model validation process and this aspect contributes to guarantee good model performance. The global modelling results show that the area analysed is still dynamic but the changes have slowed

down in the last decade. Future research will use the same method but will go further by integrating the different socio-economic conditions and planning issues across the cross border region of Luxembourg. Acknowledgements This paper would like to acknowledge the SMART-BOUNDARYproject (Automates cellulaires pour la simulation de la croissance s transfrontalie res), co-founded by CNRS urbaine et des mobilite (Centre National de la Recherche Scientiﬁque), France and FNR (Fond National de la Recherche), Luxembourg. This material is also la Formabased upon work partially supported by an AFR (Aide a tion Recherche) Grant No. PHD- 09-077, funded by the FNR. The paper also acknowledge the following contribution «Modelling land use dynamics in Luxembourg cross border region: The use of cellular automata and decision tree learning model ». presented by Omar Charif and Reine Maria Basse at the GeoComputation 2015 conference, May 20 - 23, 2015, Dallas, USA.

108

R.M. Basse et al. / Applied Geography 67 (2016) 94e108

References Annoni, A., Luzet, C., Gubler, E., & Ihde, J. (Eds.). (2001). Map projections for Europe, European Commission, Directorate-General Joint Research Centre (p. 131). Ispra, Italy: Institute for Environment and Sustainability. EUR 20120 EN. Ballestores, F., Jr., & Qiu, Z. (2012). An integrated parcel-based land use change model using cellular automata and decision tree. Proceedings of the International Academy of Ecology and Environmental Sciences, 2(2), 53e69. Basse, R. M., Omrani, H., Charif, O., Gerber, P., & Bodis, K. (2014). Land use changes modelling using advanced methods: cellular automata and artiﬁcial neural networks. The spatial and explicit representation of land cover dynamics at the cross-border region scal”e. Applied Geography, 53, 160e171. Batty, M., & Xie, Y. (1994). From cells to cities. Environment and Planning B: Planning and Design, 21(7), 31e48. Batty, M., Xie, Y., & Sun, Z. L. (1999). Modeling urban dynamics through GIS-based cellular automata. Computers, environment and urban systems, 23(3), 205e233. Benenson, I., & Torrens, P. M. (2004). Geosimulation: object-based modeling of urban phenomena. Computers, Environment and Urban Systems, 28(1e2), 1e8. Breiman, L. (1993). Classiﬁcation and regression trees. CRC press. Breiman, L., Friedman, J. H., Olshen, R., & Stone, C. J. (1984). Classiﬁcation and regression trees. Wadsworth statistics/probability series. Belmont, CA,USA: Wadsworth Advanced Books and Software. Briassoulis, H. (1999). Analysis of land use change: Theoretical and modeling approaches. Morgantown, West Virginia, USA: Regional Research Institute, West Virgina University. Brunet-Jailly, E. (2005). Theorizing borders: an interdisciplinary perspective. Geopolitics, 10, 633e649. Charif, O. (2013). Modelling and simulating individual's mobility: Case study of de Technologie de Luxembourg and its greater region. PhD diss. Universite gne. Compie Charif, O., & Basse, R. M. (2015). Modelling land use dynamics in Luxembourg cross border region: the use of cellular automata and decision tree learning model. In Geocomputing conference, Dallas,Texas, USA, 20e23, may. Clarke, K. C., & Gaydos, L. J. (1998). Loose-coupling a cellular automata model and GIS: long-term urban growth prediction for San Francisco and Washington/ Baltimore. International Journal of Geographic Information Science, 12, 699e714. Couclelis, H. (1985). Cellular worlds: a framework for modeling micro-macro dynamics. Environment and Planning A, 20, 99e109. Decoville, A., Durand, F., Sohn, C., & Walther, O. (2013). Comparing cross-border metropolitan integration in Europe: towards a functional typology. Journal of Borderlands Studies, 28(2), 221e237. EEA.. (2000). CORINE land cover technical guide e Addendum 2000. Technical report No 40 (p. 105). Copenhagen: European Environment Agency. URL http://reports. eea.europa.eu/tech40add/en. Farr, T. G., Rosen, P. A., Caro, E., Crippen, R., Duren, R., Hensley, S., et al. (2007). The shuttle radar topography mission. Reviews of Geophysics, 45, RG2004. http:// dx.doi.org/10.1029/2005RG000183. Available at: the website of Consultative Group for International Agriculture Research (CGIAR), CGIAR Consortium for Spatial Information (CGIAR-CSI) http://srtm.csi.cgiar.org/. Gardner, M. (1970). The fantastic combination of John Conway's new solitaire game life. Scientific American, 223, 120e123. Geist, H. J., & Lambin, E. F. (2001). What drives tropical deforestation? A meta-analysis of proximate and underlying causes of deforestation based on sub-national case study evidence. Louvain-la-Neuve, France: LUCC International Project Ofﬁce, University of Louvain-la-Neuve. Gerber, P. (2012). Advancement in conceptualizing cross-border daily mobility: the Benelux context in the European Union. European Journal of Transport and Infrastructure Research, 12(2), 178e197. Goodman, R. M., & Smyth, P. (1988). Decision tree design from a communication theory standpoint. IEEE Transactions on Information Theory, 34(5), 979e994. Guisan, A., & Zimmermann, N. E. (2000). Predictive habitat distribution models in ecology. Ecological Modelling, 135(2e3), 147e186. Houet, T., Verburg, P. H., & Loveland, T. (2010). Monitoring and modelling landscape dynamics. Landscape Ecology, 25(2), 163e167. Lambin, E. F., Geist, H. J., & Lepers, E. (2003). Dynamics of land use and land-cover change in tropical regions. Annual Review of Environmental Resources, 28, 205e241. Lambin, E. F., Rounsevell, M. D. A., & Geist, H. J. (2000). Are agricultural land-use

models able to predict changes in land-use intensity? Agriculture Ecosystems and the Environment, 82, 1e3. Li, X., & Claramunt, C. A. (2006). Spatial entropy-based decision tree for classiﬁcation of geographical information. Transactions in GIS, 10(3), 451e467. Li, X., & Yeh, A. G. (2004). Data mining of cellular automata's transition rules. International Journal of Geographical Information Science, 18(8), 723e744. Moore, D. M., Lees, B. G., & Davey, S. M. (1991). A new method for predicting vegetation distributions using decision tree analysis in a geographic information system. Environmental Management, 15(1), 59e71. Paasi, A. (2005). Generations and the ‘development’ of border studies. Geopolitics, 10, 663e671. Pal, M., & Mather, P. M. (2003). An assessment of the effectiveness of decision tree methods for land cover classiﬁcation. Remote Sensing of Environment, 86(4), 554e565. Phipps, M. (1989). Dynamical behavior of cellular automata under the constraint of neighborhood coherence. Geographical Analysis, 21, 197e204. Pontius, R. G., Jr., Huffaker, D., & Denman, K. (2004a). Useful techniques of validation for spatially explicit land-change models. Ecological Modelling, 179(4), 445e461. Pontius, R. G., Jr., Shusas, E., & McEachern, M. (2004b). Detecting important categorical land changes while accounting for persistence. Agriculture, Ecosystems & Environment, 101(2e3), 251e268. Raileanu, L. E., & Stoffel, K. (2004). Theoretical comparison between the gini index and information gain criteria. Annals of Mathematics and Artiﬁcial Intelligence, 41(1), 77e93. Razi, M. A., & Athappilly, K. (2005). A comparative predictive analysis of neural networks (NNs), nonlinear regression and classiﬁcation and regression tree (CART) models. Expert Systems with Applications, 29(1), 65e74. Schiebel, J., Omrani, H., & Gerber, P. (2015). Border effects on the travel mode choice of resident and cross-border workers in Luxembourg. European Journal of Transport and Infrastructure Research, 15(4), 570e596. Sohn, C., Reitel, B., & Walther, O. (2009). Cross-border metropolitan integration in Europe. The case of Luxembourg. Basel and Geneva. Environment & Planning C, 27(5), 922e939. Speybroeck, N., Berkvens, D., Mfoukou-Ntsakala, A., Aerts, M., Hens, N., Van Huylenbroeck, G., et al. (2004). Classiﬁcation trees versus multinomial models in the analysis of urban farming systems in central Africa. Agricultural Systems, 80(2), 133e149. Tobler, W. R. (1970). A computer movie simulating urban growth in the Detroit region. Economic Geography, 46(2), 234e240. Tobler, W. R. (1979). Cellular geography. In S. Gale, & G. Ollson (Eds.), Philosophy in geography (pp. 279e386). Dordrecht: Reidel. Turing, A. (1950). computing machinery and intelligence. Mind, 59(236), 433e460. Ulam, S. (1952). Random processes and transformations. In Proceedings of the International Congress of Mathematicians (Cambridge, Massachusetts, August 30eSeptember 6, 1950) (vol. 2). Rhode Island: American Mathematical Society, 264e275. Verburg, P. H., van Berkel, D. B., van Doorn, A., van Eupen, M., & van den Heiligenberg, H. (2010). Trajectories of land use change in Europe: a modelbased exploration of rural futures. Landscape Ecology, 25(2), 217e232. Vogt, J. V., Soille, P., de Jager, A., Rimaviciute, E., Mehl, W., Foisneau, S., et al. (2007a). A Pan-European river and Catchment database, European Commission, Directorate-General Joint Research Centre. JRC Reference Reports, EUR 22920 EN (p. 119). Ispra, Italy: Institute for Environment and Sustainability. URL http:// desert.jrc.ec.europa.eu/action/php/index.php?action¼view&id¼23. Von Neumann, J. (1951). The general and logical theory of automata. In L. A. Jeffress (Ed.), Cerebral Mechanism in Behavior-the Hixon Symposium, 1948 (pp. 1e41). Pasadena, CA, New York: Wiley. White, R. (2006). Pattern based map comparisons. Journal of Geographical Systems, 8(2), 145e164. White, R., & Engelen, G. (1997). The use of constrained cellular automata for highresolution modelling of urban land use dynamics. Environment and Planning B, 24(3), 323e343. White, R., & Engelen, G. (2000). High-resolution integrated modelling of the spatial dynamics of urban and regional systems. Computers, Environment and Urban Systems, 24, 383e400. Wu, S., Silvan-Cardenas, J., & Wang, L. (2007). Per-ﬁeld urban land use classiﬁcation based on tax parcel boundaries. International Journal of Remote Sensing, 28(12), 2777e2801.

Spatial and temporal dimensions of land use change in cross border region of Luxembourg. Development of a hybrid approach integrating GIS, cellular automata and decision learning tree models

Spatial and temporal dimensions of land use change in cross border region of Luxembourg. Development of a hybrid approach integrating GIS, cellular automata and decision learning tree models

Recommend Documents