Journal of Hydrology 395 (2010) 256–263
Contents lists available at ScienceDirect
Journal of Hydrology journal homepage: www.elsevier.com/locate/jhydrol
Application of GIS and remote sensing techniques in generation of land use scenarios for hydrological modeling F. Oñate-Valdivieso a,⇑, Joaquín Bosque Sendra b a b
Unidad de Ingeniería Civil, Geología y Minas, Universidad Técnica Particular de Loja, C/. Marcelino Champagnat S/N, 1101608 Loja, Ecuador Departamento de Geografía, Universidad de Alcalá, C/. Colegios 2, 28801 Alcalá de Henares, Madrid, Spain
a r t i c l e
i n f o
Article history: Received 8 June 2010 Received in revised form 13 October 2010 Accepted 22 October 2010 This manuscript was handled by K. Georgakakos, Editor-in-Chief, with the assistance of V. Lakshmi, Associate Editor Keywords: Land use change model Hydrological scenario LCM Calibration of parameters Sensitivity analysis
s u m m a r y This research studies the change in land use in a binational hydrographic basin in South America. In addition, a future perspective for land use is generated according to the trends in the development observed. A multi-temporal analysis of land use change is carried out and variables that can explain the observed transitions will be selected. The relations between changes and explicative variables are studied in order to stochastically model future land use maps. Persistence was found to be the predominant state. Higher transitions were observed in the zones of boundaries between categories. Biophysical variables had the most explicative power with a better performance of the model based on logistic regression than the one made by using neural networks. Ó 2010 Elsevier B.V. All rights reserved.
1. Introduction The transition processes among land cover are the result of complex interactions among physical, biological, economic and social factors. In most cases, these factors have influence on the production of erosive processes, an increase of surface runoff, changes in biodiversity, etc. (Mas, 1999). The generation of land use scenario towards a horizon year can be done based on a multi-temporary analysis of the changes in land use that have occurred in the research area and their correlation with established variables, which can explain, to some extent, these changes. The analysis and modeling of change of land use identifies its causes, its location and the exact moment when they happened. The analysis and modeling uses are varied; papers on degradation and desertification can be found (Huete et al., 2003; Geerken and Ilaiwi, 2004), droughts (Bayarjargal et al., 2006), floods (Liu et al., 2002), urban growing (Van Vliet et al., 2009), etc. The detection of changes in land use can be carried out based on two procedures: first, to do the class assigning from the obtained classification of the images from the satellite on different dates in order to identify changes. Then, the changes are obtained based on a simultaneous analysis of images from different dates and then ⇑ Corresponding author. Tel.: +593 72570275; fax: +593 72584893. E-mail addresses:
[email protected] (F. Oñate-Valdivieso), Joaquin.bosque@ uah.es (J. Bosque Sendra). 0022-1694/$ - see front matter Ó 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.jhydrol.2010.10.033
to assign the thematic classes (Van Oort, 2006; Coppin et al., 2004). Each of these procedures presents different reliable degrees and accuracy according to each case application. The analysis carried out based on the first approach is the most used in the bibliography (Pontius et al., 2004). The quantization of changes is done through a cross tabulation analysis of the land use at two different times (Pontius et al., 2004; Serra et al., 2008). Like this, the persistence, profit, loss and exchanges of each of the land uses considered are determined. In order to model the spatial patterns of the changes of land use two approaches prevail: (a) Models based on regression and (b) models based on spatial transition. The first establish relations between wide range of explanatory and predictor variables and the observed changes in land use. Later, these variables will be used to estimate the location of future changes in the landscape. The influence of these local factors over the change in occupation soil can be modeled as a function of the declining distance in which influence decreases with the distance increase. The influence also decreases by using variables distributed spatially such as the edaphological type, the slope and the elevation, whose influences are given by their geographic location. Generally, in the modeling based on regression different approaches have been used, such as linearization of an exponential relationship as in logistic regression, non-linear, and the ones that are connected to neural networks (Mas et al., 2004; Pijanowski et al., 2005). On the other hand, models based on spatial transitions
F. Oñate-Valdivieso, J. Bosque Sendra / Journal of Hydrology 395 (2010) 256–263
257
cover mainly the stochastic techniques, which are based on the Markov Chain Method and the cellular automaton (CA) (Pontius and Malanson, 2005; Van Vliet et al., 2009). These models assume explicitly that the nearby areas have influence on the transition probability of the area or main cell. For the predictive modeling to be transformed into a useful tool, it is necessary that they represent in a substantial way the magnitude of changes, the location of future changes and their spatial patterns (Brown et al., 2002). 1.1. Objective The objective of this paper is to generate a land use scenario towards the year 2012, which reflects the historic tendencies of change in the Catamayo-Chira Binational Basin. For this purpose, the changes in the land use are analyzed, the effect of the possible explanatory variables of the changes is studied and a land use scenario is developed through two different methodologies. At the beginning of the article the area of study is described, and later the procedures that have been used in the detection and quantification of change are explained in detail. The explanatory variables considered for the study and the methodologies used are presented in order to model the change in the land use towards the horizon year. Finally, the article is focused on the analysis of the relation among the explanatory variables, their transitions and the quality analysis of the obtained results. 2. Methodology 2.1. Area of study The Catamayo-Chira Binational Basin extends between the coordinates 3°300 to 5°80 from the south latitude to 79°100 to 1°110 of the west latitude. It spreads through the south-west borderline region between Ecuador and Perú with an extension of 17,199.19 km2. It has 817,968 inhabitants. The basin is formed in the mountains of the Continental water divide (Ecuador) and ends in the Pacific Ocean (Peru). It goes through the mountain ranges and the coast. It has tropical weather, ecosystems, administrative uses and systems, which determine the natural territorial and the very diverse anthropic characteristics. The geography of the basin is abrupt and has altitude ranges that go between 3700 and 0 m above sea level. In this scenario, 11 life zones can be found. These go from the tropical dessert to the mountain rainforest. The yearly mid precipitation ranges from 800 mm and varies between 10 mm in the low zone to 1000 mm in its headwaters. In general terms, 14% of the surface of the basin is covered by tree vegetation, 41% by dry forest, 30% by pasture, 10% for crops and a 5% by different uses. In this work, it has been considered as a study area. The portion of the binational Catamayo Chira basin, which is located upstream from the entrance of Poechos reservoir, occupies an area of 11,910.74 km2. The location of the study zone can be observed in Fig. 1. 2.2. Land Change Modeler The Land Change Modeler LCM from Idrisi Andes (Eastman, 2006) was used in order to carry out the research on the change of land use. This software has been developed to analyze and predict the change in land cover and to evaluate the implication of these changes in biodiversity. The LCM is sequential and produces graphics and maps of the change in land use. It allows for determining the transition potential of the use of earth from one
Fig. 1. Location of the binational Catamayo Chira Basin.
category to a different one, taking into consideration the static or dynamic variables, which will be the ones explaining the change. For predicting the land use, planning elements that can stimulate or limit the changes are included. Additionally, it includes tools for the change of land use analysis and territorial planning. 2.3. Generation of historic maps of land use In order to study the dynamics of land use, it is necessary to have maps that reflect the status of land use at different times. For creating these maps, three images from the study zone were used: Landsat5 TM (02/11/1986), Landsat5 TM (24/07/1996), Landsat7 ETM + (02/10/2001). These correspond to the dry season in the study zone. This is in order to avoid possible classification errors due to the seasonal change in vegetation. Since Landsat images are being used, the scale of this paper is 1:100,000 (Chuvieco, 2002, p. 163). The geometric correction of the images was carried out based on checkpoints taken over the image. For the topographic correction the Civco Method (1989, quoted by Chuvieco (2002, p. 275)) was used. The atmospheric correction was done using the Chavez Method (1986, quoted by Chuvieco (2002, p. 272)). A cloud mask was elaborated by using the Saunders and Kriebel algorithm (1988, quoted by Chuvieco (2002, p. 289)). Through the minimum spectral angle method (Richards and Jia, 2006, p. 368; Chuvieco, 2002, p. 353), the images from the seven predominant covers of the Catamayo Chira basin were classified in a monitored way: natural forest, trees, dry forest, pasture, rice, cane and corn. This was done taking into consideration the spectral signatures from training sites which have been selected. The classified images underwent a mode filter with the purpose of eliminating the saltand-pepper effect. The category validation was done through 100 observations taken in the area and a geo-referenced observation through GPS. The purpose is to determine the number of observations that match the classification that has been done and to present the results with a percentage. Finally, a re-classification of images was done. The categories were grouped as follows: natural forest and trees as tree vegetation; the cane growing, corn and rice as crops. The dry forest and pasture categories were maintained. It is important to mention that the high plateau cover and the clouds were grouped into one category called no data, which is considered to be non-dynamic throughout time. With this procedure three maps of land use for 1986, 1966, and 2001 were generated.
258
F. Oñate-Valdivieso, J. Bosque Sendra / Journal of Hydrology 395 (2010) 256–263
2.4. Detection of changes and explicative variables After the maps of land use for 1986, 1996, and 2001 were obtained, we decided to analyze the changes that occurred between 1986 and 1996 in order to obtain from the interaction of such changes with possible explicative variables a predictive model which will be validated through the comparison with the obtained map for 2001. The changes that occurred were studied by applying the methodology proposed by Pontius et al. (2004), which determined the persistence, profit, loss, and interchanges among the thematic categories considered in each map of land use in terms of the analysis of a cross tabulation among them. The changes that occurred between the two analyzed years (1986, 1996) were identified and each one of them was considered as an independent submodel of transition that relates the observed changes with a certain number of variables that explain such changes. Qualitative explanatory variables are introduced into the calculation using a transformation based on empirical likelihood method (Kleinbaum and Klein, 2002, p. 101). In this case, calculate the probability of a change in land use based on the existence of a particular type of soil edaphic (variable quality), in this way, the map categorical pedological soil type becomes a continuous map of probability of occurrence of a particular land use transition, depending on soil edaphic, so get as many maps processed soil type as many transitions occur in the study area. The explicative variables that integrate each sub model of transition were selected according their explicative potential which has been evaluated through the Cramer’s V coefficient. This V coefficient compares the explicative variables, one at a time, with the thematic categories of the map of land use in 1990 in which values similar or higher to 0.40 were accepted (Eastman, 2006). In some cases, it was necessary to practice with several combinations of explicative variables until obtaining the most favorable adjustment between them and transitions. Six explicative variables of biophysics order that prepare for the occurrence of the different occupations were considered: The elevation (DEM), land slope, total annual precipitation, distance to watercourse, distance to the initial location of the coverage and the type of land. As explicative anthropic variables, the distance to communication means and the distance to cities were selected, considering that the presence of roads and the closeness to cities are factors that motivate and facilitate the agricultural and wood exploitation and that can also directly affect in the use or coverage change. 2.5. Modeling of the land use change According the selected explicative variables, the probability of occurrence in each transition was calculated through the application of two different alternatives of modeling: through logistic regression (Kleinbaum and Klein, 2002, p. 4) between the observed changes and the possible variables that explain such changes, and through neural networks of multilayer perceptions (Pijanowski et al., 2002; Chuvieco, 2002, p. 412), which model the land use change in terms of the understanding of the existent relation between the changes and the explicative variables. The modeling of the land use change to the horizon year (2001) was developed through the application of Markov chains, using the coverage map of the final date (1996) and the transition probability that has been already calculated. This modeling helped us to obtain areas of profit and loss of each category in the horizon year. The map of the future land use was defined through a procedure of multiobjective allocation of land uses (MOLA) (Bosque and García, 1999; Eastman, 2006); for this purpose, all the observed transitions were considered, creating a list of host classes (that will lose parts of land) and a list of demanding classes (that will gain
land). With the use of the selected explicative variables, the most appropriate places for each occupation change are determined and then a land that belongs to all the host classes is assigned to the demanding classes. The results of land use reallocation were superimposed to produce the final result (Eastman, 2006). With the described procedure, two maps that prognosticate the land use for 2001 based on the modeling of the relation between the observed changes and the explicative variables were generated. Such relations were modeled with logistic regression and neural networks. 2.6. Validation of the land use models and generation of scenarios For the validation, the map extracted from the 2001 image was considered as a referent, using confusion matrixes for the study of the correspondence between the reference map and the ones obtained through neural networks and logistic regression. It helped to determine the forecast errors of the land use according to each established model, as well as the omission and commission errors that have been made. The global reliability of the classification was calculated from the confusion matrix, as the relation between the number of pixels which have been correctly assigned and the total number of pixels of the image (Chuvieco, 2002, p. 492), and the adjustment between the reference map and the maps generated through the Kappa index (Pontius et al., 2001). Once the adjustment was analyzed, a land use map towards year 2012 was generated. For this purpose, the land use maps of 1996 and 2001, the explicative variables selected for each transition, and a model that has the best capacities were considered. In a future study, that map will be used as a scenario that will represent the vegetation status in the horizon year if its evaluation keeps the observed historical tendency. Its effect will be analyzed through a semi-distributed hydrological model of the already mentioned study area. 3. Analysis of results 3.1. Change detection Table 1 shows the cross tabulation summary done between the maps of 1986 and 1996; here you can observe a predominance of persistence in all the coverage (diagonal values) equivalent to the 80.5% of the total surface of the study area; moreover, you can observe an increase in the shrub-type vegetation area and dry forest, as well as a decrease of the grassland and cultivation surfaces. The increase of shrub-type vegetation has been produced with higher intensity in zones initially occupied by grassland. Cultivation areas grew up taking mainly dry forest areas which present a general loss due to the changes to dry forest, grassland, and shrub-type vegetation. These decreases can possibly be attributed to the abandonment of agricultural zones that have caused by different socio-economic factors such as the limited availability of water resources, high production costs and migration. The increase of dry forest areas is produced equally in zones that were initially occupied by shrub-type vegetation, crops, and grassland. The majority of changes have been produced mainly in zones that are in contact among coverage. This happens because crop areas in the low zone of the basin are increasing and they intrude on dry forest zones; something similar occurs in the middle and high part of the basin in which natural vegetation zones are used to extend the surface of grassland and crops with agricultural purposes. Grasslands could become cultivated zones if the farmer wants them to be. When the crop zones are abandoned, they become grassland and later, due to the natural regeneration, the shrub-type vegetation comes back.
259
F. Oñate-Valdivieso, J. Bosque Sendra / Journal of Hydrology 395 (2010) 256–263 Table 1 Cross tabulation of the studied coverage in 1986 (horizontal) and in 1996 (vertical). Shrub-type vegetation
Dry forest
Grassland
Crops
Total
Shrub-type vegetation Dry forest Grassland Crops
111161.61 16134.60 32327.18 1177.26
6678.70 204018.96 15217.83 9707.99
51008.73 18600.04 433629.25 6198.05
5369.52 14461.56 8891.21 14243.09
174218.56 253215.16 490065.48 31326.38
Total
160800.65
235623.48
509436.08
42965.38
2367935.15
3.2. Explicative variables Table 2 shows the association level that exists between continuing explicative variables and the studied coverage in the study zone. The weakest explicative variable is the distance that exists to the rivers with values lower than the tenth decimal. Its limited association with the analyzed coverage can be caused by the topography that is notably irregular in the study zone. We hope to have higher values in the case of crops since higher agricultural surfaces can be found in the surroundings of the water courses. The slope is a variable that shows limited association with the existent land uses and distance to roads (Table 2). The total annual precipitation shows low levels of association with the considered coverage, which are higher to the ones observed for the case of the slope. The distance to cities show low association with grassland and crop zones, although it presents an acceptable value with the coverage of dry forest. The highest association is noticed between the elevation (DEM) and the analyzed land uses (Table 2), which help us verify the supposed initial observation that the elevations prepare for the existence of determined vegetal species. High levels of association can also be observed between the distances to zones that presented dry forest, shrub-type vegetation, and grassland in 1986 and the land uses in 1990. We can suppose that the proximity between the different land uses can facilitate great interchanges of land use. The presence of dry forest (Table 2) is predominantly related to the elevation and distance to zones of shrub-type vegetation in 1986. This fact could explain this type of land use in low-elevation zones and the observed changes of shrub-type vegetation to dry shrub in zones with relatively low elevation. Grasslands show more association with distance to grasslands and dry forest in 1986 (Table 2). This can be a important factor of an increase in the grassland zones which are mostly covered by dry forest. Crops show more relation with elevation and distance to grassland coverage in 1986 (Table 2). This situation is explained by the fact that crops are present in relatively low zones and, in many cases, the abandoned agricultural zones are naturally covered by grass due to their proximity to grassland zones and vice versa. None of the analyzed explicative variables presented a significant level of association with the occurrence of shrub-type vegetation. The low explicative power of the considered variables can
possibly be attributed to the fact that shrub-type vegetation is originated through a process of natural regeneration that is conditioned by biological factors such as the level of competence among seedlings and the distance to adult shrubs with the closest amount as well as the distance to other woody species (Pérez-Ramos, 2007). The factors of the physical environment also exert a significant influence on regeneration, thus, highlighting soil water availability, light intensity at the understory, and other edaphological parameters related to fertility, acidity or thickness of the layer of fallen leaves (Pérez-Ramos, 2007). It is clear that none of these factors was considered in this study, especially due to the difficulty of charting the determining factors mentioned above and the details that they require, which are higher than the scale used in this work. The type of soil does not present a level of association, which is relevant to the presence of shrub-type vegetation. The remaining occupations of soil present values higher than 0.30, which, according to Eastman (2006), can be useful to model the change in occupation of soil. 3.3. Submodels of transition In several cases, submodels of transition included explicative variables which presented values that were higher than the Cramer’s V. In other cases, it was necessary to consider variables that, although they presented values of this parameter, contributed to the improvement of the correlation between explicative variables and transitions. In Table 3 the different submodels of transition, the variables that are part of these submodels, and the results of the calculated logistic regression are shown. The coefficients that affect each explicative variable in the logistic regression equation, and the correlation degree between variables and transitions (ROC) are included. It can be seen in Table 3 that, in all cases, the correlation between transitions and explicative variables (ROC) exceeds 80%. The sign of the coefficients of the logistic regression equation allows us to know if the relation between the explicative variables and the transition is direct or inverse. It can be observed that the elevation (DEM) has inversely related transitions whose final occupation requires low elevations (crops, shrub-type vegetation) and directly with transitions that require medium or high elevations
Table 2 Cramer’s V coefficient: association level of the quantitative, explicative variables and the studied land uses.
DEM Slope Edaphic soil type Total annual precipitation Distance to rivers Distance to dry forest in 1986 Distance to shrub-type vegetation in 1986 Distance to crops in 1986 Distance to grassland in 1986 Distance to roads Distance to cities
Shrub-type vegetation
Dry forest
Grassland
Crops
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
0.497 0.105 0.341 0.233 0.019 0.443 0.654 0.321 0.267 0.286 0.359
0.522 0.185 0.434 0.318 0.060 0.837 0.431 0.102 0.642 0.104 0.141
0.683 0.163 0.373 0.229 0.048 0.673 0.416 0.271 0.747 0.129 0.184
260
F. Oñate-Valdivieso, J. Bosque Sendra / Journal of Hydrology 395 (2010) 256–263
Table 3 Results of logistic regression: modeled transitions (submodels of transition), correlation degree (ROC), explicative variables and coefficients of each explicative variable in the regression equation. Transition
ROC
Term
From shrub-type vegetation to dry forest
0.9619
Independent Soil type Distance to cities Distance to roads Distance to dry forest in 1986 Dem
11.64781610 2.31351393 0.73700496 0.01894947 1.24138015 1.73454320
From shrub-type vegetation to grasslands
0.8550
Independent Soil type Distance to cities Distance to grasslands in 1986 Dem
29.30334746 3.93446790 0.46821253 1.12111169 2.70293381
From shrub-type vegetation to crops
0.9511
Independent Soil type Distance to cities Distance to roads Distance to grasslands in 1986 Dem
23.61843169 10.16724732 0.21055266 0.35086532 0.82993113 2.63246543
From dry forest to shrub-type vegetation
0.9262
Independent Dem Soil type Distance to cities Distance to shrub-type vegetation Soil type
7.52194619 2.01211903 0.02095232 0.31118093 1.07188171 2.07990532
From dry forest to grasslands
0.9023
Independent Dem Soil type Distance to cities Distance to grasslands in 1986
From dry forest to crops
0.8901
Independent Soil type Distance to cities Distance to roads Dem Distance to crops in1986
7.66879065 4.21369442 0.01049975 0.15799291 0.78416923 0.81895196
From grassland to shrub-type vegetation
0.8740
Independent Dem Soil type Distance to shrub-type vegetation in 1986
7.57213473 1.68691095 0.93310520 1.24074251
From grassland to dry forest
0.9794
Independent Distance to cities Distance to roads Distance to dry forest in1986 Dem Soil type
17.17056084 0.66496870 0.03044125 1.37560559 2.14996078 8.64315221
From grassland to crops
0.9298
Independent Soil type Distance to cities Distance to cities Dem Distance to crops in 1986
16.12339406 3.11453739 0.15612132 0.47700557 0.67087987 1.68747988
From crop to shrub-type vegetation
0.9031
Independent Dem Soil type Distance to shrub-type vegetation in 1986
4.23241181 1.63607407 0.28243151 1.31503270
From crop to dry forest
0.8137
Independent Dem Distance to roads Distance to dry forest in1986 Soil type
1.54771212 1.18424277 0.12008042 1.30541035 1.40525690
From crop to grassland
0.8082
Independent Soil type Dem Distance to grasslands in 1986
(grasslands and shrub-type vegetation). Soil type is directly related to almost all of the transitions because soil characteristics determine the type of occupation. The distance to cities and roads is
Coefficient in the logistic regression equation
18.71824490 3.08577471 1.90792026 0.15072886 0.81043991
13.06317561 3.75104014 2.36610567 0.90143128
directly related to the occurrence of occupations of the forest type and inversely related to occupations of the agricultural type. The distance to initial occupations in 1986 is inversely related to the
F. Oñate-Valdivieso, J. Bosque Sendra / Journal of Hydrology 395 (2010) 256–263
current occurrence of all of the occupations; thus, confirming initial assumptions. The application of Multilayer Perceptrons Neural Networks (MLP) to the study of the relation between the analyzed transitions and the possible explicative variables allows us to verify that the RMSE in the training phase, as well as the calculated Table 4 Likelihood of transition between occupations of soil.
Shrub-type vegetation Dry forest Grassland Crops
Shrub-type vegetation
Dry forest
Grassland
Crops
0.8134 0.0131 0.0619 0.0991
0.0595 0.9223 0.0156 0.2656
0.1238 0.0314 0.9133 0.1486
0.0033 0.0332 0.0091 0.4867
261
error for the validation phase, reaches values lower than 0.0022. Accuracy in modeling transitions exceeds 70% in all cases. Due to the theoretical conception of neural networks, relations between explicative variables and transitions to be modeled depend upon weights with which the activation functions in the neural network. This situation makes it difficult to interpret the existing relations when the black-box-type procedure is applied. The likelihood of transition calculated for each type of coverage is shown in Table 4. The predominance of the likelihood of persistence can be observed in this table, highlighting the likelihood of transition of crops to grassland and to dry forest as well as of shrub-type vegetation to grassland. The aforementioned information suggests that when crop zones are abandoned, they have the most likelihood of transforming into dry forest zones than transforming into grasslands, possibly due to climate conditions of
Fig. 2. Maps based on logistic regression and neural networks (MLP) compared with the map extracted from the 2001 image.
262
F. Oñate-Valdivieso, J. Bosque Sendra / Journal of Hydrology 395 (2010) 256–263
water stress in the low zones of the surveyed area, in which the largest crop areas are located. In the high zones, rainforests are being deforested and transformed into grasslands for livestock purposes. A transition to crop zones is less likely due to topographic conditions or to dry forest because of the climate conditions. Grasslands have more likelihood of persistence rather than transition to other types of coverage. Crops have an approximately 50% likelihood of presenting persistence; this is possibly because of the existence of zones exclusively dedicated, and for long periods of time, to growing sugar cane in the middle part of the basin, rice in the low middle part, and corn in the northwest. On the other hand, the other crop zones are near the existing rivers and they are subject to variations in the types of crops, abandonment due to drought, or destruction caused by floods. For this reason, these zones are more susceptible to suffer a transition. If this fact is considered, it is reasonable to think that the likelihood of persistence of crops is low.
3.4. Models for change in land use and validation In Fig. 2, land use maps obtained through logistic regression and neural networks are presented. These maps forecast changes, ob-
Table 5 Confusion matrix between the map extracted from 2001 image and the map created through neural networks (MLP) ((1) shrub-type vegetation, (2) dry forest, (3) grassland, (4) crops). Map of 2001 (reference) 1
2
3
Total 4
Map 2001 MLP 1 145,457 10,654 51,990 4060 2 13,828 256,148 29,052 10,245 3 45,660 20 054 489,723 17,822 4 568 13 998 4,742 14,845 Total Omission error
212,161 309,273 573,259 34,153
Error commission
31.4 17.2 14.6 56.5
205,513 300,854 575,507 46,972 1,128,846 29.2 14.9 14.9 68.4
served between 1986 and 1996, for 2001. The maps obtained can be visually compared to the land use map of the 2001 image. Comparing the map obtained through logistic regression with the map extracted from the 2001 image, we can observe a great similitude in the areas with shrub-type vegetation, dry forest, and grassland; but, the growing areas have the greatest differences, si they underestimating the areas observed in the image of comparison. If we compare the map obtained through neural network with the map extracted from the 2001 image, we can observe that there is an overestimation of the areas covered by shrub-type vegetation, grassland and crops, and there is a certain underestimation of the areas of dry forest. We can even see the presence of certain pixels (in the south part of the map) corresponding to the dry forest, in zones in which the biophysical conditions would limit their presence. In Table 6, the confusion matrix between the map extracted from the 2001 image and the map created through logistic regression is shown. We can see that there is a higher number of pixels that shows correspondence among the thematic types of both maps. On the other hand, there is a considerable amount of pixels (37,593) which, besides belonging to shrub-type vegetation in the map of 2001, has been classified as grassland in the map created by logistic regression. A similar case is the one with the grassland in the 2001 image that has been modeled through logistic regression as shrub-type vegetation (33,203 pixels) or dry forest (19,690 pixels). The commission errors reach a maximum value of 42.5% for the crops and 52.3% corresponding to errors of omission. In Table 7, the values of global reliability calculated from confusion matrix included in Tables 5 and 6, the statistic Kappa and the correlation coefficient between the reference map of 2001 and the maps of land use generated with logistic regression and neural networks are shown. It can be observed that the map created through logistic regression has a total reliability of 86%, a Kappa index of 0.901 and a correlation coefficient of 0.941. These values reach acceptable levels that guarantee the adjustment between the calculated model and the reference map. These values exceed significantly the ones reached by the map obtained through neural networks.
Table 6 Confusion matrix between the map extracted from the 2001 image and the map created through logistic regression (LogReg) ((1) shrub-type vegetation, (2) dry forest, (3) grassland, (4) crops). Map of 2001 (reference) 1 Map 2001 (RegLog) 1 162,437 2 3909 3 37,593 4 1574 Total Error of omission
Total
2
3
4
10,262 269,829 12,348 8415
33,203 19,690 516,059 6555
3111 5905 15,528 22,428
Commission error
209,013 22.3 299,333 9.9 581,528 11.3 38,972 42.5
205,513 300,854 575,507 46,972 1,128,846 21.0 10.3 10.3 52.3
Table 7 Validation parameters between the map extracted from the 2001 image and the maps created through logistic regression (LogReg) and neural networks (MLP).
Global reliability (%) Kappa R
2001 MLP
2001 LogReg
80.27 0.862 0.911
86.00 0.901 0.941
Fig. 3. Scenery of land use projected by 2012.
F. Oñate-Valdivieso, J. Bosque Sendra / Journal of Hydrology 395 (2010) 256–263
If we compare the results of the Tables 5 and 6 with each other, we can observe that the projection of land use obtained through logistic regression shows less error of omission and commission. On the other hand, the estimate errors of the coverage occurrence is also fewer. The Table 7 shows interesting results of the model obtained through logistic regression, which again exceed the ones obtained through neural networks. All these facts make us assume that the logistic regression may show better results if it is applied in creating the model of land use by 2012. Crops have become the most difficult land use to predict using neural networks or logistic regression because their presence not only depends on biophysical and anthropic factors, as the ones considered in this study, but socio-economic factors like supply and demand of certain agricultural products, migration, economic conditions of the region, political decisions, etc., as well as the occurrence of weather events such as droughts or floods. The scenery of land use projected for 2012 was carried out by applying logistic regression; it can be observed in Fig. 3. The great similarity that it has with the map created in 2001 is observed. It can be attributed to the dynamism of the explanatory variables. 4. Conclusions The persistence of the different land occupations is the predominant state in the study area; we can notice that the changes are produced in the boundary area among categories. The greater explanatory power of the occurrence of the different coverage was presented by biophysical variables, such as elevation (DEM, Digital Elevation Model), the distance to the areas that showed dry forest, shrub-type vegetation and grassland in the start date of the study. The anthropogenic variables provided little explanatory power. The best estimate of the land use change was produced when applying logistic regression showing the result obtained through neural networks and an overestimation of the shrub-type vegetation areas. The most difficult land use to predict was of crops, due to its dependence on socioeconomic and climate factors that apparently exceed to the ones of biophysical and anthropic nature. The scenery generated for 2012 showed a remarkable similarity with the map of land use corresponding to 2001; this is due to little dynamism of the explanatory variables. The difficulty of doing a real representation of socioeconomic variables that could explain the evolution of the land use, especially the changes produced in the crops can additionally contribute to this fact. The multi-temporal study of the land use change, its explanatory variables and the prediction based on logistic regression or neural networks provide an interesting tool for the creation of sceneries whose effect in the hydrological regime of a river basin can be studied through a hydrological model.
263
References Bayarjargal, Y., Karnieli, A., Bayasgalan, M., Khudulmur, S., Gandush, C., Tucker, C.J., 2006. A comparative study of NOAA–AVHRR derived drought indices using change vector analysis. Remote Sensing of Environment 105, 9–22. Bosque, J., García, R., 1999. Asignación óptima de usos del suelo mediante generación de parcelas por medio de SIG y técnicas de evaluación multicriterio. VII Conferencia Iberoamericana sobre SIG. Mérida, Venezuela. Brown, A., Goovaerts, P., Burnicki, A., Li, M., 2002. Stochastic simulation of landcover change using geostatistics and generalized additive models. Photogrammetric Engineering and Remote Sensing 68 (10), 1051–1061. Chuvieco, E., 2002. Teledetección ambiental: La observación de la Tierra desde el espacio. Editorial Ariel. Barcelona, España. 586 p. Coppin, P., Jonckheere, I., Nackaerts, K., Muys, B., 2004. Digital change detection methods in ecosystem monitoring: a review. International Journal of Remote Sensing 25 (9), 1565–1596. Eastman, J.R., 2006. IDRISI Andes. Tutorial. Clark-Labs, Clark University, Worcester, MA. Geerken, R., Ilaiwi, M., 2004. Assessment of rangeland degradation and development of a strategy for rehabilitation. Remote Sensing of Environment 90, 490–504. Huete, A.R., Miura, T., Gao, X., 2003. Land Cover conversion and degradation analyses through coupled soil–plant biophysical parameters derived from Hyperspectral EO-1 Hyperion. IEEE Transactions on Geoscience and Remote Sensing 41 (6), 1268–1276. Kleinbaum, D.G., Klein, M., 2002. Logistic Regression. A Self-Learning Text, second ed. Springer, N.Y. 513 p. Liu, Z., Huang, F., Li, L., Wan, E., 2002. Dynamic monitoring and damage evaluation of flood in north-west Jilin with remote sensing. International Journal of Remote Sensing 23 (18), 3669–3679. Mas, J.F., 1999. Monitoring land-cover changes: a comparison of change detection techniques. International Journal of Remote Sensing 20 (1), 139–152. Mas, J.F., Puig, H., Palacio, J.L., Sosa-Lopez, A., 2004. Modelling deforestation using GIS and artificial neural networks. Environmental Modelling and Software 19, 461–471. Pérez-Ramos, I.M., 2007. Factores que condicionan la regeneración natural de especies leñosas en un bosque mediterráneo del sur de la Península Ibérica. Ecosistemas 16 (2), 131–136 (
). Pijanowski, B.C., Brown, D.G., Shellito, B.A., Manik, G.A., 2002. Using neural networks and GIS to forecast land use changes: a Land Transformation Model. Computers, Environment and Urban Systems 26, 553–575. Pijanowski, B., Pithadia, S., Shellito, B.y., Alexandridis, K., 2005. Calibrating a neural network-based urban change model for two metropolitan areas of the Upper Midwest of the United States. International Journal of Geographical Information Science 19 (2), 197–215. Pontius, R., Malanson, J., 2005. Comparison of the structure and accuracy of two land change models. International Journal of Geographical Information Science 19 (2), 243–265. Pontius, R., Cornell, J.y., Hall, C., 2001. Modeling the spatial pattern of land-use change with GEOMOD2: application and validation in Costa Rica. Agriculture, Ecosystems and Environment 85, 191–203. Pontius, R., Shusas, E., McEachern, M., 2004. Detecting important categorical land changes while accounting for persistence. Agriculture, Ecosystems and Environment 2/3 (101), 251–269. Richards, J.A., Jia, X., 2006. Remote Sensing Digital Image Analysis, fourth ed. Springer, Berlin, Germany. 439 p. Serra, P., Ponsa, X., Saurí, D., 2008. Land-cover and land-use change in a Mediterranean landscape: a spatial analysis of driving forces integrating biophysical and human factors. Applied Geography 28, 189–209. Van Oort, P.A.J., 2006. Interpreting the change detection error matrix. Remote Sensing of Environment 108 (1), 1–8. doi:10.1016/j.rse.2006.10.012. Van Vliet, J., White, R., Dragicevic, S., 2009. Modeling urban growth using a variable grid cellular automaton. Computers, Environment and Urban Systems 33, 35– 43.