An information management system for forecasting environmental change

An information management system for forecasting environmental change

Computers ind. EngngVoL 31, No. I/9., pp. 289 - 292,1996 Copyright O 1996 Elsevier Seietme Lal Printed in Great Bfita~ All f i ~ s rwerved Pergamon S...

336KB Sizes 4 Downloads 169 Views

Computers ind. EngngVoL 31, No. I/9., pp. 289 - 292,1996 Copyright O 1996 Elsevier Seietme Lal Printed in Great Bfita~ All f i ~ s rwerved

Pergamon S0360-8352(96) 00133.7

o'36o-8352/96 $15.oo + o.oo

AN INFORMATION MANAGEMENT SYSTEM FOR FORECASTING ENVIRONMENTAL CHANGE Vicente Fernando Silveira ), Suresh K. Khator 2 and Ricardo Miranda Barcia 3 'Fundacao do Meio Ambiente, 88010-970 Fiorianopolis, SC, Brasil CNPq - Conseiho Nacionai de Desenvolvimento Cientifico e Tecnoiogico 2Department of Industrial & Management Systems Engineering, Univ. of South Florida, Tampa, FL 33620-5350, USA 3Department of Production Engineering, Federal University of Santa Catarina, C.P. 476, 88040-900, Florianopolis, SC, Brasil ABSTRACT Predicting environmental change give policy makers indications for a better allocation of natural, economic and social resources. Intensive use of these resources may lead to unpredictable results due to the highly non-deterministic characteristics of environmental data. An integrated model using Remote Sensing, Geographic Information System and Artificial Neural Network is proposed in order to assess possible causal relationships that lead to environmental modifications.

KEYWORDS Information management system; forecasting; environmental change; landscape assessment; remote sensing; geographic information system; artificial neural network AVAILABLE TOOLS Environmental monitoring and assessment involve manipulation of complex and voluminous data sets. The demand for efficient storage, analysis and display of this kind of data has led to an intensive use of computers and the development of sophisticated information systems. Geographic Information Systems (GIS) are playing a very important role in transferring and analyzing data from the world. GIS consist of a powerful set of tools for collecting, storing, retrieving, transforming and displaying spatial data from the real world for a particular set of objectives. It has had a tremendous impact in any field that analyzes spatially distributed data. GIS techniques and associated data sources (specially remote sensing) are becoming a central piece for environmental monitoring and assessment. Its ability to manage large amounts of data from different origins, formats and scales, has permitted to approach environmental studies in different manners. GIS's main objectives are processing spatial data by meaning of: creating digital abstractions of the real world (encoding); handling these data efficiently (storage); finding spatial correlation among variables (analyzing); and showing the results (displaying) (Berry, 1993). The efforts applied in developing GIS's full potential has probably been based more on intellectual achievements creativity - - than significant technological outgrowth. In other words, the system has no inherent answers, and therefore it should be used as an analytical tool, an extension of thought. Recently GIS interest has been evolving from simple inventory and management of data to include more sophisticated data analysis and modeling capabilities (Goodchiid, 1993). The efforts toward development of intelligent GIS (IGIS) to support spatial analysis decisions will certainly be a central point of environmental research in the next decade. A process that involve collecting data from an object, an area, or a phenomenon with no physical contact with these respective sources, and interpreting these data to extract information, is defined as Remote Sensing (RS) (Lillesand et al., 1994). The products of any pictorial representation in remote sensing are photography and electronic image. More usually the term image is used to represent the data from any 289

290

19th International Conference on Computers and Industrial Engineering

scene. RS has been used broadly for ecological research, regional planning and management of landscapes. A claimed advantage is that measurements can be done repeatedly on a regional scale. In developing countries, where census and survey data are scarce, RS has made a significant contribution in providing updated information both for government and private users. Neurocomputing technology or merely, artificial neural network (ANN) is an area of information sciences that takes its models inspired by biological nervous systems. An ANN is composed of many simple elements working in parallel imitating the biological neurons. It can be trained to perform a particular function by adjusting the connections' weights among these elements. The ANN may be considered as a "black box" that performs some function for mapping a given input to an output. A computation is carried out by mapping input values to output values and the mapping problem fixes the number of inputs as well as the number of outputs for the network. Given a data set the net can be trained in.order to find a relationship between the inputs and outputs. Later on this ANN can be applied to unseen data in order to perform the same operation. Some authors recognize that the concept of ANN has many analogies with the statistical world. One of the possibilities of application of the neural net is to consider it a "super-form of multiple regression" (Hewitson et al., 1994). The net has the ability to find any arbitrary non-linear function developing a relation {y} = f{x} like in a regression, but here it is not tied to a linear relationship. There is some consensus in terms of its improved efficiency in quantitative analysis; ability to lead with non-linearity problems, and at the end, possibilities of automation (Openshaw, 1994). PROPOSED MODEL The basic nature of human perception is that our mind can organize the variety of stimuli it receives and orders this vast amount of rather chaotic raw data into useful information. Broadly, the work of perception is divided between the various sensory processes to obtain raw data and the faculty to convert these data into meaningful information, experience from the real world. Even if one could arrive at an optimal representational structure for environmental data, there would still be two important questions to resolve: the relevance of subsets within the vast amount of data has to be determined and a useful framebased representation need be developed (Chalmers et al., 1995). The proposal of developing a model for forecasting environmental change makes an analogy in this model of perception-representation. Several authors have pointed out that some key variables may determine the structure, function and change of environmental conditions. If these variables, and their possible causal relationships, can be perceived and mapped in a system of representation, then forecasting environmental changes could be carded out by means of pattern recognition in the data. This model comprehends an integration among remote sensing, geographic information systems and artificial neural networks. Integrated utilization between RS and GIS has been strongly developed in the last two decades. More recently ANN has been used to classify data acquired from RS devices. No attempts were made to integrate ANN as an analytical operating module of GIS. Variables that are spatially distributed can be represented, in a GIS environment, as raster maps. A raster image of an area of study is a fine mesh of grid cells in which a condition or characteristic of the earth's surface at that point is recorded. A digital image data is composed of a two dimensional array of discrete picture elements (pixels). An image is a numeric matrix and each pixei is an element of this matrix. GIS has the capability to overlay images and obtain composite images adding multidimensionality to the pixel. At this point a fundamental element of geographical information as a tuple T = < x, y, zl, z2..... z~ > can be defined which represent the values of n spatial variables (z) at the location (x, y) (Goodchild, 1992). These variables can be of any data type like binary, nominal, ordinal, interval, or ratio. General algebraic operations can be handled now with a set of image maps as variables in stead of numbers. Therefore the equation (Lee et al., 1992) Landscape pattern = f (ecological processes, market forces, social factors)

(1)

describes landscape pattern as a function of classes of causal key variables (independent), represented here as raster images. Raster images representing variables that determine land use and result in landscape patterns are used as inputs. Temperature, precipitation, soil types, vegetation, slopes are

19th International Conference on Computers and Industrial Engineering

291

examples of ecological images; land and commodities prices of market forces; income distribution, population density of social factors. The process to be modeled by the ANN is similar to a multivariate regression. The dependent variable is the raster map of landcover change and the independent variables are the ecological processes, the market forces and the social factors. The ANN is appropriate to lead with such a variety of multisource data because there is no need of a prior knowledge about the statistical distribution of the classes in the data sources (Benedktisson et al., 1990). Multitype data have different distributions and do not fit a unique general model. The ANN can determine how much significance each data source has in the regression model. This multivariate regression is carried out by presenting the input vectors (raster images) and the corresponding output vectors to the ANN. The ANN is trained until it can approximate a function associating input and output vectors. The final process is the mapping of a land use change environmental scenario where an ANN can "learn" to make forecasting based on real data. Simulations ca be carried out in order to understand modifications on landscape structure presenting unseen raw data to the ANN. Besides that such a model can identify the key variables that influence landscape change depending of the scale of study approached. Figure 1 shows the proposed model that is GIS-based since it utilizes GIS capabilities to receive, generate and prepare data to be used as the input and output variables in order to train an ANN. These variables are represented by raster images of a desired area to be assessed.

+ Fig. I.

An integrated RS, GIS and ANN Based Model

Using the algebraic operations with images in the GIS environment, independent variables can be created to a specific scene. GIS's mathematical modeling capability is used to evaluate relationships between images or tabular data to produce other derivative images. To represent the dependent variable, the output to be presented to the ANN will be developed from a technique called Change Vector Analysis (Malila, 19g0). Working with several data sets for the same chosen area and date, a multidimensional space is created for a chosen scene. Each pixel in the image has a location in this space. On a temporal scale this position change according to new environmental conditions and each pixel has a new position on the multidimensional space now. This new position can be expressed by the pixel Euclidean distance in this space

D=4(X2-X,)z+(Yz-YI)2+

.... + ( n 2 - n , ) 2

(2)

Where X~ is variable X in time 1, X2 is variable X in time 2, and so successively. Therefore, this distance image represents an image of change that have occurred in the landscape correlated with those input variables that will be presented to the ANN. The generalized delta rule, a supervised training approach, is used to map the data (Windrow et al., 1960). This rule has been generalized from the delta rule (eq. 3) to embody one or more layers of hidden neurons. W(k)=W(k-

l)+r I[t(k)-W(k-

l ) x ( k ) ] x r(k)

(3)

where x (k) is the input pattern, t (k) is the desired output, W (k) is the state of the weight matrix describing the network after k presentations, and 1] is a learning rate. The delta rule is similar to the

292

19th International Conference on Computers and Industrial Engineering

mathematical method of stochastic approximation for regression problems. However, this rule does not ensure to discriminate data that are not linearly separable (such as muitisource data) and a generalization is done to include one or more hidden layers. This generalization property permits to train an ANN on a representative set of input/output pairs with no need to train on all possible input/output pairs. APPLICATION To generate the raster images a dataset that is supplied with Change and Time Series Analysis Explorations in Geographic Information Systems Technology (United Nations Institute for Training and Research) was used. The GIS system utilized was IDRISI (a Clark University trademark). Data was trained in an ANN using a MatLab Neural Network toolbox (The Math Works Inc.). Original variables from this dataset, like temperature, rainfall, slope, soil types were used and other training data was generated by randomly assigning uniformly distributed values and them scaling them. These independent variables were mapped with an index (NDVI - normalized difference vegetation index) as the dependent variable. This index is a quantitative measure of highly correlated with the quantity of living matter in any region. Preliminary results show that a very simple ANN (six inputs; twelve hidden neurons; one output) is able to capture the structure of this data and provide instrumentality of forecasting outcomes for other, unseen data sets extracted from the same originating process. With 8,000 net evaluations the final error sum of squares was 0.001. This fit can be improved by increasing the training set size. Forecasting environmental changes will play a significant role in planning policies makers' agenda on the near future. Efficient allocation of natural, economic and social resources is of fundamental importance for society's commonwealth. Human utilization of resources has a very unpredictable result making, therefore, efforts to the development of tools for assessing environmental conditions of ultimate necessity. REFERENCES Benediktsson, J.A., P.H. Swain, and O.K.Ersoy. (1990). Neural Network Approaches Versus Statistical Methods in Classification of Multisource Remote Sensing Data IEEE Transactions on Geoscience and Remote Sensing. 28:540-552. Berry, J.K. (1993). Cartographic modeling: The Analytical Capabilities of GIS In: Environmental Modeling with GIS pp. 58-74. New York: Oxford University Press. Chalmers, D.,R. French, and D. Hofstadter. (1995). High-level Perception, Representation, and Analogy: A Critique of Artificial-intelligence Methodology In: Fluid Concepts and Creative Analogies. New York: Harper Collins Publishers. Goodchild, M.F. (1993). The State of GIS for Environmental Problem-Solving In: Environmental Modeling with GIS pp. 8-15. New York: Oxford University Press. Goodchild, M.F. (1992). Geographic data modeling Comauters & Goosciences, 18, 401-408. Hewitson, B.C.,and R.G Crane. (1994). Looks and Uses. Neural Nets: Applications in Geography. Dordrecht :Kluwer Academic Publishers Lee, R.G., R. Fiamm, and M.G. Turner. (1992). Integrating sustainable development and environmental vitality: a landscape ecology approach In:Watershed Management: Balancing Sustainabilitv and Environmental Change. New York: Springer-Verlag. Lillesand, T.M.,and R.W. Kiefer (1994). Remote sensing and image interoretation. New York: John Wiley & Sons. Malila, W.A. (1980). Change Vector Analysis: An Approach for Detecting Forest Changes with Landasat. Proceedings 6th Annual Symposium on M~chine Processing of Remotely Sensed Data pp.326-335, Purdue: Purdue University. Openshaw, S. (1994). A concepts-rich approach to spatial analysis, theory generation, and scientific discovery in GIS using massive parallel computing In: Innovations in GIS - selected papers from the First National Conference on GIS Research. UK: Taylors & Francis. Widrow, B., and M.E. Hoff. (1960). Adaptive switching circuits IP.~ WESCON Convention Record New York pp. pp. 96-104.