Business logic for geoprocessing of distributed geodata

Business logic for geoprocessing of distributed geodata

ARTICLE IN PRESS Computers & Geosciences 32 (2006) 1746–1757 www.elsevier.com/locate/cageo Business logic for geoprocessing of distributed geodata C...

660KB Sizes 2 Downloads 68 Views

ARTICLE IN PRESS

Computers & Geosciences 32 (2006) 1746–1757 www.elsevier.com/locate/cageo

Business logic for geoprocessing of distributed geodata Christian Kiehle GIS Working Group, Institute of Geography, University of Bonn, Meckenheimer Allee 166, 53115 Bonn, Germany Received 21 September 2005; received in revised form 4 April 2006; accepted 4 April 2006

Abstract This paper describes the development of a business-logic component for the geoprocessing of distributed geodata. The business logic acts as a mediator between the data and the user, therefore playing a central role in any spatial information system. The component is used in service-oriented architectures to foster the reuse of existing geodata inventories. Based on a geoscientific case study of groundwater vulnerability assessment and mapping, the demands for such architectures are identified with special regard to software engineering tasks. Methods are derived from the field of applied Geosciences (Hydrogeology), Geoinformatics, and Software Engineering. In addition to the development of a business logic component, a forthcoming Open Geospatial Consortium (OGC) specification is introduced: the OGC Web Processing Service (WPS) specification. A sample application is introduced to demonstrate the potential of WPS for future information systems. The sample application Geoservice Groundwater Vulnerability is described in detail to provide insight into the business logic component, and demonstrate how information can be generated out of distributed geodata. This has the potential to significantly accelerate the assessment and mapping of groundwater vulnerability. The presented concept is easily transferable to other geoscientific use cases dealing with distributed data inventories. Potential application fields include web-based geoinformation systems operating on distributed data (e.g. environmental planning systems, cadastral information systems, and others). r 2006 Elsevier Ltd. All rights reserved. Keywords: Distributive geodata; Web-based information systems; Web service; Geo web service; Web-based geoprocessing

1. Introduction Geoinformation plays an important role at all levels of public life, especially since huge amounts of geodata are derived from earth observation techniques (e.g. satellite and aerial imagery), the amount of available geodata has risen exponentially. Geodata and geoinformation are used in environmental planning, utility companies, planning offices, and to Tel.: +49 228 73 2098; fax: +49 228 73 9658.

E-mail address: [email protected]. 0098-3004/$ - see front matter r 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.cageo.2006.04.002

support spatial decision making. In order to effectively protect the environment, coordinate civil protection, and support decisions, it is essential that up-to-date geodata are available. The most important task in any kind of spatial information system is the process of information generation, which is preceded by the process of data integration. Data integration is perhaps the most time-consuming task of information generation, because in most geoscientific problems, the data used are stored by several data providers spread throughout the country (or even around the world).

ARTICLE IN PRESS C. Kiehle / Computers & Geosciences 32 (2006) 1746–1757

This paper presents a technical solution of how to use cutting-edge technologies for the generation of information from distributed geodata inventories by means of web services (Weerawarana et al., 2005). After introducing the basic concepts of web services, service-oriented architectures and web-based geoprocessing, a spatial intersection web service and a geoscientific case study are presented. These illustrate the utilization of such techniques to solve important, topical geoscientific problems. 2. Status quo—globally distributed geodata, locally generated geoinformation Most geodata are produced and provided by public authorities (e.g. land registration offices) and large organizations (e.g. National Oceanic and Atmospheric Administration (NOAA), European Space Agency (ESA), etc.) at various levels of detail (local, regional, national, and global). While in the past the integration of geodata in large-scale geoinformation systems (GIS) has been preferred, the principles of distributed computing have since become more popular; in particular, the internet serves as a distributed computing platform. With the use of internet technologies, the distribution of data and information becomes easier. Current developments in the field of geoinformatics and software engineering have led to the technology of web services and geo web services, respectively. From a German point of view, spatial data are generated mainly by public data providers. According to Germany’s federal system, data are published on different scale levels:

   

national scale level (e.g. Umweltbundesamt, the national environmental protection agency), federal state scale level (e.g. Landesvermessungsamt, the federal statistical office), regional authority scale level (e.g. regional water suppliers), local authority scale level (e.g. municipalities).

Apart from these publicly available datasets (yet not free of charge), there are also data available from small to large-scale enterprises, planning offices, and even from the general public. The advent of location-based technologies as recently published by google.com (http://maps.google.com/) promise even more data availability in the near future. The distribution and availability of geodata

1747

and non-spatial data is best characterized by the term globally distributed data. In order to generate information from data, one has to integrate the various sources of distributed data into one system. Large-scale geo information systems were the first choice in the past, because they allow a hybrid modeling (raster data, vector data, multi-dimensional spatial data, time series, etc.) of data and provided algorithms for the generation of information. But these systems, often described as ‘monolithic’ (Abel et al., 1999), come with some significant drawbacks:

     

the initial investment for computer hardware and software is high, geo information systems demand specialist operator skills, the process of data integration is intensive in time and knowledge, integration of updated data sources demands major efforts (i.e. the process of data integration has to be repeated), data and information are not easily portable, and exchange of data and information can be problematic due to vendor-specific data formats (missing interoperability).

These problems result in less portable, less interchangeable and less interoperable data. The process of information generation has to be done on a local system and has to be repeated, each time new data become available. Particular problems resulting from this approach are deficiency in data and information reusability, and the high latency in updating cycles. 2.1. Basic technology—spatial data infrastructure (SDI) Information Technology in general, and geoinformatics in particular offers just-in-time access to distributed, heterogeneous and spatial data. In order to address the aforementioned problems: the concept of Spatial Data Infrastructures (SDI) is employed based on the concept of geo web services (Anderson and Moreno-Sanchez, 2003). Geo web services are increasingly used to provide geodata and geoinformation through the World Wide Web. According to Nebert (2004) a SDI ‘‘[y] hosts geographic data and attributes, sufficient documentation (metadata), a means to discover, visualize,

ARTICLE IN PRESS 1748

C. Kiehle / Computers & Geosciences 32 (2006) 1746–1757

and evaluate the data (catalogues and web mapping), and some method to provide access to the geographic data’’. In addition to these technical aspects of SDI McLaughlin and Nichols (1992 cited in Chan et al., 2001) define the components of an SDI as ‘‘sources of spatial data, databases and metadata, data networks, technology (dealing with data collection, management and representation), institutional arrangements, policies and standards and end-users’’. The basis of any SDI is the data. Data are provided by web services, in the spatial domain mainly according to standards published by the Open Geospatial Consortium (Percival, 2003). The Open Geospatial Consortium (OGC)—formerly known as Open GIS Consortium—aims to provide open standards to share geoinformation. An open standard, in the sense of OGC, means that anyone can access and implement it and everyone is free to take part in the definition processes of those standards. An overview of the most commonly used OGC specifications can be found in Vretanos (2002) for the Web Feature Service (WFS), in Evans (2003) for the Web Coverage Service (WCS), and the Web Map Service (WMS) is described in detail in de la Beaujardiere (2004) and Kolodziej (2004). Kresse and Fadaie (2004) give an overview of specifications co-maintained by the International Standardization Organization (ISO). All current OGC specifications are available via http://www.opengeospatial.org/ specs/?page=specs. These services add a level of abstraction to the data. Data abstraction is extremely important in distributed computing environments because a standard-compliant service (here: WFS or WCS) allows data access only through a layer of abstraction. The major advantage is that the underlying data (e.g. for updating purposes) can be changed without affecting any user of the service. A disadvantage of the provision of a publicly available web service is the fact, that other services or users may build applications on top of a service. Once the service’s interface changes or the underlying data get deleted, applications utilizing the web service

may no longer function. Fig. 1 shows the principle of data abstraction from the web service-oriented point of view. Spatial data are the modules of spatial information as required in environmental and engineering planning. While GIS were often used in the past to integrate all available information for a specific problem, the advent of web technologies now allow the distributed storage and access of data. This aspect has several advantages:

   

the amount of available data rises significantly, the facilities to share data are archived more easily, the reuse of data is fostered because once acquired, data can be published and reused by anyone who has access to those data and regularly updated data (e.g. data from public authorities) can be more readily accessed because potential users utilize the web service, and not the data directly.

While non-spatial data (such as textual information and images) can be provided by web services based on mainstream IT protocol SOAP (W3C, 2003), spatially related data can be provided by web services based on Open Geospatial Consortium specifications. Another technology, often employed for publishing and describing service interfaces, is WSDL, which is defined in detail in W3C (2001) and Weerawarana et al. (2005). WSDL is used for publishing web services and allows software engineers to implement customized client applications. According to their high level of standardization, web services are usable in several, independent contexts. Fig. 2 summarizes the concept of SDI from a high-level technical point of view. Distributed data inventories (distributed across an organizational network or worldwide across the internet) are published as web services (data tier) and used for information generation (business logic tier) for a variety of clients. These clients (presentation tier) could be users accessing a standard web information system, a portable device or a fullfeatured GIS.

Fig. 1. Data abstraction through web service technology.

ARTICLE IN PRESS C. Kiehle / Computers & Geosciences 32 (2006) 1746–1757

1749

Fig. 2. High-level technical view of a spatial data infrastructure.

In summary, the concept of SDI based on web services provides ubiquitous access to spatial and non-spatial data. This allows the integration of distributed and heterogeneous data for the generation of information to serve a variety of clients with information. 2.2. Web-based geoprocessing of distributed data— development of a business logic component The most commonly used architectural approach to build up a SDI is the approach of a distributed multi-tier system. This typically consists of a data tier (also referred to as the backend), a businesslogic tier (sometimes referred to as integration tier or the middleware), and a presentation tier (also referred to as the frontend) (Sommerville, 2004). Fig. 2 illustrates the business-logic tier as the center of any SDI, in between the data tier and presentation tier. It is capable of integrating several data sources of heterogeneous origin. The business logic consists of several components that encapsulate all required tasks for accessing data, turning data into information, and forwarding results to the presentation tier. Inside the business logic, all kinds of functions, conventionally performed in geo information systems, can be implemented. The tasks needed for geoprocessing, the processing of spatial data according to a strictly defined geospatial algorithm, comprise three elements: 1. connection of web services which serve as data providers, 2. application of the geospatial algorithm, for example: spatial intersection, calculation of Normalised Difference Vegetation Index (NDVI)

(see Lillesand et al., 2004), or calculation of groundwater vulnerability (see Vrba and Zaporozec, 1994), 3. preparation of information for output (e.g. on a handheld device or through a web-based information system).

In early 2005 a consortium headed by GeoConnections/Natural Resources Canada and PCI Geomatics started to work on a proposal for the definition of interfaces to integrate spatial (and non-spatial) algorithms and models in OGC-compliant SDI. Those efforts led to the initialization of an interoperability experiment (IE) carried out by nine organizations from the public, scientific, and private sector. During this IE an OGC specification was drafted, that describes how to implement web services for geoprocessing purposes, namely the Web Processing Service (WPS) specification (Heier and Kiehle, 2005). The main aspect is the definition of three interfaces which have to be implemented by all services, in compliance with WPS (see Fig. 3). The first relevant interface is common to all OGCcompliant web services: get capabilities. This interface generates a XML document1 containing service metadata (e.g. provider of web service, address of web service, etc.). Metadata—data about data—are important because they document the actual spatial datasets. Metadata are seen as one key element to foster data reusability and, in companion with catalog services, to build up SDI (Nogueras-Iso et al., 2004; Nogueras-Iso et al., 2005). 1 See http://geotech.lih.rwth-aachen.de/wps/wps?Request= GetCapabilities&Service=WPS&Version=0.2.4 for an example capabilities document.

ARTICLE IN PRESS 1750

C. Kiehle / Computers & Geosciences 32 (2006) 1746–1757

complex spatial information systems (Bernard et al., 2003). A service chain consists of atomic services (e.g. spatial intersection, spatial buffer, point-in-polygon-analysis, map classification and so on) which are executed in a certain order. The services are independent of data and context, thereby providing high reusability. In order to allow different services to work together in a service chain, the input and output parameters have to be wellknown. In an OGC context this is partly achieved by using the get capabilities interface. The WPS enhances this description of parameters by providing a describe process interface, defining input and output parameters in a very detailed way. 3. Case studies Fig. 3. Web processing service interface (Heier and Kiehle, 2005).

The describe process interface provides more detailed information2 about a specific service (i.e. input data to provide, output data format to expect, etc.). The most important interface is the execute interface whose implementation takes care of the geoprocessing task3 (e.g. a spatial intersection of two GML4 themes). Using a service for geoprocessing makes computing ubiquitous. A service, once established, is easily transferable to any computing platform, thus it is independent of the platform and implementation language. The output of service metadata as standardized XML (Harold and Means, 2004) makes the service usable in highly complex service chains, in order to generate information out of data. Data sources as input for any web service can be accessed through any protocol used in networked environments (most common being HTTP) and due to the output format of the web service being XML, it can be used in a broad field of applications. The establishment of service chains is a necessary task to generate high-quality information as needed in 2

See http://geotech.lih.rwth-aachen.de/wps/wps?Request= GetCapabilities&Service=WPS&Version=0.2.4&processname= Intersection for an example describe process response. 3 See http://geotech.lih.rwth-aachen.de/wps/wps?Request= Execute&Service=WPS&Version=0.2.4&ProcessName=Intersection&Reference1=http://webserver.lih.rwth-aachen.de/wpsie/ polygon1.xml&Reference2=http://webserver.lih.rwth-aachen.de/ wpsie/polygon2.xml&Store=false for an example of spatial intersection of two polygons. 4 GML stands for Geographic Markup Language. The application of GML in geosciences has recently been demonstrated by Lake (2005).

The following case studies were carried out in the scope of a 3-year research study at RWTH Aachen University (Aachen, Germany). Case study 1 was carried out in close collaboration with the GIS department of Wupperverband (Wuppertal, Germany). The application was developed during an Open Geospatial Consortium Interoperability Experiment in order to define interfaces, XML schemas, and sample application for web-based geoprocessing specification. This Web Processing Service (WPS) specification is currently under review by the Open Geospatial community. The second case study presents the development of a web-based environmental (groundwater protection) information system employing the previously introduced technologies. It has been developed as part of the author’s Ph.D. research studies at RWTH Aachen University (Kiehle, 2006). Collaborating partners were Research Centre Juelich (Juelich, Germany) and ahu AG (Aachen, Germany). 3.1. Case study 1—providing a spatial intersection service In order to demonstrate the business logic component, an example application has been designed and implemented. This provides a simple spatial intersection service for two GML themes with the basic work having been carried out for an OGC Web Processing Service Interoperability Experiment (Schut, 2005). The service cuts an input theme with the features from an overlay theme to produce an output theme with features that have attribute data from both themes. It is implemented

ARTICLE IN PRESS C. Kiehle / Computers & Geosciences 32 (2006) 1746–1757

1751

Fig. 4. Spatial intersection service component diagram.

entirely with free and open software (mainly Java 2 Enterprise Edition5 and Java Topology Suite6). The main component is the spatial intersection algorithm provided by Java Topology Suite, encapsulated in a software component according to Fig. 4. After the service receives an execute request with a reference to two GML datasets it uses a feature converter to do the spatial intersection of OGC simple features (see OGC, 1999). In addition to the components used for logging and user request, the business logic provides the preparation of a GML output, corresponding to the client request. A simple web application is provided to access the spatial intersection service.7 Although this service provides only one simple task found in any geo information system, the potential of such services is immense: deployed in grid computing environments the service can be made accessible organizationwide as a central geo information service and, due to its well-defined XML output format, easily used as an input source for other processing services (service chaining). Spatial intersection services provide syntactical interoperability (the ability to transfer data between systems independent of platform and implementation technique) for the implemented intersection algorithm. As illustrated in Fig. 4, components for performing a spatial intersection are encapsulated in two layers: 1. A base service module with common components for validation of user requests, preparation of 5

See http://java.sun.com/j2ee/index.jsp for more details. See http://www.vividsolutions.com/jts/jtshome.htm for more details. 7 The web application is accessible through http://geotech. lih.rwth-aachen.de/wpsclient/. 6

output, error handling, etc. This layer is reusable for other processing services. 2. A spatial intersection module with service-specific components for feature conversion and algorithm implementation. This layer is customized to fulfill the spatial intersection itself and therefore less reusable. The user accesses the public interfaces (get capabilities, describe process, and execute) of the service, which hide most of the complexity of the underlying components. This application serves as a proof-on-concept for the provision of topological operators on the internet. Future challenges will be the establishment of semi- and full-automated service chains. 3.2. Case study 2—providing a web-based information system for groundwater protection based on webservice technology As a part of a groundwater vulnerability research and development project, a web-based information system was developed, to assess and map this potential hazard (Azzam et al., 2003). The main goals were the advancement of spatial web services and the allocation of methods for improved geodata reusability. In order to develop solutions to realworld-problems, a case study was chosen that is common to several environmental protection agencies, i.e. the assessment and mapping of groundwater vulnerability. The term groundwater vulnerability describes ‘‘an intrinsic property of a groundwater system that depends on the sensitivity of that system to human and/or natural impacts’’ (Vrba and Zaporozec,

ARTICLE IN PRESS 1752

C. Kiehle / Computers & Geosciences 32 (2006) 1746–1757

Table 1 Base data in different scale levels Microscale

Mesoscale

Macroscale

Scale

41:10 000

o1:10 000–41:50 000

o1:50 000

Climate

Interpolated grid (10 m  10 m) German Weather Service

Interpolated grid (25 m  25 m) German Weather Service

Interpolated grid (50 m  50 m) German Weather Service

Soil

BK50 (1:50 000) Geological Survey NRW

BK50 (1:50 000) Geological Survey NRW

BK50 (1:50 000) Geological Survey NRW

Geology/hydrogeology

Bore hole data Diverse resources

HYK25 (1:25 000) Environmental State Agency NRW

HK100 (1:100 000) Geological Survey NRW

Landcover

DLM25 (1:25,000) Land Surveying Office NRW

DLM25 (1:25 000) Land Surveying Office NRW

CORINE Land Cover (1:100 000) Federal Statistical Office

Digital elevation model

DEM5 (10 m  10 m) Land Surveying Office NRW

DEM50 (50 m  50 m) Land Surveying Office NRW

DEM50 (50 m  50 m) Land Surveying Office NRW

Depth to groundwater table

Interpolated from observation well data (1:5000) StUA Aachen, Erftverband

Interpolated from observation well data (1:25 000) Erftverband

Interpolated from observation well data (1:25 000) Erftverband

BK, Bodenkarte (German for soil map); HYK, Hydrogeologische Karte (German for hydrogeological map); HK, Hydrologische Karte (German for hydrological map); NRW, Nordrhein–Westfalen (German for North Rhine–Westphalia); DLM, Digital Landscape Model; DEM, Digital Elevation Model; StUA, Staatliches Umweltamt (German for Federal Environmental Agency).

1994). It operates on about 20 different data sources (vector data as well as raster data) provided by a variety of public data providers. Table 1 gives an overview of the datasets used for information generation. One challenge was the tightly defined data update cycle (several times a year); the system should ideally operate with the latest data available. Another challenge was the distribution of these data: in addition to the data sources for the geoinformation calculation (here: groundwater vulnerability) a topographic map from an external data provider had to be integrated. Apart from the fact that the data required are of heterogeneous structure (grid data versus vector data), there are three different scale levels to consider. Following Hoelting et al. (1995), the main task here is to generate geoinformation, namely the groundwater vulnerability from distributed data sources. This approach has been used for almost a decade in German geological surveys to evaluate the potential contamination of groundwater by hazardous materials (e.g. heavy metals, oil spills, etc.). Thus, it is a preventive method, often considered for implementing the water framework directive of the European Union (The European Parliament and the Council of the European Union, 2000). In order to compute the groundwater vulnerability, five distinct

factors have to be calculated from the data provided in Table 1. The factors represent the specific protectiveness of the soil, the groundwater overburden (including the depth to the first groundwater table), the leachate rate, and some general supplements for permanent artesian pressure and suspended groundwater layers. The groundwater vulnerability (a dimensionless point value) can be assessed by computing the five factors according to Formula 1. Lower values represent a greater risk of groundwater contamination and vice versa. The risk of groundwater contamination according to Diepolder (1995) and Hoelting et al. (1995) is determined by ! n X P¼ Sþ Dn G n R þ Q þ A, (1) 1

where P is the overall protective effectiveness, S the rating points according to the effective field capacity of the soil, D the depth of the unsaturated zone above the aquifer, G the rock type (G ¼ OF), O is a factor for rock type and F the degree of faulting, jointing and karst formation, R the factor reflecting the long-term groundwater recharge, Q the bonus points for perched aquifer systems (500 points),

ARTICLE IN PRESS C. Kiehle / Computers & Geosciences 32 (2006) 1746–1757

1753

Fig. 5. Integrative SOAP web service ‘‘groundwater vulnerability’’.

A the bonus points for hydraulic (artesian) pressure conditions (1500 points). In order to compute the overall protective effectiveness, the factors (as represented in Formula 1) have to be individually calculated. Complicating matters, each factor is provided by different data providers throughout North Rhine–Westphalia. To generate uniform information, three steps have to be undertaken: 1. Provide factors as OGC compliant services (either WFS or WCS), illustrated in Fig. 5 as

Feature Service 1-N and Coverage Service 1-N, respectively. 2. Accessing the services in order to acquire the necessary data. After transforming all data to grid data, compute the geoinformation into groundwater vulnerability using Formula 1 and map algebra (Tomlin, 1990). This is illustrated in Fig. 5 by the Map Algebra Service.8

8

A technical detail: the request–response cycle is rather a communication through a Catalogue Service (Fig. 5) than a direct

ARTICLE IN PRESS 1754

C. Kiehle / Computers & Geosciences 32 (2006) 1746–1757

3. Provide the result as W3C compliant SOAP web service for platform-independent use (Fig. 5: Integrative SOAP Webservice ‘‘Groundwater Vulnerability’’). The map algebra implementation is based on local operators (DeMers, 2002; Tomlin, 1990). Any dataset is represented as a two-dimensional matrix containing floating-point values (grid dataset). Each represented point value (S, D, G, R, Q, A) is transformed through Formula 1 resulting in a twodimensional matrix representing the resulting values (P). These values are classified in five categories and later transformed into a map representing the overall protective effectiveness of the groundwater overburden. Fig. 5 gives an overview of the service interaction. Clients (here: a web application, but generally any kind of HTTP client) access an integrative SOAP9 web service ‘‘Groundwater Vulnerability’’ that encapsulates services for authentication and assessment of groundwater vulnerability. The base data are not coupled with the system. Instead, an OGC-compliant catalogue service acts as a data broker to distributed data services. The data is computed by a map algebra service according to Formula 1. The results of this process are afterwards incorporated into a map-like display by the ‘‘base service groundwater vulnerability mapping’’. The groundwater vulnerability map classifier acts as a statistic calculation module, computing map parameters, such as mean distribution of values, minimum/maximum values of computed result, etc. All the complexity remains hidden to the service consumer, who simply accesses the integrative SOAP web service, well defined by a WSDLinterface.10 For better understanding of the interface, a short summary of the methods defined in the WSDLinterface are given (parameters have been omitted in the interests of legibility): (footnote continued) connection between the map algebra service and the feature/ coverage service. 9 Since the first development on the system started in early 2004 a specification candidate like WPS was not available. SOAP and WSDL were extensively used in OGC experiments (Sonnet, 2004) and future developments will show, which of the counterparts SOAP/WSDL versus WPS will succeed in the geospatial domain. 10 http://geotech.lih.rwth-aachen.de/gwv/services/Gwv?wsdl offers the service description.

1. getAvailableData( ): retrieves a Data Transfer Object11 containing a reference to all available data for a defined spatial extend. 2. getStandardMetaData( ): retrieves a Data Transfer Object containing a reference to all available metadata for a defined spatial extend. 3. getGWV( ) 1: starts the calculation of the groundwater vulnerability (GWV) assessment and retrieves an object containing all information needed for rendering a groundwater vulnerability map. 4. getGWV( ) 2: same as getGWV( ) 1, but allows the user to define references to the data to be integrated into the calculation (here: one reference for each factor according to Formula 1). According to the provided arguments, the program automatically chooses, which getGWV( ) method to be used (method overloading). 5. getRawData( ): retrieves a Data Transfer Object containing a reference to all factors according to Formula 1, including references to metadata. 6. getGeoDataDTO( ): retrieves a Data Transfer Object containing a reference to the geodata of one specific factor, excluding references to metadata. In general, a WSDL-interface is used by software agents rather than by human operators. A tool like WSDL2Java12 is used to build a client for web service consumption, as in the previously described project. This leaves the implementation details up to the web service consumer, who decides, if the web service is better employed inside a full-featured GIS, or in a lean, mobile environment. The web application utilizing this web service does not have to be installed on the same machine as the one providing the service. Fig. 6 shows four screens of the information systems in front of a map of the study area. The main tasks performed by users can be described as: 1. Select the area of interest based on a topographic map provided by a public data provider. The client application is responsible for extracting all necessary parameters (here: bounding box, 11 Data Transfer Object is a design pattern used in object oriented technology and is explained in detail in Fowler (2002). 12 Available through http://ws.apache.org/axis/java/user-guide. html#WSDL2JavaBuildingStubsSkeletonsAndDataTypesFrom WSDL.

ARTICLE IN PRESS C. Kiehle / Computers & Geosciences 32 (2006) 1746–1757

1755

Fig. 6. Study area and screenshots of information system.

spatial reference system, resolution, etc.) to request the operation from the web service. 2. Inform by means of generating geoinformation from distributed data inventories. This task, as illustrated in Fig. 5, is fulfilled by a complex service chain which assures authorization of the requesting user, retrieves the required data and performs the necessary calculations (geoprocessing according to Formula 1, map classification, etc.).

3. Compare by generating several maps based on different data sources, and then have the system compare the maps through, for example, map algebra tasks like subtraction, addition, etc. This is important to allow users to get a detailed overview of the data incorporated into the result. 4. Decide on a specific topic, using the information generated by the system. This most important step, the spatial decision making, is still at the discretion of the user. The implemented system

ARTICLE IN PRESS 1756

C. Kiehle / Computers & Geosciences 32 (2006) 1746–1757

assists by providing just-in-time data integration, information processing, metadata, and methods to compare results. The developed system has been presented to potential users in two workshops with up to 30 participants. These users certified the applicability of the system and noted several advantages over conventional (GIS-based) information systems:





 

The system supports data reusability: data provided as services can easily be reused, on scales from organizational to world-wide, to answer queries as yet unknown. The system supports just-in-time generation of geoinformation: assuming that most actual data are available for the system, the generated information is of higher accuracy than map-like counterparts. The process of groundwater vulnerability assessment and mapping is significantly accelerated. The system is cost effective, due to the fact that it is completely based on Open Source Software (OSS).

The main disadvantage of the system is the inability of performing very complex, model-based tasks like groundwater table mapping in a fully automated way. In this case, an experienced user is required to manually adjust, for example, interpolation parameters. It should be noted that, due to the sensitive nature of the data used in this case study, the system is not available for the general public. Further, because of the spatial extent of the scientific problem, the system is only currently available in German. However, some supplementary screenshots to Fig. 6 are available via http://www.kiehle. org/cageo/screenshots_gwv.pdf. 4. Conclusions This paper has described the main web technologies available to utilize distributed, heterogeneous geodata. Besides OGC-compliant services, and the setting up of SDI, the use of SOAP and WSDL have been explored for the development of web-based information systems. Two case studies illustrate how spatial algorithms can be provided as web services compliant to a forthcoming OGC specification. According to the welldefined interfaces of the service provided, the opportunity of chaining of services complying with OGC Web Processing Service specification is made available.

The second case study is taken from a real-world example, considering a distributed data environment as basis to a hydro-geological problem. The utilization of OGC web services to facilitate distributed storage and access to spatial data allows for the integration of data in SDI. The approach presented considers a SOAP web service described by a WSDL interface, making it usable by any web client. The process of information generation based on map algebra operates on distributed data sources. This approach has significant advantages over traditional data integration approaches. As the data remain distributed, this process allows just-intime integration of data rather than permanent integration of data. If the provider updates the data source, the information system integrates the data the next time it is accessed. Another advantage lies in the potential for quick incorporation of new data into the information system. Recently published data sources can be smoothly integrated into the system because the services provided by OGC and W3C are defined by well-known interfaces, further adding the benefit of syntactical interoperability between data sources. Downsides to the approaches are the overheads produced by the semi-structured data format XML. The trade-off is interoperability for the cost of performance. The two case studies were specifically chosen for demonstrative purposes. Case Study 1 provides insight into a forthcoming OGC specification, which might revolutionize the geospatial web by adding a standard for data processing rather than just data provision. Unfortunately the specification collides with mainstream information technology (IT) specifications as provided by World Wide Web Consortium. A possible solution might be the integration of OGC-compliant web services in W3C-compliant web services, as demonstrated in Case Study 2. This adds the necessary flexibility for the development of web-based information systems with spatial content; geospatial standards are utilized, but without excluding non-spatial ITinfrastructures. The potential power of spatial data lies in the combination of spatial content with highperforming, non-spatial information systems. Acknowledgments The author thanks Rob Harrap and an anonymous referee for their comments, which helped in improving the manuscript. Chad Townsend kindly

ARTICLE IN PRESS C. Kiehle / Computers & Geosciences 32 (2006) 1746–1757

proofread the paper. The work presented in this paper has been supported by the German Ministry of Education and Science as part of the ‘‘Geotechnologien Programme’’ and can be referenced as publication No. GEOTECH-221. References Abel, D.J., Gaede, V.J., Taylor, K.L., Zhou, X., 1999. Towards spatial internet marketplaces. Geoinformatica 3 (2), 141–164. Anderson, G., Moreno-Sanchez, R., 2003. Building web-based spatial information solutions around open specifications and open source software. Transactions in GIS 7 (4), 447–466. Azzam, R., Bauer, C., Bogena, H., Kappler, W., Kiehle, C., Kunkel, R., Leppig, B., Meiners, H.G., Mueller, F., 2003. Geoservice groundwater vulnerability. Geotechnologien Science Report 2, 31–35. Bernard, L., Einspanier, U., Lutz, M., Portele, C., 2003. Interoperability in GI service chains—the way forward. In: Proceedings of the Sixth AGILE Conference, Lyon, France, April 24–26, pp. 179–187. Chan, T.O., Feeney, M., Rajabifard, A., Williamson, I.P., 2001. The dynamic nature of spatial data infrastructures: a method of descriptive classification. Geomatica Journal 55 (1), 65–72. DeMers, M.N., 2002. GIS Modeling in Raster. Wiley, Indianapolis, IN, (203pp.). Diepolder, G.W., 1995. Schutzfunktion der grundwasserueberdeckung. Grundlagen—bewertung—darstellung in karten. GLA Fachberichte 13, 5–79. Fowler, M., 2002. Patterns of Enterprise Application Architecture, first ed. Addison-Wesley, Boston, MA, (560pp.). Harold, E.R., Means, W.S., 2004. XML in a Nutshell, third ed. O’Reilly, Sebastopol, CA, (600pp.). Heier, C., Kiehle, C., 2005. Standardized web-based geodataprocessing—OGC web processing service. GIS—Journal for Spatial Information and Decision Making 15 (6), 39–43. Hoelting, B., Haertle, T., Hohberger, K.-H., Nachtigall, K.-H., Villinger, E., Weinzierl, W., Wrobel, J.P., 1995. Konzept zur ermittlung der schutzfunktion der grundwasserueberdeckung. Geologische Jahrbuch 63, 5–24. Kiehle, C., 2006. Entwicklung einer geodateninfrastruktur zur regelbasierten ableitung von geoinformationen aus distributiven datenbestaenden. Ph.D. Dissertation, RWTH Aachen University, Aachen, Germany (113pp.). Kresse, W., Fadaie, K., 2004. ISO Standards for Geographic Information, first ed. Springer, Heidelberg, (322pp.). Lake, R., 2005. The application of geography markup language (GML) to the geological sciences. Computers & Geosciences 31 (9), 1081–1094. Lillesand, T.M., Kiefer, R.W., Chipman, J.W., 2004. Remote Sensing and Image Interpretation, fifth ed. Wiley, Indianapolis, IN, (784pp.). Mclaughlin, J.D., Nichols, S.E., 1992. Building a national spatial data infrastructure. Computing Canada 18 (1), 24. Nogueras-Iso, J., Zarazaga-Soria, F.J., Lacasta, J., Be´jar, R., Muro-Medrano, P.R., 2004. Metadata standard interoper-

1757

ability: application in the geographic information domain. Computers, Environment and Urban Systems 28 (6), 611–634. Nogueras-Iso, J., Zarazaga-Soria, F.J., Be´jar, R., A´lvarez, P.J., Muro-Medrano, P.R., 2005. OGC Catalog Services: a key element for the development of spatial data infrastructures. Computers & Geosciences 31 (2), 199–209. Sommerville, I., 2004. Software Engineering, seventh ed. Addison-Wesley, Boston, MA, (784pp.). The European Parliament and the Council of the European Union, 2000. Directive 2000/60/EC of the European Parliament and of the Council of 23 October 2000 establishing a framework for community action in the field of water policy. Official Journal of the European Communities, vol. L 327/1. Tomlin, C.D., 1990. Geographic Information Systems and Cartographic Modelling, first ed. Prentice-Hall, Englewood Cliffs, NJ, (249pp.). Vrba, J., Zaporozec, A., 1994. In: IAH International Contribution for Hydrogegology, vol. 16/94. Heise, Hannover, p. 131pp. Weerawarana, S., Curbera, F., Leymann, F., Storey, T., Ferguson, D.F., 2005. Web Services Platform Architecture: SOAP, WSDL, WS-Policy, WS-Addressing, WS-BPEL, WSReliable Messaging, and More, 1st ed. Prentice-Hall PTR, Indianapolis, IN, 456pp.

Web references de la Beaujardiere, J. (Ed.), 2004. Web Map Service (WMS 1.3). OGC 04-024. Online: https://portal.OGC.org/files/?artifact_ id=5316 Evans, J.D. (Ed.), 2003. Web Coverage Service (WCS), version 1.0.0. OGC 03-065r6. Online: https://portal.opengeospatial. org/files/?artifact_id=3837 Kolodziej, K., 2004. OpenGIS Web Map Server Cookbook. Online: http://portal.opengeospatial.org/files/?artifact_id=7769 Nebert, D. (Ed.), 2004. Developing Spatial Data Infrastructures. The SDI Cookbook, version 2.0. Online: http://www.gsdi.org/ docs2004/Cookbook/cookbookV2.0.pdf OGC, 1999. OGC Simple Features Specification for SQL, revision 1.1. OGC 99-049. Online: http://portal.opengeospatial.org/ files/?artifact_id=829 Percival, G. (Ed.), 2003. OGC Reference Model. OGC 03-040. Online: https://portal.opengeospatial.org/files/?artifact_id= 3836 Schut, P. (Ed.), 2005. Web Processing Service (WPS) Specification. OGC 05-007r2. Online: http://portal.opengeospatial. org/files/?artifact_id=10634 Sonnett, J. (Ed.), 2004. OWS 2 Common Architecture: WSDL SOAP UDDI. OGC 04-060r1. Online: https://portal.open geospatial.org/files/?artifact_id=8348 Vretanos, P. (Ed.), 2002. Web Feature Service (1.0). OGC 02-058. Online: https://portal.opengeospatial.org/files/?artifact_id=7176 W3C (Ed.), 2001. Web Services Description Language (WSDL) 1.1. Online: http://www.w3.org/TR/wsdl W3C (Ed.), 2003. SOAP, version 1.2. Online: http://www.w3.org/ TR/2003/REC-soap12-part1-20030624