Environmental Modelling & Software 22 (2007) 1217e1220 www.elsevier.com/locate/envsoft
Short communication
Using UML and OCL to maintain the consistency of spatial data in environmental information systems Franc¸ois Pinet*, Magali Duboisset, Vincent Soulignac Cemagref, Clermont Ferrand, 24 avenue des Landais, 63172 Aubie`re Cedex, France Received 24 November 2005; received in revised form 11 October 2006; accepted 11 October 2006 Available online 13 November 2006
Abstract The Object Constraint Language (OCL) is a subset of the well-known Unified Modeling Language (UML) that allows specifying constraints over entities representing concepts from the application domain. The purpose of this paper is to describe a specific extension of OCL to model spatial constraints of Environmental Information Systems (EIS). These new features are applied to the agricultural spreading of organic matter. In this context, it is important to model a set of spatial constraints that define precisely where spreading can take place. For example, organic matters can never be spread inside certain natural areas. At present, some tools allow producing integrity checking mechanisms in different languages (Java, C#, SQL, etc.) from specifications of non-spatial constraints expressed in OCL. For instance, the SQL code generated by OCL2SQL can be used to check if a database verifies constraints or to forbid inserting data that do not verify them. In order to check spatial constraints in EIS, we implemented the ‘‘Spatial OCL’’ proposed in this paper into an extension of OCL2SQL. Ó 2006 Elsevier Ltd. All rights reserved. Keywords: Software Engineering; OCL; UML; Agricultural spreading; Spatial data
1. Introduction 1.1. Spatial data in environmental information systems (EIS) As environmental events occur in time and space, often data used by EIS are georeferenced. As an example, an EIS developed to monitor and analyze the spreading of organic matter (Soulignac et al., 2005) makes heavy use of georeferenced data. Spreading on the croplands is an excellent way of recycling the organic matter (manure, sewage sludge, etc.) but the corresponding agricultural practices require a fastidious monitoring system. An excessive and ill-planed
* Corresponding author. E-mail addresses:
[email protected] (F. Pinet), magali.
[email protected] (M. Duboisset),
[email protected] (V. Soulignac). 1364-8152/$ - see front matter Ó 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.envsoft.2006.10.003
spreading practice could lead to damages of soils due to pollution (Soulignac et al., 2005). It is very important to model a set of spatial constraints that define precisely where spreading of organic matter can take place; as an example, organic matters can never be spread inside certain protected natural areas. The purpose of this paper is to describe the use of the Object Constraint Language (OCL) to model spatial constraints in EIS. 1.2. The Object Constraint Language (OCL): an overview Various examples of the application of the Unified Modeling Language (UML) for the design of environmental systems can be found in Papajorgji and Shatar (2004), Papajorgji et al. (2004), Muzy et al. (2005), Jagadeesh Babu et al. (2006) and Papajorgji and Pardalos (2006). The Object Constraint
F. Pinet et al. / Environmental Modelling & Software 22 (2007) 1217e1220
1218
Language is a notational language, a subset of UML that allows specifying constraints over entities representing concepts from the problem domain (Warmer and Kleppe, 1999; OMG, 2005). It integrates notations close to a spoken language to express constraints. OCL was first developed by a group of IBM’s scientists around 1995 during a business modeling project. It was influenced by Syntropy, that is an object-oriented modeling language that makes heavy use of mathematical concepts (Cook and Daniels, 1994). OCL is now part of the UML standard supported by the Object Management Group and its role is important in the Model Driven Architecture approach (Kleppe and Warmer, 2003). OCL provides a platform-independent and generic method to model constraints. It can be interpreted by code engines/compilers to generate code automatically. Indeed, some tools allow producing integrity checking mechanisms in different languages (Java, C#, SQL, etc.) from specifications of constraints expressed in OCL (Klasse Objecten, 2005). For instance, OCL2SQL can generate SQL code from OCL constraints (Demuth and Hußmann, 1999; Demuth et al., 2001, 2004); Structured Query Language (SQL) is a language that provides an interface to database systems. The code produced by OCL2SQL can be used to check if a database verifies the constraints or to forbid inserting data that do not verify a constraint; see Demuth and Hußmann (1999), Demuth et al. (2001) for details. At present, OCL does not support operations to describe the spatial constraints that are often needed in EIS. Thus, this paper proposes extending OCL by integrating spatial functions and presents an application on a spreading matter information system. In order to provide checking mechanisms inside EIS, the proposed functions have been implemented into an extension of OCL2SQL. 2. Presentation of the new spatial functions integrated into OCL 2.1. A case study in agriculture This section illustrates the spatial extension of OCL on examples based on the spreading information systems described in Soulignac et al. (2005). Fig. 1 presents a small part of the database conceptual schema of the EIS and Fig. 2 provides an example of instance. In this paper, the conceptual diagram has been simplified and
Allowed_Area AA_id
spread_parcels_number
validity_date
AA1
2
2005/12/31
spread_onassociation is implemented by a foreign key
Spread_Area SP_id
date_of_record
spread_on_AA_id (foreign key)
SP1
2005/05/04
AA1
SP2
2005/05/17
AA1
Fig. 2. Example of a database instance (for Allowed_Area and Spread_Area) e two areas (SP1 and SP2) have been spread on the allowed parcel AA1; SP1 and SP2 are linked with AA1 by the spread_on association (implemented by a foreign key).
adapted in order to facilitate the understanding of presented constraints. In Fig. 1, the Allowed_Area class models the area on which the regulation allows the spreading of organic matters. The Spread_Area class models the area on which the spreading has already been carried out by the farmers. In the ideal case, the organic matters are organized into groups before being spread on the fields; each group has an ID in order to improve traceability. The spreading model presented in Fig. 1 includes only one of the potential organic matter providers (Purification_Station).
2.2. Integrating spatial functions in OCL The 8 Egenhofer binary relationships presented in Fig. 3 have been actively studied (Egenhofer and Franzosa, 1991; Egenhofer and Herring, 1992); they constitute the basis of Oracle Spatial SQL (Oracle Corp., 2005). In order to make the specification of spatial constraints possible on the model shown in Fig. 1, we present the integration of these relationships into OCL. The general syntax of the proposed OCL spatial functions is: A.Egenhofer_topological_relation(B) : Boolean Thus, Egenhofer_topological_relation can be: disjoint, contains, inside, equal, meet,
Allowed_Area Organic_Matter Spread_Area OM_group_id:String AA_id:String spread_on SP_id:String used_on provide_organic_matter quantity:Integer spread_parcels_number:Integer 0..* 0..* 1..1 0..* date_of_record:Date 0.*. 1..* unit:String validity_date:Date geometry:Region organic_matter_type:String geometry:Region
Fig. 1. Agricultural spreading model (described in UML).
Purification_Station P_id:String SIRET_id:String
F. Pinet et al. / Environmental Modelling & Software 22 (2007) 1217e1220
A
B
A
B
B
A
A
B
A
1219
B B
A
B
A
A
B
Fig. 3. 8 Egenhofer’s relationships between two simple regions. A simple region is a closed connected point set in a 2-dimensional space R2.
covers, coveredBy, overlap. A and B are the parameters of the operations, i.e. the two simple regions to compare; their type must be Region. These operations return true or false depending on whether the topological relation between A and B is true or false. The following example of OCL constraint illustrates the use of the proposed functions. Example 1. A spread area should not overlap with its associated allowed area: context Spread_Area inv: not ( self.geometry.overlap (self.spread_on.geometry)) Fig. 4 shows a spatial configuration that does not satisfy the above constraint. The constraint in Example 1 defines a condition that must be true for all instances of Spread_Area, i.e. all instances of the ‘‘context’’. In the constraint definition, self represents a spread area and self.geometry is the geometry of self. The expression self.spread_on. geometry returns the geometry of the allowed area linked with the spread area self by the spread_on association.
2.3. Decomposing complex spatial objects The 8 Egenhofer relations presented in Section 2.2 are used to compare pairs of simple regions (Egenhofer and Franzosa, 1991; Egenhofer and Herring, 1992) but in the case of EIS, spatial data have often a complex structure. For instance, a built area can be viewed as a set of simple regions (a set of buildings), i.e. a composite region, as presented in Fig. 5. The proposed OCL extension also supports this type of complex geometries. A composite region is a set CR ¼ {R1,.,Ri,.,Rn} where Ri is a simple region also called ‘‘part’’ of CR. In the proposed
Spread Area SP1 Allowed Area AA1 Fig. 4. A special spatial configuration. A spread area instance (SP1) has been associated with an allowed area instance (AA1) in the database by the spread_on association (see Fig. 2) but the geometry of the spread parcel overlaps the allowed parcel.
‘‘Spatial OCL’’, standard set-based OCL operations can be applied on a composite region in order to ‘‘decompose’’ it into several simple regions. Thus, operations such as forAll, exists or select can be applied to each instance of type Set(Region). The general syntax is: geometry_attribute-> set_based_operation(.). For example, ‘‘for each part p of a composite region A’’ is written as: A-> forAll(pj.). The type of A is Set(Region) and the type of p is Region; p denotes a part of A. Example 2. Let Built_Area(id:Integer, geometry: Set(Region)) be a class declaration. The following OCL constraint defines that all allowed areas must be spatially disjoint from built areas: context Allowed_Area inv: (1) Built_Area.allInstances->forAll (built_area_instancej (2) built_area_instance.geometry->forAll (buildingj (3) self.geometry.disjoint(building) ) ) In the above constraint, the use of set-based operation is needed because the geometry of a built area can be composed of several simple regions. A complete expression in natural language of this constraint is ‘‘(1) for each built_ area_instance in the Built_Area class and (2) for each building in the built_area_instance geometry, (3) the geometry of an Allowed_Area instance (denoted by self) must always be spatially disjoint from building’’.
2.4. Code generation Code can be used to check if a database verifies the constraints or to prohibit inserting data that do not verify the constraints (Demuth and Hußmann, 1999; Demuth et al., 2001). In order to provide checking mechanisms from spatial constraints, we implemented the functions presented in this paper into a first version of an OCL2SQL extension.
The parts of a built area (i.e. several buildings)
Fig. 5. Composite geometry example. Each built area can be composed of several buildings.
F. Pinet et al. / Environmental Modelling & Software 22 (2007) 1217e1220
1220 (s0)
context Spread_Area inv: not ( self.geometry.overlap (self.spread_on.geometry) ) (s1)
(s4)
(s2),(s5)
(s3)
Fig. 6. Links between the constraint of Example 1 and the subparts of the query of Example 3.
This version allows producing automatically Oracle Spatial SQL1 queries from Spatial OCL constraints. The generated queries can be used to check if a database verifies the constraints, i.e. if the data are consistent. For instance, a query can check if allowed areas are spatially disjoint from built areas (see constraint of Example 2) or if spread parcels overlap the associated allowed areas (see Example 1). Example 3. SQL query generated automatically from the constraint of Example 1: (s0) select * from SPREAD_AREA SELF where not (s1) (NOT( (s2) (MDSYS.SDO_RELATE ( (s3) (select GEOMETRY from ALLOWED_AREA where AA_ID in (select SPREAD_ON_AA_ID from SPREAD_AREA where SP_ID ¼SELF.SP_ID)) (s4) , SELF.GEOMETRY (s5) , ’mask¼OVERLAPBDYDISJOINTþ OVERLAPBDYINTERSECT querytype¼WINDOW’) ¼’TRUE’))); The above query returns all spread areas that do not satisfy the constraint (see s0); thus, if the query returns no data, the database complies with the constraint. The correspondence between the different subparts of the query and the constraint is presented in Fig. 6. 3. Conclusion and further developments OCL is a standard and conceptual language (OMG, 2005). It is used by the Model Driven Architecture approach to describe formally the behaviour of an object and to generate automatically the corresponding code (Kleppe and Warmer, 2003). Furthermore, OCL integrates notations close to a spoken language to express constraints. The ‘‘Spatial OCL’’ concepts introduced in this paper have been illustrated on a conceptual schema for the spreading management EIS. The issues related to code generation for Oracle Spatial SQL have been considered in the implementation of the proposed OCL extension into the tool named OCL2SQL. This is an interesting and flexible tool to experiment new extensions of OCL. 1 Several database systems allow using a spatial version of SQL to handle spatial data (Oracle, PostGres, MapInfo, etc.).
A study of the integration of another topological model (Calcul-Based Model) inside OCL has been also experimented (Duboisset et al., 2005) but it has not yet been tested on EIS. A comparison between the Egenhofer-based extension and the CBM-based extension could be of great interest. Our main future work will focus on the definition of new OCL operations in order to refine the specification of spatial constraints. For instance, a distance function could be considered in order to model that the distance between an allowed area and a lake must be greater than 35 m. References Cook, S., Daniels, J., 1994. Designing Object Systems-object Oriented Modeling with Syntropy. Prentice-Hall, New York. Demuth, B., Hußmann, H., 1999. Using UML/OCL constraints for relational database design. Lecture Notes in Computer Science 1723, 598e613. Demuth, B., Hußmann, H., Loecher, S., 2001. OCL as a specification language for business rules in database applications. Lecture Notes in Computer Science 2185, 104e117. Demuth, B., Hußmann, H., Loecher, S., Zschaler, S., 2004. Structure of the Dresden OCL toolkit. In: Second International Fujaba days ‘‘MDA with UML and rule-based object manipulation’’. Darmstadt, Germany, September 15e17, 2004, . Duboisset, M., Pinet, F., Kang, M.A., Schneider, M., 2005. Integrating the calculus-based method into OCL: study of expressiveness and code generation. In: Proceedings of the 16th International Workshop on Database and Expert Systems Applications. Copenhagen, Denmark, August 22e 26, pp. 502e506. Egenhofer, M., Franzosa, R., 1991. Point-set topological spatial relations. International Journal of Geographical Information Systems 5 (2), 161e174. Egenhofer, M., Herring, J., 1992. Categorizing Binary Topological Relationships Between Regions, Lines, and Points in Geographic Databases. Technical report. Department of Surveying Engineering, University of Maine, Orono, ME, p. 28. . Jagadeesh Babu, A., Thirumalaivasan, D., Venugopal, K., 2006. STAO: a component architecture for raster and time series modeling. Environmental Modelling & Software 21 (5), 653e664. Klasse Objecten, March 2005. OCL Tools and Services Web Site. . Kleppe, A., Warmer, J., 2003. Object Constraint Language, the Getting Your Models Ready for MDA. Addison-Wesley. Muzy, A., Innocenti, E., A€ıello, A., Santucci, J.F., Santoni, P.A., Hill, D., 2005. Modelling and simulation of ecological propagation processes: application to fire spread. Environmental Modelling & Software 20 (7), 827e842. OMG, 2005. OCL 2.0 Specification Version 2.0. OMG Specification, p. 185. . Oracle Corp., 2005. Oracle Spatial. User’s Guide and Reference. Oracle documentation. Papajorgji, P., Beck, H., Braga, J., 2004. An architecture for developing service-oriented and component-based environmental models. Ecological Modelling 179 (1), 61e76. Papajorgji, P., Pardalos, P., 2006. Software Engineering Techniques Applied to Agricultural Systems: an Object-Oriented and UML Approach. Springer. Papajorgji, P., Shatar, P., 2004. Using the Unified Modeling Language to develop soil water-balance and irrigation-scheduling models. Environmental Modelling & Software 19 (5), 451e459. Soulignac, V., Gibold, F., Pinet, F., Vigier, F., 2005. Spreading matter management in France within Sigemo. In: Proceedings of the Fifth European Conference For Information Technologies in Agriculture (EFITA 2005). Vila Real, Portugal, July 25e28, p. 8. Warmer, J., Kleppe, A., 1999. The Object Constraint Language Precise Modeling with UML. Addison-Wesley.