Data & Knowledge Engineering 70 (2011) 683–684
Contents lists available at ScienceDirect
Data & Knowledge Engineering j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / d a t a k
Editorial
Pushing artificial intelligence in database and data warehouse systems This special issue on “Pushing Artificial Intelligence in Database and Data Warehouse Systems” of Data & Knowledge Engineering presents a rigorous selection of the best papers of the Invited Session “Advanced Knowledge-based Systems” of the LNCS/LNAI 13th International Conference on Knowledge-based and Intelligent Information & Engineering Systems (KES 2009), held in Santiago, Chile, during September 28–30, 2009. Following the success of the Invited Session “Advanced Knowledge-based Systems” of LNCS/ LNAI KES 2008, the 2009 event has attracted a large number of submissions, and, after a rigorous review process, only 8 papers have been selected for final publication. After the conference these papers have been invited for submission to the Data & Knowledge Engineering special issue on “Pushing Artificial Intelligence in Database and Data Warehouse Systems”. After two rigorous review rounds, only 4 papers have been accepted for final publication in this special issue. The main idea that inspired this special issue relies in the evidence stating that actual Database and Data Warehouse Systems lack of methods and techniques for extending the capabilities and the expressive power of the data representation, management, warehousing and mining phases, which all play a critical role in such systems. This turns in a limited attitude for these systems to cope with modern information system applications, ranging from Intelligent Tools for Database Management to Intelligent Tools for Data Warehousing, from Complex Data Mining Tools to OLAP Interfaces, from Knowledge Discovery and Machine Learning Tools to Collaborative Filtering Plug-In Components and Sensor-and-Stream Data Analysis Tools, and so forth. As a consequence, actual Database and Data Warehouse Systems are enforced to think of “heavy” application layers that, based on some given highlevel host programming language, implement even-complex procedures devoted to effectively and efficiently manage, warehouse and mine huge amounts of relational and multidimensional data populating their storage layers. Beyond a low re-usability for novel instances adhering to application scenarios similar to those of the target implementations, this phenomenon introduces severe limitations with respect to a wide spread of aspects, ranging from expressive power to computational complexity of the management, warehousing and mining phases, from re-use and extendibility of knowledge and pattern discovery methods to integration between actual Database and Data Warehouse Systems and traditional legacy systems, and so forth. An attractive research direction to solve the above-described drawbacks of actual Database and Data Warehouse Systems consists in extending the applicative layers of these systems by means of methods and techniques borrowed from artificial intelligence and other related disciplines such as fuzzy logic, neural networks, fractals, and statistical approaches. These methods and techniques very often take the aspects of intelligent algorithms that directly extend the applicative layers of Database and Data Warehouse Systems and are implemented in native languages exposed by such systems in the vest of plug-in components. Inspired by the need for extending actual Database and Data Warehouse Systems by means of artificial intelligent models, techniques and tools, as demanded by modern information system applications, this special issue contains four papers that address both theoretical and practical challenges of this emerging scientific area. Each one of these papers provides a high-quality contribution that not only touches conceptual, theoretical and methodological aspects, but also proposes effective implementations and practical experimentations. The first paper, titled “System Models for Goal-Driven Self-Management in Autonomic Databases”, by Marc Holze and Norbert Ritter, focuses the attention on self-managing aspects of autonomic databases, which intend to reduce the total cost of ownership for a database by automatically adapting the database configuration to evolving workloads and environments. As authors correctly state, existing techniques strictly pay attention to make autonomous one particular administration task, and therefore cause problems like over-reaction and interference. To prevent these problems, the self-management logic requires knowledge about the systemwide effects of reconfiguration actions. Starting from these considerations, in this paper authors describe an approach for creating a database system model, which serves as a knowledge base for database self-management solutions. Authors analyze which information is required in the system model to support the prediction of the overall database behavior under different configurations, workloads, and database states. As creating a complete quantitative description of an existing database in a system model is a difficult task, they also propose a modeling approach which supports the evolutionary refinement of models. Finally, authors show how the system model can be evaluated to predict whether or not business goal definitions like the response time are meet. The second paper, titled “Enhancing Accuracy and Expressive Power of Range Query Answers over Incomplete Spatial Databases via a Novel Reasoning Approach”, by Alfredo Cuzzocrea and Andrea Nucita, proposes an innovative reasoning approach for enhancing the 0169-023X/$ – see front matter © 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.datak.2011.03.005
684
Editorial
accuracy and expressive power of range query answers over incomplete spatial databases, along its experimental assessment and analysis. Authors start their research from recognizing that modern spatial database applications built on top of distributed and heterogeneous spatial information sources such as conventional spatial databases underlying Geographical Information Systems (GIS), spatial data files and spatial information acquired or inferred from the Web, suffer from data integration and topological consistency problems. This more-and-more conveys in incomplete information, which makes answering range queries over incomplete spatial databases a leading research challenge in spatial database systems research. Within the general setting of incomplete spatial databases, authors particularly devote attention to the significant instance represented by the application scenario in which the geometrical information on a sub-set of spatial database objects is incomplete whereas the spatial database still stores topological relations among these objects, and propose a novel technique for efficiently answering range queries over incomplete spatial databases via integrating geometrical information and topological reasoning. Authors also propose I-SQE (Spatial Query Engine for Incomplete information), an innovative query engine that implements the main technique. Finally, a comprehensive set of experiments on both synthetic and real-life spatial data sets is provided. The third paper, titled “A Semantic Approach to ETL Technologies”, by Sonia Bergamaschi, Francesco Guerra, Mirko Orsini, Claudio Sartori and Maurizio Vincini, addresses the problem of effectively and efficiently supporting Extraction, Transformation and Loading (ETL) processes within Data Warehouse architectures, for the creation of an updated, consistent and materialized view of a given set of data sources. The solution is represented by a novel ETL tool that introduces a semantic approach whose main benefits consists in (i) allowing the semi-automatic definition of inter-attribute semantic mappings, by identifying the parts of the data source schemas which are related to the Data Warehouse schema, and (ii) grouping the attribute values that are semantically correlated, thus defining a transformation function for populating the Data Warehouse with homogeneous values. The proposed tool relies-on and extends principles and functionalities of two systems that have been previously-devised by the same authors, namely the data integration system MOMIS and the data analysis system RELEVANT, with significant novel contributions. Finally, the proposed ETL tool is carefully validated throughout its extensive usage in the context of a real-life application scenario that concerns the creation of a Data Warehouse for a set of enterprises working in the beverage-and-food logistic area. Retrieved empirical results show that the proposed tool effectively and efficiently supports ETL processes within modern Data Warehouse architectures. The fourth paper, titled “Combining Objects with Rules to Represent Aggregation Knowledge in Data Warehouse and OLAP Systems”, by Nicolas Prat, Isabelle Comyn-Wattiau and Jacky Akoka, considers conceptual modeling aspects of multidimensional aggregations, and proposes a complete conceptual framework for combining objects with rules in order to represent so-called aggregation knowledge in Data Warehouse and OLAP Systems. In this paper, authors argue that, since OLAP-like browsing of multidimensional data plays a critical role in Data Warehouse systems, with particular emphasis on the issues of analyzing data at different aggregation levels by means of roll-up and drill-down operators, aggregation knowledge should be adequately represented in conceptual multidimensional models, and mapped in subsequent logical and physical models. Unfortunately, as authors observe, current conceptual multidimensional models poorly represent aggregation knowledge, which is indeed characterized by complex structure and dynamics and is highly context-aware. Starting from these considerations, authors propose a novel approach for representing aggregation knowledge via (i) appropriate objects modeled by means of UML Class Diagrams and (ii) ad-hoc rules modeled by means of the Production Rule Representation (PRR) language, in a combined manner. In particular, in the authors' solution static aggregation knowledge is represented via class diagrams, while rules describe dynamic aggregation knowledge, i.e. how aggregations may be performed depending on the context. Authors provide principles and functionalities of the proposed conceptual framework for Data Warehouse and OLAP Systems, along with several interesting study cases targeted to a real-life Data Warehouse project that finally demonstrate the benefits due to the research contribution of this paper. The Editor would like to express his sincere gratitude to the Editor-In-Chief of Data & Knowledge Engineering, Prof. Peter Chen, for accepting his proposal of a special issue focused on pushing artificial intelligence in Database and Data Warehouse Systems, and for assisting him whenever required. The Editor would also like to thank all the reviewers who have worked within a tight schedule and whose detailed and constructive feedbacks to authors have contributed to substantial improvement in the quality of final papers. The Editor Alfredo Cuzzocrea* ICAR-CNR and University of Calabria, Via P. Bucci 41C 87036 Rende (CS), Italy ⁎Tel.: + 39 0984 831730; fax: + 39 0984 839054. E-mail address:
[email protected]. URL: http://si.deis.unical.it/~cuzzocrea.