I DATA & KNOWLEDGE ENGINEERING ELSEVIER
Data & Knowledge Engineering 26 (1998) 37-70
,
Updating intensional predicates in deductive databases* i"
D.
Laurent
~a,C
" , V.
Phan L u o n g a'b, N. Spyratos a
"LR1, U.R.A. 410 du CNRS, Bdt. 490--Universit~ de Paris-Sud, Orsay Cedex, France F-91405 bUM, U.R.A. 1787 du CNRS, Universit~ de Provence, 39, rue Joliot Curie--Marseilles Cedex 13, France F-13453 "LI/E31; Universit~ de Tours - IUPGEII-Info & Telecom-3, place J. Jaures, Blois, France F-41000 Accepted 2 July 1997
Abstract We present an approach to updating deductive databases in which every insertion or deletion of a fact (atomic formula without variables) can be performed in a deterministic way. The main features of our approach are the following: (i) the inserted and deleted facts may concern any predicate of the underlying alphabet--not just extensional predicates, and (ii) deleted facts are explicitly stored in the database. We show that logic programs in our approach can be associated with well-founded semantics. Moreover, as the explicit storage of deleted facts introduces a significant overhead, we also study the problem of storage optimization. Keywords: Deductive database; Well-founded semantics; Update semantics
1. I n t r o d u c t i o n
A deductive database consists of a set of facts and a set of rules, the set of facts being the extensional database and the set of rules being the intensional database [32,16,6]. The problem of updating intentional predicates in deductive databases has attracted considerable attention in recent years [1,14,35,15,6,31,20,30]. This problem consists of translating an insertion or deletion of a fact over an intensional predicate to (possibly more than one) insertion or deletion of a fact over the extensional predicates. The existence of more than one such translation, for a given insertion or d e l e t i o n , is referred to in the literature as n o n - d e t e r m i n i s m o f the translation. W e e x p l a i n this p r o b l e m u s i n g the e x a m p l e o f [6]. C o n s i d e r the f o l l o w i n g e x t e n s i o n a l database:
* Work partially supported by the French National Project GDR-PRC BD3; part of a preliminary version of this paper appears in the proceedings of the 9th IEEE International Conference on Data Engineering ICDE'93. * Corresponding author. E-mail:
[email protected] 0169-023X/98/$19.00 © 1998 Elsevier Science B.V. All rights reserved PII: S 0 1 6 9 - 0 2 3 X ( 9 7 ) 0 0 0 2 8 - 1
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
38
FACTS:
teaches(Smith, CS 101) attends(John, CS 101 ) advises(Smith, Mike) advises(Smith, John)
meaning that the following facts are explicitly stored in the database: • Smith teaches course CSI01, • John attends course CS101, • Smith advises student Mike, and • Smith advises student John. Moreover, consider the following intensional database: RULES:
ps(P, S) ~-- teaches(P, C), attends(S, C) ps(P, S) ~ advises(P, S) cps(C, 10,S) ~-- teaches(P, C), attends(S, C)
These rules allow for fact derivation from the extensional database in the following way: • ps(P, S) is inferred whenever professor P teaches course C and student S attends C, • ps(P, S) is inferred whenever professor P advises student S, • cps(C, P, S) is inferred whenever professor P teaches course C and student S attends C. As a first example of update, suppose that a user wants to insert the fact ps(White, John), i.e. the fact that White is professor of John, without giving further information as to whether White is teacher or adviser of John. If we assume, as in the traditional approaches [6], that the facts to be stored must involve only extensional predicates, then we have two ways to translate the update: either or
insert the facts teaches(White, C) and attends(John, C) insert the fact advises(White, John).
Clearly, in either case the resulting database implies the fact ps(White, John) that was to be inserted. However, we have non-determinism of the translation, possibly at more than one level. Indeed, first we have to choose the update to be performed. If we choose to perform the insertion of the fact advises(White, John), then no further choice is needed. If, however, we choose to perform the insertions of teaches(White, C) and attends(John, C), then we must also choose a course C, so that we can infer ps(White, John) from teaches(White, C) and attends(John, C). Following [6], these choices are left to the user (or to the system). However, if the user (for some reason) cannot decide, and if the choice proposed by the system does not satisfy the user, then the insertion of ps(White, John) cannot be performed. However, it might be the case that a user wishes to insert the fact ps(White, John) without giving further information. Then, clearly, the only way to perform the insertion is to simply store the fact ps(White, John) in the extensional database. This is not allowed by most current approaches, unless ps is a " m i x e d " predicate allowing the storage of ps(White, John), as in [7]. As a second example of update, suppose that the user wants to delete the fact
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
39
cps(CSlO1, Smith, John). This fact is not present in the extensional database but can be inferred from it using the last rule. However, again, there are two ways to translate this deletion, namely, either or
delete the fact teaches(Smith, CSI01) delete the fact attends(John, CS101)
Clearly, in either case, the resulting database does not imply the fact cps(CSlO1, Smith, John) and thus the intended deletion is performed. Following [6], the choice as to which of the two deletions should be executed is also left to the user (or to the system). However, again, it might be the case that a user wishes to delete the fact cps(CSlOl, Smith, John) without giving further information. Then clearly, the only way to perform the deletion is to simply store the fact cps(CSlOl, Smith, John) as being a deleted fact--something not allowed by current approaches, not only because the predicate cps is an intensional predicate but also because current approaches do not allow the storage of deleted facts. In our approach, we do allow the storage of any tact as being inserted or deleted, whether this fact concerns an extensional or an intensional predicate. We show that, as a consequence, every insertion or deletion over any predicate becomes deterministic. In the remaining of this section, we describe informally the basic concepts underlying our approach and its relationships to other approaches.
1.1. Summary of the approach Assumptions. Our approach relies on the following two assumptions: 1. Users can insert or delete any fact that they wish, that is, users can insert or delete facts over any extensional or intensional predicate. However, users can not change the set of rules. In other words, users are aware only of the extensional and the intensional predicates and regard the database just as a set of facts in which they may insert or delete any fact they wish over the intensional or the extensional predicates. Moreover, the system hides from users the way it reacts in response to retrieval and update requests. 2. The system stores both inserted and deleted facts. That is, deleted facts are not removed from the database, but are stored in much the same way as inserted facts are. Actually, in the present approach, a database is seen as a triple A = (IFACTS, D_FACTS,RULES) where • I_FACTS denotes the set of inserted facts, • D_FACTS denotes the set of deleted facts, and • RULESdenotes the set of rules (rules are clauses whose head and body are not empty). The body of a rule may contain negative literals. Database semantics. Deleted facts that are stored in the database play an active role and have a direct influence on database consistency and fact derivation. Roughly speaking, we call a database A consistent if A contains no fact which is both inserted and deleted. Moreover, given a fact f, we say that A derives f if one of the following conditions holds: • f is inserted, or • f is not deleted and f can be derived from inserted facts and negations of deleted facts, using the rules. Later on, we shall see that the above derivation method can be associated with well-founded
40
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
semantics [33], and we argue that our approach can be implemented in existing systems with only minor modifications. Update semantics. Given a fact f and a database A, our method of updating works in the following simple way:
Insertion o f f in A: To insert f, add f to the set of inserted facts and remove f from the set of deleted facts, if necessary.
Deletion of f from A: To delete f, r e m o v e f from the set of inserted facts and a d d f to the set of deleted facts, if necessary. We note that, if the original database d is consistent then it remains consistent after the insertion or deletion o f f as defined above. Summarizing our approach, intentional updates are not translated into extensional updates, but, rather, are stored in the database. This implies on the one hand that every intensional predicate is mixed [7] and, on the other hand, that traditional semantics must be adapted to the presence of deleted facts in the database. Let us now illustrate our approach through an example that we shall use as our running example.
1.2. Running example Consider the database Ao = (I FACTSO, D = FACTS0, RULES), where I_FACTSo:
0
(i.e. there are no inserted facts)
D_FACTSo:
0
(i.e. there are no deleted facts)
RULES: q(x, y, Z) 6- r(x, y), s(y, Z)
p(x, Z) 6- q(x, y, Z) t(x, z) 6- q(x, y, z), ~p(x, z) Here x, y and z are variables and p, q, r, s and t are first order predicates of arities 2, 3, 2, 2 and 2, respectively. We note that, as usual in the context of deductive databases [32,16,6,31], we consider two kinds of predicates, namely, extensional predicates and intensional predicates: • The intensional predicates are those that occur in the head of at least one rule. • The extensional predicates are the remaining ones in the underlying alphabet. In our running example, the intensional predicates are p, q and t, and the extensional predicates are r and s. Moreover, the database A0 is clearly consistent and no fact can be inferred from it (since the sets of inserted and deleted facts are empty). As a first example of update, let us insert r(a, b) in Ao. Denoting by A 1 = (I_FACTS1, D_FACTS l, RULES) the resulting database, we have: I FACTSl:
r(a, b)
and
D_FACTS 1:
0.
(We do not mention RULESany more in the example because this set is not modified by the updates.) Thus, r(a, b) is now inferred from d 1, and no other fact can be inferred.
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
41
As a second example of update, let us insert s(b,c) in A 1. Denoting by A2 = (I FACTS2, D_FACTS2, RULES) the resulting database, we have: I_FACTS2: r(a,b),s(b,c)
and
DSACTS2: 0.
Notice that it is now possible to infer the following facts from A2: r(a, b), s(b, c), q(a, b, c) and p(a, c). As a third example of update, let us delete p(a, c) from A2. Denoting by A3 = 0_FACTS3, D_FACTS3, RULES) the resulting database, we have: J_FACTS3: r(a,b).s(b,c)
and
D FACTS3: p(a,c).
It is important to note that A 3 still derives the facts r(a, b), s(b, c) and q(a, b, c) but not the fact p(a, c) as this fact is a deleted fact. In other words, deleted facts act as exceptions to the rules. Moreover, negations of deleted facts can be used to infer other facts. For example, we can use --,p(a, c) to infer t(a, c) from the last rule and the fact q(a, b, c). As a fourth example of update, let us delete r(a, b) from A 3. To do this, we remove r(a, b) from the inserted facts and we add it to the deleted facts. Denoting by A~ = (i FACTS4, D FACTS4, RULES) the resulting database, we have: J_FACTS4: s(b, c)
and
DJACTS4: p(a, c), r(a, b).
We note that A4 derives only one fact, namely s(b, c). In the rest of the paper, we shall use the database A3 above as our running example and we shall denote this database by A = (J_FACTS,D_FACTS,RULES). It is important to note the differences between our approach and traditional approaches. In our approach: • we record both inserted and deleted facts, and • we accept insertions and deletions of facts over both intensional and extensional predicates. By doing so, our approach makes use of all the information provided by the user. In contrast, traditional approaches • record the inserted facts but not the deleted facts, and • record facts only over extensional predicates. As a result, traditional approaches use only part of the information provided by the user--and this is, in our opinion, the main cause of non-determinism during updating. In our approach, however, the storage of deleted facts introduces a significant overhead. This led us to study issues related to storage optimization. The important question in this context is the following: under what conditions we may not store an inserted or a deleted fact and still obtain the same answers as if the fact were stored? This and related questions receive a detailed treatment in the paper.
1.3. Related work The problem of updating intentional predicates in deductive databases has attracted considerable attention in recent years. The problem is how to translate updates over intensional predicates into updates over extensional predicates. We note that a similar problem occurs also in relational databases, known as the view-updating problem, and, as shown in [8[, view-updating is not deterministic in many cases. We also recall that the same problem has been studied in the context of updating through universal scheme interfaces [19,28,27,5,25,23].
42
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
In the context of deductive databases, the problem has no global solution under standard Datalog semantics [6], but partial solutions have been obtained, for example by using functional dependencies [311. One approach that leads to a deterministic translation process can be found in [20] where the authors consider sets of databases (that they call knowledge bases), instead of a single database. Roughly speaking, operators are given on these sets so that, given an update, all resulting states satisfying minimal change are computed. It should be noticed that, if such an approach is well adapted to Artificial Intelligence, it seems to be inefficient in the context of deductive databases. Indeed, the large amount of facts usually contained in a database would generate an intractable set of databases which would make any update processing unrealistic. The main idea behind our approach is to store deleted facts and use them during querying and updating. In a way, this idea can be seen as storing the histoo' of the information handled by the database. In this sense, our approach is related to that of [35], where updates on deductive databases are defined by means of history predicates. In [35], however, deleted facts are only used in order to process updates and are subsequently ignored in query answering, whereas in our approach deleted facts play an essential role in query answering. On the other hand, we note that in [35], history predicates are introduced to update deductive databases containing incomplete information, a problem that is not addressed in this paper. A radically different approach to deterministic updating consists in: (a) defining a transaction language such that every transaction in the language preserves database consistency "automatically" [2,3], and (b) allowing users to update the database using only transactions of that language. The design of such a language, however, is not a trivial task. We note that in the approach of [22] such transactions can be obtained through abduction. In [15] a language for the specification of updates is proposed and allows the expression of dynamic constraints. We also note that a similar approach to the specification of updates is considered in [30], where dynamic constraints are seen as transaction
preconditions. An important problem related to updates is that of maintaining the database integrity constraints. An efficient approach to this problem is provided in [14] using abduction. Following this approach only tests that are relevant to the current update are performed against the constraints and the current database state, so that feasibility of the update is checked in an efficient way. However, only particular cases of constraints are handled by this approach. A more general approach is proposed in [17], where this problem is considered in the general case by means of modal logic. Modal operators are used in order to model the forthcoming state of the database after the update. However, despite its generality, the approach can not avoid non-determinism in the case of deletions. Finally, we would like to note that, by recording both inserted and deleted facts, our approach can be seen as a particular case of making use of default rules [29]. Indeed, each rule p(x) ~ pt(x~ ), p2(xz) . . . . . pk(xk) of the database, in our approach, can be seen as a closed default rule [9], where: * " p l ( x l ) , p2(x2) . . . . .
pk(xk) are t r u e " is the p r e r e q u i s i t e ,
• "p(x) is not deleted" is the justification, and • "p(x) is true" is the consequent. (The term "true" above is used informally to mean "derived facts." A formal definition will be given shortly.) In this respect, it is interesting to note that the work of [18] uses exceptions in order to generalize
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
43
negation by failure. These exceptions contribute to the derivation of new facts, as in our approach. In [18], however, exceptions are computed through "metalevel constraints" and cannot be explicitly specified in the program, as we do in our approach. The paper is organized as follows: In Section 2, we introduce our formalism and we recall the main definitions and notations concerning deductive databases and well-founded semantics. In Section 3, we show how databases in our approach can be associated with well-founded semantics. In Section 4, we give formal definitions and properties of updates and their semantics. In Section 5, we address the important problem of storage optimization in our approach. Finally, Section 6 contains some concluding remarks and suggestions for further research.
2. Basic definitions and notation In this section we first introduce our formalism and then we recall the definition of well-founded semantics of Datalog that we also use in our approach. 2.1. The database A database is seen as a triple consisting of two function-free sets of facts (which are modified during updates) and of a set of function-free rules (which are not modified during updates). The underlying alphabet A consists of: Constants. There is an infinite set of constants denoted by CONST. Constants are usually denoted by lower case characters such as a, b, etc. Variables. There is an infinite set of variables denoted by VAR.Variables are usually denoted by lower case characters such as x, y, etc.
Predicates. There is a finite set of predicates denoted by PRED. Predicates are usually denoted by lower case characters such as p, q, etc. Each predicate is associated with a positive integer called its arity, We assume that the sets CONST,VAR and PRED are mutually disjoint. We note that, since we consider function-free languages, the only terms are constants and variables. As a notational convenience, if p is an n-ary predicate of PRED, and if t~, t 2. . . . . t n are terms, then the formula p(t~, t 2. . . . . t,,) is denoted by p({). We call literal any formula of the form p(7) or -,p({) referred to as positive literal and negative literal, respectively. A literal is said to be ground if it contains no variable. A positive ground literal is also called a fact. A rule is a formula of the form p ( i ) <-- L~, L 2. . . . . L k where, for every i = 1, 2 . . . . . k, L i is a literal. The positive literal p(7) is called the head of the rule and the set of literals L~, L z. . . . . L, is called the body of the rule. A rule is positive if its body contains only positive literals. Moreover, rules are assumed to be safe [32,16], that is, every variable occurring in the head of a rule also occurs in the body of the rule and
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 3 7 - 7 0
44
every variable occurring in a negative literal of the body also occurs in a positive literal of the body of the rule. We now define what we call a database in our approach and what we mean by database
consistency. Definition 1--Database. A database A over a given alphabet A is a triple A (t FACTS, DFACTS, RULES) where IFACTS and DFACTS are finite sets of facts and RULESis a finite set of rules. =
The following definition of consistency ensures that inserted and deleted facts stored in the database do not conflict.
Definition 2--Consistency. A database A
=
(IDFACTS, RULES) is consistent if 1FACTSf")DFACTS
=
0. 2.2. Well founded semantics We briefly recall the basic notions of the so-called well founded semantics of Datalog [33]. The reader is referred to [10] for more details on this topic. Well founded semantics are based on the notion of partial interpretation, defined as follows. Let HB be the Herbrand base of a database alphabet A, i.e. HB is the set of all facts that can be obtained from A. For every subset S of HB, let --,.S denote the set { ~ f l f ~ S}. In particular, ~.HB denotes the set { ~ f l . f E liB}. A subset C of HB U-,.HB is called consistent if C does not contain a fact f and its negation @'. That is, C is consistent if there are disjoint subsets C + and C - of HB such that C = C + U - , . C . A partial interpretation of A is just a consistent subset ! of HB U --,.HB. A Datalog database is a set of facts and rules denoted as DB = (F, R), where F stands for the set of facts and R stands for the set of rules. We denote by inst_DB the set of all facts in F together with all ground clauses obtained by instantiating the rules in R using the constants appearing in the facts of F or in the rules of R. Given a partial interpretation ! and a Datalog database DB, consider the following operators. • Define the immediate consequence operator T ~ by: @
T DB(I) = {head(r) I r E inst_DB ^ VL ~ body(r), L ~ I}. • Define a set of facts U to be unfounded with respect to I, if for all f in U and for all instantiated rules r in inst_DB the following holds:
head(r) = f ~ 3 L E body(r), [~L ~ 1 v L E U] . Define the unfounded-set operator GUSDe by: GUSDB(I) is the union of all unfounded sets with respect to I. C It has been shown [33] that the operator TDB k.J ~.GUSDB has a least fixpoint which is a partial interpretation of A. This fixpoint is referred to as the well-founded model of DB. Moreover, the set of unfounded facts can be computed through the set of potentially founded facts [12] as |ollows. • Define SPFDB(I) tO be the limit of the sequence [SPFi(I)]i>~o defined by:
D. Laurent et al, / Data & Knowledge Engineering 26 (1998) 37-70
45
SPFo(I ) = {head(r) l r E inst_DB ^ pos(body(r)) = 0 ^ VL @ body(r), -~L U: I}, SPF~(I) = {head(r) I r E inst_DB /x pos(body(r)) C SPF,_ 1(I) A VL ~ body(r), ~ L U_I}. Here, pos(body(r)) stands for the set of positive literals of body(r). It is shown in [12] that GUSI~8(I) =HBoe\SPFos(I ), where HBD~ denotes the subset of HB whose facts involve only constants appearing in the facts or in the rules of DB. We note that the above sequence provides a constructive way to obtain the well founded model of DB.
3. Database semantics We show now that the above way of computing the well founded model of a Datalog database can also be used in our approach with only minor modifications.
3.1. The model of a database The operators T e and GUS of the previous section can be slightly modified in order to generate a model for any consistent database A of our approach. The idea is, on the one hand to restrict T e so that it generates no deleted facts, and on the other hand, to incorporate the deleted facts to the unfounded facts computed by the operator GUS. Formally, given a consistent database A = (I_FACTS, D_FACTS,RULES) of our approach, let inst_A denote the set of all facts in I_FACTS together with all instantiated rules that can be obtained by instantiating the rules of RULESby constants appearing in A. Moreover, let TA and GUS a denote the well founded operators associated to the Datalog neg database defined by the pair (I_FACTS,RULES). NOW, given a partial interpretation I, define the operators T~ and GUS* as follows: E
T~(I) =Ta(I)\D u FACTS and
GUS*(I)=GUSj(I)UD_EACTS.
We note that the well-founded operators are modified in our approach in such a way that deleted facts appear as unfounded facts. Now, consider a sequence [M~]i~ 0, where M i = M +i t2 ~.M,7 and where M +~, M / are defined as follows: +
M o = I_FACTS and
M +=T*(M~ ,) i
-
and
Mo
=
DFACTS
M,-=GUS*(M,_ )
,
fori>0
•
In the following theorem, we show that the above sequence is a monotonic sequence of partial interpretations of A. Thus, this sequence has a limit which is a partial interpretation of A. We call this partial interpretation of A the model of A, and we denote it by M(A).
Theorem 1. For every consistent database A = (IFACTS, DFACTS, RULES), the sequence [M/]i~ o as defined above is an increasing sequence of partial interpretations of A. Proof. We first note that T* and GUS* are both monotonic operators, since so are T~ and GUS~ (see [33]).
46
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
Let us show that, for every integer i I> 0, M~ is a partial interpretation. We first note that such is the case for i = 0, since I_FACTS71 D_FACTS= 0. Given an integer i, we now assume that for every k < i, M k is a partial interpretation and that M~ is not a partial interpretation. In this case, let f be a fact in + + C M~ f-/M~. Since D_FACTSf3 Mi = ~, f does not belong to D_FACTS. By definition of TA, there exists a rule r in inst_A whose head is f and whose body is contained in M~_ 1- On the other hand, as f belongs to M~- and not to D_FACTS, f belongs to GUSa(M ~_ ~). Therefore, by definition of GUS~, one of the literals in body(r), say L~o is such that either L~o E GUS~(M~_ ~), or --,L~,,E M~ 1. As Mi__l is a partial interpretation, we cannot have L~,, in both sets M,+I and M/_~. Hence, L~o ~ GUSj(M~ ~), entailing that L~0 is a positive literal that belongs to M s_ i 71 GUSa(M~ ~). If we write Li0 as f0' we obtain that fo~M~ ~M~. Since M + ~ C M +, f o ~ M + and, moreover, f~D_FACTS. Thus, the same + reasoning as f o r f holds fOrfo: there exists a fact f , not in DFACTS that belongs to M~_ ~ 7/M~. But we also have that f , ~MI? z (because f0 ~ m + l). As a consequence, m +~_zN M~ # !b and, similarly, for every j, 0 ~
SPF*(I) = {head(r) I r E inst_A Apos(body(r)) = ~ A VL ~ body(r), ~ L if: 1}~_FACTS, SPF* (I) = {head(r) I r ~ inst_A ^ pos(body(r)) C SPF* , (I)/x VL ~ body(r), ~L U:/}\D_FACTS. We first note that this sequence is increasing, and thus it has a limit, which we denote by SPF*(I). Moreover, it is easily seen that, for every partial interpretation I, we have SPF*(I)C SPFj(I). The following proposition shows that, for every i > 0, M I is the complement of SPF*(Mg) with respect to HB~.
Proposition 1. For every
i > 0, we have: M / = HBa\SPFS(M ~ j).
Proof. L e t f be a fact in M / . By definition, f belongs to G U S j*( M i ~) = GUSa(Mi-I) U P FACTS.If f E D_FACTS, then f belongs to none of the sets SPF*(Mi_~). Thus, in this case f belongs to * M i ~) but not to D_FACTSthen, clearly, f HBa~SPF*(Mz l). If, on the other hand, f belongs to GUSa( belongs to GUSa(M ~_ ~) (since GUS*(Ms_ i) = GUSa(Mi 1)U DFACTS). In this case, following [1 2], f does not belong to SPFa(M ~ ~). As a consequence, f does not belong to SPF*a(Mi 1) (since SPF*(Mi 1)C-SPFa(Mi l))- Therefore, we have shown that every fact f in Mi- belongs to HBa\SPF](Mi- 1), for every i > 0. [] Conversely, l e t f be a fact in HBa\SPF*(M ~ ~). As above, i f f is in D_FACTSthen, clearly, f belongs to GUS*(Mi_ ~). So, let us now assume t h a t f is not in D_FACTS. Therefore, f is neither in D_FACTSnor
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
47
in SPF*(Mi_I). If moreover, we assume that f is not in M7 = GUS*(Mi_I) ' then f is not in GUSa(Mi_ l) (since GUSa(Mi_ 1) C GUS*(M~_ ~)). This entails t h a t f belongs to SPF~(M i 1) I12]. As a consequence, there exists j t> 0 such that f E SPFj(M i_ j ). Let us first assume that there exists j > 0 such that f~SPFj(M~_j)\SPFj_I(M i), i.e. f ¢Y SPFo(M~_~). By definition of SPFj, there exists an instantiated rule r in inst_d such that head(r)=f and pos(body(r))C_SPFj l(Mi_l) and ~LfZMi_I, for every L in body(r). Since f is assumed not to be in SPF*(M~_~), either pos(body(r))U_ SPF*(Mi_I) , or there is some literal L in body(r) such that --~ ~M~_ 1. However, this last case is impossible, since it would imply that f ~ SPFa(Mi 1). Thus, there is a fact in pos(body(r)) which does not belong to SPF] (M i_ ~). Let f~ be such a fact. We note that f~ cannot be in D_FACTS, because this would entail that ~f~ is in M~ ~ (since DFACTS C MI5 1). Therefore, fj satisfies: fl ~ DFACTS, fl ~ SPFj. j (M i I) andfl ~ SPF*(Mi_ i). Thus, we obtain thatf~ satisfies the same conditions as f, but at step j - 1 of the computation of SPFa(Mi_ ~). Applying the same reasoning iteratively, we obtain a sequence f~, j~ . . . . . ~ of facts having the same properties as f and, in particular, ~ satisfies the following: fj ~D_FACTS, ~ E SPFo(M~_I) and fiU:SPF*(M~ 1). Therefore, by definition of SPFo, there exists a rule ~ in inst_d such that head(rj) =fj, pos(body(ri) ) = ~, and ~L ~M~_l, for every L in body(rj). This implies that fj E SPF*(M~_~), thus that fjESPF*(M~_I). This is a contradiction to our hypothesis that fj U: SPF* (M~ , ) Therefore, if f ~ D_FACTS and f E SPFj(M~ , )\SP~ 1(Mi , ), for some j > 0, then
f ~ GUS*(M~ 1). We finally note that i f f ~ D_FACTSa n d f ~ SPFo(M ~ ~), then the above reasoning forf~ shows t h a t f belongs to GUS*(M~_~ ). Therefore, we have shown that every f a c t f in HB~\SPF*(M~_ ~) belongs to M , , and this completes the proof of the proposition. [] Let us illustrate the computation of M(A) using our running example.
Running example (continued). We recall that the database A = (IFACTS, DFACTS, RULES) of our running example is the following: 1FACTS:
r(a, b)
D_FACTS:
s(b, c) p(a, c)
RULES: q(x, y, Z) ~-- (X, y), s(y, Z)
p(x, Z) ~-- q(x, y, Z) t(x, Z) ~-- q(x, y, Z), -~p(x, Z) The model M(A) is computed as follows. First, we have: +
M 0 = {r(a,b),s(b,c)}
and
M o ={p(a,c)}.
Now, let us compute M 1 from M o. The fact p(a, c) does not belong to M +1 because p(a, c) is a deleted + fact. The fact q(a, b, c) does belong to M j: this fact is not deleted and the first rule applies since r(a, b) and s(b, c) both belong to M 0. Therefore, we get:
m ~ = {r(a, b), s(b, c), q(a, b, c)}.
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
48
The computation of SPF*(Mo) is as follows. First, we have SPF*(Mo)= {r(a, b), s(b, c)}. Then, the facts in SPF*(Mo) are computed from M o and SPFc*(Mo) as follows: • First, we have SPF*(Mo)C SPF*(Mo). • Moreover, using the first rule, we obtain that q(a, b, c) has to be in SPF*(Mo) because r(a, b) and s(b, c) are in SPF*(Mo). • Using the second rule, we obtain no fact since p(a, c) is a deleted fact. • Finally, using the third rule, we obtain no fact either. Therefore we have: SPF* (Mo) = {r(a, b), s(b, c), q(a, b, c)}. Since q(a, b, c) is in SPF*(Mo), the computation of SPF*(Mo) from SPF*(Mo) generates only the fact t(a, c). Thus, we have SPF*(M o) = SPF*(Mo)tO {t(a, c)}, and finally, we obtain SPF*(Mo)= SPF*z (Mo). Hence, SPF*(Mo) = SPF*(Mo). Thus, we obtain M~ by complementation with respect to HBa, and we compute M 1 by: + M l = M 1 tO~.M~. + + In order to compute M 2 f r o m ml, we proceed as follows. First, we have M 2 = M 1 tO {t(a, c)}, since q(a, b, c) and ~p(a, c) are in M 1. On the other hand, the set SPF*(MI) is computed as SPF*(Mo) above, because t(a, c) is in SPF*(Mo). Therefore, M 2 = M 1 tO {t(a, c)} and it can be seen that no new fact appears during the computation of M 3 from M 2. Therefore, denoting by M + the set {r(a, b), s(b, c), q(a, b, c), t(a, c)}, the model of A can be written as follows:
M(A) = M + U~.(HBaWI+) . We note that, for every f a c t f in this example, eitherf or ~ f is in M(A). Such a model is called a total model [ 10]. Using the model M(A), we can define the notions of true, false and unknown facts in A as follows: • f is true in A if f ~M(A), • f is false in A if ~ f ~ M(A), • f is unknown in A if f is neither true nor false. We denote by M + (A) the set of true facts and by M (A) the set of false facts (with respect to A). Moreover, we denote by IM(A)I the set of those facts f such that either f or ~ f is in M(A). We have:
M(A) = M+(A) to ~.M-(A)
and
IM(A)I= M+(A) to M-(A).
We note that a fact f is unknown iff f does not belong to [M(A) I. Therefore, we shall call a fact f known i f f belongs to IM(A)I, i.e. f is known i f f f is either true or false. Moreover, if IM(A)[ contains all the facts of HBa (i.e. if all facts are known), then M(A) is called total. Note. In our running example, we have seen that all facts are known, i.e. M(A) is total. We now give an example of a database having a non-total model. Consider the database A = (X_FACTS,D_FACTS, aULES) where: I
FACTS:
D
FACTS:
r(a)
aULES: p(x) +-- r(x), --,p(x) We have M(A) = {r(a)}. Thus p(a) is an unknown fact. It follows that M(A) is not a total model.
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
49
The following proposition states an intuitively appealing property of the model M(A), namely, that every inserted fact is true in d and every deleted fact is false in A. Proposition 2. Let A be a consistent database and let f be a fact over the underlying alphabet. Then
we have: 1. if f is in J_FACTS, then f is true in A, and 2. if f is in D_FACTS, then f is false in A
Proofi The proof is straightforward: every inserted or deleted fact is in M 0 and thus in M(A), because the sequence for the computation of M(A) is increasing.
[]
3.9,_. Relationship with datalog In this subsection we show that a consistent database d over alphabet A (in the sense of our approach) can be associated with a Datalog database II(d). First, we recall that A = CONSTtA VARU PRED (see Section 2.1). Let us extend A to an alphabet A' as follows: 1. For every n-ary intensional predicate p in PRED, define a new n-ary predicate// not in PRED. 2. Define PRED' = PRED[.-]{p I P is an intensional predicate of PRED}. 3. Define A' to be the alphabet CONSTU VAR(..JFREDt. Now, given a consistent database A = (I_FACTS,DFACTS, RULES}over the alphabet A, we associate A with a Datalog database H(A) = (F, R) over the alphabet A' as follows: • If f is in 1_FACTS, then f is in F. • I f f = p ( d ) is in DFACTS and if p is an intensional predicate, then/~(ff) is in F. • I f p ( { ) e - L ~ , L 2. . . . . Lk is in RULES, then the rule p(7)~--LI,L 2. . . . . Lk, ~,6(/') is in R. We denote by MWf(A) the well-founded model of II(d). Let us illustrate the relationship between d and H(A) using our running example.
Running example (continued). We recall that the database d
= (IFACTS, DFACTS, RULES) of our
running example is the following: I__FACTS:
r(a, b) s(b, c)
D FACTS:
p(a, c)
RULES: q(x, y, z) <---(x, y), s(y, z)
p(x, z) ~ q(x, y, z) t(x, z) ~- q(x, y, z), -,p(x, z) The above database d is associated with the following Datalog database H(d):
F:
r(a, b)
s(b, c) p(a, c)
50
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
R:
q(x, y, z) 6"- (x, y), s(y, z), ~(l(X, y, z) p(x, z) ~-- q(x, y, z), ~p(x, z) t(x, z) ~-- q(x, y, z), -~p(x, z), --~((x, z)
If we compute the well-founded model of El(A), we find the following: MWf(A) = M(A) U ~(a, c)} U
{~P(~, ~)1 ~ ~ a v ~ , ~ c } U {-~,~(~,/3, ~')l ~ E {a, b, c},,, t~ ~ {a, b, c} A ~, ~ {a, b, c}} u {-4(~, ~)1~ ~ {a, b, c}A ~, ~ {a, b, c}} where M(A) is the model of A. In other words, the ground literals of M~Vf(A) over the predicates p, q, r, s and t constitute exactly the model M(A) of A. This is precisely what is stated in the following theorem. T h e o r e m 2. Let A be a consistent database and let El(A) be the associated Datalog database. Then, for every predicate p in PRED: • p(6) E M(A) iff p(6) E Mwf(A), and
• ~p(gt) E M(A) iff ~p(~)
~
MWf(d).
Proof. See Appendix A at the end of the paper. Using the above theorem, we can show easily that, if A contains only positive rules, then M(A) can be computed by means of the operator T* alone. Proposition 3. Let A be a consistent database. If all rules of A are positive, then
M(A) = Ifp(T* ) U .-~.(HB~\lfp(T* 1) (where lfp stands for "least fixpoint"). Proof." We first show that M + (A) = lfp(T*). We recall that the operator T* is monotonic. It follows that the least fixpoint lfp(T*) is the limit of the sequence [Ti]i~ o defined by: T0=T*(0 )
and
Ti=T*(T~ 1),
fori>0. +
Since A is consistent, we have: T o = M o = i FACTS. On the other hand, since all rules in A are assumed to be positive, for all instantiated rules r of inst_A, body(r) is a set of positive literals. Thus, + we have: T](Mi_ l) = T*(M+~), for i > 0. Now, given an integer k, and assuming that Mg = T~, for * all i such that 0 < i < k, we have M~+ = Tg, because M+k = Tj(Mk-~) = T~(Mk-I)* + =Ta*(Tk l)=Tk" Therefore, we have shown that M+(A)= lfp(T*). On the other hand, we note that the additional predicates used in II(A) are extensional. Since the rules of A are positive, H(A) is a semi-positive Datalog program. As a consequence, MWf(A) is a total model [9]. Let us assume that f is a fact not in M(A). By Theorem 2, f is not in Mwr(A), thus ~ f belongs to MWf(A). Applying again Theorem 2, ~ f belongs to M(A). This shows that M(A) is total; therefore, the proposition is proved.
D. Laurent et al, / Data & Knowledge Engineering 26 (1998) 37-70
51
As another consequence of Theorem 2, the following lemma states that, given a consistent database and a fact f, if the database derives f (respectively ~ f ) then storing f as being inserted (respectively as being deleted) does not change the model of the database.
Lemma 1. Let A = (IFACTS, DFACTS, RULES) and A' = (IFACTS', D_FACTS', RULES) be two consistent databases over the same alphabet and having the same rules. Then, for every fact f: 1. If I_FACTS= I_FACTS' CI {f}, D_FACTS= D_FACTS' and i f f is in M(A)', then M(A) = M(A'). 2. If I_FACTS= I_FACTS', D_FACTS= D_FACTS' U {f} and ~ - f is in M(A)', then M(A) = M(A'). Proof:. We first note that the lemma is obvious if A = A'. We distinguish two cases, as follows:
Case 1: f does not belong to I FACTS'. Clearly, H(A) and H(A') differ only by the fact that f is stored in H(A) whereas f is not stored in H(A'). Since we a s s u m e f to be in M(A'), the presence o f f in H(A) is not necessary in order to g e t f in the well-founded model MWf(A) of H(A). In other words, MWf(A) can be computed in the same way as the well-founded model MWf(d ') of H(A') is computed. Therefore, we have M'~f(A)= MWf(A'), which entails by Theorem 2 that M(A)= M(A'). Case 2: f does not belong to D_FACTS'. We first note that i f f is a fact over an extensional predicate, then H(A) = H(A'). Thus, because of Theorem 2, M(A) and M(A') are equal. If, on the other hand, f = p(6) is over an intensional predicate, H(A) and H(A') differ only by the fact that ,6(d) is stored in H(A) whereas //(~) is not stored in H(A'). Since --,p(ti) belongs to M(A'), the presence of/~(d) in II(A) is not necessary in order to get ~p(d) in the well-founded model MWf(A) of H(A). Moreover, the predicate /~ is an extensional predicate in H(A) and in H(A'). Thus, we have M'~f(A) = (MW~(A')\{-~lS(d)}) U {fi(d)}, which entails by Theorem 2 that M(A) = M(A'). Therefore the lemma is proved. []
4. Update semantics In our approach, inserting a f a c t f in a consistent database d = (I_FACTS,DFACTS, RULES) m e a n s two things: (l) rendering f true in A, and (2) leaving A consistent. In order to achieve these goals, we proceed as follows: 1. To render f true in d, we simply put f in IFACTS, and 2. To leave A consistent, we remove f from D_FACTS (if f were already there from a previous deletion). Similarly, deleting a f a c t f from a consistent database A = (IFACTS, DFACTS, RULES)means two things: (1) rendering f false in A, and (2) leaving d consistent. In order to achieve these goals, we proceed as follows: 1. To render f false in A, we simply put f in D_FACTS, and 2. To leave d consistent, we remove f from lFACTS (if f were already there from a previous insertion).
52
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 3 7 - 7 0
In the remaining of this section, we first state formally the definitions of insertion and deletion and then we proceed to study their properties.
4.1. Definitions Definition 4. A = (IFACTS, DFACTS, RULES) be a consistent database and let f be a fact. Insertion. The insertion o f f in A, denoted by ins(A, f ) = (1_FACTS', D_FACTS', RULES) is a database
defined by: IFACTS' = IFACTS U { f } ,
and
DFACTS' = D FACTS\{f}.
Deletion. The deletion of f from A, denoted by del(A, f ) = (I_FACTS', D_FACTS', RULES), is a database defined by: I_FACTS' = I_FACTS\{f}, and
D_FACTS'= D_FACTSU {f}.
In other words, in order to insert f you just add f to the inserted facts and you remove f from the deleted facts; in order to delete f you just remove f from the inserted facts and you add f to the deleted facts. There are two important points to note here: 1. Every insertion and every deletion is possible. Indeed, a consistent database remains consistent after every insertion or deletion. 2. The insertion or deletion of a fact requires simply the addition and the removal of a single fact (at most). No inference of facts is required during updating. In contrast, in traditional approaches (see for example [6]): 1. Not every insertion or deletion is possible, and 2. In order to test the possibility of a given insertion or deletion, deductions against the database are necessary. We therefore claim that, as far as processing is concerned, our approach to updating is conceptually simpler and more efficient than other approaches. In order to illustrate our approach, we consider now the case of updates over a recursive predicate. Let A = (I_FACTS, D_FACTS,RULES) be a database containing the following set of rules: RULES: anc(x, y) 4-- par(x, y)
anc(x, y) +-- anc(x, Z), par(z, y) together with the following sets of facts: IFACTS =
{par(a, b), par(b, c), par(c, d)},
and
D_FACTS= 0.
Assume now the insertion of anc(a,e). According to Definition 4, the updated databas A~ = (I FACTSl, D FACTS1, RULES) is defined by I_FACTS1 = {par(a, b), par(b, c), par(c, d), anc(a, e)}
and
D_FACTS1 = ~.
Note that we simply "store the insertion," since the insertion contains no information about the fact(s) over par responsible of this update. If we now delete the fact anc(a, d) from A~, then we obtain a database A 2 (I_FACTS2, D FACTS2, RULES) defined by =
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70 I_EACTS2 = {par(a, b), par(b, c), par(c, d), anc(a, e)},
and
53
D_EACa'S2 = {anc(a, d)}.
We note that, although the facts par(a, b), par(b, c) and par(c, d) are true in the database A 2, anc(a, d) has become false. Intuitively, this corresponds to the fact that, since the deletion contains no information about the fact(s) over par responsible of it, we simply "store the deletion" and take appropriate actions (through the notion of exception) in order to derive consistent information from the updated database. Of course, it is important to recall that, by requiring the storage of deleted facts, our approach has increased space-requirements, compared to traditional approaches. However, there are two factors to be taken into account in this respect. 1. First, in certain systems, users may want to query the database for deleted facts and, in that case, the increased space-requirements is the price to pay for the increased service offered by the database. It is important to note that, although temporal database approaches allow such queries, they do not address the problem of non-deterministic updates. 2. Second, the space-requirements of our approach are of the same order as in temporal database approaches. Moreover, as will be seen in the next section, particular storage optimizations are possible in our model.
4.2. Properties of updates In order to study the properties of updates we need a means of comparing two databases having the same underlying alphabet and the same set of rules. We choose to compare such databases with respect to their sets of true and false facts. It is important to note that, the comparison between two databases d and d ' makes sense only if the constants used to instantiate the rules in A are the same as the constants used to instantiate the rules in A'. In other words, comparing d and A' requires that the associated Herbrand bases HBa and HB a, be equal. However, this restriction can be avoided if we instantiate the rules in A and in d ' with respect to HB a U HBa,, since we only consider finite sets of constants. In what follows, we implicitly make use of this 'extended' way of instantiating the rules, when necessary.
Definition 3. Let A and A' be two consistent databases. Define: Query-preordering. A is smaller than A', denoted by A F- A', if M+(A) C M + ( A ') and M - ( A ' ) C M-(A).
Query-equivalence. A is query-equivalent to A', denoted by A ~- A', if M(A) = M(A'). Clearly, A = d ' iff d I - A ' and A'__ d. As an example of query-equivalent databases, let us consider the database A = (I_FACTS, O FACTS, RULES) of our running example, where we have i FACTS: r(a,b),s(b,c),
and
O_~ACTS: p(a,c),
and the database d = 0_FACTS~, D_FACTSl, RULES)defined by: I_FACTS1" r(a,b),s(b,c),q(a,b,c),
and
D_FACTSj: p ( a , c ) , r ( a ' , c ' ) .
54
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
Clearly, A and A~ are query-equivalent because, with respect to HB a U HBa, = HBa, we have: • M+(A) = M+(A1) = {r(a, b), s(b, c), q(a, b, c), t(a, c)}, and • M (A) = M-(A~) = HBa\{r(a, b), s(b, c), q(a, b, c), t(a, c)}. We note that, from the point of view of storage optimization, one would probably prefer A to A~ as A contains fewer inserted and deleted facts than A~. The following proposition states that insertions and deletions of facts are idempotent and commutative. The proof follows immediately from the definitions.
Proposition 4. For every consistent database A and for all facts f and f ' , we have: 1. ins(ins(A, f), f ) = ins(A, f), and del(del(A, f), f ) = del(A, f), 2. ins(ins(A, f), f ' ) = ins(ins(A, f ' ) , f), and del(del(A, f ) , f ' ) = del(del(A, f ' ) , f), 3. if f and f ' are distinct then del(ins(A, f), f ' ) = ins(del(A, f'), f). The requirement that f and f ' be distinct in Proposition 4.3 above is indispensable. Indeed, i f f = f ' then we do not have del(ins(A, f), f ) -- ins(del(A, f), f), simply because f is true in ins(del(A, f ) , f ) whereas f is false in del(ins(A, f), f ) (see Proposition 2). However, the following proposition gives cases where ins(del(A, f), f ) and del(ins(A, f), f ) can be compared to 4.
Proposition 5. For every consistent database A and for every ,fact Jl we have: 1. del(ins(A, f), f ) --- A iff f ~ M-(A). 2. ins(del(A, f), f ) -- A iff f ~ M+ (A). Proof. The proposition is easily shown from Lemma 1. Indeed, to show 1 above, let us denote by A' = (I_FACTS',D_FACTS',RULES) the database del(ins(A, f ) , f). Then, t FACTS'= I_FACTS AND D_FACTS' = D_FACXSU {f}. Since f is in M (A), Lemma 1.2 applies and we find that M(A) and M(A') are equal. Therefore A and A' are query-equivalent. Applying Lemma 1.1, the query-equivalence between A and ins(del(A, f), f ) can be shown in the same way as above, therefore the proposition is proved. [] The first property in Proposition 5 above is referred to as reversibility of insertion: if a previously false fact f is inserted in the database then subsequent deletion of f takes us back to the original database (up to query-equivalence). The second property above is referred to as reversibility of deletion: if a previously true f a c t f is deleted from the database then subsequent insertion o f f takes us back to the original database (up to query-equivalence). There is another desirable property of updates that is often cited in the literature [19,27,5,6,25,23], namely the property of monotonicity. This property requires that every true fact must remain true after any insertion and, similarly, every fact that is true after a deletion should have been true before the deletion. The following proposition states that, in our approach, if all rules are positive then monotonicity holds.
Proposition 6. Let A = (I_FACTS, D_FACTS, RULES) be a consistent database such that all rules are positive. Then for every fact f we have: A ~ ins(d,f)
and
del(A,f) [- A .
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
55
Proof. The proposition is a consequence of Proposition 3 and of the monotonicity of the operator T*. Indeed, since all rules are positive, M+(A) is the least fixpoint of T* applied to I_FACTS, O_FACTS and RULES(see the proof of Proposition 3). On the other hand, since ins(A, f ) contains more inserted facts than A and at most as many deleted facts as A, T* applied to the updated database produces more facts than T* applied to A. This shows that .4 F-ins(A, f ) . A similar argument holds for deletions. []
Note. In the presence of n o n - positive rules monotonicity fails, as the following example shows. Consider a database d = (I_FACTS, D_FACTS,RULES) defined by: I_FACTS: q(a) D FACTS~ RULES:
~}
p(x) ~ q(x), ~r(x) .
Then we have: M + ( A ) = { q ( a ) , p(a)} (since ~r(a) E M ( A ) ) . Inserting r(a) in A gives a database 41 such that M + ( d l ) = { q ( a ) , r ( a ) } (we do not have p(a), since r(a) is now in M(A~)). Thus A and A l = ins(d, r(a)) are not comparable following the preorder . On the other hand, we note that deleting r(a) from A z gives the database 4 2 = 0_FACTSz, D FACTS2, RULES) defined by: i FACTS2"
q(a),
and
D_FACTS2: r(a).
Thus, we have M+(A2) = M+(A), showing that dl and d 2 = del(d 1, r(a)) are not comparable either.
5. Storage optimization In this section, we study possible storage optimization, i.e. we study cases where the fact to be inserted or deleted does not need to be stored in the database. These cases are first characterized with respect to the relation of query-equivalence. Then, we consider storage optimization with respect to a new equivalence relation, called update-equivalence, which refines the relation of query-equivalence by taking updates into account. Formally, the problem of storage optimization with respect to an equivalence relation = (the relation "~ being either the relation of query-equivalence or of update-equivalence) can be stated as follows: given a consistent database A and a fact f, • In the case of the insertion of f in A, let us denote by d f~"s the database 0_FACTS', D_FACTS',RULES) where r_FACTS' =LFACTS\Or} and where DFACTS' =D_FACTS. Clearly, the ins database Ay is obtained from d by keeping unchanged the set I_FACTS and by modifying the set D_FACTS as in Definition 4. Following these notations, the insertion o f f in A does not require the ins storage of f if ins(A, f ) ~- A I . • Similarly, in the case of the deletion of f from A, let us denote by A - rde~ the database 0__FACTS', D__FACTS',RULES) where I_FACTS' = I_FACTS\0r} and where DFACTS' -~- D F A C T S . Clearly, the database ~Af aet is obtained from d by modifying the set I FACTS aS in Definition 4 and by keeping unchanged the set D_FACTS. Following these notations, the deletion o f f from A does not require the storage o f f if del(A, f ) ~- ~Af del•
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
56
5.1. Storage optimization with respect to query-equivalence Storage optimization with respect to query-equivalence is based on the following proposition. Proposition 7. For every consistent database A and for every fact J~ we have:
1. A =- ins(A, f ) iff f ~ M +(A). 2. A = del(A, f ) iff f ~ M (A). Proof. To show 1 belongs to M+(A). we have I_FACTS' = we assume f to be
above, we first note that, if we assume A = ins(A, f ) , then by Proposition 2, f Conversely, denoting by A ' = (IFACTS', DFACTS', RULES) the database ins(A,f ), I_FACTSU {f} and D_FACTS' = D_FACTS (because f cannot belong to D FACTSsince in M+(A)). Therefore, because of Lemma 1.1, M(A) = M(A') which entails that
A-A'. The case of a deletion can be treated in a similar manner, if we apply Lemma 1.2. Thus the proposition is proved. [] Following Proposition 7, storage optimization with respect to query-equivalence is characterized as follows: 1. We can avoid storing a fact f in I_FACTS, provided that f is derived from the database before the update, and 2. We can avoid storing a f a c t f in D_FACTS,provided that --f is derived by the database before the update. It is important to note that, in the above optimization, we are only concerned with the preservation of answers to queries in the updated database. However, it may be the case that query-equivalence is not preserved by "further" updates. We explain this point using our running example. Running example (continued). We recall that the database A = (IFACTS, DFACTS, RULES) of our running example is the following: I_FACTS:
r(a, b) s(b, c)
DFACTS:
p(a, c)
RULES: q(x, y, Z) +'- (X, y), s(y, Z)
p(x, Z) ~-- q(x, y, Z) t(x, Z) ~ q(x, y, Z), ~p(x, Z) The fact ~p(a', c') belongs to M(A), thus, following Proposition 7, deleting p(a', c') from A may be achieved without changing the database (recall that M(A) is a total model). In other words, A and A' = del(A, p(a', c')) are query-equivalent. Unfortunately, this query-equivalence is not preserved if we insert r(a', b) and s(b, c') in both A and A'. Indeed, we have: • p(a', c') is true in ins(ins(A, r(a', b)), s(b, c')), whereas • p(a', c') is false in ins(ins(A', r(a', b)), s(b, c')).
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
57
The above example shows why query-equivalence must be refined in order to take updates into account. Roughly speaking, under this new equivalence relation, which we call update-equivalence, two databases are equivalent if they are query-equivalent and if they remain query-equivalent after any sequence of forthcoming updates.
5.2. Storage optimization with respect to update-equivalence We first define the notion of update-equivalence, which is based on the following notion of update sequence. Given a consistent database A, we call update sequence any sequence U = [(ut, ft), (u2, f2) . . . . , (u,,, f,)] where, for i = 1,2 . . . . . n, ug stands for ins or del and w h e r e f is a fact. We denote by d~ the database ui(... (u2(u~(d,f~),f2) . . . . ) , f ) , for i = 1,2 . . . . . n and we denote by U(A) the database A,.
Two consistent databases A and A' are said to be updateequivalent, denoted by A ~ A', if for every update sequence U, U(A) and U(A') are query-equivalent. Definition 5 - - U p d a t e - e q u i v a l e n c e .
The following proposition states that the deletion of a fact involving an extensional predicate can be optimized with respect to update-equivalence. P r o p o s i t i o n 8. Let A = (IFACTS, DFACTS, RULE)) be a consistent database and let f = p( d~) be a fact.
If p is an extensional predicate and if,f is not in 1FACTS, then d and del(A, f ) are update-equivalent. Proof. First note that instantiated rules must be constructed with respect to the constants appearing in A and in f With a slight abuse of notation, we still denote by inst_A this set of instantiated rules. Since p is an extensional predicate a n d f is not in I_FACTS, inst_A contains no rule whose head i s f Thus f cannot belong to any SPF* in the computation of M(A). Therefore, - , f is in M(A), which entails that Lemma 1.2 applies to A and del(A, f). Thus, A = del(A, f). Now, given an update sequence U and denoting by A ' = 0__FACTS', D__FACTS',RULE) the database del(A, f ) , we prove that either the above reasoning still holds for U(A) and U(A') or that the update sequence U produces equal databases U(A) and U(A'). Indeed, let us assume that Ai ~ and A~_~ are not equal and that inst_Ai_l contains no rule whose head is f Considering the ith update in U: - If this update is not the insertion of f, then inst_A~ contains no rule whose head is f and the above reasoning with A and A' still holds for Ai and A!. - If this update is the insertion off, then f appears in inst_Az. Since Ai i and A~_ 1 differ at most by the fact that f is stored as being deleted in A~_ r and not in Ai_I, the databases A~ and A~ become equal. Thus in either case, the databases U(A) and U(del(A, f)) are query-equivalent, and the proposition is proved. [] It is important to note that Proposition 8 shows that, in our approach, we can treat deletions of facts over extensional predicates in the same way as in usual approaches, i.e. the deletion of a fact over an extensional predicate can be performed by simply removing it from the set of inserted facts. However, the following example shows that there are cases of storage optimizations of deletions other than those considered in the above proposition.
58
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
Example 1. Let us consider a consistent database A = 0_FACTS, DFACTS, RULES)with the following set of rules: RULES: p(x) ~-- q(x, y), --,q(x, y)
p(x) ~ p(x), s(z, x) Although p is not an extensional predicate, we show that the deletion of any fact of the form p(a) does not require the storage of p(a). Indeed, because of the first rule, any fact of the form p(a) is in M (A), except when this fact is in LFACTS. Thus, when p(a) is not in IFACTS, this fact is false, no matter whether it is stored in DFACTS or not. In other words, during the lifetime of the database, every deletion of a fact involving p does not change the set DFACTS, i.e. the set D_FACTS may never contain facts over predicate p. The above example suggests that if a fact f is always true (respectively false), unless it is deleted (respectively inserted), then the insertion (respectively the deletion) o f f can be optimized with respect to update-equivalence. We show in the next subsection that, in the case of stratifiable databases, these are the only cases of storage optimization with respect to update-equivalence. It is thus important to note that, even in the particular case of stratifiable databases, there is no hope to obtain interesting cases of storage optimization with respect to update-equivalence, other than those obtained by the above Proposition 8.
5.3. The case of stratifiable databases Stratifiable logic programs, introduced in [4], allow negative literals in the bodies of rules under the following restriction on the rules: there exists a positive integer n such that the rules can be partitioned in n "strata" in such a way that, for every i = 1,2 . . . . . n, for every rule r in the ith stratum, for every negative literal ~p(/') in body(r), there is no rule r' in strata of ranges greater than or equal to i such that head@') is a literal involving the predicate p. We refer to [4,10] for formal definitions of this well-known class of logic programs and we note that stratifiable databases are considered in many practical cases, including the case of negation-free sets of rules. We recall here that stratifiable programs enjoy the important property that their semantics is computed by iterated least fixpoint computations, stratum by stratum [4]. Moreover, it can be shown from [23] that the well-founded model of a stratifiable program is total, since stratifiable programs are particular cases of effectively stratifiable programs and since effectively stratifiable programs are shown in [33] to have a total model under well-founded semantics. The following proposition shows that, in the present approach, every stratifiable database has a total model (as in the traditional approach). Proposition 9. If A is a consistent and stratifiable database, then M(zl) is a total model. Proof. If the rules of A are stratifiable then the associated Datalog program H(d)--defined in Section 3.2--is stratifiable (because the doted predicates are extensional). Thus the well-founded model of //(A) is a total model (see [10,12,33]) and, by Theorem 2, we obtain that M(A) is a total model. []
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
59
Before we give our results about storage optimizations with respect to update-equivalence in the case where the database to be updated is stratifiable, we state the following lemma.
Lemma 2. Let A = (YFACTS, DFACTS, RULES) be a consistent database and let f be a fact. Let inst_A(f) be the set of those rules in inst_A whose head is f If M(A) is a total model and if jbr ever3, rule r in inst_A(f), f belongs to body(r), then ~ f E M(A). Proof. Since every rule in inst_A(f) has a non-empty body, f is not in I_FACTS. Moreover, since M(A) is total, M(A) contains one of the literals f of--Tf. Let us now assume that fEM+(A). As f ~ I_FACTS, there exists i > 0 such that f E Mi(A) and f ~ 34,._ ~(A). Therefore, there exists a rule r in inst_A(f) such that body(r)C M.I(A), entailing that f is in M + ~(A). This is a contradiction which shows that --f is in M(A). [] In order to state formally our results about storage optimization with respect to update-equivalence, it is important to recall that update-equivalence takes into account any forthcoming insertion or deletion in the database. Therefore, testing update-equivalence cannot be performed by considering only those constants that appear currently in the database, since other constants may appear, due to some insertion or deletion. So we must provide characterizations which hold for any finite subset of CONSTcontaining the constants appearing in the current state of the database. To this end, we introduce the following additional notations. Given a consistent database A = (IFACTS, DFACTS, RULES) and a finite set of constants C, we denote by inst_A(C) the set I_FACTS together with all instantiations of the rules such that variables are assigned constants from C. We note that, following this notation, inst_A is the set inst_A(C) where C is exactly the set of all constants appearing in A. Now, given a fact f we denote by inst_A(C, f ) the set of those instantiated rules of inst_A(C) whose head is f Similarly, we denote by inst*_A(C, f ) the set of those instantiated rules of inst_A(C, f ) whose body does not containf itself. Moreover, if C is the set of all constants appearing in A, the sets inst_A(C, f ) and inst*_A(C, f ) are denoted by inst_A(f) and inst*_A(f), respectively. Another notation which we need here is the following: given a rule r of the form f 6-- L1, L 2 . . . . . Lk, we denote by bodyr,,(r) the conjunction of those literals L i (i = 1,2 . . . . . k) which belong to the set body(r), i.e. bodYro(r) is the first-order formula Lj/xL 2/x" • "AL~. We show now that under the assumption that the database to be updated is stratifiable, storage optimizations with respect to update-equivalence can be characterized and applied to tractable practical cases. Unfortunately, as our results show, even in the restrictive case of stratifiable databases, cases of storage optimization with respect to update-equivalence occur only in very few cases.
Storage optimization for insertions. Roughly speaking, the characterization of storage optimization in the case of an insertion says that, when inserting a f a c t f if the disjunction of the bodies of the rules in inst_A(C, f ) is a tautology, for any finite set of constants C, t h e n f does not need to be stored in the updated database. We recall that if A = (l - - FACTS,D_FACTS,RULES) is a consistent database, we denote by AJi''= (l_ FACTS', D_FACTS',RULES) the database defined by IFACTS ' = IFACTS and D FACTS' = D FACTS\{f}. It follows that, some constants appearing i n f and thus in ins(A, f), may not appear in Af~ns. However, in
D. Laurent et al. I Data & Knowledge Engineering 26 (1998) 37-70
60
ills
order to formally compare A r and ins(A, f), we shall consider instantiations that involve constants appearing in the fact f, even if some of these constants do not appear in the database under consideration.
Theorem 3. Let A = (I_FACTS,DFACTS, RULES) be a consistent and stratifiable database and let f be a ins fact not in I FACTS. Then ins(A, f ) ~ Af if and only if the formula V rEinst_A(C,.f) bodyfo(r) is a
non-empty tautology for any finite set of constants C containing the constants in A and in f. Proof. See Appendix B at the end of the paper. We note from the above theorem that there are few cases where optimizations for insertions are possible. In particular, it turns out that if all rules are positive, then no insertion can be optimized with respect to update-equivalence. The following example shows cases where Theorem 3 applies. Example 2. Let us consider a consistent database A = (IFACTS, DFACTS, RULES) with the following set of stratifiable rules: RULES:
p(x) ~'-
q(x, y)
p(x) ~-- ~q(x, y) It follows from Theorem 3 that the insertion of any fact of the form p(a) does not require the storage of p(a). In other words, during the lifetime of the database, every insertion of a fact involving p does not change the set IFACTS, i.e. the set I_FACTS never contain facts over predicate p. Indeed, for every finite set of constants C containing the constants in A and a, the formula V r~n,, J(C',r~ b°dyl~(r) is here V ~c[q(a, ce)v--,q(a, a)]. This is clearly a non-empty tautology, which shows that Theorem 3 applies here. It is important to note that, from a computational point of view, testing the applicability of Theorem 3 is impossible in general. However, Example 2 suggests that, in some cases, we may be able to test the applicability of Theorem 3 in an efficient manner. To show this important point, let f = p(d) be the fact to be inserted, and perform the following steps on the rules of the database: 1. For every rule r whose head is p(/), instantiate only the variables in 7, to obtain a rule whose head is p(8). Let r 1, r 2. . . . . r k be the resulting partially instantiated rules. 2. For every n-ary predicate q in the body of at least one rule r i (i = 1, 2 . . . . . k), consider n q q q distinct variables, each associated to each entry in the list of parameters of q. Let v 1, v2 . . . . . v,, denote these distinguished variables. 3. For every n-ary predicate q, for every occurrence q(t~, t 2. . . . . tn) in the body of a rule ri (i = 1, 2 . . . . . k), and for every j = 1, 2 . . . . . n do the following: if tj is a non-distinguished variable then q replace each occurrence of tj in body(r~) by the distinguished variable vj. Let RULES(f) denote the resulting set of partially instantiated rules. Note that, in step 2 above, the rules are not changed and that the aim of step 3 is to replace all variables by distinguished variables, while preserving the constants as well as the bindings between
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
61
variables. For example, steps 2 and 3 would transform p ( a , x , x , y) into p(a, v p1 , O i~', O 3P) , o r equivalently, into p(a, v 2, For instance, in Example 2, with f = p(a), we have RULES(f) = {p(a)~--q(a, V~), p(a)e--~q(a, v~)}. Moreover, we note that v ~ e ~ ¢ i ~ b°dyI,,(r) = q(a, V ~) k~ ~q(a, v2), q which is a non-empty tautology. As suggested above, the following proposition shows that, in general, if V re ...... ~I~ b°dylo(r) is a non-empty tautology, then Theorem 3 applies. It is important to note that this provides a sufficient condition for applying Theorem 3 in an efficient manner.
Proposition 10. Let (IFACTS, DFACTS, RULES) be a consistent and stratifiable database and let f be a fact not in I_FACTS. Then ins(A, f ) ~ A ri~s if the formula V re ....... ¢r~ b°dyI,,(r) is a non-empty tautology.
Proof. Following Theorem 3, we have to show that, for every set of constants C containing the constants in A and in f if v re ...... ~r~ b°dYro(r) is a non-empty tautology, then so is Vrez,.,,a¢c,f~ bodylo(r). In order to prove this result, we first show that inst_A(C, f ) is precisely the set of all instantiations with respect to C of the rules in RULES(f). Let inst(C, RULES(f)) be this set of instantiated rules. Since RULES(f) is a set of partially instantiated rules of RULES whose heads are f, we have inst(C, RULES(f))C inst_A(C, f ) . Conversely, let r be in inst_A(C, f ) . Then, there exist r' in the set RULESand an instantiation 0 of the variables such that r = rO (rO denotes the instantiation of r obtained by the replacement of each occurrence of every variable x in r by the constant O(x)). Moreover, denoting by 0i the partial instantiation which defines the set RULES(f), r'O~ is in RULES(f). Therefore, there is an instantiation 0' of the distinguished variables in r' such that r'O' = r, which shows that r belongs to inst(C, RULES(f)). Therefore, we have inst(C, RULES(f))= inst_A(C, f ) . Moreover, if we assume that V re,~,~,¢i ~ bodyro(r) is a non-empty tautology then so is v~e~,~c ..... ~t~ b°dyro(r), for every set of constants C. Therefore, we obtain that if V , e ~ , ~ bodyr,,(r) is a non-empty tautology then so is v ,.e,,~,.a¢c,s~ b°dylo(r), which completes the proof of the proposition.
[]
Storage optimization for deletions. In the case of deletions, the following theorem states that, when deleting a f a c t f if the disjunction of the bodies of the rules in inst*_A(C, f ) is always false for every finite set of constants C containing the constants in A and in f then f does not need to be stored in the updated database. We recall that if A = (I_FACTS, D_FACTS, RULES) is a consistent database, we denote by A --f'~e~= (I FACTS', D_FACTS,' RULES) the database defined by IFACTS ' = I FACTS\{f} and D FACTS' = D-- FACTS.Moreover, as in the case of insertions, in order to formally compare _y a de' and del(A, f ) , we shall consider instantiations that involve constants appearing in the fact f, even if some of these constants do not appear in the database under consideration. _
_
Theorem 4. Let A = (I_FACTS,D_FACTS,RULES)be a consistent and stratifiable database and let f be a --
~
A
del
fact not in D FACTS. Then del(A, f ) - s if and only if the formula 9 [ V rei,s,*_a~C t~ b°dyfo(r)] is a (possibly empty) tautology f o r every f n i t e set o f constants C containing the constants in A and in f. Proof. See Appendix C at the end of the paper.
Example 1 (continued): We recall that the database A = (I_FACTS, D_FACTS,RULES) of this example contains the following rules:
62
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-7(9
RULES: p(x) +-- q(x, y), --,q(x, y)
p(x) ~- p(x), s(z, x) If we consider the deletion of the fact p(a), for every finite set of constants C containing the constants in A and a, the formula ~[ V rEinst*_A(C,f) b°dyjo(r)] is here 7 [ V ec(q(a,ce)^~q(a, ce))], since no instantiations of the second rule can belong to i n s t * A ( C , f). This is clearly a tautology, which shows that Theorem 4 applies here. As in the case of insertion, we show, as a consequence of Theorem 4 above, that there are sufficient (and non-trivial) conditions under which deletion optimization can be efficiently tested. For this, we consider the set RULES(f) from which we eliminate those rules whose bodies contain the fact f Let RULES*(f)be the resulting set. For instance, in Example 1 above, with f = p(a), we have RULES(f)= {p(a)@ q(a, v q2), --,q(a, v~), p(a)+--p(a), s(v],a)} and RULES*(f) ={p(a)~---q(a,v~), -~q(a, vq)}. Moreover, as V,.~,,~,~/, bodyro(r) = q(a, v q2)A--,q(a, v q2), ~[ V , ~ ..... ,(~) bodyn,(r)] is a tautology. As suggested above, if ~[ V ~ , ~ , ( e ) bodyro(r)] is a non-empty tautology, then we have a sufficient condition for applying Theorem 4 in an efficient manner. This is shown in the following proposition. Proposition 11. Let A = ( I F A C T S , DFACTS, RULES) be a consistent and stratifiable database and let f be a fact not in DFACTS. Then del(A, f ) -~, A det if the ,formula 7[ V ,~,LH~*{~) bodyf,,(r)] is a (possibly
empty) tautology. Proof. The proof of this proposition being similar to the proof of Proposition 10, we omit it. We terminate our considerations on storage optimizations under update-equivalence with the following two important remarks: 1. Cases of storage optimization with respect to update-equivalence happen only in some rare situations where the rules are written in a non-careful and strange way. We refer to examples 1 and 2 for such cases. Actually, we can consider that the only case of interest, is that of deletion of a fact over an extensional predicate (see Proposition 8). Moreover, Theorems 3 and 4 show that, even in the restricted case of stratifiable databases, there is no hope of finding other cases of storage optimizations with respect to update-equivalence. 2. Our assumption that the database is stratifiable has been used in order to deal only with databases having a total model. The assumption of a total model is indispensable in our results, as shown in the following example.
Example 3. We consider two databases A = (I_FACTS, D_FACTS,RULES) and A' D_FACTS', RULE), having the same set of (instantiated) rules, and defined by: I_FACTS ~- 0
AND
D FACTS = 0
I F A C T S ' ~---{g}
and
DFACTS' = 0
RULES: f <---g f 6-- -~g g ~-- -~g
= (I FACTS',
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
63
It is important to note that neither A nor A' is stratifiable, because of the third rule. Moreover, we note that V ~ , a~c.l> and V rei,,,_~'~c,s~ are non-empty tautologies. However, we show that the insertion o f f in A cannot be optimized, whereas this insertion performed against A' can be optimized. ins Indeed, we have M(A) "s) = 0 (since A r = A) and M(ins(A, f ) ) = {f}. These databases are not query-equivalent, thus they are not update-equivalent either. t ins t ins On the other hand, we have M(A i ) = M ( i n s ( d ' , f ) ) = {f,, g}. Thus A r and i n s ( d ' , f ) are query-equivalent. We show now that they remain query-equivalent after every update sequence U. Indeed, assuming that before the ith update in U, the databases are query-equivalent, we have: - If the ith update in U is the insertion or the deletion ofj~ then the databases become equal, thus query-equivalent. - If the ith update in U is the insertion or the deletion of g, then f remains true in both databases, and thus, they remain query-equivalent. - If the ith update in U is the insertion or the deletion of a fact other t h a n f or g, then the databases remain query-equivalent, essentially for the same reasons as in the case of stratifiable databases. This is so because, in this case, we are dealing with databases having total models. Therefore Ar'ins and i n s ( d ' , f ) are update-equivalent. This example shows that in the case of non-stratifiable databases, the characterization of Theorem 3 does not apply, and it can be seen that similar examples also exist for deletions. Therefore, we also have that, in the case of non-stratifiable databases, the characterization of Theorem 4 does not apply.
6.
Concluding
remarks
We have seen a new method for updating deductive databases which allows the insertion or deletion of a fact over intensional predicates in a deterministic manner. The main contribution of our approach is that, contrary to most other approaches, inserting or deleting facts over intensional predicates can always be accomplished without having to make choices. There are two important differences between our approach and traditional approaches, namely, in our approach, - w e record both inserted and deleted facts, and - we accept the storage of facts over both intensional and extensional predicates. By doing so, our approach makes use of all the information provided by the user. In contrast, traditional approaches - r e c o r d the inserted facts but not the deleted facts, and - accept the storage of facts only over extensional predicates. As a result, traditional approaches use only part of the information provided by the user--and this is, in our opinion, the main cause of non-determinism during updating. In our approach, however, the storage of deleted facts introduces a significant overhead. There are two factors to be taken into account in this respect. First, in certain systems (such as in AI systems), users may want to query the database for deleted or false facts; in that case, the increased space-requirement is the price to pay for the service offered by the database. Second, as we have seen, some (limited) storage optimization is possible with respect to query-equivalence. In the case of update-equivalence, storage optimization is even more limited and requires the assumption of a total model. In this respect, it would be interesting to consider our results in the context of effectively
64
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
stratifiable databases which are precisely those databases that have total models (see [12]). We are currently investigating this research direction, based on [13]. Another way to consider storage optimization is to change the set of rules in such a way that the semantics of the database is preserved while storing fewer facts. In [34], for example, we use Inductive Logic Programming techniques in order to "learn" new rules when the database contains too many deleted facts. The goal of the learning phase is to compute a new set of rules so that the semantics of the database is preserved, while storing as few facts (inserted or deleted) as possible. Finally, we would like to mention some research directions that have been explored, based on the material of the present paper. In [26], we combine ideas from [25] and the present paper into a more general setting. According to [26], a database is seen as a quintuple (iNs, DzL, (, 7r, v) where: - I N S is the set of inserted information, - t)EL is the set of deleted information, - ~: is a monotonic operator computing the exceptions, based on the set DZL, - zr is a monotonic operator computing true information, based on the sets ~Ns and I~EL, - v is a monotonic operator computing false information, based on the sets ~NS and DEL. The case of Datalog ne~ databases, dealt with in the present paper, appears as a particular case of the general framework of [26], if we consider: - i n s as the set I__FACTS, -DzL as the set D_FACTS, -- ~: as the identity operator, - 7r as the operator T" and - v as the operator GUS. Currently, we are working on a generalization of the notion of exceptions as presented in this paper. In [21], we consider true and false exceptions, obtained as a least fixpoint associated to special rules, called update rules. An update rule is of the form L 0 ~-- L 1, where L 0 and LI are positive or negative literals. The intuitive meaning of such rules is to specify the side effects that are to be associated with a given update. An interesting aspect of this extension is its close relationship with integrity constraints and non-monotonic reasoning. Finally, the problem of storage optimization under equeryequivalence in the presence of update rules is under investigation. A preliminary work on this topic can be found in [24].
Acknowledgements
The authors wish to thank Nicole Bidoit, G6rard Ferrand and the anonymous referees for their helpful comments and suggestions on a preliminary draft of this paper.
Appendix
A
Proof of Theorem 2 2. Let A be a consistent database and let [I(A) be the associated database. Then, for every. predicate p in PREP: Theorem
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
65
• p(6) ~ M(A) iff p(~i) ~ M~f(A), and • ~ p ( g ) ~ M(A) iff ~ p ( g ) ~ MWf(A). P r o o f i Let us call M ' ( d ) the set of those positive or negative facts i n MWf(A) that involve a predicate in PRED. Therefoere, we have to show that M(A) and M'(A) are equal. For this, recall that M(A) and wf Mwf(A) are inductively computed through the sequences [M~L~o and [M~ ]~0, respectively. Finally, for every i, we denote by M I the set of those positive or negative facts in MWf(A) that involve a predicate in PRED.We prove successively the inclusions (i) M'(A) C_M(A), and (ii) M(A) _CM'(A). (i) First, we have M 0 _CM 0 since M 0 = 0 and M 0 = I FACTS [..) - n . D FACTS. If we assume that, for some integer k, we have M I C_M i for all 0 ~< i < k, then we can show that M'~ C M k. Indeed, i f f is in wf M'kM' contains a rule r whose head is f and whose body is in Mk_ ~. Thus, the k k 1~ then i n s t I I ( d ) literals in body(r) over predicates in FREDare in M k 1. Therefore, following our induction hypothesis and the construction of//(A), inst_d contains a rule r' whose head i s f and whose body is a subset of Mk-1" Moreover, f = p(&) is not deleted. Indeed, i f f is deleted then fi(&) is in F which entails that ~fi(t~) is not in M;~. Therefore, i f f is deleted then p(&) is not in M k (by construction of//(A)). As a consequence, by definition of T*, f belongs to M r Now, assume that - f belongs to M'LM' k k- . We show that --,f belongs to M k by showing that: wf SPF*(Mk-1 ) C_ SPFn(z)(M k_ l )" Indeed, the above two sets are computed through the sequences [SPF*(Mk_I)]~o and wf [SPFi(Mk_l)]i>~o, respectively. The inclusion is proved by induction on i. First, we have SPF~(Mk-1 ) C_ SPFo(M~ f 1). Indeed, recall that
SPF~(Mk_ 1) = {head(r)lr E inst_A A pos(body(r)) = 0 /x VL ~ body(r), --nL ~ M k_ 1}~D
FACTS .
Thus, for every f = p ( c ~ ) in SPF~(M~_,), f is not deleted. Moreover, by construction of H(A), inst_II(d) contains a rule r' such that pos(body(r')) = 0 and such that, for all L in body(r') over a predicate in FRED, ---nL~ M k _ l (since we assume M~_ 1 _CM k_ l ). If, on the other hand, we consider L in body(r')\body(r), then L is the literal ~,d(ti). Since f = p ( c ~ ) is not deleted, fi(&) is not in F, which entails that -~//(c~) is in Mk_ ~ (because/~ is an extensional predicate). Thus, there is a rule r' in inst_II(A) such that h e a d ( r ' ) = f , pos(body(r'))= 0 and, for all L in body(r'), -~L U: Mk_ l . Therefore, f is in SPFo(M~f_I ), showing that SPF*(M k_ 1) C_ SPFo(M'~fl ). wf Now, assuming that S P F * ( M k _ ~)_C S P Fj(Mk_~) for 0 ~ j < l , let us show that SPF*(Mk_~)C_ SPFt(M~f~ ). Indeed, recall that
SPF*(Mk_ 1) = {head(r)[r E inst_A A pos(body(r)) C_ SPF* ~(M k_ 1) /x ~ L ~ M k -1}~ FACTS,
VL E body(r).
So, for each fact f in SPF*(Mk_I), i n s t i l ( A ) contains a rule r' whose head is f and such that wf pos(body(r' )) = pos(body(r)). Thus, pos(body(r' )) C_ SPF *_ I(Mk_ 1) C_ SPF I_ l (M k_ 1). Now let L be in wf body(r'). If L is also in body(r), then -,L is not in Mk_ ~, showing that ~ L is not in Mk_ ~. If, on the other hand, L belongs to body(r')~body(r), then L is -~p(c~) and, as f = p(a) is in S P F * ( M k_ i), ti(5~) is wf p wf not in F. Hence,/~(&) is not in Mk_ 1, showing t h a t f belongs to S F~(M~_I). Summarizing the above proof, we have SPF*(Mk_ 1) C_ SPF~(M~_ wt ~) for all integer l, thus we have SPF*(M~_ ~) C_ SPFn(~)(M~ f_~). As a consequence, we have M'~ - C M k which entails that M~ C_M~. Therefore, we have M'(d)_C M(A).
66
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 3 7 - 7 0
(ii) Now, we show the inverse inclusion, that is M(A)C_C_M'(A). More precisely, we show that for every i, there exists j such that M i C M I +j. This holds for i = 0 because of Proposition 1 and because, when computing the well founded model of H(A), it can be seen that every inserted fact belongs to M~ f and that, for every deleted fact f, ~ f belongs to M~ f. Indeed, every inserted fact is in II(A) and, for every deleted fact p(~i),/~(8) is in H(A). Thus, for i 11- 1,/~(8) is in M~f(A). Moreover, because of the literals -@(F) in the bodies of the rules whose head is p(i), the fact p(8) is unfounded, showing wf that ~p(d) is in M~ (A), for i > 1. By induction, let us assume that for all i < k, there exists j such that M~ C M'~ +j that let us prove the existence of an integer j ' such that M k C M'~ +/. First, let f be in M k such that p(&) = f--- head(r) for some rule in inst_A. Then f is not in D FACTSand for all L in body(r), L belongs to M k ~. Therefore, there is s o m e j ' such that all L in body(r) are in M~ _j +;. Moreover, denoting by r' the associated rule in inst_H(A), the literal ~/~(&) is in M'k - l+j' ( s i n c e f = p ( ~ ) is not in D_FACTS). Therefore f belongs to m'k+/. p wf Now, we show that every negative fact involving a predicate in PRD in S Ftt
wf
wf
S Fo(Mk+/_ l) = {head(r) I r E inst_H(A) Apos(body(r)) = 0 A V L E body(r), ~L ~ Mk+/ ~}
.
wf
Thus, given a fact f = p ( & ) in SPFo(Mk+/_~) where p is a predicate in PRED, there is a rule r' in inst_A such that head(r')=f, pos(body(r'))= 0 and, for all L in body(r'), -~L does not belong to M~ ~ (because M k 1 C- M k' - ~ + / -C M k-~+/," wf ] Moreover, if f belongs to D FACTS, then /~(&) is in M wf As ~/~(c~) is in body(r), f does not belong to SPFo(M~f+/_~), a contradiction which shows k + j ' - 1" that f is not in D_FACTS. Thus f belongs to SPF*(M~_ ~). The induction step is as above, because for i > 0, the only difference with respect to the case i = 0 is that pos(body(r')) is in SPF t_ j ~.mwf ~+/_ ~). As wf our induction hypothesis is that every fact in SPFt_~(Mk+/_~) involving a predicate in PRED is in SPF*~(M,_~), the proof is complete. Therefore, we have shown the inclusion of M(A) in M'(A), which terminates the proof of the proposition. []
Appendix B Proof of Theorem 3 T h e o r e m 3. Let A = (i FACTS,D_FACTS,RULES)be a consistent and stratifiable database and let f be a ins fact not in I_FACTS. Then ins(A, f ) ~ A s if and only if the formula Vr~i,.,_a~c r~ b°dYto(r) is a non-empty tautology for any finite set of constants C containing the constants in A and in f.
Proof. We first show that if VrEinstA(C,f ) bodYso(r) is a tautology for any finite set of constants C containing the constants in A and in f, then ins(A, f ) and As'n' are query-equivalent. Let A)~' = (I_FACTS', D F A C T S ' , RULES) and let ins(A, f ) = (I_FACTS", DFACTS 't, RULES). Then (I_FACTS" = I F A C T S U { f } , I_FACTS' =- I FACTS and
D_FACTS"= D F A C T S ' = D FACTS\{f}. Since A is stratifiable, so is Asins and, by Proposition 8, M(A)n') is a total model. Thus, every literal in the formula VrEin,t_a),~
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70 •
67
ins
tautology, it is true in M(Aiy). Thus, there is at least one rule r in mst_A r (f) such that bodyyo(r) is true in M(A) ~~). In other words, there is at least one rule r in inst A~"s such that body(r) C M(A~ s). As ins ins a consequence, f belongs to M(A l ), because f is not deleted in A I . By Proposition 7.l, • ins • ins A i"~f =- m s ( A f , f ) , and by Definition 4, it is easy to see that ms(A s , f ) = ins(A, •f). So, we obtain that
A ,fi~~ =--ins(A, f). Now, let U = [ ( u ~ , J l ) , (u2, f 2) . . . . . ( u , , f , ) ] be an update sequence and let us denote by A I = (I_FACTS'i, D_ZACVS'i,RULES) the database ui(... (u:(ul(A}"~(f), fl), f2) . . . . ), f ) , and by A',' = ( _vACvS';, D_FAm'S',', RULES) the database ui(... (Uz(U~(ins(A, f), fj ), f2) . . . . ), f ) , for i = 1, 2 . . . . . n. Denoting by i that A"---A". Indeed, as previously shown, A o, = A ri,, a n d A 0. = m s.( A , f ). , w e s. h o w b y m d u c t o n o ni the equivalence ' holds for t = 0. Assume that Ai' I -= A"~ ~ . I f A i_j ' = A "i_~ t h e n A 'i = A i a" n d t h e r e s u l t ¢ t! ! is trivial. If A i_ ~ # Ai 1, then either A~ = A"~ (when f = f ) or, by Definition 4, I_FACTS'~= I_FACTS'/ O {f} and D FACTS'~= D_VACTS'~. Therefore, A I and A'[ can be shown to be query-equivalent as done for A{, s and ins(d, f), since VrEi,s,~i(r ~ bodyi,,(r) is assumed to be a tautology. Thus, we have shown that, if V,.~in,,_j(C.I~ bodyco(r) is a tautology for every finite set of constants C containing the constants "
"
in A and in f then A ~i n s ~ ins(A, f). Conversely, we assume now that A{,~ ~ ins(A, f ) and we show by contraposition that V~g,~,_a~c,y ~
bodyy,(r) is a tautology for every finite set of constants C containing the constants in A and in f Indeed, let C be a finite set of constants C containing the constants in A and in f and such that VrE~,,.,~c,r) body~,(r) is not a tautology. Writing the above formula in the conjunctive normal form gives the following: Akq~k, where the formulas q~ are all disjunctions that can be formed by picking exactly one literal in the body of each instantiated rule of inst_A(C, f). We call /" the conjunction obtained from Akq~~ above after removing every conjunct q~k containing a fact g together with its negation ~g. Clearly, F is logically equivalent to the formula V,E~s,A~C# ) bodyro(r) and is not reduced to the empty formula, since V ~ , , _ ~ ( c , ~ ~ bodyr,,(r) is assumed not to be a tautology. I f f is • ins one of the conjuncts of F, then for every r in inst_A(C, f), f E body(r). Since rest_At (f) C inst_ A(C, f), for every r in mst_ " Afi,s (f), f ~ body(r). By Lemma 2, this entails that ~ f E M(Ayi~s ). This is a contradiction to the facts that: (i) Ajn s and ins(A, f ) are update-equivalent (thus query-equivalent), and ( i i ) f E M(ins(A, f)) (see Proposition 2.1). Thus, we are allowed to consider a disjunct in /" different than f Let ~ =L~ v L 2 v . . . v L , be such a disjunct, and consider the update sequence U = [(u~, f~), (u~, f2) . . . . . (u~, f~)] such that for i = 1, 2 . . . . . n: if L i is different than f then if L~ is positive, i.e. L~ = g, then put (del, g) in U, if L i is negative, i.e. L~ --- ~g, then put (ins, g) in U. We denote by A t and A~ the databases U(A) "') and U(ins(A, f)), respectively. On the other hand, we note that U does not contain (ins, f). Indeed, in this case, inst_A)""(f) contains a rule whose head is f and whose body contains ~ f Thus in this case, A is not stratifiable. Moreover, there is no fact g such that (ins, g) and (del, g) are both in U. Indeed, in this case, g and ~ g are in q~, which is impossible by construction of /2. Thus, for every i = 1, 2 . . . . . k, if ui = ins then f belongs to ~_VACTS~ and to I FACTS2, and similarly, if u~ = del then f belongs to D_FACTS~ and to D_FACTS2. Moreover, since (del, f ) is not in U, f belongs to I_FACTS2 and t h u s f is true in A~. We prove now t h a t f is false in A~. To this end, we show that f ~ S P F * ( M o ( A ~ ) ) , which entails that ~f~M~(A~) and that, by monotonicity, ~ f ~ M(A~ ). First, f ~ SPF*(Mo(A ~)), because for every rule r in inst_A~ (f) such that pos(body(r)) = 0, there is a literal ~ g in body(r) such that g ~ Mo(A ~). Assuming now that for some i > 1, f ~ SPF*_ ~(Mo(A ~)), we show that f ~ SPF*(Mo(A ~)). Indeed, if f ~ SPF*(Mo(A ~)), as f ~" -
-
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
68
D_FACTS1, then inst_Al(f) contains a rule r such that pos(body(r))CSPF*l(Mo(Al)). Therefore, f if: body(r), which entails, by construction of U, that body(r) contains a literal L such that --,L belongs to Mo(A~), a contradiction to the definition of the operator SPF*. Thus, ~ f E M t ( A l ) , which completes the proof of the theorem.
[]
Appendix C Proof of Theorem 4 T h e o r e m 4. Let A = (I_FACTS, D_FACTS, RULES)be a consistent and stratifiable database and l e t , be a
fact not in D_FACTS. Then del(A, f ) ~ --fAdel if and only if the formula ~[V~i,,,._a(c.: ~ body:o(r)] is a (possibly empty) tautology for every finite set of constants C containing the constants in A and in f. Proof. We first show that if " ' l ( ~ / r E i n s t A ( C , f
)bodyf,(r))
is a tautology for every finite set of constants C
containing the constants in d and in f, then del(A, f ) and _A:de~ are query-equivalent. Let A~e t = (I__FACTS1,DFACTS', RULES) and let d e l ( A , f ) = (IFACTS", DFACTS", RULES). Then ~_ FACTS"=I_FACTS' =I_FACTS\{f}, D FACTS"= D_FACTS U {f} and D_FACTS' = D_FACTS. Since A is stratifiable, so is _a /de' . Therefore, by Proposition 8, M(A , :del ) is a total model and so, every literal in the formula Vrein,t,_aJet(: ) bodY:o(r) is either true or false in this model. Since the disjunction is assumed to be always false, it is false in M(A~ et ). Thus, every disjunct is false in M(A~ e~). In other words, for • "t~ del every rule r in inst*_A~ el, body(r) U_2_M(A~ e'). As a consequence, no rule of mst _Af (f) allows to obtain f in M(A~t). Moreover, f does not belong to I_FACTS' and, by L e m m a 2, the rules in • , Adell. adel rest_Fay unst,:g _z~f
ao not allow to derive f since their bodies contain f. As a consequence, ~j' • • A:d e l -- t•n s ( A fd e l , f ) . By Definition 4, it is easy to see that del(A: et, f ) = del(A, f), and so we obtain that --/A~t - del(A, f). Now, let U = [(u l, fl), (u2, f2) . . . . . (u,, f~)] be an update sequence and let us denote by A I = (i_ lc'~ FACTS'i, D_FACTS'e, RULES) the database ui(. . . ,( U 2,( U ~,( A d e:l ( w:, fl), f:), . . .), f ) , and by A "i = (i _ FACTS", O_FACTS'~, rules) the database ui(...(u2(ul(del(d, f), f~), f2) . . . . ), f ) , for i = 1, 2 . . . . . n. Denoting by A o' = --:ad~Zand Ao = del(d, f), we show by induction on i that A' -----A'. Indeed, as previously shown, ' z~" , ,, , it the equivalence holds for i = 0. Assume that d i_ ~ =- i-~. If A~_ ~ -- A~_ ~ then A i -- A i and the result or, by Definition 4, is trivial. If A i-1 . " A. i l,. then. either A~ = A "i (when ui=del and f = f ) _FACTS"=i I_FACTS'i tO {f} and D_FACTS':~---D_FACTS'. Therefore, A I and A" can be shown to be query-equivalent as done for AfdeI and del(A, f), since the disjunction ~[Vre~n.._.~,i(:~ bodY:o(r)] is del
belongs to M ( A f
), and by Proposmon 7.2,
assumed to be a tautology. Thus, we have shown that, if [V, ei..,, a(c,: ~ bodY:o(r)] is a tautology for every finite set of constants C containing the constants in A and in f, then A:aet ~ del(A, f). Conversley, we assume now that A ~ d e l ( A , f ) and we show by contraposition that --,[V ei.~,._a~c,: ) body:o(r)] is a tautology for every finite set of constants C containing the constants in A and in f Indeed, let C be a finite set of constants containing the constants in A and in f, such that "3[Vr~inst._A(C,f) bodYfo(r)] is not tautology. Then there is a rule r in inst*_A(C, f ) whose body does not contain a fact g together with its negation ~g. Let r 0 be such a rule, which we write f ~--L~,
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 3 7 - 7 0
69
L 2. . . . . L~. Consider the update sequence U = [ ( u l , f 2 ) , (Uz, f2) . . . . . (uk, f~)] such that for i = 1, 2 . . . . . k, L i is in body(ro) and - if Li is positive, i.e. L i = g, then put (ins, g) in U, - if L i is negative, i.e. L~ = ~g, then put (del, g) in U. We denote by A~ and A2 the databases U(A~ el) and U(del(A, f)), respectively. Since AYae~is assumed to be stratifiable, body(r o) does not contain the literal ~f, thus U does not contain (del, f). Moreover, there is no fact g such that (ins, g) and (del, g) are both in U, since r 0 has been chosen so that body(ro) does not contain a fact and its negation. Thus, for every i = 1, 2 . . . . . k, if ii = ins then f belongs to I_FACTS 1 and if u i del t h e n f belongs to D_FACTS1. Therefore, body(ro)C_Mo(A~) a n d f tE O_FACTSl, which entails t h a t f ~M(A~ ). On the other hand, by construction of U, f has not been inserted. Thus, f belongs to D_FACTS2, entailing that ~ f E M(Az). This shows that A~ and A 2 are not query-equivalent, [] thus that A _ f a~t Jdel(A, f ) . A contradiction which completes the proof of the theorem. =
References
[1] S. Abiteboul, Updates, a new frontier, in Second International Conference on Data Base Theory, ICDT (LNCS 326, Springer-Verlag, 1988). [2] S. Abiteboul, V. Vianu, Transactions and integrity constraints. ACM SIGACT-SIGMOD-SIGART, Svmp. on Principles of Database System (1985). [3] S. Abiteboul, V. Vianu, A transaction-based approach to relational database specification, J. of the ACM, 36 (1989). [4] K.R. Apt, H. Blair, A. Walker, Towards a theory of declarative knowledge, in J. Minker (ed.), Foundations of deductive databases and logic programming (Morgan Kaufmann, 1988). [5] P. Atzeni, R. Torlone, Updating databases in the weak instance model, ACM SIGACT-SIGMOD-SIGART, Symp. on Principles of Database Systems (1989). [6] E Atzeni, R. Torlone, Solving ambiguities in updating deductive databases, MFDBS'91 Int. Conf., Rostock (LNCS 495, Springer-Verlag 1991). [7] F. Bancilhon, R. Ramakfishnan, An amateurs introduction to recursive query processing strategies., in Readings in Database Systems (Morgan-Kaufmann, 1988). [8] F. Bancilhon, N. Spyratos, Update semantics of relational views, A CM Transactions on Database Systems 6(4) (1981). [9] N. Bidoit, Bases de donn~es d6ductives: N~gation et logique des d6fauts, Ph.D. Dissertation (Universit~ de Paris-Sud, Orsay, 1989) in French. [10] N. Bidoit, Negation in rule-based database languages: A survey, Theoretical Computer Science 78 (1991). ll 1] N. Bidoit, Bases de Donn~es Ddductives, Prgsentation de Datalog (Armand Colin, 1992) in French. [12] N. Bidoit, C. Froidevaux, Negation by default and unstratifiable logic programs, Theoretical Computer Science, 78 (1991). [13] N. Bidoit, E Legay, Well: An evaluation procedure for all logic programs, ICDTlntl. Conf. on Database Theory (LNCS 470, Springer-Verlag 1990). [14] F. Bry, H. Decker, R. Manthey, A uniform approach to constraint satisfaction and constraint specification in deductive databases, EDBT Int. Conf. (LNCS 303, Springer-Verlag, 1988). [15] F. Bry, Intensional updates: Abduction via deduction, Intl. Symposium on Logic Programming (1990). [16] S. Ceil, G. Gottlob, L. Tanca, Logic programming and databases, Surveys in Computer Science (Springer-Verlag, 1990). [17] L. Cholvy, Mises h Jour dans les bases de connaissances, in 5 e Journdes Bases de Donndes Avancges, Genbve (1989) in French. [18] K. Eshghi, R.A. Kowalski, Abduction compared with negation by failure, Intl. Symposium on Logic Programming (1989). [19] R. Fagin, J.D. Ullman, M, Vardi, On the semantics of updates in databases, in Proc. of ACM PODS, Atlanta (1983). [20] G. Grahne, A.O. Mendelzon, EZ. Revesz, Knowledgebase transformations, in Proc. of ACM PODS, San Diego ('1992).
70
D. Laurent et al. / Data & Knowledge Engineering 26 (1998) 37-70
[21] M. Halfeld Ferrari Alves, D. Laurent, N. Spyratos, Update rules in datalog programs, LPNMR, Logic Programming and Nonmonotonic Reasoning, Lexington, KY (LNAI 928, Springer-Verlag, 1995). [22] A.C. Kakas, P. Mancarella, Database updates through abduction, 16th Intl. Conf on Very Large Databases VLDB, Brisbane, Australia (1990). [23] D. Laurent, N. Spyratos, Updating in universal scheme interfaces, 1EEE Transactions on Knowledge and Data Engineering 6(2) (1994). [24] D. Laurent, Ch. Vrain, Learning query rules tbr optimizing databases with update rules, International workshop on Logic in Databases, LID'96, San Miniato, Italy (LNCS 1154, Springer-Verlag, 1996). [25] D. Laurent, V. Phan Luong, N. Spyratos, Deleted tuples are useful when updating through universal scheme interfaces, IEEE International Conference on Data Engineering, Phoenix, AZ (1992). [26] D. Laurent, V. Phan Luong, N. Spyratos, Database updating revisited, DOOD Intl. Con.[. on Deductive and Object-Oriented Databases (LNCS 760, Springer-Verlag, 1993). [27] Ch. Lrcluse, N. Spyratos, Implementing queries and updates on universal scheme interfaces, VLDB Int. Conf Los Angeles (1988). [28] D. Maier, J.D. Ullman, M.Y. Vardi, On the foundations of universal relation model, ACM Transactions on Database Systems 9(2) (1984). [29] R. Reiter, A logic for default reasoning, Artificial Intelligence 13 (1980). [30] R. Reiter, On formalizing database updates: preliminary report, EDBT Int. Conf (LNCS 580, Springer-Verlag, 1992). [31 ] R. Torlone, P. Atzeni, Updating deductive databases with functional dependencies, DOOD Intl. Conf. on Deductive and Object-Oriented Databases (LNCS 566, Springer-Verlag, 1991). [32] J.D. Ullman, Principles of Databases and Knowledge Base Systems, Vol. I (Computer Science Press, 1989). [33] A. Van Gelder, K.A. Ross, J.S. Schlipf, The well-founded semantics for general logic programs, J. of the ACM 38(3) (1991). [34] C. Vrain, D. Laurent, Apprentissage de r~gles et bases de donnres drductives, lie Journ~es Bases de Donn~es Avanc~es (BDA), August 95, Nancy, France, in French. [35] M. Winslett, A model-based approach to updating databases with incomplete information, ACM Transactions on Database Systems 13(2) (1988). Dominique Laurent received his doctoral degree in 1987 and then his Habilitation degree in 1994 in Computer Science from the University of Orlrans (France), at the University of Tours (France) is currently Professor. His research interests include incomplete information, negative information and updates in databases, together with inductive logic programming and data mining. Viet Phan Luong received his Ph.D in Computer Science in 1993 from the University of Paris-Sud (France). He is presently Assistant Professor at the University of Provence (France). His research interests are focused on data modelling and deductive databases.
Nicolas Spyratos received the Dipl. lng. degree in Electrical Engineering from the National Technical University of Athens in 1966, the Ph.D. degree in Systems Engineering from Carleton University in 1975 and the Th~se d'l~tat degree in Computer Science from the University of Paris-Sud (France) in 1981. He is currently Professor at the University of Paris-Sud. His research interests include data modelling, database theory and database programming languages.