Informomn Swems Vol. 14. No. I. pp 65-77, Printed in Great Britam. All rights reserved
FORMAL
1989 Copyright
SEMANTICS
FOR DATABASE
0306-4379/89 $3.00 + 0.00 ii‘ 1989 Pergamon Press plc
SCHEMAS
DAN A. SlMovtcrt and DAN C. STEFANESCU University of Massachusetts
(Received
at Boston, Department of Mathematics MA 02125, U.S.A. 30 January
1988; in
and Computer Science,
Boston.
revisedform 2 September 1988)
Abstract-We propose an abstract semantics for database schemas which considers the interaction between the disjointness and the “is-a” relationships between data types. Our approach is based on a system of axioms introduced and proven sound and complete by Atzeni and Parker. By introducing the concept of mergeable family of schemas we propose a design methodology of database schemas whose goal is to generate schemas which are free of inconsistencies and redundancy. We also investigate the generalization and specialization methods of abstraction as a useful tool in the specification of database semantics using as a staring point the conditions formulated by Abiteboul and Hull for the IF0 model when the disjointness constraints are present.
1. INTRODUCTION The purpose of this paper is to introduce an algebraic
approach in specifying the semantics of schemas for knowledge bases; this topic is important for both the artificial intelligence area and for databases. Our starting point is the redefinition, in abstract terms, of the notion of schema, as it was introduced by Atzeni and Parker [2]. They regard schemas as taxonomies of types; relationships between the types participating in a schema are expressed through an “is-a” hierarchy. Types are conceived as subsets of an universal collection of objects or, as a monadic relation or a Boolean combination of such relations (see [9]). Using the concepts introduced by Borgida et al. [4] we shall treat types as templates. If x is-a y and y has some properties then x must have the same properties. This assumption is known as the strict inheritance hypothesis and we shall adopt it here. In [2] the authors consider the effect of imposing a disjointness constraint on database schemas which are built around the “is-a” relationships among types. Here. we attempt to enrich and refine the Atzeni-Parker model by differentiating between generalization and specialization (see [5,6]) and by incorporating other aspects which originate in the IF0 model proposed and studied in [3,7,8]. We need some preliminaries concerning relations. Let z be a relation on a set S. a E S x S. The inverse of z is the relation r-l = {(r. u)lu, L’ES, (u, 0) ~a}. The product of the relations a. p G S x S is the relation r/? = 1(.x. Z) 1.~.: E S. 3~ E S, (x, y) E sl, (.V.Z,Epl. If s E S we shall consider the subset r(.~)=~~~(~~~S.(~.~~)~Sj.Clearly.if~~~~(x)then .YE r - I(?,). If I, = {(x. .u)ls E Sj is the diagonal relation on S we shall define the powers of z as Z’ = I, and %(“_I=r”z for na0. -+To whom correspondence
should be addressed, 65
If 1, E a we shall refer to a as a rejexiue relation. A relation a is symmetric if a -’ = a. The least symmetric relation containing a is asym= a U a -I. A relation a is transitive if a2 c a. The least transitive relation containing a is the relation
called the transitive closure of p. The least transitive and reflexive relation containing a, also known as the transitive and reflexive closure of a, is a*=“yOa”. Dejinition 1. A database schema or, briejy a schema, is u triple H = (S, cr,a), where S is a-finite set of types and u, 6 are binary relations on S satisfying the following conditions: (API) (APii) (APiii) (APiv)
u is rejexive and transitive; 6 is a symmetric relation on S; if (x, x) E 6 then (x, y) E 6 for UN y E S; for x, y, z E S, if (x, y) E 6, (2. x) E o then (Z,,V)E& (APv) if (x, x) E 6 then (x, y) E o for all y E S.
It is transparent that cr plays the role of the “is-a” relationship while 6 plays the role of the “disjointness” relationship between types, regarded as constraints in (21. In other words, if (x, y) E d then every object of x also belongs to the type J’: furthermore, if (x, y) E 6 then there are no common objects in the types x and y. Schemes will be denoted with G, H, .; types will be designated by the last small letters of the alphabet: s,t,u . . . .. A type x will be referred to as consistent if (x,.x)$6. We shall denote by C(H) the set of all consistent elements of S. A schema H = (S, cr,6) is consistent if C(H) = S. The set of inconsistent rypes is Z(H) = S - C(S) = {x Ix E S, (x, x) E S}.
DAN A. Sr~ovrcr and DAN C. STEFANE~CU
66
With the relational terminology the axioms (APi)+Pv) can be reformulated as follows: (RAPi) (RAPii) (RAPiii) (RAPiv) (RAPv)
t E Q and i9 E a; d =6-l; Z(H) x 5 I; 6; a6 r S; I(H) x S E; 0.
Schemas will be represented by graphs. In order to introduce this kind of representation let us consider the “direct is-a” relation o,, defined by a, = u, -of, where bI = u - 1. We shall have (x, y) E a, if there is no “intermediate type” z such that (x, z) EU and (Z,Y)EU. The graph of the schema H is the pair $$ = (S, E)* where S becomes in this context, the set of vertices of gH and E c S x S is the set ofedges of te;, containing two types of edges: (1) If (x7y) a a, then we shall have an oriented edge, originating in x and ending in y; such an edge witi be represented by a solid arrow. (2) If (x, y) E 6 will shall have an unoriented edge in gH represented by a dotted line. Example 2. Consider ihe set of types S = (person, student, teacher, teachingassistant, pfofesaor), representing various individuals involved in the teaching activity af a co&e. rf (teacher, person), u0 = {(student. person), (teaching-assistant, student), (teaching-assistant, teacher), (professor, teacher)} and 6 = {(professor. teaching-assistant), (teaching-assistant, professor) ) duegraph of thb schema is represnted in Fig. t.
X AN ALGEBRAIC APPROACH SEMANTICS OF SCHEMAS
OF
We need certain simple consequences of the AP axioms. Let H = [S, b, 6) be a schema, Lemma 3. If (a, gfE_a fl6 then p E I(H). Alsu. if (r., sf E g and s E I(H) then r E i(H). Proof. Since (p, q)E 6 and also (p, q)E 6, taking x=qand y=z=p in APiv) will give @,p)~& in view of APii). For the second part of the Lemma, by applying APivfwithx =y =sandz =rwehave(r,s)E&The first part of the Lemma gives 6. r) E S.0 person
t eachar student
Lemma 4. Zf there is a type z E C(H) and (z, x), (2, y) E u then (x, y) g b. Proof. Suppose that (x, JJ)E 6. Then (r, X) E u implies (t, y) E 8, according to APiv). This membership, together with (z, y) E u will give [r, r) E 6, according to Lemma 1, which is contradictory.0 It is interesting to notice that the converse of the implication from Lemma 2 does not follow from the axioms (APi)-(APv) as it is shown by the following example: Consider a finite, nonempty set S of natural numbers, S c N, and define the relation 6 on S by * l.AIso,define(m,n)Euifm (m,n)ESifgcd{m,nj divides n. We remark that n is inconsistent if and only ifn = I. It is easy to verify that (S, {b, S I> satisfies the AP axioms, Suppose now that we consider the get S = (4,6f. The pair, (4,6) does not belong to 6, however, it is impossible to find z ES such that (z, 4), (z, 6) E o. Definition 5. An AP-schema H = (S, a, 6) is complete if assuming that (x, y ) $6 ken there is z E C(H) such that It7 x) E Q and (z, y) E u. There is a standard method for generating APschemas on any set. Theorem 6. Let (P, { 6,0}) be a poset having 0 as its first element. Zf D: S 4 P is a mapping then the triple HD = (S, uv, 6,), is an AP-schema on S, where
and
Proof. We have to verify that HD satisfies (APi)-(APv). It is obvious that the first two axioms are satisfied. Assume that (x, x) E S,,. This means that D(x) = 0 and, of course inf{D(x), D(y)} = 0 for any y E S. Therefore, (x, y ) E 6, + Let now .r, y, f be such that i.x, p) E ~5~ and (&X)EUa, that is D(z)< D(x). The set fD(zfl D(y)) has at least one lower bound (namely, 0, the least element of the poset). There is not other lower bound, for if c $ D(z) and c B D(y) far some c E P we would abo have c Q D(x) and this implies c = 0. Therefore, we have inf(D(z), D (y )} = 0 hence fGY)eb
The last axiom is immediate since D (x ) = 0 implies D(X) < D(y) for aI1 y ~2S.0 The mapping D generates a certain meaning of the types. By imposing restrictions on this mapping it is possible to obtain schemas satisfying various conditions encountered in the literature. For instance, if the mapping D is one-to-one we have no cycles of “is-a” relationships, as it can be easily seen. Theorem 7. For any complete AP-schema, H = (S, CT,6) there is a mapping D:S + P, where (P, (<, 0)) is a poser having the _!?rstelement 0 such that
Formal semantics for database schemas and 6 = {(x.Y)linf{D(x),
D(y)} = O}.
(2)
Proof. Consider the poset (2’, C, 8) generated by the schema itself and define the mapping D:S + 2” by D(x)=
{tl(t,x)~a
-6).
(3)
Clearly, if f E D(x) then (1, r) $6 for, otherwise we would have (t, x) E 6 according to axiom ‘(APiii). Let (u, u) E u. Consider t E D(u); we have (t, u) E o and (t, u) 4 6. Due to the transitivity of 0 we also have (t, u) E CT.If we had (I, IJ) E 6, taking into account that (u, u) E u this would imply that (t, u) E 6 according to (APiv) and this can not be the case. Therefore, (I, u) 4 6, hence, t E D(v) and this proves that D(u) E D(v).
Conversely, assume that D(u) E D(v). We shall distinguish two cases: (i) (u, u) E 6, that is, u is inconsistent, (ii) u is a consistent type.
and
In the second case we have (u, u) E:D(u) due to the reflexivity of o (APi) and this means that u E D(u), that is, (u, u) E 6. In the first case we shall have (u, v) E CTaccording to (APv). Therefore, we obtain in any case (u, u) E c and this proves (1). To prove (2) we have to show, in this context, that (u,L*)E~ if and only if D(u)flD(v)=9. Let (u,u)~S and assume that t ED(u)nD(u). We shall have: (1, u) E 0, (r, u) $6
(4)
(t, t’) E 6, (2. v) L 6.
(5)
and
Applying conflicts
(APiv) to (3) we obtain (t, U)E~ which with (4). Therefore, we must have
D(u)fID(v)=9.
We remark that until now we have not used the completeness of the schema H in the proof. Assume that u, c E S such that D(U) n D (u) = 9. If (u, c ) 6 S then, due to the completeness of the schema, there is z ES such that (z, z)$a and (2, U)EU, (i. 2.) E 6. We can not have (z, u) E d for this would imply (z. Z)E 6 according to Lemma 1. Therefore, (Z,U)EU-dS. that is, : E D(u). Similarly, we can prove that z E D(v), hence z E D(u) n D(v), which is a contradiction. This shows that for a complete schema D(u)ilD(t;)=0 implies (u,u)E~.~ Corollary 8. For any AP-schema (complete or not) H = (S. u, b) there-is a poset (P, { <, 0}) and a mapping D:S -0 P such rhar u = ug and 6 E 6,.
Theorem 2 shows that complete AP-schemas have the property of being “self-described”, at least as far as “is-a” and disjointness constraints among types are concerned since the partial ordered set used in defining the semantic mapping D is built in terms of the scheme itself.
67
Also, under more general conditions we obtained for an arbitrary scheme H = (S, (r, 6) the existence of a mapping D:S + P into a poset (P, <) having the least element 0 such that (s, t)~ u if and only if D(s) d D (1) and for which (s, 1) E 6 implies inf{D (s), D(t)} = 0.
It is sometimes important to avoid having several names for the types associated to the same extension. This is accomplished in 13, 71 by imposing the nonredundancy condition on the scheme. Specifically, a scheme is nonredundant if there is no cycle of u, that is, there is no sequence x,, ,x, such that (x,,x,+])Eu for l
1. Using the previous theorem it is possible to prove the following: Corollary 9. A scheme H = (S, u, 6) is complete and nonredundant IY, and only zy, there is a one-to-one mapping D:S + P such that u = uD and 6 = 6,. We remark that if H = (S, u, 6) is a non-redundant
scheme then (S, a) is a poset, where o is a strict partial order. In semantic models various authors impose extra restrictions on the constraints in order to provide a better reflection of the real world. For instance, in [3] the “is-a” relationship is required to satisfy the following supplementary Church-Rosser-like condition: (CR) If x, y are types of the schema H and z is a consistent type of H = (S, o, 6) such that (z, x) E u and (z, y) E u than there is a type w such that we have both (x, w) E 6 and (y, w) E u. Any schema H satisfying (CR) will be referred to as a Church-Rosser schema. In their paper [3] Abiteboul and Hull consider Church-Rosser schemas satisfying several natural conditions. We prove that such schemas (which also have the completeness property) provide for each type x a maximal supratype Q(x) which plays an important role in the context of the interaction between the disjointness and the “is-a” relationships. Lemma 10. if H is a non-redundant CR schema then for every type x there is a unique maximal type Q(x) such that (x, Q(x)) E cr. Proof. Since S is a finite set it is clear that any type
x is in the relation 0 with a maximal type 1. Suppose now that we have two maximal distinct types t, s such that (x, t), (x, s) E cr. Applying the CR property we obtain the existence of a type u such that (t, u), (s, u) E u. But, this contradicts the maximality of t and u which proves our Lemma. The unique maximal type which is greater than x will be denoted by Q(x).0
The previous Lemma is introducing a mapping Q:S + S for nonredundant, CR schemas. For complete schemas this mapping is related to the disjointness relationship 6. Theorem 11. If H = (S, u, 6) is a complete, nonredundant CR schema then if (x, y)$S we have Q(x) = Q(y).
DAS A. SIMOWCI and DAN C. STEFAhEscK
68
Proof. Suppose that (x, y) $6. Since H is complete there is a consistent type z ES such that (z. x). (z,y)~a. Applying the CR property we obtain the existence of a type w such that (x, w). (y. w) E u and, also (w, Q(w)) E cr. Therefore, we have (x, Q(W)), (y, Q(w)) E rr and, using the uniqueness of the maximal type greater than a given type we have
Q(x) = Q(v).0 Let us notice that in a nonredundant
CR schema the mapping Q is a closure operator in the poset (S, a). that is, Q(Q(x)) = Q(x), (x, y) E 0 implies Q(x)-Q(y) and (x,Q(x))EQ for all x,yeS. Corollary
12. In a nonredundant
complete
CR
schema ifQ(~)#Q(y)
then (Q(x).Q(Y))~~. Pro4 Since Q(Q(x)) = Q(x) # Q(Y) = Q(Q(v)). by applying the contrapositive of Theorem 11 we have (Q(~).Q(Y))E~.O. 3. SCHEMAS
WITH
REPOSITORIES
We have already developed the notion of an AP database schema in a general setting. In this section we investigate particular types of AP schema% Our starting point is the IF0 model developed by Abiteboul and Hull [3]. IF0 is concerned with mechanisms for representing structured objects and functional and “is-a” relationships between them. In particular [3] deals with s~pecialization and generalization, two types of useful supertype/subtype inheriting relationships. This last aspect is the focus of our investigation. Informally, specialization is used to define possible roles for members of a given type. Thus one can create subtypes which inherit (and depend upon) the information from the supertype, i.e. the subtypes are specializations of the supertype. Any given object can change its role over time, i.e. it can move from one specialized subtype to another in consecutive instances of the schema. In the generalization relation two or more subtypes are bundled to create a new supertype. In this case the information in the supertype depends strictly upon the information in all of the subtypes. The key idea behind the use of the “is-a” relationship is that types, the building blocks of databases, share information. In fact, a fair amount of information is duplicated. Avoiding duplication seems to be an important factor in reducing the human cognitive load as well as the use of the processing resources. In a traditional setting, if (x.~)E u this means that the subtype x contains information from the supertype y (maybe all of it).
This fact can be interpreted
in two ways:
ox is a special case of y. i.e. x views a part of the information from the repository y or, ly is a general case of x, i.e. y views the entire information from the repository X. The difference between the two cases is the origin of information which is imported, i.e. either the
supertype or the subtype. Correspondingly, .Y plays the role of a view (repository) and y plays the role of repository (view). Under this interpretation. any change in the information associated with the repository may mean changes in the information associated with view nodes, but not vice versa. In what follows we formalize_ the notion of repository and we examine the impact of this concept on the construction of database schemas. Suppose that both .x and y are repositories for z and (x, z), (2, y)~ c,,. We shall eliminate such situations from the schemas we accept here due to the fact that this generates a undesirable redundancy in the information stored in the database. Indeed, x contains a part of the information accessible from z; since (I, y) E Q,, this information is also contained in y. which means that the information contained by x is duplicated in y. This is reflected in the second axiom below. We shall also impose a condition which eliminates the circularity of the repositories. Let u. = u - u2 be the direct “is-a”. introduced in the first section. Definition 13. A schema with repositories is a pair R = (H. p). where H = (S, u, 6) is a schema (as introduced in Definition 1) and p is a binar_v relation of S such that the following axioms are satisfied:
_
(REPOi)
(REPOii) (REPOiii)
The relations p and p - ’ form a partition is u,Uu~‘=pUp-’ and pnp-‘=$I (u,np)(u,np -I) = 0. The relation p has no cycles, that is, p” fl I, = 0 .for all n E N.
of u,Uu,'that
The last axiom corresponds transitive
closure
to the fact that the p + = UnsNp” of p is a strict partial
order on S. If (u. u) E p then u is the repository and u is the oiew of the pair (u. 0). A type u is a repository type if u is the repository of some pair from S x S. The introduction of the concept of repository allows us to distinguish between two basic types of “is-a” relationships: the generalization and the specialization. Definition 14. A specialization pair is a pair oftypes from 0,np-1; a generalization pair is a pair of types from
0,np.
A schema with repositories R = (H, p) can be represented by marking the edges of the graph 9” of the schema H. Namely, we shall place the symbol “0” at the end of each arc (x. y) which is the repository ‘of the pair (x,y). In this manner we shall obtain specialization edges (whose final vertices are marked) and generafization edges (whose initial vertices are marked). Example 15. Consider the schema with repositories S = {tourist, where H = (S, u, S), R = (H, P). business-traveler, passenger, crew-member, traveler, frequent-flyer} and the repository relation p is: passenger), (business-traveler, pas{(tourist,
69
Formal semantics for database schemas senger), (passenger, frequent-flyer), (passenger, traveler), (crew-member, traveler)) and the graph of this schema is given in Fig. 2. In the graph of this schema we have four generalization edges ((tourist, passenger), (businesstraveler, passenger), (passenger, traveler), (crew, traveler)) and one specialization edge: (frequentflyer, passenger). A relational database structured along ‘the schema
given in Fig. 2 can be defined in a SQL-like manner starting from the primary repositories tourist, business-traveler and crew-member: CREATE TABLE tourist (name char(35). address char(35), telephone char(lO), flightno char(5), flightdate date, miles integer2) (name CREATE TABLE business-traveler char(35). address char(35), office-tel-char(lO), carp-acct char(20). flightno char(5). flightdate date, miles integer2) CREATE TABLE crew-member (name char(35), basedcity char (20), position char (20), flightno char (5). flightdate date, hours integer2)
The rest of the types are defined as views (in the sense of SQL) based on these types, according to the following statements: CREATE VIEW passenger AS (SELECT name, address, flightno, miles FROM tourist) UNION (SELECT name, address, flightno, miles FROM business-traveler; CREATE VIEW traveler AS (SELECT name, flightno, flightdate FROM passenger) UNION (SELECT name, flightno, flightdate FROM crew-member) CREATE VIEW frequent-flyer AS
flightdate,
flightdate,
SELECT name, address, miles FROM passengers WHERE miles > 20000
i’
tourist
flightno,
flightdate,
Axiom (REPOii) means that no generalization arc may be followed by a specialization arc. Certain schema designs may be unacceptable because of the implied circularity of the repository relation, as it is shown by the following example. rhe H= Example 16. Consider schema ({x9 y9 i, u, c, w ), a, 6) and the schema with re positories R = (H, p) whose graph is given in Fig. 3. Since (y, z), (z, u), (u, w), (w,y)~ p we have y E p4 n I, which violates axiom (REPOiii).
We shall present a standard mechanism for inducing a repository relation of an AP-schema. Furthermore, we show that any schema with repositories can be generated in this manner. Let 59 = (S, E) be a directed acyclic graph. Definition I Z A stamping function for the graph
Clearly, any numbering of the vertices which corresponds to a topological sort of the graph ‘3 or of its dual is a stamping function. Theorem 18. Let H = (S, u, 6) be an AP schema and let T be a stamping function for the graph generaled by crO. If ~~=((x,y)lx,y~S,(x,y)~u~~ and then (H, pr) is a schema with T(x) < T(y)1 repositories. Moreover, for any schema with repositories R = (H, p), where H = (S, a, S) it is possible tofind a stamping function T such that p = pr. Proof. We have to verify the axioms REPOi)-
__---v A
11
Graph of an unacceptable design.
59 is an one-to-one mapping T:S + N such char x ,,..., x, in palh 9 we have any for T(x,) < max { T(x,), T(x,)} for 1
traveler
passenger
Y Fig. 3.
crew-member
b
f requent-fiyer
business-traveler
Fig. 2. The graph of the schema with repositories.
-I, Let us notice that (REPOiii). {(x,y)lx,y ~S,(x,y)~crr”’ and T(x)> J[y)), which shows immediately that pr and p;’ form a partition of aa” = a,Ua;‘, which gives (REPOi). To prove that (REPOii) is satisfied let (x,y)~(u~npr)(q,np;‘). There is z ES such that (x, z). (2, Y) e Qo, T(x) < T(z), and T(y) < T(z). However, this contradicts the definition of the stamping function for x, z, y is a path in the graph and T(z) $ max{T(x), T(y)}. This contradiction shows that (a0f~pr)(uo rl p ;‘) = 9.
DAN A. SIMOVICI and DAN C. STEFANESCU
70
T(traveler)=
/A
-----T(crew-member)=
T (passenger)
= 3
d
a
c ----_-_
5
b
Tcfrequent-flyer)=
T(tourist)=
schema can be created in the order given by the values of T. Assuming that the construction of the schema is done by adding a new type at each step we may interpret the value of a stamping function T(x) as the moment when the type x was added to the scheme. This is consistent with Ihe following two basic assumptions:
6
(1) specialization is made starting from an existing type by identifying certain characteristic properties of some of its subtypes; (2) generalization is made by constructing a new type consisting from the sum of its existing constituents.
4
‘L 1
T(business-traveler)=
2
Fig. 4. Values of the stamping function.
In order to justify (REPOiii) suppose that there is t E S such that (t, t)~&fl I. There is a sequence that such t = to, t,, . . . , t” = t of types T(t) = T(t,J < T(t,) c . . . < T(t,) = T(t), which is clearly absurd. Therefore, (REPOiii) is satisfied. Conversely, let R = (H, p) be a schema with repositories, where H = (S, cr,6). Due to axiom (REPOiii) the directed graph of the relation p is acyclic. Let T:S -+ N be a numbering of the vertices of this graph which is given by a topological sorting. We claim that p = pr. The inclusion p E pr is obvious, due to axiom (REPOi) and to the definition of the topological order of a directed acyclic graph (see. [l]). To prove the converse inclusion let (x, y) E pr. We have T(x) < T(y) and either (x, y) E 6” or (x, y) E a;‘. In view of (REPOi) we have in either case (x, y) E p Up-‘. However, we may not have (x, y) E p -’ because T(x) c T(y). The single altemative is (x, y) E p, which gives the equality p = pr.O Example 19. A topological sort of the graph corresponding to the relation p of the schema discussed in Example 15 will give the values specified in Fig. 4 for the stamping function T: and the six types of this . ..
v
Consider a schema with repositories R = (H, p), where H = (S, o, 6) and let x be a type of this schema, x ES. Dejnition 20. The local subschema of the type x is the subschema H, whose set of types is S, = {x} u {y ly E S, (x, y) E arm}. In Fig. 5 we consider the two possible forms of local subschemas of a type x of a schema with repositories. If in(x) and out(x) are the set of arcs ending in x and the set of arcs originating in x in H, then, due to axiom (REPOii), at most one of the sets in(x) and out(x) may contain arcs of a different nature (generalization and specialization). The classification of local schemas considered above allows us to introduce a classification of the types of schemas with repositories. Definition 21. The type x is obtained by generalization if its local schema is of the form given in Fig. 5(a); if its local schema is of the form given in Fig. 5(b) then x is obtained by specialization. This definition can be justified as follows. Let T:S -+ N be a stamping function for the schema H. If the local schema is of the type considered in Fig. 5(a) then we have: T(x) > T(y),
for all
y E in(x), (y, x) E p,
T(x) < T(v),
for all
v E out(x), (x, v) E p,
T(x) < T(z),
for all
z E in(x), (x, z) E p.
.. .
out(z) i t
A X 0
+
in(x)
...
dy
1..
bz...
Fig. 5. Local subschemas for x
.
I
b
in(x) z
71
Formal semantics for database schemas Already in the schema
-----
z
1
.
..
in(x)
z ...
Fig. 6. Adding the type x to the schema.
Under the above interpretation of the stamping function the type x can be added to the schema only after the types y E in(x), with (y, x) E p are already in the schema. Consequently, the type x is obtained by generalization from types y em(x) already in the schema [see Fig. 6(a)]. If the local schema is of the type considered in Fig. 5(b) then T(x) > T(u),
for all
u tout(x),
T(x) < T(u),
for all
0 E out(x), (x, 0) e p,
T(x) -c T(r),
for all
z
l in(x),
(u, x)ep,
(x, z) c p.
This implies that x is obtained by specialization from types u E out(x) already in the schema [see Fig. 6(b)]. The gist of the above discussion is that for schemas with repositories we have a welldefined building methodology. If R = (H, p) is a schema with repositories, where H = (S, c,6) and S = {x,, , .x,} then we have a sequence of schemas with repositories {(H,, p,)] 1 < i d n} such that x,}, cr,, 6,) each H, is a subschema of H,=({x,,.... H ,+ I for 1
already existing types through the generalization or the specialization process. The order in which we add the types to a schema (given by some stamping function) has a clear semantic relevance in a schema with repositories since it shows how the new type collects its information from already existing types. The above discussion also shows that a schema with repositories is put together starting from very simple pieces. In order to formalize this idea we shall need the following concept: Definition 22. An elementary schema is a schema with repositories the over set of types S = {x,, . . ) x,, y } whose graph has one of the forms given in Fig. 7(a) or (b). The elementary schema from Fig. 7(a) will be called the g-elementary schema of y, while the elementary schema from Fig. 7(b)‘will be referred to as the s-elementary schema of y. According to Corollary 8 for any schema H = (S, a, 6) there is mapping D:SdP, where (P, { <, 0)) is a lattice such that u = cg. If {x,, . . , x,, y } is the set of types of the elementary schema of y then we shall have: . . . , D (x,)} in the case of a (1) D(Y) 3 supID( g-elementary schema and, (2) D(y)
Y
.. A Xl
Xi
63
YI Y
b)
4 Fig. 7. Elementary schemas.
A. SIMOVIC~ and DAN C. STEFANESCU
DAN
72
elementary schema of the type y. The underlying schema of R(y) will be denoted by H(y). In the original organization of schemas there is no requirement on the types involved in the generali~tion. The present framework (i.e., the realm of schemas with repositories) offers the possibility of a formalization of the generalization process and recaputuring in this abstract setting of various aspects of generalization (see [3, IO]). Definition 23. A schema with strong generalizations is a schema with repusitor~es (H, p) such that H is consistent and nonredundant and any g-elementary schema of a type y (having the set of types {X,,..., x,, y} enjoys the following two properties: (GENl) D(y) = sup{D(x,), . . . , D(x,)j and (GENZ) (xi, x,) E 6 for i + j and 1 Q i, j 6 n. Proposition 24. Let (H, p) be a schema with strong generalizations. If the g-elementary schema of u type y has the set of vertices {x,, . . . . x,,y} then n 22. Furthermore, the partial graph of 9.. determined by the generalization edges {(x, y ) 1(x, y ) E u. n p > is a union of trees, Proof. Suppose that the g-elementary schema of y would have the set of types (x,,y). By (GENl) this implies D(y) = D(x,) and this contradicts the nonredundancy of H. In order to prove that the above-mentioned partial graph is a union of trees it will suffice to show that each pair of types X, y is connected by at most one path. Suppose that we would have two distinct paths in the partial graph: u,, . , . , up and vt, . . . , vq such and up=uvq=y. If uk, vh is the that x =u,=v, last distinct pair of types of these two paths we shall have Us+ , = vh+ , which implies (r+, v~) E 6, by GEN2). Since (x, uk), fx, v,,)~a this gives D(x) Q inf{D(u,), D(v,,)~ = 0, which in turn, gives D(x) = 0 thus contradicting the consistency of the schema. 0 The axioms of the IF0 model (see [3]) which are concerned with generalization and specialization amount to the assumption that the corresponding schema with repositories has strong generalization. In addition, the schema corresponding to the partial graph determined by the specialization edges (called here the specialization partial graph) is assumed to have the CR property. We proved in the first section of this paper that the semantics of a database schema H = (S, 6, b) can V
U
i
b)
~
Y
2)
X
u
cl
Fig. 8. H,. Hz are subscbemas of H.
be condensed in a mapping D:S -+ P, where (P, { 6, 0}) is a partial ordered set (if we limit ourselves to facts related to the “is-a” relationship and to the disjointness restriction). If one is further interested in differentiating between specialization and generalization then we need to add the notions of repository/view. This, in turn can be expressed by a stamping function T:S + N., 4. A DESIGN PHONOLOGY FOR DATABASE SCHEMAS The design methodology we propose in this section consists in building large database schema by starting from small, local components of the schema and joining these components into larger ones. In the joining process various facts and constraints imposed on the local schemas may interact in rather surprising ways generating redundancy and inconsistency. The purpose of this part of the paper is to develop a technique for dealing with these undesirable parts of a schema. Let S be a set of types. We shall denote by AP(S) the set of schemas AP(S) = (Hi/Hi = (Si, ui, S,), S, E Sj, i E I>. If Hi, Hjs AP(S) then we shall write Hi < H, if Si c S,, ui c Q/ and Si 5 6,. Theorem 2.5. The partial ordered set CAP(S), <) is a complete lattice. Let us remark that the triple Proof. H,, = (S, S x S, S x S) belongs to AP(S). Therefore, it will suffice to prove that any collection of schemas from AP(S), {H,lj E J} has a greatest lower bound. Consider the triple K = (T, u, 6), where T= n Sj, JEJ
and
It is s~i~tfo~ard to verify that K satisfies the axioms (APi)-(APv) and that I( = inf(H,]j E JJ. 0 Let H = (S, u, 6) be a schema. Definition 26. A subschema of H is a schema G=(T,u,,&) such that TcS,ur=un(Tx T) and a,=6 n(T x T). We remark that 1(G) = f(H)ll T. The set of all sub~hemas of a schema H will be denoted by Sub(H). Let G = (T, ur, S,), K = (U, uv, 6,) be two subschemas of H. A partial order can be introduced on Sub(H) by G a K if T E cr,o,=a,n(T x T) and 6r=S,,n(T x T). The partial ordered set (Sub(H), <) is also a complete lattice as the reader will easily verify. If H f AP(S) we shall obviously have Sub(H) = AP(S); however, Sub(H) is not a complete sublattice of AP(S), as follows from the next example. Example 27. Consider the set S = {x, y, u, v) and the subschemas H, and Hz (represented in Fig. 8(a)
73
Formal semantics for database schemas
Y
u
to the definition of mergeable family of schemas. As we shall see, for such families the least upper bound preserves the properties of consistency and nonredundancy. Let %7={H,IH,=(S,,u,,b,),j~.l) be a collection of schemas. DeJinition 29. The interaction set of %, denoted 6,
I I z
u
Fig. 9. sup{H, I If,} in AP(S)
T(W), is the set of ail types shared by at least two schemas of V. i.e.
and (b), respectivel?,) qf the schema H represented in Fig. 8(c). The least upper bound of H, and H2 in (Sub(H), a) is H itself. However, the least upper bound of H,, H2 in (AP (S), <) is the schema represented in Fig. 9. Let % = {H,b E J} be a set of the schemas from
AP(S). Important properties of schemas (like consistency and nonredundancy) are inherited by the greatest lower bound in%? of this set of schemas. However, this is not true, in general, for the least upper bound, as follows from the following example. Example 28. Consider the schemas H,, H, given in Fig. IO(a) and (b), respectively. Both schemas are consistent and nonredundant. However, their least upper bound in AP(S) loses both properties. In sup {H,,H,} ali three types x, y and z are inconsistent.
As suggested by the above example, the loss of properties is due to the fact that the component schemas do not concur when restricted to the types they have in common. This observation leads
(6)
T(~)=It13j,.jzEJ,j,#jz,tES,,nS,l).
Dejinition 30. W is a set of mergeable schemas [/.for all j,, j2 E J WChave: a,, n(T(W
x T(W) = o,,n(T(W
x T(W)
and b,, f~(T(V) x T(V)) = 6,: n (T(V) X T(W). In what follows, we assume that % is a set of mergeable schemas as described by the above definition. An immediate consequence of the above definition we have the equality I(H,,)n T(V) = I(H,,)fl T(Y), for all j,, j, E J. Example 31. Consider the schemas H,, H,, H, and H4 represented in Fig. Il. For the collection of schemes V, = {H,, H,, H,} the interaction set is T(W,) = {x, y, z}. This collection is not mergeable due to the ,fact that
2
A
while
* .I;;‘-
/___--__A
Y
5
a)
Y
1
x TW,)) = 9.
6, n(T(%)
On another hand,for the collection W2 = {HI, Hz. H,} the interaction set is T(W2) = {y } and the collection W2 can easily vertfied to be mergeable.
The relations 0, enjoy special properties since they concur on T(V) (as described in Definition 30): Lemma32. we have:
ForanynEN,n>2andj
schemas.
z
I&_--*
Yh Hz
jnEJ
Proof. The argument is by induction on n. The base case, n = 2, is obvious. Assume that the inclusion holds for n = k, i.e.
b) Fig. 10. Consistent and nonredundant
,,...,
H3
Fig. Il. Schemas H,, Hz. H, and H4.
ye--__e
w
fi,
DAN
74
For n = k + 1 using the induction have:
A. SMOV~CI and DAN C.
hypothesis, we
Ifj, =j, then, by the transitivity of oj, we shall have uj,ojk G Q,,. Similarly, if jk = jk+ ,, we shall have aito;i+8 caIk+, due to the transitivity of a)*+,. In either case we have p-kc1 ,” _ The
STEFANESCIJ
fact the equality (7) can be written:
Since each uj is reflexive on Sj and S = U,, JS, it follows that I~ E UjrJb, = 6; therefore, we have: a*c&Ud*., (2) We have
k2d= U{aiajS,li,j, m E J}.
aip c ainali+l*
case
left is when j, #jk and jk # jk+,, Let x, y, z, u be four types such that (x, y) E rrj,, (y, z) E a,k and (z, tl) E a,, I* By Definition 30 we have y, z E T(%‘) and therefore, (y, z) E ai, which implies that
Consider the relation u,u,6,,,. If i =j, we have a,~$, c aid,,, by the transitivity of a,. If j = m, since H,,, satisfies (RAPiv), we have oiajS, E a&,,. If neither equality holds let (x, y)e a,, (y, Z)E bj and (z, w) E 6,. By the equality (6) we have y, z E T(%%‘) and thus (y, z)ea, due to the mergeability of the schemas. Therefore, (x, w) E ui~J,,, c a$,. The above three cases imply the desired inclusion. (3) We have &-’
and y f l(Hj) for some j E J then x E f(H,,). Proof We have y E I(H,” ) since y is inconsistent in Hi and y E T(V) ifj, fj.; otherwise, (that is, ifj, =j.) we shall have trivially y E I(H, ). Then, by Lemma 32 and since aj, is transitive, we have (X,
Y)
E
‘fi
Ui,
E
bj,Ujm
p-1
and two applications of Lemma 3 give us x o 0&,)-O For a collection of schemas V = (H,lH, = (g, r+ S,),j E J) let us define the relations: b=
u
jsJ
61 and
8 = ,y, Sj
j, m E J).
Consider the relation ai8ja;‘. If i =j we have fact that ai6ju;’ c 6,a;’ in view of the Hj satisfies (RAPiv). If j =m we obtain ~$~a;’ = ai(am6j)-’ c a,aj due to the symmetry of Sj and to axiom (RAP@ If none of the previous equalities hold iet (x,y)oai,(Yrz)e6j,(z,~)ou;‘. Bytheequality(6) we have y. z E T(%‘) and, therefore, (y, z) E ai which implies (x, w) E &a;‘. The above cases imply that: &&-E = U(u,bju;‘li,j,m
E J)
E U(a,S,li,j E J)UU(Gju;ili,j
E J>
= (&qym.o The previous Lemmas allow us to describe effectively the least upper bound of a collection of mergeable schemas. Theorem 35. Let V x (HjIHj I: (Sj, aj, Sj),j E J> be a set of merge&e schemas. Then sup@ in (AP(S), 9) is the schema Ii given by H = (S, u*, dsym) where
on the set S = Ujr,Sp Also, we define the set f =
= lJ(aibia,‘li,
,(JJItHj)*
Lemma 34. We have the following: s = ,YJ s,, t7 = ci u (i x S), 6 =SU&&U(i ProofI (1) Since ci* 2 b U ci2, it is sufficient to show that d* c d U 8’. Let J* be the free monoid generated by the set J; the null word of J* will be denoted by e. The set of words of length n is J”. For u E J* we shall define the relation G(u) by G(e) = is and tj(uj) = i3(u)aj, forj e J. It is easy to verify that for any U,,U~E J* we have G(u,u2f=ii(u,)~(u2). The transitive and reflexive closure of u is given by: (7) where n 2 2, we have G(U)= If u=jlj2...jnr aj, air . . . ajmc rzj,0;. , due to Lemma 32. In view of this
x S),
Proof. We shall prove initially that W satisfies the axioms (APi)-(APv). The relations c* and V’” clearly satisfy axioms (APi) and (APii). We have {x 1(x, x) E Svm> E i. Observe that (x, x) E 6”“” if and only if (x, x) E 6. Therefore, in order to show the above inclusion we have to consider one of the following cases: _ (1) (x,x) ES,, which gives x E 1. (2) (x, x) E aj,Sj2. There is t such that (x, t) E aj, and (t. \’ xt* E 6;. I‘ In view of the symmetry of 6,,. we
Formal semantics for database schemas
also have (x. [ ) E 4, and, due to the mergeability of the collection % we shall also have (x, t) E S,,. Applying Lemma 3 we obtain (x, x) E a,, , hence x E f. (3) If (x, X)E 1 x S or (x, X)E S x i then, obviOUS~Y,
x E
i.
The reverse inclusion. i = {x 1(x, x) E anrnj is straightforward. Since i = I(H) the triple H satisfies axioms (RAPiii) and (RAPv). In order to prove that H satisfies the axiom (RAPiv) we need to show that (8 U(i x S))* = 6* U(i x S). The inclusion (B U (f x S))* 2 8* U (1 x S) is obvious. We need to prove only that if (x,y) E (B U(i x S))’ then (X.y)Ee*U(jX S). Assume that *$J and let J,=JU {*} and ut = i x S. As we did in the proof of Lemma 34 we shall consider the relation G(u), where u EJ$. The reflexive and transitive closure of BU(ZxS)=bUa*is (6 u (i x s))*
= .tJ,
E(u). l
If(x.y)~cT,(u)andu~J*then,clearly(x,y)~a*. If u contains at least one occurrence of * wecanwriteu=u,*...jt’u,,whereu ,,..., u,EJ*. There is a sequence of types t,,, zO,. . . , tn, z, such that ro = x, (lo, zo) E E(uo), (zo, I,) E b*, . . . 3 (z,_ ,, I,) E CT**,(t,, zI) E ii (14,). The definition of u. implies zo, . . , znE I. Applying Lemma 33 we obtain x Ei, hence (x,y)Ei x S. This allows us to write ~*a~“~ = (8*U(f x S)) x6s?*=&*p~mU(ix
,g=&*dU~*&U(f+&-IU
(i x s) u (s x i) E b&uci2dub&- u 62Sb-1u (i x S)U (S x f). From Lemma 34 and the definition of 6 it follows that u*dT”” E (8$)symU (1 x S)rvm = gwm, which shows that RAPiv) is satisfied.
From the definition of H it is obvious that His an upper bound for %. Let M = (S’, u’, 6’) be another upper bound for %. We have 6 E u’, 8 c 6’, S E S’ and i E I(M). Therefore, i x S c I(M) x S’ s u’ since M satisfies (RAPiii). Thus u E u’. In a similar manner we can prove that d E b’. Thus His the least upper bound of U.0 Corollary 36. In the above setting, each H,, j E J is consistent tfl U.,6,H, is consistent.
The previous Corollary shows that the least upper bound of mergeable schemas preserves the consistency property. The next theorem shows that the nonredundancy is also preserved: Theorem 37. For., any collection of mergeable schemas % = { H,lj E J j the schema H = sup% is nonredundant and consistent if and only if all schemas H, are nonredundant and consistent. Proof Suppose that the schema His nonredundant
and consistent. We already proved that each of the schemas H, is consistent. Assume that H, is redundant. There are two types u, L’E S, such that u f c and (u, c). (L..u) E 6,. Since u; c B z u this would imply
75
that H is redundant. Thus if the least upper bound is nonredundant then each of the H, is nonredundant. Suppose now that each H, is nonredundant and consistent. The schema H is also consistent due to Corollary 36 and we get u * c B U 8’. Thus, to finish the proof of the theorem, it is sufficient to show that u’ = 6 U S2 is nonredundant, i.e. it is impossible to have (x,y), (y,x) E u’ with x # y. The proof is by contradiction and it has three nontrivial cases: (1) (x,y)Eu,,(y,x)Eu,,i#j. Then we have (x, y) E T(w) and, by Definition 30, (y. x) E u, which is impossible. (2) (x,y) E u,, (y, x) EU,U,. If j = k then, by the transitivity of u,, we have the situation in case 1. Suppose i = j and let (y, W)E u, and (w, x) E uk. Then, by transitivity of u,, we have (x, w) E ut and we are again in the situation of case 1. Otherwise let (y,w)~u,,(w,x)~u~. Then, by Definition 30, we have (y, w) E T(V) and, thus, (y, W)E u, which, by transitivity, implies (x, w) E u,, another instance of the situation in case (1). (3) (x, y) E u,u,,,, (y, x) E uruk. The cases when t = m or j = k are reduced, by transitivity, to cases (1) and/or (2) above. Otherwise, and let (x,z)Eu~,(z,Y)Eu,, (Y. W)EU, (w,x) E ut. By Definition 30 we have z, w E T(w). If m = j then, by transitivity of u,, we have (z, w) E u,,,, thus (z, w) E ui, (x, w) E u, which is a reduction to case (I). If m i j then y E T(w) and therefore, by Definition 30 and transitivity, (x, w) E u,, another reduction to case (l).o Example 38. We have seen that the collection w2 = {H,, H2, H, j from Example 31 is mergeable. The schema H, = sup’ig,, represented in Fig. 12 is consistent and nonredundant since schemas H,, H2 and H4 have these properties.
Given two relations u and d on a set S such that u is reflexive and transitive and 6 is symmetric the triple (S, u, 6) is not an AP-schema since in general, 6 and u do not to satisfy the axioms (APiii)-(APv). Nevertheless, we have the following result: Lemma 39 The relations u and d can be extended to two relations u, and 6, such that: (1) (Soar 6,) will satisfy (APi)-(APv) and (2) ty H’= (S. u’, 6’) is any other schema
z
1 t ---.
Y
to
t 5
1
Fig. 12. Schema H, = sup*,.
DAN A. SIMOVICIand DAN C. STEFANESCL'
16
such
that u CU’ and (S, Q, , 6,) Q (S, 6’9 6’).
6 ~6’
then
Proof. Consider the sequence of pairs of relations {(ui,6,)liaN} defined by u0=u,6,,=6 and Ui+ 1=
(UiU (1, X S))*,
(8)
and 6,+, =siu(Ii
x S)U(S
x Ii)Uu,siU6,u;‘,
(9)
where 1i={xIx~S,(x,~)~6i} for i20. Let us remark that all relations ui are reflexive and transitive and all relations ai are symmetric. Furthermore, they form two ascending chains: a, c u, E . . . E 6,. . . , 6,C6,G
. . . cd,...
Therefore, the relation u, given by:
is reflexive and transitive and 6, defined by
is
triple We claim the that satisfies the axioms (APiii)-(APv). The set of inconsistent types is symmetric.
H, = (S, u,, 6,)
due to the definition of 6,. Consider now a type x E I(H, ) and let y be an arbitrary type from S. Since x E Ii for some i 2 0 we have (x, y) E Ii x S E ai+, , which implies (x, y) E 6,. This shows that H, satisfies the axiom APiii. To prove that H, satisfies (APiv) consider three types x, y, z E S such that (x, y) E S, and (z, x) E 6,. There exist i,j E N such that (z, x) E ui and (x, y) E Sj. Since we have the ascending chains {u, 1i E N} and {6,liEN} we can write (z,x)Eu,, and (x,y)e&, k = max{i, j}, which gives, in turn, where kY)E~kh
E4+,
=a,.
Finally, let us verify that H, satisfies (APv). Assume that (x, x)E~, and let y ES be an arbitrary type. Since i E I, for some i EN it follows that (x,y)~I,xScu,+,~a,, which proves that H, satisfies (APv). Consider now a schema H’ = (S, o’, 6’) such that a s u’ and 6 c 6’. We shall prove by induction on ieN that ai=a’ and 6,~6’. The initial case for i = 0 follows immediately from the assumption made above. Suppose that ui c u’ and 6,~ 6’. We have to consider the following Let(x,Y)E&+,. five cases: (I) (x, y)e ai. We have (x, y)a 6’ due to the inductive hypothesis. (2) (x, y) E I, x S. In this case (x, x) E 6, c 6’,
which implies (I, Y) E 8’ due to the fact that 6’ satisfies the axiom (APiv). (3) (x, Y) E S x Z,. This case is similar to the second case. (4) (x, y) E a,&. There is u E S such that (x, u) E u, and (u, y) E 6,. Therefore, (x, u) E u’ and (u.y)ea’hence (x,y)ed’g’~S’. In this situation (y, x) E u,6,, (5) (X,Y)E&J;‘. which gives (y, x) E 6’. Since 6’ is symmetric we obtain (x, y) E 6’. Let now (x,y)oui+,.
If .r = y we have clearly
b,Y)EQ’.
Therefore, we may assume that there exists a sequence of elements of S: rO,. , z, such that n 2 1 for x =&), and (rp*rp+, )EU,U(I, x S),r,=y O
. . . cs,...
must stabilize after no more than 1S I2steps. The most complex operation in each step is the computation of the reflexive and transitive closure of u,U (Z, x S) which can be done in O(lS 13) time. Therefore, the algorithm (in its present form) requires a time of
O(lSIS). We are now in the position to articulate the outline of a design methodology for building a database schema from smaller schemas.
Fig. 13. Schema H, = sup%?,
Formal semantics for database schemas
Let now {HiI i E I) be a family of schemas, not necessarily mergeable. Define S = U{S,li E I}, c = (U{a,li E I})* and 6 = U{S,ji E I}. By applying the previous Lemma to the triple (S, ~,a) we obtain the schema H, = (S, u, ,6,). In this manner we have the possibility of putting together schemas which are not mergeable. We shall denote by R, the subschema of H, whose set of types is S, . As we have, previously noticed the family {R, 1i E I} is mergeable. Moreover, we have the following Corollary 40. The schema H, is sup{ Hi( i E I) in the complete
lattice
(AP(S),
<).
It is obvious that H, is an upper bound of the family {H,I i E I) in AP(S). Suppose that H, < H’ = (S, u’, 6’) in AP(S) for all i E I. We have Proof.
u
ISI
u, C a’
and
,I), 6, E 6’,
77
specialization and generalization. The introduction of schemas with repositories makes possible the formalization of the construction process of a schema. As we have shown, the order in which the types (together with their elementary schemas) are added to the database schema is semantically significant since it gives us the possibility of identifying those types produced through generalization vs the types produced through specialization, We believe that there are many more aspects of database schema semantics which deserve attention. Among other issues, we mention the extension of the notion of mergeability to schemas with repositories, the study of mappings between schemas which would formalize abstractization and generalization between databases and the investigation of other types of constraints between the participating types.
hence
REFERENCES ill A. V. Aho, J. E. Hopcroft
and J. D. Ullman. Dare Reading,
Structures and Algorithms. Addison-Wesley,
Mass. (1983). PI P. Atzeni and D. S. Parker. Formal properties of net-based knowledge representation schemes. In hoc.
and
of
in view of the reflexivity and transitivity of 0’ and of the symmetry of 6 ‘. Applying the previous Lemma we obtain H, < H’ which shows that H, is, indeed, the least upper bound of the collection {HiI i E I}. 0 Example
41.
Consider
the
collection
%?,= {H,,
Hz, H,} of non -mergeable schemas. Following the approach outlined above we consider the set of types S = {x, y, z, up}.
Due
to the
non-mergeability,
the
given in Fig. 13 is inconsistent despite the fact that all participant schemas are consistent. schema
H,
5. CONCLUSIONS
The mathematical properties investigated above have significant consequences on the design of database schema. The study of the complete lattice of schemas and the introduction of the notion of family of mergeable schemas allowed us the possibility of formulating a design methodology of database schemas which is capable of eliminating redundancy and inconsistencies by analyzing the local component subschemas which appear naturally in a bottom-up approach to the design process. The notions of view and repository are important for a clear differentjation between the notions of
the
Zhd
IEEE
Conf.
on
Data
Engineering,
pp. 700-706 (1986). 131 S. Abiteboul and R. Hull. IFO: a formal semantic database model. ACM Trans. Dotabase Sysl. 12(4), 525-565 (1987). 141 A. Borgida, J. Mylopoulos and H. K. T. Wong. Generalization-specialization as a basis for software specification. In On Conceptual Modelhng (Edited by M. E. Brodie, J. Mylopoulos and J. W. Schmidt), pp. 87-116. Springer, New York (1984). is1 M. L. Brodie and D. Ridjanovic. On the design and specification of database transactions. In On Conceptual Modelling (Edited by M. E. Brodie, J. Mylopoulos and J. W. Schmidt), pp. 277-306. Springer, New York
(1984). a M P. P. Chen. The entity-relationship model-toward unified view of data. ACM Trans. Database Syst. l(l), 9-36 (1976). 171 R. Hull and R. King. Semantic Database Modelling: Survey, Applications, and Research Issues, TR-86-201.
Computer Science Department, University of Southern California, Los Angeles (1986). PI R. Hull. A survey of theoretical research on typed complex database objects. In Dnrubases (Edited by J. Paredaens), pp. 193-256. Academic Press, New York (1987). [91 R. Reiter, On the integrity of typed first order data bases. In Advances in Database Theorv (Edited bv H. Galaire, J. Minker and J.-M. Nicblas). Vol. i, pp. 137-157. Plenum Press, New York (1981). [lOI J. M. Smith and D. C. P. Smith. Database abstractions: aggregation and generalization. ACM Trans. Darubere Sysr. 2(2), 105-133 (1977).