I DATA & KNOWLEDGE ENGINEERING ELSEVIER
Data & Knowledge Engineering 26 (1998) 99-115
Supporting update propagation in object-oriented databases* Chengfei Liu, Hui Li, Maria E. Orlowska** Cooperative Research Center (CRC)for Distributed Systems Technology and School of Information Technology, The University of Queensland, Qld. 4072, Australia Received 6 December 1996; revised manuscript received 14 April 1997; accepted 13 May 1997
Abstract
Objects in a database are interrelated, When an update operation is applied to an object, it may also impact on its related objects, depending on the semantics of their relationships. Current OODBMSs provide no support for update propagation but hard-coding. In this paper, we study update propagation support for generic update operations in object-oriented databases. We take a declarative approach, specifying propagation policies for each identified reference attribute in classes of an object-oriented database schema. Propagation policies for generic update propagation are well defined. However, we also discover that potential conflicts among propagation policies may occur if the policies can be arbitrarily specified by a designer. Therefore, we promote the update propagation problem to a higher level, investigating possible dependencies between objects. As such, the designer only needs to specify the dependency property for each reference attribute. Propagation policies are predefined for each type of dependency. By introducing some restrictions on an object-oriented database schema, conflict-free propagation policies can be achieved. Implementation issues for update propagation support in object-oriented database systems are also addressed.
Keywords: Object-oriented databases; Update propagation; Propagation policy; Object dependency
I. Introduction Objects in a database are interrelated. W h e n an update operation is applied to an object, it m a y also i m p a c t on its related objects, d e p e n d i n g on the semantics o f their relationships. Let us c o n s i d e r an e x a m p l e for electronic d o c u m e n t p r o c e s s i n g [12]. S u p p o s e that a document consists o f a title, an author and a n u m b e r o f sections. A section in turn is c o m p o s e d o f paragraphs. A d o c u m e n t m a y share sections with other d o c u m e n t s . A section exists o n l y if it b e l o n g s to at least one d o c u m e n t and c a n n o t be deleted as long as there exists a d o c u m e n t using it. Similarly, a p a r a g r a p h m a y be shared a m o n g different sections. F o r a p a r a g r a p h to exist, there m u s t be at least one section containing it and * The work reported in this paper has been funded in part by the Cooperative Research Centres Program through the Department of the Prime Minister and Cabinet of the Commonwealth Government of Australia. ** Corresponding author. Tel.: (61-7) 3365-2989; fax: (61-7) 3365-1999; e-mail:
[email protected] 0169-023X/98/$19.00 © 1998 Elsevier Science B.V. All rights reserved
PII: S 0 1 6 9 - 0 2 3 X ( 9 7 ) 0 0 0 2 2 - 0
100
C. Liu et al. / Data & Knowledge Engineering 26 (1998) 99-115
thus a document containing it. A paragraph cannot be deleted individually either. Annotation paragraphs may be added to documents, however, they are not shared among different documents. For an annotation paragraph to exist, the document which it annotates must also exist. Further, documents may contain images that are extracted from other files. The existence of images, however, does not depend on the documents using them. Due to these different relationships, when an update operation, say delete, is applied to a document object, the propagations to its related sections, annotation paragraphs and images are also different. (i) For each of its related sections, the section is deleted if it is no longer shared by other documents. The propagated deletion of a section object may impact on its related paragraphs in the same way. (ii) All of its annotation paragraphs are deleted unconditionally. (iii) None of its related images are deleted. Current object-oriented database systems provide no support for update propagation. Usually, a designer has to implement an ad hoc method in a class which does the update to the objects of the class as well as the propagation to all the related objects for each update operation. This hard-coding approach is obviously cumbersome, where the propagation rules are buried in the code, making them hard to understand and control. In addition, propagation rules cannot be changed dynamically. Also, similar forms of propagation defined on different object types cannot be reused. Ideally we shall be able to support update propagation in a declarative way, such that the designer only needs to specify what to do, not how to do it. In relational database systems, efforts have been put into support for referential integrity constraints [3,4,15,16]. This can be regarded as a restricted form of update propagation applying to the insert, update and delete operations by specifying referential integrity rules. However, referential integrity rules are not sufficient to specify all update propagation rules. For instance, in the above electronic documents example, the propagation rule of deletion on document objects to section objects across the relationship Share is hard to express using referential integrity rules. This propagation is somehow restricted, i.e., only the last document object which shares a section object will propagate the deletion to the section object. Since a document may contain a set of sections, and a section may be shared by many documents (i.e., not dependent exclusively to a document), the relationship Share between document objects and section objects is an m:n relationship. In relational database systems, an extra relation must be introduced to represent an m:n relationship. Therefore, in the example, a relation called Doe_Share_See is defined to reflect the relationship Share between document objects (modelled as Document relation) and section objects (modelled as Section relation). Document(title, author,-" .) Section(section#, • • .) Doc_Share_Sec(title, s e c t i o n # , . . - ) The primary key of each relation is underlined. There are two referential integrity constraints: Doc Share Sec.title C_Document.title Doc Share Sec.section# C_ Section.section# As we may observe, what referential integrity rules can specify in this scenario is limited to the impact on the tuples of the relation DocShare_Sec when an update operation is applied to either
C. Liu et al. / Data & Knowledge Engineering 26 (1998) 99-115
101
Document tuples or Section tuples. They cannot specify what should be done to Section tuples when the update operation is applied to Document tuples across the relation Doc_Share_Sec. Much work has been done in representing and enforcing integrity constraints in object-oriented database systems [5-7,9,10]. Although update propagation could be considered as a case of integrity constraint maintenance, the specification of update propagation rules, however, is slightly different from specification of integrity rules. An update propagation rule emphasises how the database can evolve from one valid state to another, while an integrity rule emphasises what are valid states of databases. Rumbaugh [17] proposed a simple mechanism for controlling operation propagation across relationships for general object-oriented programming languages. It is based on associating propagation attributes for particular operations with the relationships. However, this work has not addressed update propagation in object-oriented database systems specifically. There is no concern about potential conflict in propagation attribute declarations. In this paper, we focus on update propagation in object-oriented databases. We realize that automatic system support for application-specific update propagation is hard to achieve. Therefore, our discussion is concentrated on propagation of generic update operations. Similar to [17], propagation policies are specified for each generic update operation on reference attributes in all classes of an object-oriented database schema. Possible update propagation policies are carefully studied. However, we recognise the potential policy conflict problem if policies can be arbitrarily specified by a designer. Therefore, we promote the update propagation problem to a higher level. We investigate possible dependencies between arbitrary pairs of objects such that the dcsigner needs only to specify the dependency property for each reference attribute. Propagation policies for each type of dependencies are predefined. With some restrictions on an object-oriented database schema, the system will generate propagation policies automatically and guarantees conflict-free propagation.
2. Update propagation in OODBs 2.1. Object-oriented databases Object-oriented databases [2,11] were proposed to meet the needs of advanced database applications, such as CAD/CAM, software engineering, spatial databases and multimedia databases, where applications require more complex structures for objects, new data types, and the need to define nonstandard application-specific operations. In object-oriented databases, any real-world entity is uniformly modelled as an object. Each object has a state and a behaviour associated with it. The state of an object is defined by values of its attributes and the behaviour of an object is specified by the methods that operate on the object state. They are encapsulated together and accessed or invoked from outside the object through explicit message passing or function calls. An object is uniquely identified by a system-generated object identifier (OLD) which, unlike the primary key in relational databases, cannot be modified by applications. The value of an attribute of an object is also an object in its own right, which can be both primitive and non-primitive objects. A non-primitive object in turn consists of a set of attribute values. Furthermore, an attribute of an object may take on a single value or a set of values. In contrast, the relational data model only allows an attribute to take on a single primitive
102
C. Liu et al. / Data & Knowledge Engineering 26 (1998) 99-115
object as value. Objects with the same set of attributes and methods are grouped in classes. The intentional notion of a class corresponds to an Abstract Data Type definition. Definition 1. A class named as C is defined as C = ( ~, {Csup}, {A : CA}, {M : {P : Ce}}) (a) c~ is the extent of C, c~ = {o} where o is an object (instance) of C. (b) {Csup} is the set of super-classes of C. (c) {A : CA} is the set of attribute definitions of C, where A and CA represent name and domain type of an attribute, respectively. The value of attribute A of an object o is expressed by o.A. (d) {M :{P:Cp}} is the set of method definitions of C, where M, P and Cp are method name, parameter name and parameter domain type, respectively. An object-oriented database schema S is defined by a set of classes
S = {C~, C2,..-, C.} where Ci(1 <~i <~n) is a class name. For the discussion of this paper, we are only concerned with the extent and attributes of a class. Attributes can be classified as two types: literal attributes and reference attributes. A literal attribute takes immutable object(s) as value, while a reference attribute takes mutable object(s) as value. As defined in ODMG-93 object model [2], a relationship between objects of two classes C i and Cj in object-oriented databases can be expressed as a pair of reference attributes (traversal paths as in O D M G - 9 3 ) Are f and A~ef, where Are f is a reference attribute of C i with Cj as its domain type, and A~ef is a reference attribute of Cj with C i as its domain type. A~ef is called the inverse reference attribute of Are f and vice versa. To emphasise the reference relationship, we use the following notation to represent a reference attribute Are f of class C i with class Cj as its domain type, Aref: c~i---) ~j where c~i and c~i are extents of (7,. and Cj respectively. If Are f is a single-valued reference attribute, A ref of every object in ~; takes one object (i.e., its OID) in ~j as its value. If Ar~f is set-valued, A ref of every object in c~; can take a set of objects (OIDs) in c~j as its value. For the purpose of this discussion, we treat a single-valued reference attribute as also taking a set as value, but the set can hold at most one OID. The directional representation contrasts to the in-directional key value based representation of the relational model. In relational databases, a relationship is usually represented as a referential integrity constraint of primary key values in a relation Ri and foreign key values in another relation Rj. If the relationship is of re:n, an extra artificial relation R r must be created to represent the relationship, with referential integrity constraints between R; and R r, and between Rj and Rr. There is no direct constraint between Ri and Rj. This is why an update operation fails to be propagated either from tuples of R; to Rj or from tuples of Rj to R;. However, the m:n relationship can be expressed as a pair of set-valued reference attributes in object-oriented attributes, without introducing an extra class. In order to simplify the representation of the update propagation problem in object-oriented databases, we assume the so-called relational integrity constraints [9] always hold for any relationship,
C. Liu et al. / Data & Knowledge Engineering 26 (1998) 99-115
103
i.e., if a reference attribute Are f : ~i '--> ~y exists in class Ci, the inverse reference attribute A~e f of Are f also exists in the class C~. In the electronic document processing example, four classes Document, Section, Paragraph, Image can be defined for the object-oriented database schema S~do~, i.e., Sedo~ = {Document, Section, Paragraph, Image}. In the following, we give type definitions of these four classes. Every reference attribute has an inverse reference attribute specified by an inverse clause. class Document type { title: String; author: String; content: Set(Section)inverse Section: shared by; annotation: Set(Paragraph)inverse Paragraph: annotated_for; figure: Set(Image)inverse Image: referred_by
} class Section type { content: Set(Paragraph) inverse Paragraph: shared by; shared by: Set(Document)inverse Document: Content
} class Paragraph type { annotated_for: Document inverse Document: annotation; shared_by: Set{Section)inverse Section: content
} class Image type { image_file: File; referred by: Set(Document) inverse Document: figure
} 2.2. Problem specification The update propagation problem can be stated as follows: When an update operation is applied to an object, the update is not just applied to the object only, it may also propagate to its related objects across its relationships to these objects according to application-specific rules, until no new objects can be reached.
Usually update propagation is not symmetric, propagation on an object at one end of a relationship may not be the same as that at the other end.
104
C. Liu et al. / Data & Knowledge Engineering 26 (1998) 9 9 - 1 1 5
In most OODBMSs, an application designer has to implement a method for each update operation as well as its propagation for each class, no matter how similar the operations are. The semantics of an update operation on individual objects is mixed with the semantics of propagation along t h e relationships. As we may observe, relationships between objects are the main reason of propagation of a generic update operation. The selection of which objects the operation should propagate to can be separated as a property of the relationship between objects, rather than a property of the operation as a whole. As discussed above, a relationship in OODBs can always be represented as a pair of reference attributes. So it is possible for designers to simply declare a propagation policy on each reference attribute for each type of update operation. Given an object-oriented database schema, we define a reference graph to reflect all relationships between classes in the schema, Based on a reference graph, we further define a propagation graph for specifying propagation policies for a generic update operation.
Definition 2. Given an object-oriented database schema S = {C l , C2," • • ,Cn}, a reference graph of the schema S is defined as a labelled directed graph RG = (VRc, Eel;, LRc) where (a) VRc= S, (b) ERe is a finite set of arcs. If a reference attribute A ~f: T~i --> ~j is defined in class C i of S, then an arc e
~ ERG, (C) LRC is a set of labels for edges by applying a labelling function f : ERo ~ d × ~1, where ~ / i s a set of reference attribute names. The function f maps each arc e = (Ci, C~) E Eu~; to a pair ~ S~ X ,~, where A ref is the name of a reference attribute A ref : ~i ~ ~¢~J'and A ief is the name of the inverse reference attribute of Are~. Fig. 1 shows the reference graph for the electronic document database schema S~doc. =
Definition 3. Given an object-oriented database schema S = {C I , C 2 , " • " , Cn}, a propagation graph of the schema S for a generic update operation U is defined by PG = (Vpc,, EpG, Le6) where (a)
= s,
Document
Image
Section
(content,shared_by
Paragraph Fig. 1. Reference graph for electronic document example.
C. Liu et al. / Data & Knowledge Engineering 26 (1998) 99-115
105
(b) EpG is a finite set of arcs. If a reference attribute Aref: (~i " ~ (~j is defined in class Ci of S, then an arc e = (C/, Cj) ~ ER~, (c) Lec is a set of labels for edges by applying a labelling function g : Epz ~ ~¢ × ~4 × ~, where s¢ is a set of reference attribute names and ~ is a set of propagation policies for the operation U. The function g maps each arc e = (C i, Cj) E E p c to a triple (Aref, A~ef, P) E ~ × ~ X ~, where Are f is the name of a reference attribute Ar~f: ~, --4 4 ' A[ef is the name of the inverse reference attribute of A r~f, and P is a propagation policy defined on e for the operation U. Propagation policies for each generic update operation are defined later in Section 2.4. The propagation graph PG of a schema S for an update operation U can be constructed from the reference graph RG of S by simply augmenting the label with a propagation policy, i.e., associating a propagation policy with each reference attribute for the operation U. We will show the propagation graph of S~do~ later in this section after propagation policies for each generic update operation are introduced.
2.3. Generic update operations In object-oriented databases, a generic update operation consists of deleting an object, inserting an object, and modifying an attribute value. Usually, modification of literal attribute values of an object will not impact on other objects, provided the object-oriented database is well designed and update anomalies will not occur during updating [13]. Furthermore, since a reference attribute value is an OID or a set of OIDs, OIDs can only be added or removed (i.e., adding a reference or removing a reference), but cannot be modified. Therefore, the generic update operations which will result in propagation are restricted as follows: • delete an object • remove a reference • insert an object • add a reference
2.4. Propagation policies Each reference attribute Aref: ~ ~ 4 has a propagation policy defined tbr each generic update operation U. The propagation policy specifies how to propagate the operation U from an object in ~i to its related objects in Wj. Actually, a propagation rule can be either generic or application specific. Realising that automatic support for application specific update propagation is very hard, possibly nothing but triggering or hardcoding, therefore, we only address generic update propagation. After careful observation between arbitrary pairs of objects, we summarise the following generic propagation policies for the four types of generic update operations. The propagation policies for delete object operation: • lndependentlyPropagate(IP) If IP policy is defined on reference attribute Aref" (~i ~ %' and 3oik E ~ , 3o"i ~ % such that O1D(o ) E ok.Aref, ' then deleting o k' will propagate the delete operation to o/.J • DependentlyPropagate(DP)
C. Liu et al. / Data & Knowledge Engineering 26 (1998) 99-115
106
If DP policy is defined on reference attribute A re, : CCi---> %, and 3o~ E ~,, 3o~ ~ • such that OID(o~) E ok.Aref, then attempt to propagate the delete operation on o k" to o~. i J However, if the deletion of o jr fails, Oik is not allowed to delete. This implies that the deletion of o kj is dependent on the deletion of o Jr. An object cannot be directly deleted if the object is reserved for multiple objects. This is defined by the RV policy later in this section. An object o k cannot be deleted if either o k cannot be directly deleted or there exists another object o t such that DP policy is defined from o k to o r and o r cannot be deleted.
• RestrictedlyPropagate(RP) If RP policy is defined on reference attribute A ref: ~,--~ '~, and 3 o~. ~ %~i, 3o~ ~ ~j such that OID(o~) ok.A~e ' e, theft deleting o ki will propagate the delete operation to o~J only if ~3o~ ~ such that OID(o~)E o~.A~ef, where A~ef: ~j---> T0i is the inverse reference attribute of Are f. • Inverse(/v) I f / V policy is defined on reference attribute A ~.- %~---> ~,, and 3oik ~ ~ , ::1o~~ ~ such that i ) t OID(o~)~ok.Ar¢ e, then deleting o ki only causes removing OID(o~.) from o~.A~, where A ~f" ~j ~ c6~ is the inverse reference attribute of ar,t.
i Reserve(RV) If RV policy is defined on reference attribute a r~.: qg~--~ %, and 3o~ ~ c~, 3o~ ~ c~ such that O1D(o~) ~ o~.A~t, then o~ is simply not allowed to be deleted. This is the only policy that can cause a deletion fail directly. The propagation policies for remove reference operation:
• IndependentlyPropagate(IP) If IP policy is defined on reference attribute Aref: ~i "--') ~ , and 3o/~ ~ ~ , 3o~ C ~j such that OID(o~) ~ oik.A~f, then removing OID(o~) from oik.A~f will result in removing OID(o~) from j t °r.A r~f and propagating a delete operation to o Jr. • DependentlyPropagate(DP) If DP policy is defined on reference attribute Ar~f" ~ --~ T~j, and 3o~kC ~ , 3o~ ~ W~ such that OID(o~) ~ ok.A~e, then attempt to propagate the delete operation to o~. ' J However, if the deletion i of o~ fails, OID(o~) is not allowed to be removed from ok.A r~f" Otherwise, remove OID(o~) from o~.Are f and propagate the delete operation to o~. • RestrictedlyPropagate(RP) If RP policy is defined on reference attribute A ref : %~---> ~ , and qo~ ~ ~ , ~o~ ~ ~ such that OID(o~) ok.Ar~e, then removing OID(o~) from o'k.Ar~f will result in removing oIm(o~k) from i ~ o~.Z r~, o~.Z~ t. and propagating a delete operation to o~ only if ~:1o~ ~ ~ such that OIO (o~) where AI~e • ~ - - 4 ~ is the inverse reference attribute of Are f.
• Inverse(/v) I f / V policy is defined on reference attribute Zr~.: ~ ---> %~), and 3o~ ~ ~ , 3o~ G ~j such that OID(o~) ~ o'k.a~, then removing OID(o~) from o~.A~f will result in removing OID(o~k) from j o ~.A~f only. The propagation policies for insert object operation:
• None(NO) The insertion of an object o~k in ~, is not restricted to the existence of any object in % if policy
NO is specified on a reference attribute A ref" C~ __~ C~j for insert operation. • Bound(BD)
C. Liu et al. / Data & Knowledge Engineering 26 (1998) 99-115
107
The insertion of an object oik in ~i is bound to the existence of an object in specified on a reference attribute A ref: ~i --'> % for insert operation. The propagation policies for add reference operation:
if policy BD is
• Inverse(IV) If IV policy is defined on reference attribute Are f : (~i ~ ~j, then adding OID(o~) in o~.Are t will cause adding OID(o~)in o~.A~ef, where A~ef: ~j ---> ~i is the inverse reference attribute of Are,..
• NotAllowed(NA) If NA ipolicy is defined on reference attribute A ref' T~i~ ~j, then it is not allowed to add OID(o~)
into o h.A refThe propagation graph of the electronic document database schema Sedoc for delete operation is shown in Fig. 2. From the figure we can see when deleting a document object its propagations to objects via different reference attributes are different. (i) It will independently propagate to its related section objects, the document object will definitely be deleted. A section object cannot be deleted individually, it will be deleted by the last document object which references it. (ii) It will dependently propagate to its annotated paragraph objects. An annotated object can be deleted individually, resulting in removing the inverse reference of the document for which it annotates. (iii) It will just remove the inverse references of referred figure objects, not causing their deletion. 3. Policy conflict and prevention
3.1. Potential policy conflict Specifying propagation policies for update propagation is much easier than hard-coding individual methods. However, specifying a propagation policy depends on designer's perception towards the application. It is possible that the policies declared by a designer are incompatible or conflicting. Document
(content,shared_by, IP).~ Section
~igure,IV)
(ann°tatiI°n'an (anr ,ted,
ted for, annotation,IV)
(co ntant,shared by,I p)"-'.-~--.____~
Paragraph Fig. 2. Propagation graph for electronic document example.
~
unage
108
C. Liu et al. / Data & Knowledge Engineering 26 (1998) 99-115 .
RV
/RV
,(~
"@'
(a) Cyclic RV policy declaration
(b) Self-contradictory policy declaration
Fig. 3. Potential policy conflicts.
An object may reserve(RV) references for other objects. If there exist object Oip in ~ reserves reference for object OJq in 4 ' OJq reserves reference for object o,k~in ~k, and o,k reserves reference for oip, where ~i, ~j and c~ are extents of classes Ci, Cj and C~, respectively, then it will not be allowed to delete any object in the cycle. As shown in Fig. 3(a), such a cyclic RVpolicy declaration may cause a potential propagation policy conflict. Fig. 3(b) shows another potential conflict that may occur. If an object Oip may propagate a delete to another object OJq through one path of references, while oi, also reserves for o Jq, then the propagation is self-contradictory. In addition, it is also possible that the policies defined for one update operation are not compatible with the policies defined for another update operation. As shown in Fig. 4, the policy defined on reference attribute Are f for delete operation is incompatible with the policy defined on the inverse reference attribute A ~ef of A ref for insert operation. The policy BD defined for insertion says an object in ~ cannot exist without the existence of an object in ~ . While the policy/V defined for deletion says there is no requirement for propagation to objects in cCj if an object in ~. is deleted. Obviously, arbitrary policy definitions on reference attributes for the same or different operations can easily cause policy conflict and incompatibility. Bearing the problems in mind, we study the semantics of relationships and propose an approach for resolving the incompatible and conflicting policy declarations at a higher level.
3.2. Types of object dependencies
Kim et al. [12] presented a model of composite objects and studied four types of composite references. Halper et al. [8] and Artale et al. [1] elevated the part-whole relation to the status of a first-class citizen and further studied the semantics and functionality of a variety of part relations. In this paper, we emphasise update propagation, therefore, we consider dependencies between arbitrary pairs of objects, not limited to the part-whole relation. In this light, we classify types of dependencies between objects as follows.
©.
.o (a) Insert
©
©
iv (b) Delete
Fig. 4. Incompatible policy declaration.
.©
C, Liu et al. / Data & Knowledge Engineering 26 (1998) 99-115
109
Definition 4. Class Ci is said to be exclusively dependent on Class Cj via reference attribute A~¢f: ~/---> % iff
Voik ~ c~i, 7! o~ E c~i (OID(o~) E Oik.Zref) holds for every state of the database• We also call the reference attribute A~f to be exclusively dependent and its inverse reference attribute A~f: c~j __> ~ to be exclusively determinant• Definition 5. Class C~ is said to be sharedly dependent on Class Cj via reference attribute A~f: ~---> ~ iff i (a) Vo~ E ~ , 3o~ @ ~ (OID(o~) ~ ok.Z ~, ) holds for every state of the database, (b) o~k is not allowed to be deleted if 3o~ ~ cdj (OID(Jk) Eol.A~¢f), ~ ' where Are' f • (~j----)(t~i is the inverse reference attribute of Are f. We also call the reference attribute A ~ef to be sharedly dependent and A ~ef to be sharedly determinant. Definition 6. Class C~ is said to be multiply dependent on a set of classes attributes mrefl," " "Aref," iff
Cjl ,• • •
Cjm via reference
VOik ~ ~t" ~0~I ~ ~jl' • "'~0~ m ~ ~jm (OlD(°~I)E ok.aref~ i i A "'" /X OID(O~m) E ok.a,.ef .) holds for every state of the database• We also call the reference attributes A r~f,,... A,~f, to be multiply dependent and their inverse reference attributes A ~ f , " ' A ~ e f , to be multiply determinant. Definition 7. Class C~ and Class Cj is said independent via reference attribute A r~ : c~i ~ ~ iff there is no restriction on their objects. We also call the reference attribute A~ef and its inverse reference attribute A ~ef" C~j ___>% to be independent. A reference attribute is said to be dependent if it is exclusively dependent, sharedly dependent or multiply dependent• Likewise, a reference attribute is said to be determinant if it is exclusively determinant, sharedly determinant or multiply determinant. Usually, real update propagation happens only to the determinant reference attributes. In the next, we study the propagation policies for each type of these object dependencies•
3.3. Predefined propagation policies for different dependencies Fig. 5 shows propagation policies for different types of object dependencies. As we know, a delete operation can be divided into two parts: one is the operation to the object itself; another is the impact operation, i.e., propagation on all its references to other objects. Whenever an object is deleted, its references to other objects must be removed as well, therefore a remove reference operation sometimes is a sub-operation of delete operation, i.e., the policies for remove reference operation is implied by delete operation and thus is not shown in the figure. First, let us consider propagation policies for delete operation. If object oik in ~i is exclusively dependent on object o Jl in q~i' deleting o~ will dependently propagate(DP) to oik, while deleting o~k only results in deleting inverse reference. If object o~k in ~i is sharedly dependent on a set of objects O~ E c~j, o~k is reserved for all objects in O~ until they are all deleted. The deletion of every object in
C. Liu et al. / Data & Knowledge Engineering 26 (1998) 99-115
110
ration Delet~
Insert
Add Reference
q
cj
cj
Ci Exclusively Dependent on CJ
Ci
C!
q
cl
CI
cj
Ci Sharedly Dependent on Cj
Ci
Cjk
CJl
C!
q
CI
¢j
Ci Multiply Dependent on CJ
c|
Ci
c|
q
c|
cj
CI Independent from CJ
Ci
c!
c!
Fig. 5. Predefined propagation policies for different dependencies.
O~ will independently propagate(IP) to oik but only the last deletion will cause o~. to be deleted. If object oik in ~i is multiply dependent to multiple set of objects O~ C % , , . . . O~m ~ C %m via different reference attributes, deleting object in any of these sets will restrictedly propagate to o~k.Note, there is a difference between sharedly dependent and multiply dependent reference attributes, o~k is not allowed to delete individually in the former but is allowed in the latter. For independent relationships, deleting an object only causes the removal of its inverse reference. The propagation policies for insert and add reference operations are simple. In fact, there is no propagation problem. When inserting an object, there is no extra operation but there is a restriction if the object being inserted is dependent by definition on some other objects. When adding a reference, the only extra operation is to set the inverse reference. By categorising reference attributes according to different types of object dependencies, it is easy to see that policies predefined for different operations on relationships (i.e., a pair of reference attributes) with the same type of object dependency are compatible. However, it is permitted in a class definition to have reference attributes with all types of object dependencies. The following rules represent what kind of combination of these dependencies a single instance of a class can have.
C. Liu et al. / Data & Knowledge Engineering 26 (1998) 99-115
111
Rule 1. I f an object o is exclusively dependent on another object, it will not be allowed to have any dependencies (exclusive, shared, multiple) on other objects. Rule 2. If an object o is sharedly dependent on some objects, it will not be allowed to have any exclusive and multiple dependencies on other objects. However, new shared dependencies on other objects will be allowed. Rule 3. If an object o is multiply dependent on some objects, it will not be allowed to have any exclusive and shared dependencies on other objects. However, new multiple dependencies on other objects will be allowed. Rule 4. An object can have any number of independent reference to it, no matter whether it is dependent on some objects.
3.4. Conflict policy prevention As we may notice now, propagation is due to necessary enforcement of object dependencies. In the following, we first define a dependency graph of a given reference graph, then show that if a dependency graph is acvclic, the previous propagation policy conflict problems will no longer exist. Definition 8. Given an object-oriented database schema S = {Cj,. • • C,}, a dependency graph DG of the schema S is a subgraph of its reference graph RG, it can be constructed as follows: (a) EoG C ERG is derived by keeping those arcs in ER~, each of which is labelled with (Aref,A~ef), where A r~f is a dependent reference attribute name, (b) VD~ C VR~ is therefore reduced to keep only those which are connected by arcs in Eoa. Fig. 6 shows the dependency graph of an electronic documents database.
~
Document
Section
~
Paragraph
Fig. 6. Dependency graph for an electronic document example.
112
C. Liu et al. / Data & Knowledge Engineering 26 (1998) 99-115
Proposition 1. If the dependency graph DG of an object-oriented database schema S is acyclic, then the schema S is free of RV-cycle policy conflict. Proof: From Fig. 5 we can see RV policies are only defined on sharedly dependent reference attributes in the propagation graph PG of S. Since the dependency graph DG is acyclic and DG is a subgraph of the reference graph RG of S, there is no cycle of sharedly dependent reference attributes in RG, therefore, there is no RV cycle in PG.
Proposition 2. It the dependency graph DG of an object-oriented database schema S is acyclic, then the schema S is free of self-contradictory policy conflict. Proof: Suppose a self-contradictory policy conflict exists in the propagation graph PG of S. Then from Fig. 5, we can replace all IP policies with inverse RV policies in PG. As a result, a dependency cycle of sharedly dependent reference attributes is formed in PG thus RG and DG of S. This contradicts the acyclic dependency graph. The dependency graph shown in Fig. 6 is acyclic; therefore, the electronic document processing database schema S~aoc is free of both RV-cycle policy conflict and self-contradictory policy conflict.
4. Implementation In supporting update propagation, we extend the class definition of an object-oriented database schema to include the type of object dependency for each reference attribute appearing in the schema. This is similar to specifying referential integrity constraints and rules when creating a table in relational database systems. For example, the class definitions for Section class and Paragraph class are extended as follows. class Section type { content: Set inverse Paragraph: shared by shared_determinant; shared by: Set(Document)inverse Document: content shared_dependent
class Paragraph type { annotated for: Document inverse Document: annotation exclusivedependent; shared by: Set inverse Section: content shareddependent
}
C. Liu et al. / Data & Knowledge Engineering 26 (1998) 99-115
113
Based on the definitions of all classes of an object-oriented database schema, it is easy to construct a dependency graph for the database and to check whether the graph is acyclic. We can implement propagation policies in an abstract class. With the help of a class hierarchy and the inheritance facility of an OODBMS, all concrete classes can inherit the propagation methods defined at the high-level abstract class. Specialisation may be applied for the application-specific update propagation based on the propagation methods of a generic update operation. This can relieve lots of hard-coding work from a designer, while avoiding propagation conflicts and incompatibilities which may occur and is hard to check if implemented individually by designers. In supporting different types of reference attributes, the rules 1-4 should be enforced by the system. This can be accomplished by defining a new insert operation also in the abstract class which refines the insert operation implemented by the OODBMS. The refinement works simply by checking the preconditions defined by rules 1-4. It is possible that an update operation to an object may propagate to another object several times from different paths. Rules 1-4 put a restriction on propagation to any object. Since an object can have at most one type of dependency on other objects at a time, the types of multiple propagations to the same object, if they occur, should be the same. This guarantees that there are no propagation conflicts along different propagation paths. In addition, we maintain the mutual references for all relationships between objects. When an update propagation happens to an object, say delete, all references of the object to other objects will be removed. Therefore, recursive processing of update propagation is possible, not necessarily restricted to transaction-based processing as adopted in [17]. However, in supporting the DP policy, a two-phase recursive processing strategy is needed. At the first phase, check whether all objects which are exclusively dependent on the object are deletable. At the second phase, really apply the propagation.
5. Conclusion
Update propagation is an important topic in database systems. In most database systems, it is the database application designers' responsibility to hard-code the propagation semantics in transactions or methods of OODBMSs. In this paper, we discussed update propagation support in object-oriented database systems and tried to relieve the burden from designers as much as possible. Since automatic support of all application-specific update propagation is hard to achieve, we focused on update propagation of generic update operations, particularly the delete operation. As we know, propagation of update is caused by relationships between objects; therefore we took a declarative approach, specifying propagation policies on the reference attributes, rather than hiding propagation semantics in the update method of each class. We studied the propagation policies for generic update propagations and discovered that potential conflicts among propagation policy declarations may occur if a designer can arbitrarily declare propagation policies on reference attributes of a class. As such, we promoted the problem to a higher level so that designers only need to specify the dependency property for reference attributes. Propagation policies are predefined for each type of dependency. By introducing some restrictions on class definition, the potential conflicts will disappear. In the paper, we also addressed the implementation issues for update propagation support in object-oriented database systems.
114
C. Liu et al. / Data & Knowledge Engineering 26 (1998) 99-115
Currently, w e are supporting update p r o p a g a t i o n for a w o r k f l o w repository [14] using Objectstore O O D B M S b a s e d on the ideas introduced in this paper.
Acknowledgments T h e authors w o u l d like to thank Dr. A r t h u r Ter H o f s t e d e for m a n y useful c o m m e n t s that he has provided.
References [1] A. Artale, E. Franconi, N. Guarino and L. Pazzi, Part-whole relations in object-centered systems: An overview, Data and Knowledge Engineering 20 (1996) 347-383. [2] R.G.G. Cattell, (ed.), The Object Database Standard: ODMG-93 (Morgan Kaufmann, 1994). [3] C.J. Date, A Guide to the SQL Standard (Addison-Wesley, Reading, MA, 1987). [4] C.J. Date, Referential integrity and foreign keys: Further considerations, in Relational Database Writings 1985-1989 (Addison-Wesley, 1990). [5] M. Doherty, J. Peckham and V.F. Wolfe. Implementing relationships and constraints in an object-oriented database using a monitor construct, in Proc. of the Ist Int. Workshop on Rules in Database Systems, Edinburgh, Scotland, 30 August-1 September, 1993 (Springer, 1994)pp. 347-363. [6] A. Formica and M. Missikoff. Integrity constraints representation in object-oriented databases, in Int. Conf. on Information and Knowledge Management, Baltimore, MD, November 8-l l, 1993 (Springer, 1993) pp. 69-85. [7] N. Gehani and H.V. Jagadish, Ode as an active database: Constraints and triggers, in Proc. of the 17th VLDB Conference, Barcelona, Catalonia, Spain, September 3-6 (Morgan Kaufmann, 1991) pp. 327-336. [8] H. Halper, J. Geller and Y. Perl, "Part" relations for object-oriented databases, in Proc. of ER Conference, Karlsruhe, Germany, October 7-9, 1992 (Springer, 1992) pp, 406-422. [9] H.V. Jagadish and Xiaolei Qian. Integrity maintenance in an object-oriented database, in Proc. of the 18th VLDB Conf., Vancouver, Canada, August 23-27 1992 (Morgan Kaufmann, 1992) pp. 469-480. [10] Anton E Karadimce and Susan D. Urban, A framework for declarative updates and constraint maintenance in object-oriented databases, in Proc. of the 9th Int. Conf. on Data Engineering, Vienna, Austria, April 19-23 1993 (IEEE Computer Society, 1993) pp. 391-398. [11] A. Kemper and G. Moerkotte, Object-Oriented Database Management: Application in Engineering and Computer Science (Prentice-Hall, 1994). [12] Won Kim, Elisa Bertino and Jorge F. Garza, Composite objects revisited, in Proc. ACM-SIGMOD Intl. Conf. on Management of Data, Portland, Oregon, May 31-June 2, 1989 (SIGMOD Record 18(2) 1989) pp. 337-347. [13] Byung S. Lee, Normalization in OODB design, SIGMOD Record, 24(3) (1995) 23-27. [14] C. Liu, H. Li and M. E. Orlowska. Object-oriented design of repository for enterprise workflows, Technical Report DDU-TR-9602 (CRC for Distributed Systems Technology, University of Queensland, Q4072, Australia, June 1996). [15] Victor M. Markowitz. Referential integrity revisited: An object-oriented perspective, in Proc. of the 16th VLDB Conference, Brisbane, Queensland, Australia, August 13-16, 1990 (Morgan Kaufmann, 1990). [16] Victor M. Markowitz, Safe referential integrity structures in relational databases, in Proc. of the 17th VLDB Conference, Barcelona, Catalonia, Spain, September 3-6, 1991 (Morgan Kaufmann, 1991). [17] J. Rumbaugh, Controlling propagation of operations using attributes on relations, in OOPSLA "88 Proc., San Diego, CA, September 25-30, 1988 (SIGMOD Notices 23(11) 1988) pp. 285-296.
C. Liu et al. / Data & Knowledge Engineering 26 (1998) 99-115 Chengfei Liu is a senior research scientist of Cooperative Research Center for Distributed Systems Technology of Australia. He received his Ph.D., M.Sc. and B.Sc., all in Computer Science, from Nanjing University (China). He was an Associate Professor in the Department of Computer Science at East China University of Science and Technology. His research interests include Object-Oriented and Object-Relational Database Systems, Object-Oriented Development Methodology, Heterogeneous Database and Multidatabase Systems, Distributed Query Processing, CASE Tools, Advanced Transaction Models, Workflows and Transitional Workfiows. Hui Li is a Ph.D. Candidate of the School of Information Science, University of Queensland. He received his M.Sc. from Southwest Jiaotong University (China) in 1993, and B.Sc. from Nanjing University (China), both in Computer Science. His research interests include Query Processing, ObjectRelational Data Model, Transactional Workflow and expert systems.
115
Maria Elzbieta Oriowska is Profossor of Information Systems at the University of Queensland. She received her M.Sc (1974) and D.Sc. (1979) in Computer Science from Warsaw University, Poland. Her interest include The Theory of Relational Databases, Distributed Databases, Various aspects of Information Systems Design Methodologies (including Distributed Systems), enhancement of semantic data modelling techniques, transaction processing in distributed systems, concurrency control, distributed and federated database systems, and workflows technology. She has published over 100 lesearch papers in international journals and conference proceedings. Since 1992, she has been associated with the Distributed Systems Technology Centre (DSTC) in Brisbane as the Distributed Databases Unit Leader. She has served on the program committees for around 50 international conferences, and was elected a Very Large Databases (VLDB) Endowment Trustee in 1994.