Defining database dynamics with attribute grammars

Defining database dynamics with attribute grammars

Volun e 14, number 3 INFORMATION PROCESSING LETTERS 16 May1982 DEFININGDATABASEDYNAMICSWITH ATTRIBUTEGRAMMARS* Dzenan RIDJANOVICand Michael L. BROD...

761KB Sizes 0 Downloads 76 Views

Volun e 14, number 3

INFORMATION PROCESSING LETTERS

16 May1982

DEFININGDATABASEDYNAMICSWITH ATTRIBUTEGRAMMARS* Dzenan RIDJANOVICand Michael L. BRODIE Departmentof ComputerScience, Universityof Mruyhd, CollegePark,MD 20742, UXA. Received 23 March 198-l; revised version received 4 January 1982

Semantic data models, specification of database integrity constraints, context-free and attribute grammar&conceptual modelling

1. Introduction

2. Semanticdata models

In the programming language area several formalisms are widely used for the precise definition of the syntax and semantics of programming languages and programs. In the database area the specification of the effects of operations on a database, with many associated integrity constraints, is still an important open issue. This issue which addresses database dynamics has been informally called database semantics. One approach to the specification of database dynamics, borrowed from programming languages, is to describe database semantics by associated transactions (programs). Another approach, which has just begun to receive attention in programming languages, is to use integrity constraints which are not input-output specifications for programs, but rather specifications for constraints on data [ 141. Due to the growing complexity and importance of database applications, the pragmatic database community requires precise definitions of database dynamics without the complexity of existing formal methods [4]. This paper proposes the use of attribute grammars for the simple, yet precise, specification of database integnty constraints.

The relationaldata model (RDM) [S] presents powerful primitivesfor the representationof structural propertiesof databaseapplications.These prop ertiesaddressdatabasestatics which correspondsto the syntax of a programminglanguage.Entitiescan be representeddirectlyin relationswhich imply fewer implementationdetails than, say, a list structure. Relationshipscan be representedexplicitly in relationsor implicitly by meansof foreignkeys. However, the RDM provides no direct means of defming and maintaining the properties of relationships. The RDM, and indeed most data models, provide low-levelmanipulation primitives (e.g., insert, update, delete) and inadequate means for cnnlposing application-oriented operations from the primitives. Cornpare the application properties that can be represented in the relation

with the propertiesrepresentedby the operation insert(482,RITZ, 4340, co1ID, 010682,010982

)+

The insertoperation requires more meaning to be

* This work is supported, in part, by the National Science Foundation under grant number MCS 77-22509. 132

expresseddependingon the context in which it is used. This meaningis expressedthroughintegrity constraintsassociatedwith application-oriented operationssuch as make-reservation, tmnsfeerc reservation and comfinnwservation. Semanticdata models (SDMs)attempt to extend 002@0190~82~OfKKWXW /$02.75 8 1982 North-Holland

Volume 14,

number3

INFORMATIONPROCESSINGLETFERS

the ability of datamodels to representstatic and dynamicpropertiesof preciselyand abstractly. model (SEW) [ 131 ad& to the RDMthree forms of abstraction: n, -tima andm&izatibn with which to representrelationships.The abstractionscan be consideredstructuralcomposition rules. Classificationis a form of abstractionirlwhidr a c&s is defmedas a set of elements.This is the Sasrmrcncsfrelationship, as in the relationship between a type and its instances.Aggregationis a form of abstractionin which a relationshipbetween component classesis consideredas a higherlevel aggregateclass.This is the parts)trelationshipas used in semanticnetworks.Generalizationis a form of abstractionin which a relationshipbetween category classesis consideredas a highr:r?gvelgenericclass. This is the &u re!ationshipa&2from semanticnetworks. Figs. 1,2 and 3 illustrate the threeabstrac-

16 May 1982

tions. (Capitallettersdenote abstractions and small lettersidentify classelements.) The extended semantichierarchymodel (SHMt) (21 adds both structural and behavioural abstractions for modellingtcl SHM.In the process of abstraction, detailsrelevantto the problem at hand are emphasized while less relevant details are ignored. Another view of abstraction is the establishment of a one to m43ntrelationship. Classificationrelates one class to many elements (instances) of the class. Aggregation relates an aggregate class to many component classes. Generalization relates a generic class to many category classes.The three forms of abstraction do not provide a means of representing the natural set relationship amongst classes,namely, that a class is composed of a set of members of another class. Clearly, the prop erties of a set differ from those of its members. For example, an employee may have properties name, sex, srrlrrryand depmmtent while employees (set of employee) may have properties group-name, number-

PERSOU_ Ret1

DavLd

B#ZZ

Da&d

R&won fr

ati LnHunar

reprarrmts

of

Zat ti8

Ridd

Lynn

a PERSOR claZ38.

an fndanoe-oj’

relationship,

Fig 1. CWification example.

RGSRRVATIOR t -7 WT7L ROTRL is

psrf

PERSOR

oj’ RRSRRVATIOR.

rf- “9

repxesents

a part-of

rdationahip. Fig 2. Aggrqgatianexample.

RnPLOmr /tL MARACER MARACGR ie

an EMPLOXRE,

SRCRETARP

where/

CRAMBTR-MAID

represents

ia-a

relationship.

Fig. 3. Generdiration example. 133

INFORMATION PROCESSING LEl’TJIRS

Volume14,number 3

sented as a set of operations (general iteration construct), e.g., for eachE InEWLOYEES nlrlse-yalaru@

of-males,numbercof-femalesand avawg~sa~?y-The member-of relationship between classes is absent in

the above abstractions; it is treated implicitly and is modelled using classification. SHM+introduces a fourth form of abstraction called associa~ioi:rto treat sets explicitly at the class level [3]. Association is a form of abstraction in which a set of member elements of the member class is considered as a higher level associate element of a set class.This is the memberof relationship. Association is illustrated in Fig. 4. In [6] it is suggested that data and control structures are designed using the same principles. In [7,8] it is shown that Cartesian product, discriminated union, and sequence data>structuring methods correspond respectively to functional composition, choice, and iteration operation constructs. This simple and appealing idea is fundamental to the ‘structured programming’area (e.g., see [ 10,121). This paper suggests a relationship between these important concepts and heretofore distinct concepts in the database area, thereby establishing a correspondence between structured programming and database design using semantic data models. Aggregation, generalization and association provide means of organizing behaviour as well as data. For example, an operation on RESER VATION,

3. A context-freegrammarfor databasestatica

In programming languages, context-free grammars (CFGs) are used to define syntax, and attribute grammars (AGs), based on CFGs, are used to define semantics. The main idea here is to use AGs to define database dynamics (integrity constraints). However, CFGs are used first to define database statics. In particular, CFG production rules are used for the specification of SHM+abstractions (composition rules). CFG is usually denoted as G = (V,T,P,S) [9]. V and T are disjoint finite sets of variablesand terminals, respectively. P is a &rite set of productions. Each production iule is of the form A ::= ar,where A is a variable and Q!is a string of symbols from (V U T)* (* is the Kleene star.) Finally, S is a special variable called the start symbol. A structural part of the SHMt is precisely denoted by M = (C,S,A,D). Thus C,S,A,D in SHM+ correspond to V,T,P,S, respectively, in CFG. C is a finite set of classes. There are two basic kinds of classes- composite and simple. Classesare simple if they are not defined in terms of other classes. S is a finite set of simple class elements. We assume that C and S are disjoint. A is a set with four different kinds of composition rules - one for each form of abstraction in SHMt. D is a special database class. In Fig. S(a) the structural part of the HOTEL RESERVATION database is presented and in Fig. S(b)

say make-reservation,can be represented as a sequence (functional composition) of operations on HOTEL and PERSON, an operation on EMPLOYEE, say hire-employee, can be represented as a case statement (general choice construct) on MANAGER, SECRETARY and CHAMBER-MAID;and an operation on EMPLOYEES, say raise-salary,can be repreHOTEL I NAME

1 EMPLOPEES

I ADDfiESS

EMPLOPEES EMPi;

f

One instance (memberelement).of an a

member

of

the instanceof the

representsa

member-oj?

Fig.4. Association example. I34

EMPLO1yEE

EMPLOYEES

relationsh2.p.

16 May 1982

member class

is

set class, where t

Vdume 14, number 3

tNFOlWA”CION P

Hnson

16 May 1982

RSdd

ZattfrrLynn

Johu eraen Fig. S(P). HOTEL RESER VATIOA?abstractions

EOTRL RESRRVATZOR* IC.S,A,DI C

-

{RRSGRVATIBRS, BWPLOYRRS,

PERSOU,

RRSBRVIITIOI, RMPLOYEB,

NARAGRR,

CTOTIL,

SECRRTARY,

IPAUE,

ADDRESS,

CRAHBRR-MAID)

S - CRvft Davfd, Einuon Xidd, ZaZZio Lynn, Retorta, Bitton, Wau York, Detroit,Ntz~#Rpoun, John Gram, Pet8r Wstrrh, . . . 1

D = RRSSRPATIORS 1 OOtW$8t8 Of tha RBSRRVATIORSs

t=

fOttOW$n@: RCS~RVATlOP

RSSRRVAI'IOR ::- HOTRL PRRSOR ROTRL

::- BANB

BNPLt?Ylms

::=

IwPLorEE

::= ff~dl~cimISECRETARY 1~~M~RER-MAID

PBRSOR

::-

Bett bav
IPAME

::=

A8lOria

ADDRRSS

::= Ipsa, Xo'orklDetroit

MA10AGBR

::-

ADDRRSS %MPLOYfEES

MPLOYIIFBJ

lIi3

Lpwt

t*O??

Mary Brow 1John Grew I?eter

WeZeh

etc.

F&5(b).Grammarfor HOTELRESERVATION suucture. 135

Volume 14, number 3

INFORMATION PROCESSING LETTERS

16 May 1982

Table 1 SC

slw+

particular CPG:

particular SXN+ structural model (or database schema):

G = (V,T,P,S)

El - (C,S,A,Dl

lcnquaqe L defined by G

data mfrerse

program PR of L

database DB of DV

its corresponding grammar. HOTEL RESERVATW!! can be considered as the database schema name, where a schema corresponds to ‘thestructural part of t:tC SHM+. In the programming language area a CFG is used to defme a programming language, i.e.. to generate all programs of that language. In the database area, the structural part of SHM+,denoted by M =:(C,S,A,D), is used to define a data universe,i.e., to generate all possible databases of that universe. Thus, a data universe corresponds to a programming language and a database corresponds to a program. A derivation tree of a program is a proof that the program is syntactically correct, and a derivation tree of a database is a proof that the database is a part of the defined data universe, i.e., satisfies the schema. The relationship between the structural properties of the Sk&i+ and the CFG is represented in Table I.

4. An attributr3grammarfor database dynamics A major open problem with SDMsis the specification of a database dynamics. Ir the programming languagesarea, attribute grammars [ 111 are used to describe language semantics. An attribute grammar (AG) is a CFG, G = Q/,T,P,S), in which ea+ variable from V is associated with a set of attributes. These; attributes describe the properties of the variables. The values of attributes are given by semantic rules associated with the production rules from P. Thus, a ‘meaning’may be assigned to a program in a CFG language by stating attributes of the variables in a derivation tree for that program. In a similar way, meaning may be assigned to a

, DV defined by M

database by using attributes of the classesin a derivation tree for that database. Each class from C in M = (C,S,A,D) is associated with a set of attributes. The attributes define the properties of insert, delete and update operations, and they can be defined by predicate rules (integrity constraints) associated with each abstract composition rule from A. There are four different types of abstract composition rules in SHM+ (for classification, aggregation, generalization and association) and each of them has predefined predicate rules which constitute semantics of the SKIM+. Thus, behavioural properties of a database are derived, on the one hand, from the behavioural characteristics of the abstractions in SHM+(predefined-model constraints), and on the other hand, from the dynamic properties of the database application (application constraints). Operands for the database operations are class (simple or composite) elements. That is the basic reason that non-terminal nodes in z database derivation tree can be considered as database variables of certain classes.The same names will be used for database variablesand classes,assuming that there is a one-to-one correspondence between them and that the structural and behavioural properties of the variables are determined by the properties of their classes. Fig. 6 shows the make-reservation operation attribute (constraint) vaZtd=reservation, which is defmed by the other integrity constraints hotel-exists, personexists(i.e., exists in a database) and vaZid-new-peson. This is a ‘synthesized’ constraint, i.e., defined solely in terms of constraints of the descendants of the corresponding database variable (or class).The other type of integrity constraints is ‘inherited’, i.e., defined in terms of constraints of the ancestor of the database variable.’For example, cancel-pemon is an operation

Volume 14, number 3

INFORMATIONPROCESSINGLETTERS

16 May 1982

RESERVATION I ROTEL RESERVATION

: :-

I PERSON

HOTEL PEPSOR

vuZ
(RESERVATIONI hotet-exist8

c----) (ROTELI

(person-exietti

and

(PERSON)

g=

valid-neo-pereon

where -

represents

the

logical

biconditional

(PERSORl I

connective.

Fig. 6. Valid-resenwhonattribute.

‘inherited’ constraint : cancel-person (PEASO~

Acknowledgment c* no-reservation

(RESER VATION).

Operation constraints (attributes) are defined in terms of their ‘local environments’ (using the principle of abstraction) miuiwizing the interconnections between different parts of a database. This localization and partitioning of the semantic (predicate) rules makes the definition of database dynamics (integrity constraints) easier to understand and more concise and explains the use of AGs.

5. Conclusion

Due to the growing complexity and importance of database applications, the pragmatic database community requires precise definitions of ,database dynamics without the complexity of existing formal methods. Semantic data models have been introduced to represent static and dynamic properties of database applications in a direct, precise and abstract way. So far, very mathematically oriented specification techniques of semantic data models have not been widely accepted in the database area. This paper has proposed a technique that has the flexibility to meet varying precision requirements, yet avoids the complexity of existing formal methods: context-free grammars are used to define database statics and, more importantly, attribute grammars are used to specify database dynamics (integrity constraints).

The authors wish to thank the referees for their constructive comments and helpful criticism.

References [l] M.L. Brodie and S.N. ZiJles,eds., Proc. Workshop on Data Abst., Database; and Conceptual Modelling, Special Issue SIGPLAN Notrl:es 16( 1) (1981); SIGMOD Record ll(2) (1981); SIGART Newsletter 74 (1981). [2] M.L. Brodie, On rnodelling behavioural semantics of databases, Proc. 7th irrternat. Conf. Very Large Databases, France, 198 1. [ 31 M.L. Brodie, Association: a database abstraction for semantic modeiling, in: P.P. Chen, ed., Entity-Relationship Approach to ;information Modelling and Analysis (ER Institute, Los Angeles, 1981). [4] M.L. Brodie, Axiomatic definitions of data model semantics, Inform. Systems 7(2) (1982). [S] E.F. Codd, A relational model for large shared data banks, Comm. ACM 13(6) (1970). [6] C.A.R. Hoare, Notes on data structuring, in: 0.3. Dahl, E.W. Dijkstra and C.A.R. Hoare, eds., Structured Programming (Academic Press, New York, 1972). [7] C.A.R. Hoare, Data reliability, SIGPLAN Notices lO(6) (1975). [8] C.A.R. Hoare, Data structures, in: R.T. Yeh, ed., Current Trends in Programming Methodology, Vol. IV (Prentice-Hail, Englewood Cliffs, NJ, 1978). [9] J.E. Hopcroft and J.D. Ullman, Introduction to Automata Theory, Languages and Computation (AddisonWesley, Re?ding, MA, 1979). f101M.A. Jackson, Principles of Program Design (Academic Press, New York, 1975). 137

Volume 14, number 3

INPORMATION PROCESSING LETTERS

[ 1l] D.E. Knuth, Semantics of context-free languages, Math. Systems Theory 2(2) (1968). [ 121 C,J. Myers, Composite Structured Design (Van Nostrand, New York, 1978). [ 131 J.M. Smith and D.C.P. Smith, Database abstraction:

138

16 May 1982

aggregation and generalization, ACM TODS 2(2) (1977). [ 141 S.N. Zilles, Types, algebras and modelling, in: M.L. Brodie and S.N. Zilles, eds., Proc. Workshop on Data Abst., Databases and Conceptual Modelling, Special Issue SIGPLAN Notices 16(l) (1981).