Object systems over relational databases

Object systems over relational databases

Object systems over relational databases M K Crowe The purpose o f this document is to present a set o f mechanisms and concepts for object systems b...

1MB Sizes 1 Downloads 148 Views

Object systems over relational databases M K Crowe

The purpose o f this document is to present a set o f mechanisms and concepts for object systems based on an external relational database. The object space may be shared among a set o f applications which use the standard query language SQL as its principal data access mechanism. Methods are not a concern o f this paper and may be handled by callout to separate execution engines. Internally a semantic data model is used including reverse links. A new formalism for describing such a complex data model is presented in the paper. An object manager embodying these ideas is fully implemented for the Oracle database management system. object-oriented, relational model, data modelling

BACKGROUND

AND MOTIVATION

Integrated software environments such as those found in many large software packages (e.g. for administration, computer integrated design) need to share common persistent data. There is a growing trend towards the use of relational databases as a standard basis for this purpose. Although interesting experiments have been made with other systems such as persistent heap systems (e.g. PS-Algol) and object-oriented programming systems (Smalltalk), relational database management systems are preferred for large volumes of data because they offer a stable basis: the basic relational architecture is mature and supported by a strong range of reliable products. Moreover, the lifetime of the data is measured in decades and organizations will not willingly entrust it to the latest fashionable persistent programming system even supposing such a system could handle the required volumes of data. When such collections of programs are sharing such data, each program on starting up needs to collect information from the shared store and convert it into an internal form. This is potentially wasteful, involving repeated access to the relational database management system (DBMS): and moreover, the DBMS will treat these programs as competitors for the data, since there is no standard way of informing it that these programs are cooperating on a single task. In these circumstances the notion of a shared local agent which provides the shared data from local memory is attractive. On the other hand, the relational data model is much more limited than the data structures available in highDepartmentof ComputingScience,Universityof Paisley,HighStreet, Paisley PAl 2BE, UK Vol 35 No 8 August 1993

level programming languages. A normalized relational system is likely to have logically related data spread over a number of tables, but the only way of expressing relationships between these tables may be by use of foreign keys, and the construction of additional relationship tables which exacerbate the data fragmentation problem. Support for the reorganization of such data into logical groups would help in the programming task, so that the shared local data can use a different data model from the external DBMS. The object manager described in this paper can handle many-to-many relationships between data in the object space and data in the DBMS, so that data coming from many relational tables and traversing relationships and foreign keys can be linked together in a logically simpler object structure. Researchers have proposed various forms of semantic and object-oriented extensions to database management systems1-5, as the foundation for an object-oriented layer as a proposed standard basis for application development. Such a development would continue the trend whereby first operating systems, then hierarchical file systems, then relational databases have become the standard basis for application development. There have been two major problems with all of these proposals: in the first place, they have required a non-standard query language. Secondly, entity-relationship modelling6'7 has become a popular way of designing data models that are easily normalized, but this has tended to conflict with the concept of object identity1, since it scatters information about an object through many tables, so that many transactions are needed to change an object. The work described in this paper proposes a solution to both of these problems. It formed part of a research and development project 8 for developing an environmental resource manager for urban planning applications. In this project, there were added requirements: the data being accessed were data from the database of a local authority, but potential changes (scenarios) were being considered, so the data values in the shared local data must be allowed to change without making modifications to the external DBMS. The object manager idea grew out of earlier work in a project 9 on tools for designing real-time fault-tolerant computing systems. With its origins in such widely separated fields, the development potentially addresses generic issues in computing and may therefore be of wider interest. The basic approach of this paper is that the data in a relational database may be interpreted in an objectoriented way. In the general case, data from many tables can be combined to create the information about the objects in a class. The limiting simplest case is where a

0950-5849/93/080449-13 © 1993 Butterworth-Heinemann Ltd

449

Object systems over relational databases

single relational table defines an object class. The paper explains the motivation for constructing object classes in this way, and presents a coherent approach which supports the selection, combination and composition of relational data to form object systems, and examines the relationships (morphisms) between such object systems. Although the language of objects is used the notion of method is rather less important in the present context. Functions and procedures, from the viewpoint of this paper, are mostly internal to the applications. The object-oriented ideas that are of most value are object identity and inheritance. Encapsulation and abstract data typing are provided to the extent that the only way to access data is through the object, and methods can be associated with object classes. Ordinary SQL is sufficient to access and modify data: there is no need to change it for object-oriented access, although, as is usual for such systems, some extra lower-level primitives have been added. In the HERMES project g, all SQL select statements are first handled by a geographical preprocessor (Cartech) that implements topological predicates and functions. For simplicity this aspect is ignored in the present paper. It is a pleasure to acknowledge the contributions made to this work over the years by a number of researchers and other colleagues, especially Arvind Kaur, Mihail Hiladakis, John Oram and Paul Oldfield.

OVERVIEW OF THE OBJECT SYSTEMS APPROACH This section introduces the main concepts from the point of view of the user of the object system approach. The implementation, which is discussed towards the end of the paper, uses an object manager interposed between the relational database and the user applications, to implement the object system concept.

Object classes and relational tables The simplest possible way of associating object-oriented ideas with a relational system is to allow a relational

table to correspond to an object class (or frame type), and each record in the table to correspond to an object of that class. Primary keys give an immediate interpretation of object identity: they identify a single record in the table. Other columns (fields) of the table then correspond in a natural way to attributes (instance variables) of objects of the class. This simple case, where each attribute has a single value, and attributes do not have subattributes, is sufficient for tables that have primary keys. (Most commercial database systems tolerate tables without primary keys, and for example, allow repeated tuples, but such tables cannot be used with the system described in this paper.) In databases that have been designed using entityrelationship modelling, most entities will be of this form. For simplicity during this section, the concepts are presented for the case where the primary keys are all simple, that is, consist of a single column. Multiplecolumn primary keys are commonplace, however, and are discussed a little later. (See Figure 1.) Attributes of an object can be updated in the ordinary way. But the requirements of object identity dictate that an update SQL statement cannot be used to alter the value of the primary key.

Relationships and synthesised object classes A relationship between two entities will be implemented in the database using a table with two columns, giving the primary key of each of the entities. (See Figure 2.) Using SQL, traversing such relationships is achieved using joins. The object manager supports this, of course, but also provides a much more natural and efficient mechanism as an alternative. A good way of viewing this sort of relationship from an object-oriented viewpoint is that it defines set-valued attributes I° of both the Town and Road objects. Then each town has an associated list of roads it is on, and each Road has a list of towns that it passes through. This conceptual step, however, implies (a) that there should be set-valued attributes in the system, and suggests a notion of subattributes whereby the attributes of a Road

TOWN Town#

Townname

Population

T

on

SI~g

Number

ROAD Rind#

Road_name

Class

st~

Road rind

Figure 1. The simplest case: entities defined by relational tables

450

Information and Software Technology

M K CROWE

TOWNROAD ("Town x is on Road y") Town_no

Roadno

refers to

refers to

TOWN(Town #)

ROAD(Road#)

Figure 2. Relationships can be defined by relational tables containing foreign keys

might appear as subattributes of elements of the set; and (b) that if both sets are visible in the object system, internal mechanisms should ensure that they are automatically kept in step; an update to one should affect the other. It turns out to be rather difficult to contrive a natural way of naming these synthesized attributes, or the extended Town and Road object classes that contain this additional information. In general, entities will participate in many relationships. It is best, therefore, to allow the users of the system to specify new names for the attributes and classes, and add a synthesis mechanism whereby a new object class can be defined by a foreign key. Thus, in the above example, the user has considerable choice how to access the above information. The default mechanism would create an object class from the table Town_Road, but since no primary key is specified for the table, it will arbitrarily use the Town_no as the object key, and give each Town__Road a set-valued attribute Road_no: the user can of course override this and specify which column(s) to use (within the object manager) as the primary key. By default, also, the object manager will obtain the foreign key constraints from the external database, and enforce them. The user may specify in addition that the foreign key information is used to synthesize a new object class called (say) Townl containing added information about roads, by composing the Town object with the Town_Road relationship. (See Table 1.) (The definition of SF and other tables used by the object manager to define object classes is given later, under the heading 'Object manager system tables'.) The object class Townl is thereby defined, by default, as having Town_no as its primary key, and simple attributes from Town together with a set-valued attribute Road_no. This object class is defined using two relational tables: which are consulted when any object is accessed for the first time; and which are updated if necessary when the user requests changes to Townl to be written out to the external database. Table 1. SF

Frame_Type

Attribute

Refclass

Defines

town__road town 1 capitaLcity

Town_no Road_no City #

town road town

town 1 town2 capital

Vol 35 No 8 August 1993

The process can be iterated using the foreign key information for Road_too, to define a new object class called (say) Town2, as shown in Table 1. This also has Town_no as its primary key, and the attributes from Town, but now has the attributes for Road appearing as sub-attributes of the set-valued attribute Road_no. Alternatively, the user could traverse the relationship from the Road object, provided the primary key of the Town_Road class is specified to be Road_no. In order to do both, the user may specify a second object class Town_Roadl that accesses the same relational table TOWN_ROAD. but with the primary key differently defined. As to the internal mechanisms for keeping the relationship-sets in step, these turn out to be largely enforced by the same mechanisms that are required to provide the concept of object identity. In fact, the mechanism for synthesized object classes does not result in duplication of data within the object manager: the same object (instance) in the object manager's memory is retrieved by traversing Town2 to a given road as by accessing it directly as a member of the Road object class. However, where more than one object class is constructed from the same relational table, duplication within the object manager does occur, and care is taken to propagate updates among such replicated sets. Synthesized object classes have been presented here for the special case of binary relationship tables, but the construction iterates well to ternary and higher relationships. More generally, any foreign key can be used to define a synthesized object class. For example, the notion of inheritance is another special case of synthesized object class (although for historical reasons, specialization of objects may also be specified directly). Suppose, for example, that some towns have administrative roles in relation to regional government, and their attributes are specified in a table CAPITAL_CITY. Suppose that the same primary key is used for cities as for towns, then this means that City# is a foreign key referencing TOWN (Town#), and the synthesized object class Capital can thus be defined: its attributes will consist, by default, of all the attributes from Town together with all the attributes from Capital_City. For multi-column primary keys, mechanisms exist to specify the other components of the foreign key. These are described later. 451

Object systems over relational databases Table 2.

Town #

Town.-name

Road #

ii23 1123 I 123

Paisley Paisley Paisley

A737 A726 A741

Object classes and SQL SQL can be used in a natural way to access and modify data within object classes of all these types. In particular, new values of set-valued attributes can be added or deleted using the ordinary INSERT and DELETE statements, and subattributes can be modified or cleared using the UPDATE statement. First consider the INSERT statement. Under this influence of the default model where a row in a table corresponds to an object, it is easy to see that an INSERT can be used to create a new object. But if there are set-valued or subattributes, some of the records in a table may refer to the same object. As an example, consider Table 1 synthesized above. (See Table 2.) (Recall that here, Road # is a set-valued attribute, so that here three elements of the set for town = 1123 are shown.) Inserting a new record in Table 2 with T o w n # 1123 will not create a new Town object, but will add a new element of the set-valued attribute R o a d # to the object 1123. CREATE TABLE can be used to create a new simple object class with primary and foreign key information. The creation of classes with set-valued attributes or subattributes, and dynamic synthesis and specialization of object classes can be done via the system tables used by the object manager, but the changes to these tables must be written out to the external database before the object manager attempts to access the information. The best way is to restart the object manager with the new information. Views are also supported in the object manager. Views in the external database are treated in the object manager in the same way as tables: in particular they can be

modified, but the modifications cannot be written out tc the external database. This facility is provided as a prac. tical matter for the HERMES project, but modification~ inside the object manager to views can lead to date inconsistency. The only safe thing to do would be tc disallow any modification to views. (See Figure 3.) CREATE VIEW establishes a new pseudo-object clas,, whose attributes are obtained from other classes by process of renaming and selection. Views of this kind cannot be modified. A SELECT statement can be interpreted readily in an object-oriented way. The FROM clause in a select statement gives a list of views or classes, which generalize the concept of table. The column names in a select statement are attribute names, searched for in the hierarchies of attributes defined by the classes, and prefixes can disambiguate this search. The synthesized object class concept introduced earlier provides the notion of inheritance in this system. The form of inheritance it provides is dynamic, as the following example shows. Suppose that there are two relational tables PERSON and EMP, where PERSON (corresponding to a simple class Person) gives a set of personal attributes of people identified by a Person # , and EMP is a table with primary key E m p # that gives employment information, and refers to the PERSON table via foreign key Person#, then the mechanism described above can create a class Employee as the synthesized object class uniting the information from these two tables, and where Person # is a primary key. (This could be created as a view in an ordinary relational DBMS, but then it could not be updated.) As every Employee is a Person, it is correct to say that Employee is a subclass of Person and inherits all its attributes and methods• Selecting from PERSON will yield only the information that was there before, but selecting from EMPLOYEE will now give all information from both EMP and PERSON: moreover, many SQL statements can be much simpler: we can write SELECT NAME, EMPLOYER FROM EMPLOYEE

Workstation 1 omO

\

File Server ~[ om_oracle }

Workstation2

(applic3 ~

o°m0 rnl

~

/

• {om ingres ]

1

@

Typical OM activity snapshot The two instances of ore0 are different

Figure 3. Software architechure f a r a local area network

452

Information and Software Technology

M K CROWE

instead of SELECT A.NAME, B.EMPLOYER FROM PERSON A, EMP B WHERE A.PERSON_NO = B.PERSON_NO

In the object system we can use the SQL DELETE statement on EMPLOYEE. If we say DELETE FROM EMPLOYEE WHERE NAME = 'Fred'

all record of Fred's salary, etc., will disappear; but Fred will still be a Person. On the other hand, DELETE FROM PERSON WHERE NAME = 'Fred'

will automatically remove all trace of Fred from both classes.

Application program interface The object manager (OM) provides a shared object space for a collection of clients. The information in this object space is obtained as required from the external DBMS or provided by its clients. Each instance of the object manager can be shared among a set of applications, typically running on the same workstation. By default, each OM is local to a workstation and is identified by a small integer (0, 1. . . . ) which it automatically allocates itself on startup. It obtains its set of object-classes and relevant databases on startup from a small set of standard database tables which are retrieved from a given database (which is specified for each instance). In this way a set of OMs provide independent contexts of shared data for applications running on that workstation, with lightweight manipulation of this shared data avoiding access to the databases system. Groups of applications cooperating on the same data should share the same instance(s) of OM. But applications that are not cooperating should start another OM. It is easy to make an OM available to other workstations by simply copying the file containing its Internet address from its position in /tmp on one workstation to the corresponding file on another, but this should be limited to cases where the workstations are agreeing to pool transient data. The workstation's virtual memory is used for the shared memory. The application program interface uses the C calling sequence, instead of the more usual embedded SQL which requires a preprocessor. There may be more than one object manager operating on the workstation, identified by a small integer (the application can access more than one of these). The principal interface functions are:

• Sql • Select • Describe

Allows an application to send an SQL statement to OM. Allows an application to obtain the results of an SQL select statement from OM. Allows an application to obtain the table structure that would result from a Select using the given SQL query.

Vol 35 No 8 August 1993

The results of a Select statement are returned in a table structure giving the number of rows and columns, a list of column headings, and formats and a set of arrays of data items for each column in the table. There is a library function FreeTable to reclaim the memory used in the application for this structure.

Object manager system tables The relational tables described here provide the mechanism for specifying the construction of an object system over the existing relational data. All of these tables can be empty, and then by default each relational table in the database appears in the object system as a class in the manner described above. Details on object classes and attributes are obtained from a standard set of database tables when OM is started up. These tables are currently defined as follows (the word frametype is to be read as synonymous with class): fi

(frametype,tname,cname,attr)

This table gives the correspondence between table and class names, and column names and attributes, for any cases where the defaults are inconvenient. If a name used in the object manager where a class would be expected, and the name is not given in this table, the object manager assumes that the default behaviour of the system is required to define a class from a table of that name in the database, even if this table is already being accessed in other ways by the object manager. Moreover, a many-many relationship between classes and tables is supported, where the attributes of a class can be drawn from various tables. Such replicated access is managed correctly: modification propagates to all copies. fs

(frametype,specializes)

This table gives a way of specifying inheritance directly. It is not generally used: composition using the SF table is preferred. When a class specializes another, all attribute and key information is inherited, together with the relationships with the database, and this inherited information can be extended for the new class (including the addition of further components to the primary key). fk

(frametype,keypos,attr)

Most primary key information can be obtained from the external database. Primary keys can also be specified using the F K table provided the primary key information in the external database does not conflict with this. Primary keys can have several components, identified by the keypos field (values 1, 2 . . . . ). aa

(frametype,attrname,subattr,rpt)

This table allows subattributes to be specified. By default all non-key attributes are non-repeating subattributes 453

Object systems over relational databases

of the last component of the primary key, so entries in this table can be used to define further attributes, or to override the default rpt value of 'F'. al

(frametype,attrname,refclass,keypos,attr)

Most foreign key information can be obtained from the external database or from the SF table described in this section. This table allows additional links (foreign keys) to be specified. The attrname field gives the last component of the foreign key, and other components if any are specified using the last two fields.

Each class C has a primary key kc = (k, . . . . . k , ) ~ F,+" with n I> 1.

Example In RM, the primary key corresponds to the names of the columns making up the primary key in the associated database table or view. Each class C defines a domain ~ c of objects. Each object T has a name v ( T ) e Y.+", where n is such that kc e ~+". The mapping v : ~ c ~ E+n is one-to-one. We will denote the inverse partial map also by C, thus C:Y.+'---,~Ic, with C (n) = T where v (T) = n e Y~+".

Example In RM, each row of the table or view C is sf

(frametype,attrname,refclass,defines)

This table is used for specifying composition of foreign keys to define a new class. In the class specified by the frametype field, the attrname field is the last component of the foreign key, and references a class specified by the refclass field. This composition is the define a new class whose name is given in the defines field. (See previous examples.) Earlier components of the foreign key, if any, can be specified using the AL table, or obtained from the external database. Information contained in these tables can be used to give access to any tables in any DBMS accessible by OM. The general format for a table name is [[dbms.]dbname.]table For O R A C L E the dbname part has the form user/ password.

FORMALISM SUPPORTING THE OBJECT MANAGER CONCEPTS The following formalism for the Object Manager concepts is needed in order to place the semantics of the operations described earlier on a sound basis. In this presentation, it is assumed that all inheritance is provided using the synthesized class mechanism described previously. (The general case of inheritance can be treated using a transitive antisymmetric relation on the set of classes, which is used to factor the other constructs described in the next sections.) It is also assumed that modifications to views are disallowed. For simplicity we also restrict to string-valued attributes: the existence of other formats does not add anything new.

Classes Classes can be identified by name: for convenience we will identify classes with a set of strings F __. E + where Y~+ denotes the set of nonempty (variable-length) strings. We denote the empty string by NULL: the set of all variable-length strings is denoted E.

Example In the simplest mapping of a relational database to the object system, a table or view defines a class. We will refer to this mapping as the relational model, RM. 454

an object of C, and the primary key specifies it uniquely. Each component of the primary key typically has an index defined on it, so that looking up any one of the set of values of the key gives a subset of rows: the last component of the primary key yields a unique row. Classes in this section do not include view classes (those defined within the object manager by CREATE VIEW). Attributes A class defines a network of attributes, identified by name. Attributes may have subattributest'; this is what defines the network structure, and attributes may have sets of values. If there is no restriction on the cardinality of the set of values, the attribute is said to be a 'repeating attribute', but most attributes can have at most one value (i.e. the set of values has cardinality 0 or 1). The value of a non-repeating attribute is the single element of its set of values, if it exists, or N U L L if the set of values is empty. In this formalism, components of the primary key are considered to be attributes of the class. This usage of the term attribute is not usual in the literature: attributes normally give properties of objects. But the presence of composite primary keys complicates the formalism, and it is to try to keep the complexities under control that this strange usage is introduced. With the terminology adopted here, objects appear as the values of the attribute k,, the last component of the primary key. If the primary key is simple (consists of one component), and using the terminology of RM for simplicity, then the value of the primary key is the object name, and the remaining columns define the 'attributes of the object' in the more usual terminology. Remark

Example In RM non-primary columns are nonrepeating attributes and have no subattributes; they are subattributes of the last component of the primary key. Each (repeating) attribute gives an index in which to look up its values. In particular, the first component of the primary key defines an index: if n = 1 the entries in this index are the object names, and the index gives the objects. For simplicity in our formalism, we identify an attribute a with its name a, so that the set of attributes of C is identified with a subset Ac of the set of nonempty strings E ÷. The set of attributes that are repeating are denoted Re, with R c ~ A c . Since components of the key are considered to be attributes, they are clearly repeating attributes, and form a subset K c - R e , where K c is the underlying set of the vector k c. Information and Software Technology

M K CROWE

Then the subattribute relationship within a class C defines a relation Oc~_~ x ~, such that if a l o c a 2 then a~ is a subattribute o f a2. A n attribute m a y be a subattribute o f itself, t h o u g h this is rather unusual (like recursive data structures in c o m p u t e r science): this is why attributes f o r m a network and not in general a tree. Each c o m p o n e n t o f the key is a repeating subattribute o f the previous component. These data are required to satisfy the following conditions, some o f which were expressed in words above: (A1) F o r the k~ where k c = ( k 1 . . . . . kn) is the primary key o f C, ki • R c, and k~ack~_ i for i > 1.

C (s)pC '(t), and so p contains two foreign keys. p can be considered to yield two object classes, one with the foreign key for C as its primary key, and one with that for C'. In this way the earlier discussion can be implemented in this formalism. The construction is not limited to binary relationships, since condition (A3)(a) allows multiple c o m p o n e n t s o f a foreign key to be found from ancestors in the instance hierarchy, and so the construction can be applied recursively. We define the set o f base classes (non-synthesised classes) as A = F - Synth(Fkey).

Instances

Example For RM, the set of attributes of a class C consists of the set of column names of the table C. The primary key concept corresponds and (AI) amounts to a definition of a and R.

To define the general semantics o f attributes, including subattributes, repeating attributes, and links, it is convenient to introduce the notion o f instance.

(A2) F o r any a • Ac, there exists exactly one sequence p(a) = (ao,a~,...,an) where a0 = k~ • k c, a, = a, and ai • Ac, a~trca~_ 1 for 1 ~< i ~< n. There is a relation Fkey ~_F x g +* x F which defines foreign key information, subject to the following condition:

Remark Intuitively, objects and attribute values are instances. An object will in general have a list of values for each attribute (as repeating attributes are allowed), and each value defines a new instance (as subattributes are allowed). Thus the result of indexing an instance by a string gives a new instance. To find an object in a table in RM requires an indexing operation using each component of the primary key. With the above definition of the set of attributes, this mechanism is that of traversing subattributes and repeating attributes.

(A3) I f Fkey(C,(a~ . . . . . am),C'), where kc = (kt . . . . . kn) and k c, = (k'~. . . . . k'n.), (a) for each i with 1 ~< i ~< m, a I •

Ac,

and if i > 1 then

P(ai_ i ) ~- p(a,), (b) m = n ' and Acc~A c, = ~ . (c) if ai = kj for some i, j then i = j. Remark (a) is required for compatibility with (AI), since the foreign key must have the structure of a key; (b) ensures that the foreign key's length is compatible with the referenced primary key (the second part is a renaming issue since the formalism assumes no conflict between attribute names); (c) ensures that if the foreign key contains a component of the primary key, it must contain all the earlier components of the key, and is in accordance with intuition though may not be logically necessary. There is a partial function Synth: Fkey---, F: corresponding to the SF table above. This m a p p i n g is subject to the following conditions: (A4) I f Synth(C,(a~ . . . . . am),C') = C "

(a) (b)

kc- = kcu{km}, A c. = A c u A c , , R c . = RcURc,, with the relationship tr c. defined as follows: atrc.b iff atrcb, or aac.b, or a = k m • Ac', b • Ac and amacb. These conditions define the synthesized class C". Note that am ¢ Ac because of (A3)(a). Using this concept there is a natural interpretation of an entityrelationship model system (ERM) as an object system. Assuming that the ERM is implemented in a relational database, first construct an object class for each entity. Now each binary relationship p records a relationship between two entities C and C ' such that (s,t)e p when Remark

Vol 35 No 8 August 1993

F o r each C • A, let the d o m a i n J - c o f instances o f C be defined formally as follows. First, ~--c contains a distinguished element ec. F o r each z • 9-c, there is a partial m a p z :E × E ~ 3- c satisfying the following axioms. (I1)

z(a,s) is undefined unless a • Ac and one o f the following is true: a = k~ and T = ec, where kc = (k~ . . . . . k,) is the primary key o f C, or z = z ' ( a ' , s ' ) , for some C , a ' , s ' with aaca'. Remark

We can write z(a,s)= z'(a',s')(a,s).

(12) T(a,s) is defined for at most one string s unless a • Rc (i.e. unless a is a repeating attribute). Notice that in the spirit o f definitions such as the above, nothing else can be said a b o u t instances. Thus for any ~ • ~ c , either z = ec, or z = z'(a,s) for some unique strings a,s and instance z ' • W c ; this leads to a representation o f any z • J c as z = e c ( a ~ , s l ) . . . (am,s,,) for m >i 0, and a consequence o f the above rule (I1) is that then p(am) = (a~ . . . . ,am). We express this as a condition: (I3)

I f z(a,s) = z'(a',s') then z = z', a = a ' and s = s'.

Objects are instances: there is a natural identification ~c~_gT"c as follows: given T • ~ c , if s = (Sl . . . . . s,) = v(T), then T = C (s) = ec(kl ,sl ) . . . (k,,sn). If the value o f attribute a o f T is the set S, then there are instances T(a,s) for each s • S, and so on for subattributes o f a.

455

Object systems over relational databases We disallow redundant instances: (I4) If kc • y." and ec(kt , s t ) . . . (kt,s~) is defined, then there is some ec(kl , s t ) . . . (k,,s,) • ~! c. Remark This condition merely deals with the instances needed to construct multi-component keys. The condition could be rewritten: if ec(ki,s~)... (k,s~) is defined, and kc e Y.", there is a s e )z, with (sz . . . . . s~)=_.s such that C(s) is defined. Foreign keys must be well behaved. Temporarily defining the partial function $ (z,a) = s if ~ = ec(at ,st ) . . . (a,,,s~), a = a~ some i and then s = s~,

(I5)

If F k e y ( C , ( f t . . . . , f , ) , C ' ) , then for every z = e c ( a , , s , ) . . . ( a , , , s , , ) such that $ ( z f , ) is defined with f , = ai (this implies that $ ( z ~ ) is defined for j < n), then C'(~b(z~) . . . . . ~b(zf,)) e ~ c ' .

This gives the mechanism for defining the instance set 57-c,, for a synthesized class C " e F - A . If Synth(C,(f~ . . . . , f , ) , C ' ) = C " , and with T as in (I5), and whenever r ' = T(a~,+ l,Sn+ t ) ' ' " (a'p,s'p) • ~'c', z" = ec,,(al ,sl). . . (ai,st)(a',+ i,s',+ 0 . . . (a'p,s'p) e ~-c", and any initial subsequence thereof. In particular we identify as an object T" = ec.(aj ,sl ) . . . (ai,si) • .~c".

Object identity

SELECT NAME FROM EMPLOYEE

This example motivates the definition of object given in the previous section.

Object system Formally construct the domain of all attributes as a disjoint union A = ~ A c, CeF

and similarly construct R and a. Consider the domain of instances 5 r together with the mapping k and Synth and relations a R, F k e y as an object system. There is a category of such object systems ~-: given J - and 5 r', a morphism ct :~----, ~ " satisfies the following properties:

(Ol) If alaa2, then a(at)a'a(a2) (i.e. a preserves the subattribute relation).

(02) If a e Rc then a(a)eR~tc) (i.e. a preserves the

J=

@ ~c.

Ceil

The above notation for instances explicitly names the class C, and thus distinguishes the elements of this disjoint union. At the end of the previous section it was noted that for a C " e F - A, 3-c. can be identified as a subset of ~7-. Example Suppose the class Employee is synthesized from Person using the foreign key person # in a table Emp. Then the above identification means that we regard the Person whose name is 'Fred' as the same object as the Employee whose name is 'Fred'. On the other hand we cannot ask for SELECT SALARY FROM PERSON WHERE NAME= 'Fred'; since Persons do not have Salary. But we can ask SELECT SALARY FROM EMPLOYEE WHERE NAME= 'Fred';

even though Person Fred and Employee Fred are the same object. Using the language of joins, the last query here is the same as SELECT EMP SALARY FROM EMP,PERSON WHERE PERSON.NAME ='Fred' AND PERSON,PERSON# = EMP.PERSON#;

Furthermore, if we SELECT NAME FROM PERSON

repeating attribute property). then a ( z ( a , s ) ) = z ' ( a ( a ) , s ) (i.e. preserves attributes). I f F k e y ( C , ( f ~ . . . . . f . ) , C '), then F k e y ' ( a ( C ) , ( ~ ( f ~ ) , . . . . ~(f,)),~(C'))(i.e. ~ preserves foreign keys). If Synth(C,(f~ . . . . . f , ) , C ' ) = C " , then Synth'(a (C),(~(f~) . . . . . a ( f . ) ) , a ( C ' ) ) = ~ ( C " ) (i.e. ~ preserves synthesized classes).

(03) If z ' = a ( z ) , (04)

Next consider the disjoint union

456

we will expect a larger set of names than if we

(05)

The importance of the morphism concept is that the construction that OM carries out to collect its data from the relational database is a morphism in this sense from RM to the object system defined for OM. We can now in fact speak of the object system RM. Note that it is not a requirement of morphisms that keys are preserved; for example, we might have k c e Y." and k'~(c)e Y." with m # n. On the other hand, the subattribute requirement (O1) would require that at least they must agree on kt, and foreign keys will impose additional constraints. Since keys are not in general preserved, there is also no preservation of object identity through a morphism. As simple examples of morphisms, notice that consistent renaming of an attribute or class name is a morphism (but data cannot be changed). If 5 r is an object system, then forgetting some or all of the classes, attributes or link relations defines a brain-damaged object system 3 " and there is a natural morphism f ' ---. ~ corresponding to the restoration of the missing structure. In particular there is a natural morphism from the empty object system J~ to any other. We will denote natural morphisms corresponding to the definition of structure, by 6. There are also natural morphisms corresponding to the addition of data, but we reserve 6 for morphisms that do not involve alterations to data, that is, 5 is always onto (it is not always one to one as the examples below show). Information and Software Technology

M K CROWE

are z~ ~ ~-c, z2e ~ n such that for each j with l~j~k, at least one o f ~(z~,a:), ~b(z2,a~) is defined and if defined their value is sj}, and (C ',(f~ . . . . . f~),C ") ~ Fkey~r, if (C,(ft . . . . . f , ) , C ") Fkeys- or (D,(f, . . . . . f , ) , C ") e Fkey~.

But there are no morphisms corresponding to forgetting o f structure or deletion or updating o f data. These definitions define a category: in particular, there is an identity m o r p h i s m ~r : ~ - ~ ~ ' , and the composition o f two morphisms is a morphism. Disjoint union forms a natural binary operation ~) on object systems. I f F' ~_ F~., there is a natural object system ~ ' = ~'lr,, obtained by forgetting the classes not in F', and we can write ~j--'c_ ~-; in such a case we can form the complement ~ - - ~--' in a natural way; but o f course there is no way in general o f reconstructing ~- from J " and 5- _ ~7-,, since relations across the two pieces will have been lost.

There is a natural m o r p h i s m 6 : ~ - ~ ~-'. Unless Synth is already defined totally, a new object system i f ' can be constructed with an additional synthesized class C ' using some (CI ,(f~ . . . . . fn),C~) ~ Fkey. I f in ~',Synth(C~ ,(fl, . . . ,f~),C:) = C', is defined using condition (A4), the class C ' is the natural join CI N,I C~. There is a natural m o r p h i s m 6 : : - - - , ~-'.

Relational operators

Tables and selection

Given an object system "~- and a class C, suppose a subset o f attributes A c A c is selected such that if a~ tra= and a~ e A then a2 e A. Then a new object system ~r, can be defined, adding to 5- a new class C ', the projection PRa C, with:

The result o f selection in a relational system is a relation. Unfortunately, the result o f selection in an object system is not an object, but a table. A table is just a rectangular matrix o f (possibly null) strings: t = {t~} ~ Y."~___E*. A vector o f attributes o f a given class, v e Anc, defines a table t with n columns whose rows consist o f values o f a coherent set o f instances o f 9 - c in the following sense. (Here n is not necessarily the n u m b e r o f c o m p o n e n t s o f kc .)

Ac , = A , R c , = R c m A , k c , = k c c 3 A , t r c,=tr c , = a l A , instances ~--'c.= {ec.(a~,s~). . . (a~,Sm) whenever .-=c(a~,sl)...(am,s=)e~'- c and all a i e A } , and (C ',(fl . . . . . f , ) , C ") ~ Fkey~-, whenever (C ',(f~ . . . . . f , ) , C ") ~ F k e y : and all f, e A. There is a natural m o r p h i s m 6 : ~ - - ~ J - ' . Given an object system J and classes C and D, a new object system 3 - ' can be defined, adding to 5" a new class C ' , the cartesian product CCPD, with: Ac, = Ac 03 An, Re, = Rc ~) Ro, kc, = (kl . . . . . k~+m) where k c = ( k l . . . . ,kn), ko=(kn+l . . . . . kin), trc, defined so that aac,b iff aacb and b # k,, or atrck . and b = k m , or a = k n + l and b = k ~ , or atrob, instances 3-~, = {ec,(al,sl)... (ak,Sk) whenever for some j ~< k, ec (al ,sl ) . . . (aj,sj) e ~--, e~ (aj + i,sj + l ) . . . (ak,Sk) e 5" and a:+~ ac, a:}, and (C ',(fj . . . . . f , ) , C ") -~Fkey:, if (C,(fm . . . . . fn),C") e F k e y : or (D,(f~,

. . . . f , ) , C ") e Fkey=r. There is a natural m o r p h i s m 6 : 5 - ~

5"'.

Remark The condition on foreign keys is merely to extend the existing foreign key constraints to the new class.

In the definition o f cartesian product, suppose now that the attribute sets share some initial c o m p o n e n t s o f the primary keys. Then we can construct instead the reduced cartesian product C ' = CRPD, with:

(T1) (Coherence) F o r each id', if tv is not null, there is a (unique) instance z~ = ec(a~ ,s~)... (ap,Sp) with ap = v:, t~ =sp, and for 1 ~< r < p , if a, = Vk for some k then tik

=

Sr .

(T2) (Exhaustion) F o r each pair o f instances z = ec (a~ ,s~) ... (ap,Sp) with ap = Vj, and z ' = ec(a~,s~)... (a'p,s'p,) with a'p, = vz, such that a t = a~, and Sr = S~ whenever ar = a~, then there is at least one row i with t~ = sp and t,j, = Sp,.

The condition ensures that the table contains all coherent combinations of instances. A combination is included in the table if the values of matching attributes coincide. Note that in the case z = z', the condition reduces to ensuring that every suitable instance (i.e. ap = vj for some j) appears in the table. Remark

(T3) (Nontriviality) N o t all the entries in a row o f t are null. (T4) (Irredundancy) N o two rows o f t are constructed using the same set o f z~.

Ac, = Ac u AD, Rc. u Ro, k c. = (ki . . . . . kn +,,) where

Thus, for example, if an object T has just one repeating attribute a e v whose value is the set S, there will be a row in the table for each s e S. If two repeating attributes a, a ' are in v, neither being a descendant o f the other, with value sets (for T) S and S ', then there will be a row in the table for each pair s,s" with s e S and

k c = (k, . . . . . /,n), k o = (k, . . . . . k , , k n + , . . . . . kn+m), ~c"

s' E S ' .

defined so that atrc,b iff aacb and b ~ k n , or atrck n and b =kin, or a = k n + l and b = k , , or aaob and b # k i , instances ~ , = {ec.(al,st)... (ak,sk) such that aj+laaj for l ~ < j < k , and there

This construction defines a partial function Table: A* E *. The partial nature o f the function comes from the fact that in this section all o f the attributes for a table must belong to the same class.

Vol 35 No 8 August 1993

457

Object systems over relational databases In the query language, an SQL select statement selecting from a single class now defines a table using the following algorithm. First construct the vector v consisting of all the attributes mentioned in the select part of the select statement, together with any key components not mentioned, and any attributes referred to in the where-condition. Then the result of the select statement is the table t,, after deleting all rows that do not satisfy the where-condition, and projecting onto the columns mentioned in the select part. There is a natural way of defining where-conditions and projections as operations on tables, selecting a subset of rows and columns respectively. Intuitively a wherecondition should impose some condition on the values contained in a row, and a projection should select a subvector of the vector v that defined the table. The details are omitted here. Joins are implemented as a selection from the reduced cartesian product of the tables mentioned in the select statement (after possible renaming of attributes where columns are to match), whose where-condition selects for a match of the column pairs referred to in the join condition, followed by a projection to remove one of the (or all but one of a set of) matching columns. Alternatively, joins could be defined directly, by modifying the above condition on instances to require that if aj is one of the shared key attributes, both dp(zl,aj) and c~(%,aj) are defined and have sj as their common value. Remark In the current implementation of the object manager, internally, tables contain instances % rather than the values t0., so that tables can be used for 'positional updates' to the database similarly to cursors in relational databases. However, neither the formalism of relational calculus, nor the formalism presented here, is concerned with imposing an order on data, so positional updates probably should not be supported.

OBJECT MANAGER IMPLEMENTATION

Thus attributes may be defined for later use using SQL statements that refer to the classes FrameType or Attribute.

As explained previously, there is an additional layer of complexity in the OM implementation which results from allowing the attribute names and corresponding database columns to be different. This is useful when a class is being constructed from an existing set of database tables, and was referred to in the RM example under the guise of disambiguation of attribute names. In fact, a relation defines the correspondence of classes to tables and columns to attributes. Finally, OM allows data from a table to be used in more than one class. This is not supported by the morphism concept for object systems, and in fact the implementation uses the following construction. For each FrameType C, there is a method defined which associates the class C with a set of projected database tables in the manner of the examples of the last section. Let RMc denote the associated subdatabase, so that we have natural morphisms

ltI/C RMc

3- c

RM where Wc is simply the natural map given by the instance construction above. Then this construction extends to natural morphisms ~c

O)cRMc

,1

RM

Data structures In the implementation of OM, there is a data structure for each class, instance and attribute. Each class indexes the attributes of its primary key, each attribute indexes its subattributes, each attribute knows which class defined it. The values of attributes are implemented in the following way. For each attribute a enumerate the attributes a' such that a'trca: this allows the attribute information for an instance to be stored in an array. For each attribute a and instance z, index the set {r(a,s)} using the distinct strings s. The enumeration of attributes used here is performed at startup of OM, when it obtains the subattribute relationships from the database. To enable manipulation of attributes in OM, they could be treated as if they were objects belonging to a class Attribute that has no inheritors or inheritees. Then, if we identify A as a subset of Z as we have done above, identifying a string a with an attribute a of that name, we would have a natural identification of a with Attribute(a). 458

where t ( x ) = x for x ~ R M c simply forgets the component C. In fact this construction implements a many-to-many relation between RM and 3-. Parts of a database table may be used to define more than one class; and a class may combine information from more than one database table. This feature was used earlier. The constructions are simple, nevertheless, and so it is a relatively easy matter to verify that the OM implementation is correct.

Data manipulation in the object manager All of the above discussions on object systems and morphisms relate to the starting state of the object manager, before any SQL statements or requests from the application or processed. From this point on, the object system behaves for many purposes similarly to a relational system. This section returns to the issues discussed previously in the light of the above formalism. Information and Software Technology

M K CROWE

For example, in the implementation of a S E L E C T statement, there are two phases, compilation and execution. In the first phase, a list of tableaux 12is constructed, one for each OR clause in the WHERE condition. First, the views mentioned in the SELECT statement are removed, and the underlying classes are used instead, with appropriate renaming of attributes and additions to the WHERE condition. Each row of a tableau controls selection from a join. The keys and subattribute relations are used to construct a hierarchy of attributes for this search. When we SELECT from a class C according to some search condition, we test the search condition on all objects of class C, traversing the hierarchy to collect lists of values corresponding to the row of the tableau. These lists are then taken in combination and tested against the WHERE condition. An important side effect of selection is that data is fetched as required from the external database. Once fetched, it is remembered, and not fetched again. To enforce rule (I5) about foreign keys, it is also necessary to explore backward foreign keys references. For reasons associated with concepts of inheritance, which will be discussed in another paper, forward links are also explored. All fetching of data from the external database is under the control of the fragment manager, described in the next section. INSERT can be used to create new instances in the object system (strictly speaking to create a new object system differing from the old by the addition of new data). The new instances may constitute a new object, or may add values of a set-valued attribute of an object. For each member a of the attribute list, all of p(a) must be supplied in the attribute list and the associated value list is used to ensure that requests for instances of form ec(aL,sl)... (am,sin) are supplied to the object system. This proceeds recursively on the length, so that at each stage given 3, a new instance z(a,s) is requested. If this is a leaf request (no descendants), the request is ignored if s is N U L L , otherwise if arRc, then it is an error if z(a,x) is already defined for any x; otherwise (if a e Rc) it is an error if z (a,s) is defined. If it is not a leaf request, then the request is illegal if s is N U L L , otherwise if a ¢ Rc, it is an error if z(a,x) is defined for s # x. If the request passes these tests, then the instance z(a,s) is created if necessary. Note that consequential modifications will occur to synthesized classes, and classes defined by a relational expression. When we DELETE from a class C according to some search condition, we remove sufficient instances so that there are no instances of class C satisfying the search condition. If an instance is removed, then all descendants of that instance are also removed. However, an instance cannot be removed if such removal would result in violation of rule (I5) on foreign keys. With a little care, these semantics can be made to have the desired effects on modification of repeating (set-valued) sub-attributes.

Example If a road R now bypasses town T, in the example shown earlier, this can be effected in the object manager by Vol 35 No 8 August 1993

DELETE FROM Town1 WHERE t o w n _ n o = T a n d r o a d _ n o = R;

Consequential modifications will occur to synthesized classes, and classes defined by a relational expression. Because of the requirements of object identity, it is not possible to use UPDATE to modify a primary key. Otherwise UPDATE works exactly as one would expect. The WHERE clause selects a set of instances 3, as in the SELECT statement, and for each z, the SET clause is used to modify the descendants of z. Consequential modifications will occur to synthesized classes, and classes defined by a relational expression.

DBMS interface: fragments OM maintains a set of data structures to record the delicate relationship between DBMS tables and OM instances. In general an object class will require many database tables. It is also quite legal for the same database table (or portions thereof) to be used in more than one object class: but this must be explicitly requested in the information initially supplied to OM. This means that changes to one object may result in consequential changes to other objects. For example, consider two tables DEPT and EMP where the primary key of EMP consists of Dept # and Emp # ; Dept# is a foreign key for the DEPT table; and EMPLOYEE is synthesized using this foreign key. Then if the Accounts department is renamed Management Services by the appropriate update to the DEPT table: UPDATE DEPT SET DEPTNAME = ' M a n a g e m e n t Services' WHERE DEPTNAME = 'Accounts';

the new department name will be visible immediately in the EMPLOYEE records. The fragment concept is designed to minimize access to the DBMS, at the cost of maintaining additional information in memory; but because of the many/many nature of the relationship, much of this controlling information is needed in any case. Any changes made to the information stored in OM is not written out to the DBMS unless this is specifically requested. For each database table (or view) that OM accesses:

(1) OM keeps a note of what columns each database

(2)

table possesses. OM maintains a list of fragments that have been retrieved from the database: fragments are 'mixed' fragments (horizontal and vertical fragmentation). In fact each fragment has the form of a projection onto a set of columns and WHERE clauses using =constant and #constants of repeating attributes only. An equality fragment will contain a list of associated instances in OM: an inequality fragment indicates that a parent instance (together with its children) contains all information in an area of the table that has been accessed to date. However, (a) fragments form a tree reflecting the subattribute structure of the OM instance network: this allows

459

Object systems over relational databases

simplification of the fragments so that they contain WHERE clauses on only one column, so that the WHERE clause consists of a single equality condition or a sequence of # ; (b) we also record if a fragment has been deleted, added, updated, or that no data was found in the table when the fragment was requested; (c) for non-key columns in the table, a bitmask in the leaf fragment records which columns have been retrieved from the DBMS. In the fragment hierarchy, care is taken to ensure that (a) no ancestors of add, delete or update fragments are inequality fragments, although ordinary equality fragments can have inequality ancestors and vice versa; (b) an inequality fragment is split when an equality selection on it will actually yield more tuples (i.e. after a database access has confirmed that more tuples need to be added). Thus an equality fragment must correspond to one or more instances in OM: if more than one, they must be in different objects; (c) a delete fragment must be a leaf in the fragment hierarchy. A fragment that has an inequality WHERE clause will be split if (i) OM needs to obtain more information from the database, or (ii) so that any change to the database table will be contained in a fragment which has an equality constraint as does all its ancestors. Splitting an inequality fragment results in creation of an equality fragment at that level, and the addition of a further inequality to the inequality fragment; any subfragment information must be copied (there are no associated instances to be affected by this). (3) OM also records the set of top-level instances in OM that are bound to the database table; for each such binding we note the relevant fragment (top of a fragment hierarchy), and the attributes corresponding to the columns in the database table. Each OM instance records a list of such bindings. (4) OM also records the owner and timestamp associated with the table, as obtained from the database interface. For each object class (frametype) OM keeps a note of the associated database tables: in general there is a many-to-many relationship between classes and tables. Associated with each binding between class and table there is a mechanism for restricting the binding to certain attribute values, and a mapping between columns and attributes. When a SELECT seeks information that must be fetched from the database, an appropriate fragmentary enquiry is made to the database (i.e. the result will become a single fragment), either specifying a single value for a column, or retrieving all values of that column, subject to conditions on the parent columns in the instance hierarchy, and the associated instances are built in OM. The object manager takes special care over foreign keys. When defining a new class, the object manager checks to see if any classes reference it with foreign keys. When any instance is created, the object manager checks both forward and reverse links, 460

obtaining any referenced or referring objects from the external database. (This is done in a way that tolerates circular references.) In order to validate update and delete statements, the associated classes are checked before the operation is performed. In this way the object manager can ensure that no referenced object is deleted, and can make the necessary adjustments to links when foreign keys are updated. If the changes made within OM are requested to be made to the database, the fragment list is traversed and a sequence of SQL statements are flushed to the DBMS, adding, deleting, or updating records for each leaf fragment in the hierarchy. In all cases, the fragment hierarchy contains sufficient information to build the correct WHERE clause, and for add and update the instance information in OM is supplied to the DBMS.

External database interface Databases on the network are accessed using a database interface developed from that of SMART 9, with changes to obtain information about tables from the database. The interface is implemented as local-network-wide server processes, one per DBMS, and are shared among the OMs. Typically, such a database server would be on a file server, but OM imposes no restrictions on this. Thus the object manager uses an auxiliary database server to obtain its information from the external database. More than one DBMS can be used by an object manager, and each database server may have several object managers as its clients. The external database service is currently fully implemented for Oracle, and partially implemented for Ingres. The object manager uses the following services: • sqlsend • sqlselect

• sqlcommit

sends an SQL statement for execution by the DBMS sends an SQL select statement: DBMS returns format information and a vector of strings commits changes to the DBMS

In addition a number of primitives are defined to obtain information about tables and views, such as the column descriptions, primary and foreign keys; and also for timestamp control, disconnection, and to stop the server.

Methods and messages SQL data manipulation languages generally do not support method invocation or programming concepts such as iteration, if-then-else. One view is to say that the DBMS part of the object system is not much concerned with methods, apart from the methods implicit in, say, SQL. From this point of view a method is something that can be delivered to an execution engine that applies it to the object ~3. The view taken of this aspect depends Information and Software Technology

M K CROWE

on the balance between the DBMS and the programming system, as discussed above. Methods are stored in the database together with the class definitions, and can be retrieved from these definitions and interpreted by an execution engine. Primitives to support this process can be added easily enough, once a suitable execution environment (such as C + + ) has been selected. If necessary, an extension to SQL could easily be devised, e.g. we could have SELECT

...

SEND

methname

WITH

application programmer in practice needs to write much simpler SQL, avoiding joins whose only purpose is to traverse a foreign key relationship (an example of this was given earlier). Members of the ESPRIT research community are entitled to receive the software for educational or research purposes free of charge, subject to the usual confidentiality undertakings.

... ;

REFERENCES where the SELECT constructs a set of objects to which the given message is to be sent, and the parameter list for the method is supplied in the WITH clause. The information about the method in the object type definition would identify the execution engine to be used and give other implementation-dependent information (e.g. the name of the function).

CONCLUSIONS A mechanism for layering object systems on top of relational databases has been defined, together with an associated formalism for analysis of the behaviour and semantics of post-relational database systems. A given relational database system can be used to define many object systems, so that object systems are seen as a way of organizing the information in a relational system. A number of relational operators exist in the object system, and the standard relational query language SQL can be used for data retrieval and manipulation in the object system. The value of the system in the H E R M E S project is partly that it acts as an intermediate layer above the RDBMS, so that changes for scenarios can be made in the object system and seen by applications using the object system, without being made to the RDBMS. But it is felt to be of wider interest in that using the synthesis construction and the extension concepts of set-valued attributes and subattributes objects can be treated as logical entities, with the object manager collecting information relating to these objects from the many relational tables involved. The resulting objects are not normalized: this is the whole p o i n t - - b u t operations on these objects, with the semantics presented in this paper are correctly transmitted to the RDBMS if the object system is commanded to do so. This means that the

Vol 35 No 8 August 1993

1 Abiteboul, S, Knnellakis, P C 'Object identity as a query language primitive' Proc. ACM 1989 SIGMOD Int. Conf. Management of Data New York, pp 159-173 2 Beech, A 'A foundation for evolution from relational to object databases', in 'Advances in Database Technology EDBT 88', Lecture Notes in Computer Science, 303, Springer, pp 251-270 3 Lyngbaek, P and Vianu, V 'Mapping a semantic database model to the relational model' Proc. ACM 1989 SIGMOD Int. Conf. Management of Data, New York, pp 132-142 4 Tsur, S and Zaniolo, C 'An implementation of GEM-supporting a semantic data model on a relational backend' ACM SIGMOD '84 Proc. New York, pp 286-295 5 Vossen, G Data models, database languages and database management systems Addison-Wesley (1991) 6 Chen, P P-S 'The entity-relationship model--towards a unified view of data' ACM Trans. Database Systems t (1976) pp 9-36 7 Markowitz, V M and Shoshani, A 'On the correctness of representing extended entity-relationship structures in the relational model' Proc. ACM SIGMOD Int. Conf. Management of Data New York, pp 430-439 8 Crowe, M K and Kaur, A 'Object-oriented interfaces to relational databases' Internal Report (January 1992) HERMES--Highly-Interactive Environmental Resource Manager (Extendible System), ESPRIT project No 5405 9 Crowe, M K and Cartasco, J A 'System measurement and architecture techniques (SMART)' Proc. 6th Ann. ESPRIT Conf. Brussels (1989), Kluwer, pp 582-593 10 Roth, M A and Korth, H F 'The design of INF relational database into nested normal form', Proc. A CM 1989 SIGMOD Int. Conf. Management of Data, New York, pp 389-417 11 Jagadish, H V 'Incorporating hierarchy in a relational model of data' Proc. ACM 1989 SIGMOD Int. Conf. Management of Data, New York, pp 78-87 12 Aho, A V, Sagiv, Y, Ullmann, J D 'Efficient optimization of a class of relational expressions' ACM Trans. Database Systems, 4 (1979) pp 435-454 13 The common object request broker Architecture and specification, Draft 26 (August 1991) Object Management Group

461