Schema design using a data structure matrix

Schema design using a data structure matrix

In~om. SystemsVol. 3. pp. U-91 @ PergamonPressLtd. 1978. Printed in Great ErNain SCHEMA DESIGN USING A DATA STRUCTURE MATRIX LEIF Norwegian B. METHL...

900KB Sizes 0 Downloads 85 Views

In~om. SystemsVol. 3. pp. U-91 @ PergamonPressLtd. 1978. Printed in Great ErNain

SCHEMA DESIGN USING A DATA STRUCTURE MATRIX LEIF Norwegian

B. METHLIE

School of Economics and Business Administration. (Receiwd

25 March

1911: revised

20 lune

NSOOO Bergen, Norway 1977)

Abstract-A formal design procedure of a data base schema based on the conceptual framework of the Codasyl DBTG. is presented. A data structure matrix is used to identify and interrelate entry types of the data base. The design procedure rests upon a description of the set of information messages conceptually communicated with the data base. The messages are determined by an information analysis. The derived schema is a feasible DBTGschema, i.e. the information requirements are retained in the entry and set types defined. However. the schema is not evaluated with regards to efficiency. The data structure matrix is compatible with the well known data structure diagram. However, the matrix includes the structural elements as well as entry descriptions. Finally, the design procedure is demonstrated on a case study. SCHEMA

DESIGN

problem of identifying appropriate entry types and entry relations may be great. One possible solution of the schema design is to implement defined message types as entry types by a one-to-one mapping. This will lead to a simple file structure with no inter file relations. However, the data redundancy will be high leading to great restructuring problems and complicated updating procedures. The schema design procedure presented in this paper is an attempt to formalize the mapping of information requirement specifications (messages) into a set of data structures compatible with a DBTG-schema. As can be seen from Fig. 1, the schema design process can be subdivided into the two tasks: design of a DBTG-

TASKS

In this paper a procedure leading towards a physical realization of a data base managed by a data base management system (DBMS) is outlined. A DBMS is a generalized manipulation.

tool

for

data

It makes

structure

an integrated

definition collection

and

data

of data

available

to a wide variety of users. It allows centralized control of the data, which is necessary for efficient data administration. DBMS technology in some form or another can be traced back 15-20 years. However, the report of the CODASYL Data Base task group (DBTG)[I] was a landmark in the development of data base technology. The DBTG specifications include several languages to be used to define and manipulate data. The design procedure described in this paper makes it possible to define a schema (a “schematic” diagram) of a data base utilizing the conceptual framework of DBTG. The schema can then be described in the DBTG data definition language (DDL). The design of a data base managed by a generalized DBMS can be subdivided into the following tasks: (i) schema design (ii) allocation to physical storage (iii) design of fall back and recovery procedures (iv) design of restructuring and reorganization procedures. Schema design is based on knowledge of the information to be supported by the data base. The information requirements or messages (cf. conceptual framework in next section) may be deduced by a formal procedure, an information analysis or merely stated. The messages carry information to the information user directly or to a process deducing new information. Occurrences of the messages are entities which are conceptually (not necessarily physically) communicated with the data base. One problem in design of data bases today is that there seems to be no formal procedure available to deduce data structures from specifications of information requirements. In smaller systems this may not be a significant problem. In larger systems, however, the

+ 4

J

Designof e DBTG-schema

J LJata structure matrix 4

I

1 Data struc-

L-l ture matnx (DBTG-

schema)

Fig. 1. Outline of the schema design process. 81

a2

L. B. METHLIE

schema and evaluation of this schema with possible iterations between these two tasks. The basic input to the design process is the data representation of the information messages, transaction input/output data such as transaction frequencies. etc. and system objectives specifying, among other things, response time requirements. The first task in the schema design process is the design of a feusible schema. which is a schema satisfying the necessary conditions: (i) the information requirements determined by the information analysis (ii) compatibility with DBTG-schema. To obtain a solution satisfying the performance requirements certain performance measures such as retrieval efficiency, updating and insertion efficiency, data base creation time, storage space, etc. must be taken into consideration. Performance evaluations may lead to changes in the basic schema structure. However, these changes must not violate the necessary conditions stated above. Design of a DBTG-schema may be further subdivided into: (i) definition of entry types (ii) definition of set types (iii) intra entry structuring (entry lay-out) (iv) inter entry type structuring (file structuring) (v) inter entry structuring (ordering of entry occurrences) Inter entry structuring is concerned with the way of placing and retrieving entry occurrences. This task is closely related to allocation of data structures on physical storage media and will not be dealt with in this paper. CONCEPTUAL

FRAMEWORK

A collection of data, structured in one way or another in a data base, has no value of its own. Only by interpretation data symbols can give information to support decisions and actions. Interpretation implies assigning meaning to data so that entities outside the message itself, real world objects, can be recognized. Interpretation is a complimentary process to representation where terms referring to real world phenomena are represented by data. Representation and interpretation are parts of the communication process. We will use the terms object, property, object relation and time when we discuss the real world realm. An object is a unique part of the real world which we will be interested in. It may be concrete or abstract like person, enterprise, event etc. A property is a characteristic of an object. A property plays a role in describing an entity, for example, identifying it or characterizing it. An objea relation is a special feature of objects. Very often we are interested in properties of a relation of objects more than a single object. For example in a buying transaction, quantity ordered is not a property of a customer nor of the articles ordered, but of the relation between customer and article. Time is a fundamental entity in the real world. In Fig. 2 concepts of the real world are illustrated. Information about real world objects are communicated between different persons in an organization by

Properties

Properties

Fig. 2. Illustration

of real world

concepts.

means of messages. A message is a theoretical concept relating signs to real world entities through an encoding and decoding process. Can a message carry information? This will obviously depend on the reference frames of sender and receiver of the message. In the following an operational definition of a message is given that is restricted to semantic aspects. A message is considered to consist of a set of reference terms, called attributes, which by their names can be associated to entities outside the message itself. The semantics of a message is concerned with labeling of these entities and with the structure of the message. The message concept follows from work by LangeforsI2J and corresponds, to some extent, to a fact representation by Senko[3]. In mathematical terms the form of a message may be regarded as an n-tuple of a relation. The general structure of a message is

u R T) where J is an identifier vector denoting attributes which uniquely identify an object (I is a scalar) or an object relation, p is a property vector characterizing the object or the object relation given by J and T is a time vector. If this vector is omitted, it means “for all times”. It is convenient in systemwork to seperate the message description from its value contents, i.e. to split the attributes into attribute names and attribute values. A set of messages is said to belong to the same kind if the distinct messages in the set differ in values only for identifier and property attributes, and time if this is included. The attribute names of a message define a message type. The collection of message types to be supported by the data base will be called an information structure schema of the system. An example of an information structure schema is shown in Fig. 8. From knowledge of the information structure schema we may now be able to design the data structures. The basic entity describing data structures, is the entry which is an ordered set of data items, groups or group relations[l]. In the real world realm we distinguished between an object and an object relation. We want to

Schema

design using a data structure

retain this distinction through the mapping procedure. Thus we define two kinds of entry types, the object entry type and the relational entry type. In the following entry definitions we shall omit the time dimension introduced in the message above, assuming that an entry is valid as long as it exists or if their exists two or more entries representing a phenomenon at different times they will be divided into generations of entries. An object entry type can be written as: ((ED,(G, -

G,))

ED is the entry defining group consisting of one or more elements, identifiers, all of which refer to the same object type and each of which is sufficient to uniquely identify the object type. Each identifier may be represented by data items and/or groups. For example. both order date with the attribute name, ODATE. and order number. named ONUM, refer to the object type, named ORDER. ONUM is an item, while ODATE is a group consisting of the items YEAR. MONTH and DAY. In other words, we may use different attributes to identify the same object. Thus, the identifying attributes must be analysed with respect to which objects they refer to. In the basic schema alternative we want a one-to-one map between an object type and an entry type. The second entry type to be defined is the relational entry type: ((E,,...&(G,.

.G,)).

The relational entry type describes a set of entries representing object relations and their associated properties. (ED, . . . ED.) is an entry defining vector. Each element ET in theobject relation has the same properties as thzntry defining group of the object entry type described above. Note that for the entry defining vector to be complete, we require at least one identifying attribute from each element, i.e. each associated object in the relation must be represented. The existence of an entry type is declared by naming it. Intra entry structuring is concerned with assigning attributes to each entry type and the ordering of these attributes within the entry. In the basic schema definition we are concerned- with which attributes to assign to which entry types. In subsequent analysis the ordering may be changed due to performance evaluations. It is appropriate to distinguish between inter entry type structuring or file structuring and inter entry structuring. Inter entry type structuring is concerned with connecting different entry types. Such a connection we call a set type. A set type consists of one owner-entry type and one or more member-entry types. Just as the distinction was drawn between entry type and entry (occurrence), so a distinction is made between set type and set occurrence. The existence of a set type is declared by naming it, stating its owner-entry type and its member entry type or types. A set occurrence is an occurrence of its owner-

83

matrix

entry type together with zero or more occurrences of each member-entry types. In every set occurrence. the following relations exist: (i) given an owner entry. it is possible to access the related member-entry(-ies) of that set occurrence (ii) given a member-entry. it is possible to access the related owner-entry of that set occurrence (iii) given a member-entry, it is possible to access other member-entries of the same set occurrence Furthermore, a given member-entry cannot simultaneously belong to more than one set occurrence of the same set type. In other words an entry cannot be a member of two or more owner-entries of the same set type. The entry and set concepts can be used to define different types of data structures. A hierarchy is a common data structure. In this structure. one entry owns 0 to R occurrences of another entry type. each of those occurrences in turn owns 0 to n occurrences of a third entry type, etc. No entry owns any entry that owns it, either directly or through other entries. We call this a l-to-n relationship. An example of a hierarchical data structure is a representation of the relationship between departments and employees, where one department has many persons associated with it, but no person is associated with more than one department. A second common data structure is the network. In this structure, we have a many-to-many relationship: that is an owner entry type can be related to 0 to n occurrences of another entry type, each of which is an owner entry of 0 to n occurrences of the first entry type. We have a l-to-many relationship in both directions. An example is a customer who may buy one or more articles, and the articles may be sold to one or more customers. In this case we define a relational entry type, to which attributes associated with the relation between the two objects represented in the two entry types, are assigned: for example quantity of a specific article bought by a specific customer. Two set types are established to relate the relational entry type to its two object entry types. In general, if the order of a relation is n, we will have n different set types. The file concept will not be used here. A file is a collection of entry occurrences which can be described by a common entry type and uses a common accessalgorithm to select complete entry occurrences. DATA STRUCTURE

MATRIX

From the conceptual framework defined above, we can summarize the concepts needed to define the data structures as: (i) two kinds of entry types: the object entry types the relational entry types (ii) set types (iii) two kinds of attributes: identifying attributes denoted by I property attributes denoted by P (iv) attributes which may be. either items or groups. A group may be a repeating group denoted by P(N). The objective of our schema design task is to describe the data structures in these terms. Furthermore, we

L. B. METHLIE

84

should be able to deduce the data structures from the defined information structures by an analytic approach called a reduction procedure. We will propose to use a data structure matrix to identify and inter-relate entry types of a data base. The data structure matrix fullfills four objectives: (i) The matrix is the basis for the reduction procedure converting the information structures into data structures. (ii) It is a suitabie documentation form of the data base schema. (iii) The matrix-form is suitable for further analysis and evaluation. (iv) It is a DBTG-schema representation. The data structure matrix is divided into four partitions as shown in Fig. 3. The data structure matrix is a (m + n) x (m t k+r) matrix, where m is the number of object types, n is the number of identified object relations, k is the number of identifying attributes and r is the number of property attributes. Each row of the matrix represents an entry type. The first m rows represent object entry types, and the last n rows represent relational entry types. The name of each entry type is stated in each row (0,. . . O,, R,(Oi.. . Oi). . . R,(Oi..

.O,)).

The name of the entry type will be a description of the object class represented by the entry type. Note that this name may be different from the name(s) of the entry defining group(s) and that we may have more than one identifying attribute refering to the same object. Example: PERSONNAME and CIVILNO are two identifying attributes of the same object type “person”. They may be used in different messages to give ; ‘ormation in

-I-

Ol entry

1 I1 . . . . ..Ik I

Attributes P1 . . . . . . . . . . . . . .

T T

types

-

Oi 0

()m

-

Object

m

I

Set-types

!d 01wrier)

different contexts. The name of the object entry type may be PERSON. A relational entry type is given a name of its own. Example: Quantity (QUANT) bought of a certain article (ART) by a specific customer (CUST) can be denoted by ((CUST, ART) QUANT,). The name of the relation may be CUST-ART and stated in the matrix in the lower n rows (R(0,. . . 0.)). Set types are stated in partitions A,, and AZ,. In partition A,, we state a set type between two object entry types and in partition AZ, we state the set-types between a relational entry type and its owner-entry types. Relational entries are accessed VIA ownerentries. We use the names of the object classes to identify the m columns of partitions A,, and A,,. An element (ei.i) of partition A,, signifies a relation between two object entry types denoted by Oi and Oi. An element (e.,) of partition A*, signifies a relation between a relational entry type R,(O, . . . 0.) and the owner-entry types defined by the object classes 0;. . . 0,. The value of (e) can be I or N indicating a one-to-one or a one-to-many relation. The name of the set type is of the form X-OF-Y, where X is the name of the member-entry type (row) and Y is the name of the owner-entry type (column). Example: PERSON and DEPARTMENT are names of two object entry types. However, information about persons belonging to a specific department is wanted from the data base. This information. can be represented by a set with DEPARTMENT as owner-entry type and PERSON as member-entry type. The relationship is one-to-many and an “N” is assigned in the appropriate

---I--/ A11

-

R1(O....Oi) 3 Relational entry types

R,(Oi..

.O,_,)

Fig. 3. Data structure matrix.

Schema design using a data structure matrix

element of partition A,, as shown below. The set type is given the name PERSON-OF-DEPARTMENT. Quantity is a property of a buying transaction identified by the object relation (customer, article). We identify a relation entry type which we give the name CUST-ART and if we want to access this file VIA both CUSTOMER-file and ARTICLE-file (we have a manyto-many relationship), we establish the two set types: (CUST-ART)-OF-ARTICLE (CUST-ART)-OF-CUSTOMER

HII

IPERSON

1

I_

N

Note that we do not decompose DATE into YEAR, MONTH, DAY with one column for each. The attribute part of the matrix is shown in Fig. 5. A combination of an I and a P element within an entry type represents a binary association (see Ref. 131)or an elementary message type (see Ref. [2]).

II 1

INI

IDEPARTMENT I

UST-ART

Attributes are assigned to entry types in partitions A I? and Az2. We distinguish between identifying attributes and property attributes and we denote them I and P respectively. The property attribute may be a repeating group, in which case the element is denoted P(N). Each attribute may be either an item or a group. One column is used for each attribute. Decompositions of groups are not specified in the matrix. Example: Two messages informing about a project use different identifying attributes. Ml: (PROJ.NAME. COST) M2: (PROJ.NO. READY-DATE)

and we put N’s in the appropriate elements of the matrix as shown in Fig. 4. If we want to relate quantity not only to customer and article but, for instance, to the salesman (person) as well, we may define an object relation of order three and establish three set types. Three N’s are inserted in the columns corresponding to the object types of the relation.

IARTICLE

I

1

\

'

II

I

N .

I I, II ’ II

1

Fig. 4. Illustration of set types.

PROJECT

85

I

I

PP _.___.--.. --__

Fig. 5. Assignment of attributes.

L. B. METHLIE

86 REDUCTION PROCEMIRE

In large data information

structures

compatible needed.

bases, a formalized

with

a data

This section

DBTG-schema

to

entries

system,

is

leading

As we

set types

above

of elements:

specifications.

in one

another

message.

The

may

type

in of

structure

with

(object type

classes)

with they

is represented

an object

entry

structure

matrix.

Note

in the

type

identify

On the

same

kind,

following

to

seven

4.2 The

with

e.g. persons,

the

same

identifying

SEQ.

NO.

object

the

object

m rows

of the data

type

is given

be identified AND

objects

a

may

be grouped

“teachers”

and

attribute

into

two

name

information

relations.

Each

structure

in

message

having

attributes

referring

describes

an object

relation

tional

messages

referring

to identical

entry

same

kind

performed

by a relational

is a matter

to identify to identical

3. List all attribute

as described

in step

if there exists different

I, is

attribute

representing

names

the lower

property

attribute

in the column

columns

followed

of

attributes

are listed

by

property

the

(upper

assigned

between objects of the same or different kinds.

(lower

entry

referred

by assigning

is at-

object

matrix.

types) entry

the

role

type). type of the

above). of

in the some

to and

to the entry

via the owners

has a P-value each

entry

being type,

both the

and an I-value

in

is substituted

by

P-value

1 or N is assigned entry

as shown

rows (entry

PERSON

entry

is set to I.

types,

The value

referred

relationship

plays

of the

types

type (member

are accessed

attribute

entry

in the elements

are assigned

of the row of the member

of the data structure

attribute

to by the identifying

(owner

entry

attributes

of the element

7. Empty

type

“N’s”

and descriptive

a set-type.

and of the

to

sets. (cf. the customer-article

diferent

object)

corresponding

attributes

the entries an

the

the department-per-

to the object

the relational

identifier

level

to the complement (cf.

in

to which

“P(N)” is entered in the eolumn of the attribute name (partition A&. Set types

No identifying

5. If

an “N”

type

or

PROJECT

Fig. 6. Conversion

entry

object).

corresponding

because

type

above).

relation

“P”

columns

entry

of the relation.

n : n relationships. A property

the object

message to choose

by entering

the

is assigned level

type

object

level object

to the relational

of the owner *Relationships

is redun-

relationships. The pro-

corresponding

son relationship

objects

attributes

to the

is established

A,, opposite

6. If an attribute in the column-heading

message the

efficiency.

is assigned

partition

content

attributes

objects.

Al2 and Am. Identifying leftmost

by a

rows of the matrix.

between

I : n (hierorchicol)

opposite

relations

reference

In a relational

relationship

of performance

object

file and described

type in the n lower of analysis

object

relation

as an object

by the identifying

R(0, . . . 0,). Rela-

has an object

entry

two or

to diferent

more identifying

to

cor-

(repeating

type is treated

appropriate

order

and

P(N)

in 4.1. Which

as described

tributes.

SOCIAL

of

object

identifying and

type

4.3.2

“students”

the

dant. The message

PROJECTthe

in the columns

multivalued

to, one of the identifying

relation

by two

of basically

types,

are represented

reference

opposite P

and

I relationships.*

I:

A set type

by

valued

message

are defined

2. Inspect

in the

identified

entry

PROJNO hand

with

partitions

types

file and described

class may

classes

referring

object

names

respectively.

a one-to-one

perty

to one of the first m columns.

object

relational

information

the

to. Each

object

different

identify

to

in the first

other

e.g.

object

to by I. I and P denote

to single

referred

structures.

in the

respect

Each

attributes,

NAME.

terms

refer

that an object

different

a single

attributes.

4.2.1 of

set types

of the data

by an object

name and is assigned

struc-

by the fol-

(two or more identifiers):

a property-

be an identifier

the necessary

identifier

schema

matrix

are entered

attribute

attributes

respond group)

and time

will

referred

property

objective

procedure

properties

identified

being

well

is described all

has

*- P” and “P(N)”

4.2.2

structure

The

message

appropriate

steps: I. Analyse

in the information

structure

procedure:

“I”. the

properties

an attribute

and establish

procedure

order.

message

to the data

4.1 The

with

the conceptual

is a semantic

reduction

the information

each

schema

lowing

are grouped requirements.

from

identifiers.

message.

such properties

recall

a message

However,

attribute

are

is one important

may

framework

ture

to a

associated

relations

information

of set types

kinds

attributes

or object

the

three

The

are

in alphabetical

4. Convert

(one identifier):

necessary

satisfy

procedure.

retain

which

management

procedure

and

to

Identification the

structures

alternative.

By the reduction into

to reduce

deals with a procedure

the same kind of objects order

data

base

attributes

procedure

types)

type

to the element and the column

in Fig. 6. are deleted.

Schema

design using a data structure

(k

tI

87

matrix

columns)

message

‘I 4.2.3 Enter P or P(N) Establish

4.2.2 Enter 1,P, P(N) Establish a

4.1, 4.2.1 Enter 1. P or P(N) in matfix.

WWPe

I

I YES

5 From

elements

with I and P

where P and

Fig. 7. Flow

I

chart of the reduction

procedure.

,

set WW

I

L. B. METHLIE

88

A flow chart of the reduction procedure is shown in Fig. 7. This completes the first version of the schema. As stated above, the data structure matrix thus arrived at represents a feasible solution to our schema design problem. It will contain a minimum of data redundancy, and thus, be advantageous with regards to storage space. Further structuring may now be carried out, it may for instance be appropriate to separate all repeating groups from the entry types they are assigned to. This can be easily seen from the matrix. Separate entry types may be defined for these attributes and corresponding set types established. Further evaluation of the schema with regards to performance properties will not be dealt with in this paper. CASE STUDY

Finally, we will show how the data structure matrix can be applied to a specific case concerning ordering of

auxilliary equipment by subscribers of a telephone company. The information system will support clerks with information about existing subscribers and their equipment, ordering information etc. An information analysis has been carried out giving as a result a specification of the information structures (messages) to be supported by the data base. For our purpose the information structure schema shown in Fig. 8 will be an adequate starting point to apply the method described in this paper. We start by analyzing the identifier attributes in the information structure schema to identify the object types. In the schema we find that the attributes, ODATE and ONUM, refer to the same object type, viz. “order”. For the rest of the identifiers there are a one-to-one map between object types and identifying attributes. We give each object type a unique name, which also will be used as the name of the corresponding object entry, and insert them in the first four rows and columns of the matrix.

MESSAGES

ATTRIBUTES NAMES

ADDR ALL-QUANT DATUM

-

-_-.~-

EQUIP

IDATE ---_-

*,

INDEX ONUM

QUANT SUBSCR SIGN UPRICE-I

Fig. 8. Information structure schema.

Schema

Object

relations

this case study The attributes

four

attributes.

procedure

in the next

rows,

in

in two groups

identifying

matrix

list them

to step

in the

structure We

now

matrix

obtained

the lay-out

as shown

in Fig. 9.

now proceed

EQUIP-S((EQUIP. object

relationship,

relational each

entry

of OTYPE Furthermore. and

one

from

and

columns

we establish from

relation attribute

the same

than one type

We locate

EQUIP

the via-access

EQUIPMENT.

entry

Two

N’s

The above

matrix

For

in the ORDER-entry

type

of

structure

schema

such a

assigned

to the

I’s

message.

the

However. messages.

(EQUIP/EQUIP)

is

step 4 and we can proceed of ODATE

and ONUM

type to I in both elements.

According

to step 6 we look for I’s and P’s in an attribute

column

but in different

plays

descriptive

role

entries.

We

in the

identifier

type

the P in the ORDER-entry SCRIBER.

of the data structure

note that

ORDER)

element

matrix

(step 3 completed)

is inserted

in the matrix

SUBSCR

ORDER-entry

in the SUBSCRIBER-entry

ORDER

EQUIP-S

The

objects

called

the values

DATUM

Fig. 10. The message

we

note may

EQUIP-COMBI.

or more

complete

EQUIPMENT

Fig. 9. Lay-out

messages

between

in the same

path are

nine

like any other relational

entry

to step 5. We change

in the relational

the object

by two

is treated

relational

having

structure

defined.

of

a P

other

of message

EQUIP

the

we have

We thus insert

name

After data

in Fig. I I. A special

In the information

the message A

the

a relation

is defined

A:,. the

in Fig. IO.

shown

describes

kind.

is The

in the matrix.

and QUANT.

a set type

of more

orders.

ONUM

and QUANT

by establishing ORDER

while

in many

of both

in the OTYPE type.

QUANT)).

type (EQUIP/ORDER)

occurrence

occurrences entry

and quantity

message

message

inserted

at the matrix

one

is a many-to-many

may consist

may be found

messages

The first message

order)

i.e. an order

data

same

(OTYPE.

(equipment,

of equipment

equipment

matrix.

ONUM).

relation

of a specific

to step 4 and transfer

by one to the data structure

type

having

be given to the transfer have

in partition

particular

looks as shown

arrive

attribute-par-

this

After

3 in the

R9

matrix

to these columns

converted

titions. We

assigned

references.

and according we

using a data structure

relation

object

are now sorted

and property reduction

are then inserted

we have

design

type type.

type

by a

in partition

and

a

is an

We substitute

I in the (SUBA,,.

As

we

r

90

L. B.

NN

1

N

I

N

METHLIE

_____

..-

..-... _____ .__.,_._, P

NI

P

P

Fig. I I. All messages inserted (step 4 completed).

ECJJ~DzJIP)

N

kUJIP/~)

N

P Nt

(OW-)

P

P

I

Fig. 12. Data structure matrix of the case. study after the reduction procedure has been completed. already

have

“subscriber” require

a

I: n relationship

and “order”,

a seperate

however,

established

set type

the requirement

to be defined.

do not

Step 7 tells us to delete empty entry types. In our case the relational

entry type (ORDER/SUBSCR)

entry because it is a hierarchical We

also

note

that

the

ORDER-entry

types

have

such as ADDR,

PRIO

and PHONE.

and

SUB-

ORDER-entry

used to model entities and sets is

diagram.

The data structure

Bachman[41.

This

concepts-a

gle enclosing

It can be shown

be drawn

from

that this

the data-structure

diagram was first introduced

graphic

notation

uses two

fun-

rectangle and an arrow. A rectan-

a name denotes an entry type. The second

concept, the arrow, represents a set type. The entry type

This redundancy

located at the tail of the arrow is the owner entry type,

and

establish

necessary

type

located

at the head is the number

EOWWENT

into one or more (sub-)

necessary

types and the ORDER-

set types between or SUBSCRIBER-

to choose will require further

and update frequencies.

leave this problem at this stage.

ofma A

entry type of retrieval

as

to either the SUBSCRIBER-

type

types and establish

Which solution

widely

can easily

and the entry

(iii) separate these attributes these entry

DBTG-schema

attributes,

common

access-paths by defining set types entry

a feasible

values in both entries

(ii) assign the attributes the

many

matrix. by

damental

has three solutions:

(i) store the attribute or

is an empty

relationship.

SCRIBER-entry problem

One notation

the data structure diagram

in the set.

reached

shown in Fig. 12.

It indicates

of access to the owner-entry

from each member-entry

We have now

between

the reverse relationship

analysis

(EOUIPI ORDER)

In this case we will

Fig. 13. Illustration of one row of partition

A,,

of the matrix.

Schema

design using a data structure

We have also introduced “double-bottom” rectangles to denote entry types with indirect accesses. The complete data structure diagram of the case-study is shown in Fig. 14. The schema design procedure has been based on the conceptual framework defined by DBTG. The DBTG specification also include several languages to be used to describe and manipulate data. The data structure matrix can easily be described in a Schema Data Description Language (DDL). Thus it can be implemented in any DBTG-based data base management systems. An implementation of the above data structure matrix has been done in DMS I I00 on the Univac 1I IO computer.

ORDER

EOU~PIORDER I Fig. 14. Data structure

91

matrix

diagram

of the case study.

entry type. The information found in the data structure diagram is the information found in partitions A,, and AZ1 in the data structure matrix. Take the relational entry type (EQUIP/ORDER). It is related to the two entry types EQUIPMENT and ORDER by the two set types: (EQUIP/ORDER)-OF-EQUIPMENT and (EQUIP/ORDER)OF-ORDER. The data structure diagram of this relationship is shown in Fig. 13.

REFERENCES

111Codasyl: Feature analysis of generalized data base management systems. Codasyl Systems Committee Technical Report (May 1971). 121B. Langefors: Information systems. In fnforvwtion Processina 74, D. 937-945, North-Holland, Amsterdam (1974). 131MT E. ‘Senko: DIAM II. Semantic Binaries and ANSI SPARC Database Technology, Online London (1976). Data Structure Diagrams, Data Base. ACM 141 C. W. Bachman: (Summer 1%9).