In~om. SystemsVol. 3. pp. U-91 @ PergamonPressLtd. 1978. Printed in Great ErNain
SCHEMA DESIGN USING A DATA STRUCTURE MATRIX LEIF Norwegian
B. METHLIE
School of Economics and Business Administration. (Receiwd
25 March
1911: revised
20 lune
NSOOO Bergen, Norway 1977)
Abstract-A formal design procedure of a data base schema based on the conceptual framework of the Codasyl DBTG. is presented. A data structure matrix is used to identify and interrelate entry types of the data base. The design procedure rests upon a description of the set of information messages conceptually communicated with the data base. The messages are determined by an information analysis. The derived schema is a feasible DBTGschema, i.e. the information requirements are retained in the entry and set types defined. However. the schema is not evaluated with regards to efficiency. The data structure matrix is compatible with the well known data structure diagram. However, the matrix includes the structural elements as well as entry descriptions. Finally, the design procedure is demonstrated on a case study. SCHEMA
DESIGN
problem of identifying appropriate entry types and entry relations may be great. One possible solution of the schema design is to implement defined message types as entry types by a one-to-one mapping. This will lead to a simple file structure with no inter file relations. However, the data redundancy will be high leading to great restructuring problems and complicated updating procedures. The schema design procedure presented in this paper is an attempt to formalize the mapping of information requirement specifications (messages) into a set of data structures compatible with a DBTG-schema. As can be seen from Fig. 1, the schema design process can be subdivided into the two tasks: design of a DBTG-
TASKS
In this paper a procedure leading towards a physical realization of a data base managed by a data base management system (DBMS) is outlined. A DBMS is a generalized manipulation.
tool
for
data
It makes
structure
an integrated
definition collection
and
data
of data
available
to a wide variety of users. It allows centralized control of the data, which is necessary for efficient data administration. DBMS technology in some form or another can be traced back 15-20 years. However, the report of the CODASYL Data Base task group (DBTG)[I] was a landmark in the development of data base technology. The DBTG specifications include several languages to be used to define and manipulate data. The design procedure described in this paper makes it possible to define a schema (a “schematic” diagram) of a data base utilizing the conceptual framework of DBTG. The schema can then be described in the DBTG data definition language (DDL). The design of a data base managed by a generalized DBMS can be subdivided into the following tasks: (i) schema design (ii) allocation to physical storage (iii) design of fall back and recovery procedures (iv) design of restructuring and reorganization procedures. Schema design is based on knowledge of the information to be supported by the data base. The information requirements or messages (cf. conceptual framework in next section) may be deduced by a formal procedure, an information analysis or merely stated. The messages carry information to the information user directly or to a process deducing new information. Occurrences of the messages are entities which are conceptually (not necessarily physically) communicated with the data base. One problem in design of data bases today is that there seems to be no formal procedure available to deduce data structures from specifications of information requirements. In smaller systems this may not be a significant problem. In larger systems, however, the
+ 4
J
Designof e DBTG-schema
J LJata structure matrix 4
I
1 Data struc-
L-l ture matnx (DBTG-
schema)
Fig. 1. Outline of the schema design process. 81
a2
L. B. METHLIE
schema and evaluation of this schema with possible iterations between these two tasks. The basic input to the design process is the data representation of the information messages, transaction input/output data such as transaction frequencies. etc. and system objectives specifying, among other things, response time requirements. The first task in the schema design process is the design of a feusible schema. which is a schema satisfying the necessary conditions: (i) the information requirements determined by the information analysis (ii) compatibility with DBTG-schema. To obtain a solution satisfying the performance requirements certain performance measures such as retrieval efficiency, updating and insertion efficiency, data base creation time, storage space, etc. must be taken into consideration. Performance evaluations may lead to changes in the basic schema structure. However, these changes must not violate the necessary conditions stated above. Design of a DBTG-schema may be further subdivided into: (i) definition of entry types (ii) definition of set types (iii) intra entry structuring (entry lay-out) (iv) inter entry type structuring (file structuring) (v) inter entry structuring (ordering of entry occurrences) Inter entry structuring is concerned with the way of placing and retrieving entry occurrences. This task is closely related to allocation of data structures on physical storage media and will not be dealt with in this paper. CONCEPTUAL
FRAMEWORK
A collection of data, structured in one way or another in a data base, has no value of its own. Only by interpretation data symbols can give information to support decisions and actions. Interpretation implies assigning meaning to data so that entities outside the message itself, real world objects, can be recognized. Interpretation is a complimentary process to representation where terms referring to real world phenomena are represented by data. Representation and interpretation are parts of the communication process. We will use the terms object, property, object relation and time when we discuss the real world realm. An object is a unique part of the real world which we will be interested in. It may be concrete or abstract like person, enterprise, event etc. A property is a characteristic of an object. A property plays a role in describing an entity, for example, identifying it or characterizing it. An objea relation is a special feature of objects. Very often we are interested in properties of a relation of objects more than a single object. For example in a buying transaction, quantity ordered is not a property of a customer nor of the articles ordered, but of the relation between customer and article. Time is a fundamental entity in the real world. In Fig. 2 concepts of the real world are illustrated. Information about real world objects are communicated between different persons in an organization by
Properties
Properties
Fig. 2. Illustration
of real world
concepts.
means of messages. A message is a theoretical concept relating signs to real world entities through an encoding and decoding process. Can a message carry information? This will obviously depend on the reference frames of sender and receiver of the message. In the following an operational definition of a message is given that is restricted to semantic aspects. A message is considered to consist of a set of reference terms, called attributes, which by their names can be associated to entities outside the message itself. The semantics of a message is concerned with labeling of these entities and with the structure of the message. The message concept follows from work by LangeforsI2J and corresponds, to some extent, to a fact representation by Senko[3]. In mathematical terms the form of a message may be regarded as an n-tuple of a relation. The general structure of a message is
u R T) where J is an identifier vector denoting attributes which uniquely identify an object (I is a scalar) or an object relation, p is a property vector characterizing the object or the object relation given by J and T is a time vector. If this vector is omitted, it means “for all times”. It is convenient in systemwork to seperate the message description from its value contents, i.e. to split the attributes into attribute names and attribute values. A set of messages is said to belong to the same kind if the distinct messages in the set differ in values only for identifier and property attributes, and time if this is included. The attribute names of a message define a message type. The collection of message types to be supported by the data base will be called an information structure schema of the system. An example of an information structure schema is shown in Fig. 8. From knowledge of the information structure schema we may now be able to design the data structures. The basic entity describing data structures, is the entry which is an ordered set of data items, groups or group relations[l]. In the real world realm we distinguished between an object and an object relation. We want to
Schema
design using a data structure
retain this distinction through the mapping procedure. Thus we define two kinds of entry types, the object entry type and the relational entry type. In the following entry definitions we shall omit the time dimension introduced in the message above, assuming that an entry is valid as long as it exists or if their exists two or more entries representing a phenomenon at different times they will be divided into generations of entries. An object entry type can be written as: ((ED,(G, -
G,))
ED is the entry defining group consisting of one or more elements, identifiers, all of which refer to the same object type and each of which is sufficient to uniquely identify the object type. Each identifier may be represented by data items and/or groups. For example. both order date with the attribute name, ODATE. and order number. named ONUM, refer to the object type, named ORDER. ONUM is an item, while ODATE is a group consisting of the items YEAR. MONTH and DAY. In other words, we may use different attributes to identify the same object. Thus, the identifying attributes must be analysed with respect to which objects they refer to. In the basic schema alternative we want a one-to-one map between an object type and an entry type. The second entry type to be defined is the relational entry type: ((E,,...&(G,.
.G,)).
The relational entry type describes a set of entries representing object relations and their associated properties. (ED, . . . ED.) is an entry defining vector. Each element ET in theobject relation has the same properties as thzntry defining group of the object entry type described above. Note that for the entry defining vector to be complete, we require at least one identifying attribute from each element, i.e. each associated object in the relation must be represented. The existence of an entry type is declared by naming it. Intra entry structuring is concerned with assigning attributes to each entry type and the ordering of these attributes within the entry. In the basic schema definition we are concerned- with which attributes to assign to which entry types. In subsequent analysis the ordering may be changed due to performance evaluations. It is appropriate to distinguish between inter entry type structuring or file structuring and inter entry structuring. Inter entry type structuring is concerned with connecting different entry types. Such a connection we call a set type. A set type consists of one owner-entry type and one or more member-entry types. Just as the distinction was drawn between entry type and entry (occurrence), so a distinction is made between set type and set occurrence. The existence of a set type is declared by naming it, stating its owner-entry type and its member entry type or types. A set occurrence is an occurrence of its owner-
83
matrix
entry type together with zero or more occurrences of each member-entry types. In every set occurrence. the following relations exist: (i) given an owner entry. it is possible to access the related member-entry(-ies) of that set occurrence (ii) given a member-entry. it is possible to access the related owner-entry of that set occurrence (iii) given a member-entry, it is possible to access other member-entries of the same set occurrence Furthermore, a given member-entry cannot simultaneously belong to more than one set occurrence of the same set type. In other words an entry cannot be a member of two or more owner-entries of the same set type. The entry and set concepts can be used to define different types of data structures. A hierarchy is a common data structure. In this structure. one entry owns 0 to R occurrences of another entry type. each of those occurrences in turn owns 0 to n occurrences of a third entry type, etc. No entry owns any entry that owns it, either directly or through other entries. We call this a l-to-n relationship. An example of a hierarchical data structure is a representation of the relationship between departments and employees, where one department has many persons associated with it, but no person is associated with more than one department. A second common data structure is the network. In this structure, we have a many-to-many relationship: that is an owner entry type can be related to 0 to n occurrences of another entry type, each of which is an owner entry of 0 to n occurrences of the first entry type. We have a l-to-many relationship in both directions. An example is a customer who may buy one or more articles, and the articles may be sold to one or more customers. In this case we define a relational entry type, to which attributes associated with the relation between the two objects represented in the two entry types, are assigned: for example quantity of a specific article bought by a specific customer. Two set types are established to relate the relational entry type to its two object entry types. In general, if the order of a relation is n, we will have n different set types. The file concept will not be used here. A file is a collection of entry occurrences which can be described by a common entry type and uses a common accessalgorithm to select complete entry occurrences. DATA STRUCTURE
MATRIX
From the conceptual framework defined above, we can summarize the concepts needed to define the data structures as: (i) two kinds of entry types: the object entry types the relational entry types (ii) set types (iii) two kinds of attributes: identifying attributes denoted by I property attributes denoted by P (iv) attributes which may be. either items or groups. A group may be a repeating group denoted by P(N). The objective of our schema design task is to describe the data structures in these terms. Furthermore, we
L. B. METHLIE
84
should be able to deduce the data structures from the defined information structures by an analytic approach called a reduction procedure. We will propose to use a data structure matrix to identify and inter-relate entry types of a data base. The data structure matrix fullfills four objectives: (i) The matrix is the basis for the reduction procedure converting the information structures into data structures. (ii) It is a suitabie documentation form of the data base schema. (iii) The matrix-form is suitable for further analysis and evaluation. (iv) It is a DBTG-schema representation. The data structure matrix is divided into four partitions as shown in Fig. 3. The data structure matrix is a (m + n) x (m t k+r) matrix, where m is the number of object types, n is the number of identified object relations, k is the number of identifying attributes and r is the number of property attributes. Each row of the matrix represents an entry type. The first m rows represent object entry types, and the last n rows represent relational entry types. The name of each entry type is stated in each row (0,. . . O,, R,(Oi.. . Oi). . . R,(Oi..
.O,)).
The name of the entry type will be a description of the object class represented by the entry type. Note that this name may be different from the name(s) of the entry defining group(s) and that we may have more than one identifying attribute refering to the same object. Example: PERSONNAME and CIVILNO are two identifying attributes of the same object type “person”. They may be used in different messages to give ; ‘ormation in
-I-
Ol entry
1 I1 . . . . ..Ik I
Attributes P1 . . . . . . . . . . . . . .
T T
types
-
Oi 0
()m
-
Object
m
I
Set-types
!d 01wrier)
different contexts. The name of the object entry type may be PERSON. A relational entry type is given a name of its own. Example: Quantity (QUANT) bought of a certain article (ART) by a specific customer (CUST) can be denoted by ((CUST, ART) QUANT,). The name of the relation may be CUST-ART and stated in the matrix in the lower n rows (R(0,. . . 0.)). Set types are stated in partitions A,, and AZ,. In partition A,, we state a set type between two object entry types and in partition AZ, we state the set-types between a relational entry type and its owner-entry types. Relational entries are accessed VIA ownerentries. We use the names of the object classes to identify the m columns of partitions A,, and A,,. An element (ei.i) of partition A,, signifies a relation between two object entry types denoted by Oi and Oi. An element (e.,) of partition A*, signifies a relation between a relational entry type R,(O, . . . 0.) and the owner-entry types defined by the object classes 0;. . . 0,. The value of (e) can be I or N indicating a one-to-one or a one-to-many relation. The name of the set type is of the form X-OF-Y, where X is the name of the member-entry type (row) and Y is the name of the owner-entry type (column). Example: PERSON and DEPARTMENT are names of two object entry types. However, information about persons belonging to a specific department is wanted from the data base. This information. can be represented by a set with DEPARTMENT as owner-entry type and PERSON as member-entry type. The relationship is one-to-many and an “N” is assigned in the appropriate
---I--/ A11
-
R1(O....Oi) 3 Relational entry types
R,(Oi..
.O,_,)
Fig. 3. Data structure matrix.
Schema design using a data structure matrix
element of partition A,, as shown below. The set type is given the name PERSON-OF-DEPARTMENT. Quantity is a property of a buying transaction identified by the object relation (customer, article). We identify a relation entry type which we give the name CUST-ART and if we want to access this file VIA both CUSTOMER-file and ARTICLE-file (we have a manyto-many relationship), we establish the two set types: (CUST-ART)-OF-ARTICLE (CUST-ART)-OF-CUSTOMER
HII
IPERSON
1
I_
N
Note that we do not decompose DATE into YEAR, MONTH, DAY with one column for each. The attribute part of the matrix is shown in Fig. 5. A combination of an I and a P element within an entry type represents a binary association (see Ref. 131)or an elementary message type (see Ref. [2]).
II 1
INI
IDEPARTMENT I
UST-ART
Attributes are assigned to entry types in partitions A I? and Az2. We distinguish between identifying attributes and property attributes and we denote them I and P respectively. The property attribute may be a repeating group, in which case the element is denoted P(N). Each attribute may be either an item or a group. One column is used for each attribute. Decompositions of groups are not specified in the matrix. Example: Two messages informing about a project use different identifying attributes. Ml: (PROJ.NAME. COST) M2: (PROJ.NO. READY-DATE)
and we put N’s in the appropriate elements of the matrix as shown in Fig. 4. If we want to relate quantity not only to customer and article but, for instance, to the salesman (person) as well, we may define an object relation of order three and establish three set types. Three N’s are inserted in the columns corresponding to the object types of the relation.
IARTICLE
I
1
\
'
II
I
N .
I I, II ’ II
1
Fig. 4. Illustration of set types.
PROJECT
85
I
I
PP _.___.--.. --__
Fig. 5. Assignment of attributes.
L. B. METHLIE
86 REDUCTION PROCEMIRE
In large data information
structures
compatible needed.
bases, a formalized
with
a data
This section
DBTG-schema
to
entries
system,
is
leading
As we
set types
above
of elements:
specifications.
in one
another
message.
The
may
type
in of
structure
with
(object type
classes)
with they
is represented
an object
entry
structure
matrix.
Note
in the
type
identify
On the
same
kind,
following
to
seven
4.2 The
with
e.g. persons,
the
same
identifying
SEQ.
NO.
object
the
object
m rows
of the data
type
is given
be identified AND
objects
a
may
be grouped
“teachers”
and
attribute
into
two
name
information
relations.
Each
structure
in
message
having
attributes
referring
describes
an object
relation
tional
messages
referring
to identical
entry
same
kind
performed
by a relational
is a matter
to identify to identical
3. List all attribute
as described
in step
if there exists different
I, is
attribute
representing
names
the lower
property
attribute
in the column
columns
followed
of
attributes
are listed
by
property
the
(upper
assigned
between objects of the same or different kinds.
(lower
entry
referred
by assigning
is at-
object
matrix.
types) entry
the
role
type). type of the
above). of
in the some
to and
to the entry
via the owners
has a P-value each
entry
being type,
both the
and an I-value
in
is substituted
by
P-value
1 or N is assigned entry
as shown
rows (entry
PERSON
entry
is set to I.
types,
The value
referred
relationship
plays
of the
types
type (member
are accessed
attribute
entry
in the elements
are assigned
of the row of the member
of the data structure
attribute
to by the identifying
(owner
entry
attributes
of the element
7. Empty
type
“N’s”
and descriptive
a set-type.
and of the
to
sets. (cf. the customer-article
diferent
object)
corresponding
attributes
the entries an
the
the department-per-
to the object
the relational
identifier
level
to the complement (cf.
in
to which
“P(N)” is entered in the eolumn of the attribute name (partition A&. Set types
No identifying
5. If
an “N”
type
or
PROJECT
Fig. 6. Conversion
entry
object).
corresponding
because
type
above).
relation
“P”
columns
entry
of the relation.
n : n relationships. A property
the object
message to choose
by entering
the
is assigned level
type
object
level object
to the relational
of the owner *Relationships
is redun-
relationships. The pro-
corresponding
son relationship
objects
attributes
to the
is established
A,, opposite
6. If an attribute in the column-heading
message the
efficiency.
is assigned
partition
content
attributes
objects.
Al2 and Am. Identifying leftmost
by a
rows of the matrix.
between
I : n (hierorchicol)
opposite
relations
reference
In a relational
relationship
of performance
object
file and described
type in the n lower of analysis
object
relation
as an object
by the identifying
R(0, . . . 0,). Rela-
has an object
entry
two or
to diferent
more identifying
to
cor-
(repeating
type is treated
appropriate
order
and
P(N)
in 4.1. Which
as described
tributes.
SOCIAL
of
object
identifying and
type
4.3.2
“students”
the
dant. The message
PROJECTthe
in the columns
multivalued
to, one of the identifying
relation
by two
of basically
types,
are represented
reference
opposite P
and
I relationships.*
I:
A set type
by
valued
message
are defined
2. Inspect
in the
identified
entry
PROJNO hand
with
partitions
types
file and described
class may
classes
referring
object
names
respectively.
a one-to-one
perty
to one of the first m columns.
object
relational
information
the
to. Each
object
different
identify
to
in the first
other
e.g.
object
to by I. I and P denote
to single
referred
structures.
in the
respect
Each
attributes,
NAME.
terms
refer
that an object
different
a single
attributes.
4.2.1 of
set types
of the data
by an object
name and is assigned
struc-
by the fol-
(two or more identifiers):
a property-
be an identifier
the necessary
identifier
schema
matrix
are entered
attribute
attributes
respond group)
and time
will
referred
property
objective
procedure
properties
identified
being
well
is described all
has
*- P” and “P(N)”
4.2.2
structure
The
message
appropriate
steps: I. Analyse
in the information
structure
procedure:
“I”. the
properties
an attribute
and establish
procedure
order.
message
to the data
4.1 The
with
the conceptual
is a semantic
reduction
the information
each
schema
lowing
are grouped requirements.
from
identifiers.
message.
such properties
recall
a message
However,
attribute
are
is one important
may
framework
ture
to a
associated
relations
information
of set types
kinds
attributes
or object
the
three
The
are
in alphabetical
4. Convert
(one identifier):
necessary
satisfy
procedure.
retain
which
management
procedure
and
to
Identification the
structures
alternative.
By the reduction into
to reduce
deals with a procedure
the same kind of objects order
data
base
attributes
procedure
types)
type
to the element and the column
in Fig. 6. are deleted.
Schema
design using a data structure
(k
tI
87
matrix
columns)
message
‘I 4.2.3 Enter P or P(N) Establish
4.2.2 Enter 1,P, P(N) Establish a
4.1, 4.2.1 Enter 1. P or P(N) in matfix.
WWPe
I
I YES
5 From
elements
with I and P
where P and
Fig. 7. Flow
I
chart of the reduction
procedure.
,
set WW
I
L. B. METHLIE
88
A flow chart of the reduction procedure is shown in Fig. 7. This completes the first version of the schema. As stated above, the data structure matrix thus arrived at represents a feasible solution to our schema design problem. It will contain a minimum of data redundancy, and thus, be advantageous with regards to storage space. Further structuring may now be carried out, it may for instance be appropriate to separate all repeating groups from the entry types they are assigned to. This can be easily seen from the matrix. Separate entry types may be defined for these attributes and corresponding set types established. Further evaluation of the schema with regards to performance properties will not be dealt with in this paper. CASE STUDY
Finally, we will show how the data structure matrix can be applied to a specific case concerning ordering of
auxilliary equipment by subscribers of a telephone company. The information system will support clerks with information about existing subscribers and their equipment, ordering information etc. An information analysis has been carried out giving as a result a specification of the information structures (messages) to be supported by the data base. For our purpose the information structure schema shown in Fig. 8 will be an adequate starting point to apply the method described in this paper. We start by analyzing the identifier attributes in the information structure schema to identify the object types. In the schema we find that the attributes, ODATE and ONUM, refer to the same object type, viz. “order”. For the rest of the identifiers there are a one-to-one map between object types and identifying attributes. We give each object type a unique name, which also will be used as the name of the corresponding object entry, and insert them in the first four rows and columns of the matrix.
MESSAGES
ATTRIBUTES NAMES
ADDR ALL-QUANT DATUM
-
-_-.~-
EQUIP
IDATE ---_-
*,
INDEX ONUM
QUANT SUBSCR SIGN UPRICE-I
Fig. 8. Information structure schema.
Schema
Object
relations
this case study The attributes
four
attributes.
procedure
in the next
rows,
in
in two groups
identifying
matrix
list them
to step
in the
structure We
now
matrix
obtained
the lay-out
as shown
in Fig. 9.
now proceed
EQUIP-S((EQUIP. object
relationship,
relational each
entry
of OTYPE Furthermore. and
one
from
and
columns
we establish from
relation attribute
the same
than one type
We locate
EQUIP
the via-access
EQUIPMENT.
entry
Two
N’s
The above
matrix
For
in the ORDER-entry
type
of
structure
schema
such a
assigned
to the
I’s
message.
the
However. messages.
(EQUIP/EQUIP)
is
step 4 and we can proceed of ODATE
and ONUM
type to I in both elements.
According
to step 6 we look for I’s and P’s in an attribute
column
but in different
plays
descriptive
role
entries.
We
in the
identifier
type
the P in the ORDER-entry SCRIBER.
of the data structure
note that
ORDER)
element
matrix
(step 3 completed)
is inserted
in the matrix
SUBSCR
ORDER-entry
in the SUBSCRIBER-entry
ORDER
EQUIP-S
The
objects
called
the values
DATUM
Fig. 10. The message
we
note may
EQUIP-COMBI.
or more
complete
EQUIPMENT
Fig. 9. Lay-out
messages
between
in the same
path are
nine
like any other relational
entry
to step 5. We change
in the relational
the object
by two
is treated
relational
having
structure
defined.
of
a P
other
of message
EQUIP
the
we have
We thus insert
name
After data
in Fig. I I. A special
In the information
the message A
the
a relation
is defined
A:,. the
in Fig. IO.
shown
describes
kind.
is The
in the matrix.
and QUANT.
a set type
of more
orders.
ONUM
and QUANT
by establishing ORDER
while
in many
of both
in the OTYPE type.
QUANT)).
type (EQUIP/ORDER)
occurrence
occurrences entry
and quantity
message
message
inserted
at the matrix
one
is a many-to-many
may consist
may be found
messages
The first message
order)
i.e. an order
data
same
(OTYPE.
(equipment,
of equipment
equipment
matrix.
ONUM).
relation
of a specific
to step 4 and transfer
by one to the data structure
type
having
be given to the transfer have
in partition
particular
looks as shown
arrive
attribute-par-
this
After
3 in the
R9
matrix
to these columns
converted
titions. We
assigned
references.
and according we
using a data structure
relation
object
are now sorted
and property reduction
are then inserted
we have
design
type type.
type
by a
in partition
and
a
is an
We substitute
I in the (SUBA,,.
As
we
r
90
L. B.
NN
1
N
I
N
METHLIE
_____
..-
..-... _____ .__.,_._, P
NI
P
P
Fig. I I. All messages inserted (step 4 completed).
ECJJ~DzJIP)
N
kUJIP/~)
N
P Nt
(OW-)
P
P
I
Fig. 12. Data structure matrix of the case. study after the reduction procedure has been completed. already
have
“subscriber” require
a
I: n relationship
and “order”,
a seperate
however,
established
set type
the requirement
to be defined.
do not
Step 7 tells us to delete empty entry types. In our case the relational
entry type (ORDER/SUBSCR)
entry because it is a hierarchical We
also
note
that
the
ORDER-entry
types
have
such as ADDR,
PRIO
and PHONE.
and
SUB-
ORDER-entry
used to model entities and sets is
diagram.
The data structure
Bachman[41.
This
concepts-a
gle enclosing
It can be shown
be drawn
from
that this
the data-structure
diagram was first introduced
graphic
notation
uses two
fun-
rectangle and an arrow. A rectan-
a name denotes an entry type. The second
concept, the arrow, represents a set type. The entry type
This redundancy
located at the tail of the arrow is the owner entry type,
and
establish
necessary
type
located
at the head is the number
EOWWENT
into one or more (sub-)
necessary
types and the ORDER-
set types between or SUBSCRIBER-
to choose will require further
and update frequencies.
leave this problem at this stage.
ofma A
entry type of retrieval
as
to either the SUBSCRIBER-
type
types and establish
Which solution
widely
can easily
and the entry
(iii) separate these attributes these entry
DBTG-schema
attributes,
common
access-paths by defining set types entry
a feasible
values in both entries
(ii) assign the attributes the
many
matrix. by
damental
has three solutions:
(i) store the attribute or
is an empty
relationship.
SCRIBER-entry problem
One notation
the data structure diagram
in the set.
reached
shown in Fig. 12.
It indicates
of access to the owner-entry
from each member-entry
We have now
between
the reverse relationship
analysis
(EOUIPI ORDER)
In this case we will
Fig. 13. Illustration of one row of partition
A,,
of the matrix.
Schema
design using a data structure
We have also introduced “double-bottom” rectangles to denote entry types with indirect accesses. The complete data structure diagram of the case-study is shown in Fig. 14. The schema design procedure has been based on the conceptual framework defined by DBTG. The DBTG specification also include several languages to be used to describe and manipulate data. The data structure matrix can easily be described in a Schema Data Description Language (DDL). Thus it can be implemented in any DBTG-based data base management systems. An implementation of the above data structure matrix has been done in DMS I I00 on the Univac 1I IO computer.
ORDER
EOU~PIORDER I Fig. 14. Data structure
91
matrix
diagram
of the case study.
entry type. The information found in the data structure diagram is the information found in partitions A,, and AZ1 in the data structure matrix. Take the relational entry type (EQUIP/ORDER). It is related to the two entry types EQUIPMENT and ORDER by the two set types: (EQUIP/ORDER)-OF-EQUIPMENT and (EQUIP/ORDER)OF-ORDER. The data structure diagram of this relationship is shown in Fig. 13.
REFERENCES
111Codasyl: Feature analysis of generalized data base management systems. Codasyl Systems Committee Technical Report (May 1971). 121B. Langefors: Information systems. In fnforvwtion Processina 74, D. 937-945, North-Holland, Amsterdam (1974). 131MT E. ‘Senko: DIAM II. Semantic Binaries and ANSI SPARC Database Technology, Online London (1976). Data Structure Diagrams, Data Base. ACM 141 C. W. Bachman: (Summer 1%9).