Modelling man-machine interface in a data base environment

Modelling man-machine interface in a data base environment

Int. J. Man-Machine Studies (1983) 18, 55-70 Modelling man-machine interface in a data base environment ISRAEL SPIEGLER Faculty of Management, Tel A...

734KB Sizes 0 Downloads 29 Views

Int. J. Man-Machine Studies (1983) 18, 55-70

Modelling man-machine interface in a data base environment ISRAEL SPIEGLER

Faculty of Management, Tel Aviv University, Ramat Aviv, Tel Aviv 69978, Israel (Received 15 July 1981, and in revised form 26 February 1982) A method for the conceptual representation of queries in a man-machine data base environment is presented. The conceptual model is also used to categorize the queries into distinguishable classes. This provides users with a framework for stating their requests, and gives data base professionals a tool for designing interface languages and data base structures.

1. Introduction More and more data bases appear in m a n y new application areas, and their use increases rapidly. By separating physical data from programs that handle them, data base technology begins to replace the intermediary function in conventional computer data processing. This function is usually p e r f o r m e d by p r o g r a m m e r s or analysts, who approach the data knowing both their structure and location. This is no longer necessary as users are now able to interface with the computer and state directly what they want, without having to specify how or where to get the answer. Divorcing users from the how and where of data is a significant step in m a n - m a c h i n e interface. However, the interface can be meaningful only as long as the user view and machine view about data (i.e. the format under which the data are stored) concur. Lack of congruity between these two views m a y lead users to approach computer data either not knowing what to ask, or asking questions which the computer cannot answer. Even in stating the what part of their request, users m a y come up with forms which vary according to their particular views. Nevertheless, we think that the different ways of stating a query converge onto a c o m m o n conceptual form that represents the query unambiguously. It is this aspect of the interface that we are attempting to model. Specifically, this p a p e r proposes a method of formally defining the interface between a user and data a b o u t a predefined environment stored in the computer, thus linking the two views of data; the user view and the machine view. The scope of this p a p e r covers an intermediate level between the theoretical and practical aspects of query research. Other research in this area has b e e n conducted on the theoretical level, for example, by Codd (1972), who studied the completeness of selection capabilities in the data base; Chandra & Harel (1980), who introduced the computability of queries and Vandijck (1977), who investigated relational retrieval languages. A m o n g the practical developments were S Q L - - t h e language of System R (Blasgen et al., 1981) and Q u e r y - b y - E x a m p l e (Zloof, 1977). The result of this work is not another query language, but rather a method that aids designers of query languages, and a means by which users m a y gain a better understanding of their interface with a data base. 55 0020-7373/83/010055 + 16503.00/0

O 1983 Academic Press Inc. (London) Limited

56

i. SPIEGLER

The paper briefly presents the interface requirements and data views. Next it introduces the basic elements of the model in the form of value and function types of the data base. These are incorporated into a model which formally defines the data base functions. Finally, the model is used for the classification of user queries. Examples of queries related to a sample data base are given throughout the paper. T h e y are summarized and presented in model conceptual form in the Appendix.

2. Interface requirements Three basic requirements need to be met before man-machine interface can be established: (a) common frame-of-reference, (b) agreement on interface structure and (c) predefined terms. A common frame-of-reference assures that the interfacing parties are "talking" about the same thing; that is, both have agreed on the situational context and have a common "data base". Other names have also been used to identify the context: Falkenberg (1976), for instance, called it the Universe of Discourse. Usually, the context consists of an environment with entities of sufficient interest for an organization about which it collects and stores data. The second requirement, that of structure, defines the associations that exist among entities of the environment, as well as the links between entities and their attributes. An agreement on interface structure is the basis for any further communication about stored data that represent the situational context. An example of such a structure is a hierarchy that describes the relationships among divisions, departments and employees. Any given context can be described by more than one structure, depending on the "view" taken on the data. The third requirement for interface assumes the existence of a predefined glossary of unambiguous terms, with which both man and machine are familiar during the interface. The terms are used as names of attributes, that is, NAME, S A L A R Y , M A N A G E R ; or as names of verbs, decriptors, predicates or connectives needed to form the interface.

3. Views of data Any given environment may be modelled in several views of its related data. A view of data is a logical representation of the structure of that data, and of its basic terms. The three basic views of data are: the user, logical and physical views. They are shown in the columns of Table 1 together with two columns for reality and interface language. The Table summarizes the terminology used in the different views as related to the interface requirements of context, structure and terms. Column 1 in Table 1 defines reality, which consists of entities, attributes and links between them. The data base view, shown in columns 2 and 3, consists of logical and physical views; Nijssen (1976) named them "conceptual" and "internal" views while Engles (1972) used the concepts of "file-organization" and "data-organization", respectively, where "file-organization" is the correspondence between the information structure and the structure of the data and "data-organization" is the correspondence

MODELLING

MAN-MACHINE

INTERFACE

57

r~ o

0

O

O

o "r,

~'B

<

-~, 0

~- e - i

0 o

8

58

I. S P I E G L E R

between the structure of the data and the structure of storage. Graphically, the two views can be shown as follows:

File -organization or Iogic,~i view

Data-organization or physical view

The context of the logical view is the data base. The structure, named schema, consists of files and logical records. Typical structures in the logical view are: hierarchies, networks or relations. The lowest level in the logical view is the data-element. The physical view (column 3) refers to actual data-values, stored on computer media. H e r e the environment, or context, is the device used for storage, such as disk or tape. The structure consists of areas, files and records. Examples of storage structures, used in physical views, are: sequential, index-sequential, direct, list, ring or inverted structures. The lowest level in the physical view is the field, containing one data-value of an attribute in the data base. There need not be a one-to-one correspondence between structures of the logical and the physical views, since a logical structure may be implemented in more than one physical structure. For example, a network data base may physically be stored in index-sequential, list or ring structures. Usually, the physical structure is invisible to users who rely on the Data Base Management System (DBMS) to provide automatically for the correspondence between logical and physical structures. Requirements for context, structure and terms apply also to the interface language (column 4), in the form of propositions, syntax or grammar, and words, respectively. The interface language linking the user and the data base, may take various forms such as menu-driven, English-like, keyword-base, procedural or other formats. All such forms are assumed to converge into one conceptual form, which is the target of this work. The user view of data, a concept lacking a clear and distinct definition, has been the subject of several studies (e.g. Nijssen, 1976; Vandijck, 1977; Holland, 1980; Prowse, 1980). Data base designers tend to believe that the logical view of the data base, as reflected in their design, is the user view. This may not always be the case, since in a given environment, modelled by one logical view into a data base, there may coexist different users each having his own view of entities and attributes in that environment. One can only speculate on the diversity of those views which may be in hierarchical, functional or associative structures. It is beyond the scope of this work to model the user view on data. The aim of this research is to link the user view (column 5) with the logical view of the data base (column 2) via a common conceptual representation of functions, which can be the base for an interface query language (column 4).

4. Types of values in data bases Data bases are organized collections of named data-values, of interest to organizations. Values can be facts, quantities or measurements, physically stored in the data base.

MODELLING

MAN-MACHINE

INTERFACE

59

Storage values may look like J O N E S 8 0 0 0 S M I T H . The logical structure assigns meaning to these facts: N A M E , S A L A R Y , M A N A G E R , respectively. In other words, values are raw data, from which the logical structure can produce information. This is done much the same as applying a " t e m p l a t e " of structure and names to physical data values. Values take different forms in a data base environment, and can be arranged into the following groups.

1. Internal values: values actually stored in the data base, such as numeric, alphabetic, string, coded, pointer or binary. 2. External values: values supplied by the user, in his interface with the data base, which may or may not match stored internal values. For example, a user may ask for salaries greater than 22,000, where that n u m b e r is actually not stored. Other examples in this category are data-element names, file names, language verbs, keywords, punctuation and reading aids. 3. Derived values: new values which result from operations performed on data. These may be computed numeric fields, string splits or concatanations, lists and sub-lists. We use these value groups to define the functions in a data base environment.

5. Functions in data base queries Three basic functions are needed to interface with, and obtain information from, a data base. They are subset, retrieval and information functions. The subset function (suf) is used for reducing files or relations into smaller files, or sub-relations, by means of user-specified subsetting criteria. The retrieval function (ref) is used for stating which internal values, stored under certain data-element names, are requested by the user. In order to reduce the file or relation, this function may first invoke a subset function before processing the statement of the retrieval function. The information function (inf) is used for producing derived values, by means of external or internal values together with the suf and ref functions. Figure 1 shows the three functions, their input, output and interrelationships. Attention must be given to the input and output of each function, since their interrelationships suggest a transfer of different data types. This is discussed next. Subset functions (suf) take as their input data-element names and external values and return a set of entries, qualified under subsetting criteria. An entry is defined in this paper as the set of all attribute values associated with one entity in the data base. In relational data base terminology (Date, 1981), an entry is referred to as n-tuple, where " n " is the degree of a relation--the number of data-elements (domains) in the relation. The sample data base used in this paper, shown in Table 2, depicts a relational structure as the logical view. It describes a department store environment, and has four relations: EMP, SALES, S U P P L Y and TYPE. Entries in the E M P relation are numbered for illustrative purposes only. An example of a subsetting criterion is: " S A L A R Y G R E A T E R T H A N 11500". It consists of a data-element (SALARY), an external value (11,500), and a predicate

60

i.

SPIEGLER

Data

elements

External values

Subset ~.

~.

function

(suf)

I

"t

Entries*

1

Retrieval function (ref)

Internal values

1

).

Information function (inf)

J

Derived r- vclues

F I G l . D a t a base interface functions. *An entry is the set of all attributable values associated with one entity in the data base; also referred to as a logical record.

(GREATER THAN). In the sample data base (Table 2) this subsetting criterion, applied to the EMP relation, will return three qualified entries: 3, 6 and 8. The input of the retrieval function (ref) are entries, obtained via the subset function, and data-element names supplied by the user. The output of the retrieval function are internal values requested by the user. For example, the query PRINT NAME WHERE SALARY GREATER THAN 11500. TABLE 2

Sample database Y EMP

entry: (1)

(2) (3) (4) (5) (6) (7) (8)

SUPPLY

t Source:

NAME

SALARY

MANAGER

DEPT

DEPT

ITEM

JONES MORGAN LEWIS NELSON LONG HOFFMAN MURPHY SMITH

8000 10000 12000 6000 7000 16000 8000 12000

SMITH LEE LONG MURPHY MORGAN MORGAN SMITH HOFFMAN

HOUSEHOLD COSMETICS STATIONERY TOY COSMETICS COSMETICS HOUSEHOLD STATIONERY

STATIONERY HOUSEHOLD STATIONERY COSMETICS TOY TOY TOY COSMETICS

DISH PEN PENCIL LIPSTICK PEN PENCIL INK PERFUME

ITEM

SUPPLIER

PEN PENCIL INK PERFUME INK DISH LIPSTICK DISH PEN PENCIL

PENCRAF'F FLIC PENCRAFT BEAUTEX FLIC CHEMCO BEAUTEX FLIC BEAUTEX PENCRAFT

TYPE

IBM Query-By-Example User Guide; S H 2 0 - 2 0 T S - 0 .

ITEM

COLOR

SIZE

DISH LIPSTICK PERFUME PEN PENCIL INK INK PENCIL PENCIL

WHITE RED WHITE GREEN BLUE GREEN BLUE RED BLUE

M L L S M L S L L

MODELLING

MAN-MACHINE

61

INTERFACE

would be answered LEWIS HOFFMAN SMITH. All data-element names (shown as columns in Table 2) may be used, provided they are related to the entries qualified by the subset function. Thus the query P R I N T D E P T W H E R E S A L A R Y G R E A T E R T H A N 11500. is legal and would be answered STATIONERY COSMETICS. The reply indicates that these two departments have employees with salaries greater than 11,500. Consider now the query P R I N T I T E M W H E R E S A L A R Y G R E A T E R T H A N 11500. with the answer DISH PENCIL LIPSTICK PERFUME. Care should be exercised with such queries as there is no direct relation between ITEM and S A L A R Y . In order to obtain a valid answer to such a query, an association must be made, via a common data element, in this case, D E P T . The procedure to be followed by the data base software in handling this query is: 1. Apply subsetting criterion; entries qualified: EMP (3, 6, 8). 2. ITEM in E M P relation? yes--retrieve, stop. 3. Common data-element to EMP and any relation containing I T E M (e.g. SALES)? n o - - q u e r y cannot be answered, inform the user and stop. 4. Get qualified internal values of common data-element in E M P relation. 5. Based on 4, qualify entries on SALES; entries qualified: SALES (1, 3, 4, 8). 6. Return the qualified internal values of ITEM. The user should not be required to state explicitly the association; rather he should be informed by the system when no association exists, so that he would know that the query is not answerable. Information functions (inf) return new or derived values. Their inputs are operators, needed to perform data manipulation on internal values produced by retrieval and subset functions. The following query illustrates an information function: PRINT 11500.

AVERAGE

The result may be 13333.

SALARY

WHERE

SALARY

GREATER

THAN

62

L SPIEGLER

Another example, query COUNT NAME WHERE DEPT EQ 'COSMETICS' may be answered 3. The information function is performed through the application of operators: averaging salaries qualified under a subset function, as in the first example; or counting names returned via a retrieval function as in the second example. Note that any data-element name would give the same answer in the second example; that is, N A M E , S A L A R Y , M A N A G E R or D E P T .

6. Formal definition of functions We now present the formal definition of these functions and interrelate them into a cohesive model. First, the notation used in the definition: A. Upper-case words are called terminals and are used for: a. Data-element (field) names: S A L A R Y , D E P T , b. Operators, predicates, quantifiers such as EQ, A V G , SUM, or c. External values: 11500, 'COSMETICS'. B. Lower-case words are used as variables in the model. They are always defined by other variables a n d / o r terminals. Examples: ref, val. C. Vertical bar (]) represents a choice; a ]b means a or b. D. Brackets [ ] indicate an optional feature; that is, col[, col] means that one, or more, column names may appear in the request. E. Ellipses (...) mean optional repetition of a term; that is, [, col] . . . . F. Single quotes delimit alphabetic strings; for example, D E P T E Q 'COSMETICS'. G. Parentheses ( ) or comma are used as element delimiters. 6.1. SUBSET FUNCTION (suf) As previously stated, this function reduces files or relations according to subsetting criteria. Denoting data-elements as "col" (columns) and internal values as "val", let: suf = pred 1(col, val), where p r e d l is a relational predicate, such as EQ, GT, GE, LT, LE, NE. Adding to this definition set operators, which tie two (or more) subset functions, we get: suf = pred2(suf)(suf), where pred2 represents the logical A N D and OR. We also add the unary NOT, defined as pred3, to get the complete definition of the subset function: suf = pred 1 (col, val) Ipred2 (suf) (suf) ]pred3 (suf).

(1)

The recursive definition of suf allows for multiple calls to that function. Examples of this feature are presented below following the discussion of the other functions.

MODELLING MAN-MACHINE INTERFACE

63

6.2. RETRIEVAL FUNCTION (ref) The basic form of the retrieval function is: ref = col [col(suf). By this definition the retrieval function (ref) has the ability to state any data-element name (col), and get all values, (PRINT M A N A G E R ) or get internal values, after a subset function has been applied, (PRINT N A M E W H E R E . . . ) . There are, however, other more complex forms of retrieval such as: a. Whole relation to be retrieved; b. Series of data-elements (not only one) to be returned; c. Calculation performed on a data-element, for example, S A L A R Y / 1 2 (giving monthly salary); d. Forming a new relation from disparate columns, for example, M G R ( M A N A G E R , ITEM, SUPPLIER). Accordingly, the definition of

ref is extended, to include those forms of retrieval

ref = rel Irel(suf), where rel = rel-name [col[, coll... Iarith(col, val) ]rel-name(col[, coll...).

(2) (3)

"rel-name" represents a relation name and "arith" defines the arithmetic operators: +, --, *, /. 6.3. INFORMATION FUNCTION (inf) This function operates on results of a retrieval function, in the following manner: inf = calc(ref) Ibool(ref).

(4)

"talc" represents set quantifiers, such as AVG, SUM, MAX, MIN, COUNT, and "bool" represents the Boolean operator EXIST, which returns true/false when this function is applied. 6.4. DEFINITION OF VALUE (val) In order to allow the specification of different value types discussed in section 4 (internal, external and derived), as part of the functions, value is defined to be: val = 'string' ]expr ](ref) ](element) IALL,

(5)

where string = alphanumeric values such as 'WHITE', expr = const ]arith(const, const),

(6)

"const" is an external value, such as 11500; or an arithmetic operation performed on two constants recursively, ref = ability to get internal values via a call to retrieval function. This is used in compound queries illustrated below, element = calc(ref),

(7)

64

I. SPIEGLER

ability to obtain a derived value, such as A V G ( S A L A R Y ) , in a subset function A L L = a term referring to all stored values u n d e r a certain d a t a - e l e m e n t name. T h e use of the a b o v e definition of val facilitates linking together of information, retrieval and subset functions. It also adds flexibility to the functions t h r o u g h the recursive nature of the val definition. This allows the r e p r e s e n t a t i o n of c o m p o u n d e d queries of the form: " W h o are the employees earning m o r e than the average salary". Illustrations of queries r e p r e s e n t e d in model f o r m are discussed in section 7.

query inf ref rel

= = = =

(inf) I(ref) calc(ref) fbool(ref) rel [rel(suf) rel-name Icol[, c o l l . . . ]arith(col, val)] rel-name(col[, col]...) suf = predl(col, val[, val]...) Ipred2(suf)(suf) I pred3(suf) val = 'string']exprl(ref) l(element)lALL expr = const ]arith(const, const) element = calc(ref) predl = EQ t G T I G E I L T I L E I N E pred2 = A N D [ OR pred3 = NOT arith = +1-I*l/ calc = AVGI SUM IMAXIMINI COUNT bool = EXIST

Information function Retrieval function

Subset function Definition of value

Relational predicates Logical predicates Negation predicate Arithmetic operators Calculation operators Boolean operator

rel-name and col are names of relations and data-elements as defined in the data base. const and string are constants and strings as defined in programming languages. FIG. 2. Logical representation of model.

Figure 2 shows the entire m o d e l in a t o p - d o w n m a n n e r . It is a c o m p a c t c o n c e p t u a l r e p r e s e n t a t i o n of features n e e d e d in u s e r / d a t a base interface. As such, it does n o t claim to be a user query language, but rather a m e a n s of specifying, in a c o m m o n way, the different formats of user interface. To illustrate, each of the following q u e r y formats can be represented in the simple m o d e l form: (NAME(GT(SALARY,11500))) 1.

GU EMP

A B C G N P E M P ( S A L A R Y = 11500) go to A B C 2.

F I N D IN F I L E E M P W I T H S A L A R Y = 1 1 5 0 0 DISPLAY NAME.

MODELLING MAN-MACHINE INTERFACE 3.

4.

65

ENTER MODE query S E L E C T DATA NAMES: 01 N A M E 02 S A L A R Y 03 M A N A G E R 04 D E P T 01 ENTER SELECTION CRITERIA salary gt 11500 PRINq; N A M E W H E R E S A L A R Y IS G R E A T E R T H A N 11500.

Query language designers may use the conceptual form as a base for implementing an interactive m e t h o d by adding connectives, verbs or other conversational aids. The same conceptual form may also be used to generate DBMS routines that perform the actual access to the data.

7. Classification of user queries In addition to the definition of man-machine interface functions, the model is used to categorize user queries in a data base environment. Figure 3, a hierarchical representation of the model, depicts the scope of each query class along with its relevant model elements. We shall now define the various query classes, illustrated with examples, and described in conceptual form. For a full parsing of the examples see the Appendix. CLASS A: SIMPLE RETRIEVAL Class A queries cover requests for retrieving a relation or any part of it (such as a column):

Logical form: query = (rel) Example 1: Get all items sold. (ITEM)

Example 2: Print supplier names. (SUPPLIER)

Required user knowledge: data element or relation names. This class represents all unqualified data retrieval questions, where the user wishes to get the data as they are stored in the data base. CLASS B: SINGLE QUALIFIED RETRIEVAL Class B queries represent questions having one single condition, and operating on the same relation or file: Logical form: query = (col(predl(col, val))) Example 3: Who earns more than 11500? ( N A M E ( G T ( S A L A R Y , 11500))) Example 4: Which items are sold by the toy department? ( I T E M ( E Q ( D E P T , 'TOY')))

66

I. SPIEGLER

Required user knowledge: n a m e s a n d value r a n g e s of d a t a e l e m e n t s in t h e r e l a t i o n o r file. A n s w e r s to q u e s t i o n s in this class a r e s u b - r e l a t i o n s of d a t a s t o r e d in t h e d a t a b a s e . CLASS C: MULTIPLE QUALIFIED RETRIEVAL Q u e s t i o n s c o v e r e d in this class h a v e m u l t i p l e c o n d i t i o n s s t a t e d in o n e or m o r e r e l a t i o n s :

Logical form: q u e r y = (col(pred2(suf)(suf))) E x a m p l e 5: W h o a r e the p e o p l e w h o e a r n m o r e t h a n 10000 a n d w o r k in t h e t o y department? (NAME(AND(GT(SALARY, 10000))(EQ(DEPT, 'TOY')))) E x a m p l e 6: W h o e a r n s m o r e t h a n 10000 b u t d o e s n o t w o r k in t h e c o s m e t i c s department? (NAME(AND(NOT(EQ(DEPT, ' C O S M E T I C S ' ) ) ( G T ( S A L A R Y , 10000)))) E x a m p l e 7: F i n d all g r e e n i t e m s s o l d b y the t o y d e p a r t m e n t . (ITEM(AND(EQ(COLOR, 'GREEN'))(EQ(DEPT, 'TOY'))))

~176 . . * " ' " , ~ 1 7 6 ..

......y . . . . . . . f\,...,. ref

.--/re'

,.

\.

.."

inf

..""

"\

IX

i!,

I

o st cons~/

"~/

\

.../

FIG. 3. Graphical representation of model in hierarchical form. Query classes: . . . . . . , class A; B; - - - , class C; - - . --, class D; . . . . . , class E.

, class

MODELLING

MAN-MACHINE

INTERFACE

67

Required user knowledge: data-element and relation names and familiarity with operators. Examples 5 and 6 illustrate multiple conditions operated on one relation--EMP. Example 7 shows "linking" two r e l a t i o n s - - T Y P E and SALES. The mechanism of linking, via a common data element (ITEM), should be transparent to the user rather than be reflected in the interface language, as is shown in the model representations of this query. Any man-machine system should facilitate such linking provided there is a common data element through which the link can be made. CLASS D: COMPOUND RETRIEVAL This class defines questions having internal values (retrieved from the data base) as part of the subset criteria. The values are not supplied externally but rather derived internally from the data base:

Logicalfrom: query = col(predl(col, (ref)))) Example 8: Find all the people earning more than Lewis. ( N A M E ( G T ( S A L A R Y , ( S A L A R Y ( E Q ( N A M E , 'LEWIS'))))))

Example 9: Get the people earning more than Lewis and working in the department that sells pens. ( N A M E ( A N D ( G T ( S A L A R Y , ( S A L A R Y ( E Q ( N A M E , 'LEWIS')))) (EQ(DEPT, ( D E P T ( E Q ( I T E M , 'PEN'))))))) Required user knowledge: data-element and relation names and operators. Note that in both questions we must obtain first Lewis' salary, or the department that sells pens, before the answer can be retrieved. Class D queries represent recursive questions, where the retrieved value is used as a parameter in the subset function. This may be done in operating on one relation, as is shown in example 8, or in linking two relations--EMP and SALES, as is illustrated in example 9. From the user's point of view these questions are not much different from questions of previous classes. The more complex nature of these queries comes forth in the model representation and, indeed, in the DBMS routines implementing Class D facilities. CLASS E: INFORMATION REQUESTS This category includes queries for obtaining derived values, by means of calculations performed on results of retrieval functions:

Logical form: query = (inf) I (col(pred 1 (col, (element))))

Example 10: What is the sum of salaries in the household department? (SUM(SALARY(EQ(DEPT, 'HOUSEHOLD'))))

Example 11: Who earns more than the average salary? ( N A M E ( G T ( S A L A R Y , ( A V G ( S A L A R Y ( E Q ( S A L A R Y , ALL)))))))

Required user knowledge: data-element and relation names, logical and calculation operators. Note that calculation operators may be used both in obtaining the final answer, such as summing the qualified salaries in example 10, or in the intermediate stage, to find

68

i. SPIEGLER

the average salary first, and then use it as the value c o m p a r e d to S A L A R Y in the subsetting criterion.

8. Conclusion A conceptual query representation was defined by the identification and formal presentation of the basic user query functions. The result of the conceptual definition is an unambiguous statement of the u s e r / d a t a base interface and a clear classification of query types. The classification also gives the range and boundaries of questions answerable by the computer. The proposed model may aid designers of m a n - m a c h i n e interface languages in identifying the functions a target language is to provide for users. A formal definition of interface functions may also benefit data base designers as the structure of the data base has a direct and p a r a m o u n t effect on its possible use. This research was supported in part by Army Research Institute for the Behavioral and Social Sciences, Contract Number DAJA37-81-C-0065.

References BLASGEN, M. W. et at. (1981). System R: An architectural overview. IBIV[System Journal, 20 (1), 41-62.

CHANDRA, A. K. & HAREL, D. (1980). Computable queries for relational data bases. Journal of Computer and System Sciences, 21(2), 156-i78. CODD, E. F. (1970). A relational model for large shared data bases. Communications of the Association for Computing Machinery, 13(6), 377-387. CODD, E. F. (1972). Relational completeness of data base sublanguages, in RUSTIN,R., Ed., Data Base Systems, pp. 65-98. Englewood Cliffs, New Jersey: Prentice-Hall. DATE, J. J. (1981). An Introduction to Data Base Systems, 3rd Edition. New York: AddisonWesley.

ENGLES, R. W. (1972). A tutorial on data base organization. Annual Review of Automatic Programming, 7 (1), 1-63. FALKENBERG, E. (1976). Signification: The key to unify data base management. Information Systems, 2 (1), 19-28. HOLLAND, R. H. (1980). DBMS: Developing user views. Datamation, 141-144 (February). IBM, Query-By-Example User Guide, SH20-2078-0. JAMES, E. B. (1980). The user interface. The Computer Journal, 23(1), 25-28. NIJSSEN, G. M. (1976). A gross architecture for the next generation DBMS. Proceedings IFIP Conference Modelling in Data Base Management Systems, January, pp. 1-24. PROWSE, P. (1980). The data base approach. The Computer Journal, 23(1), 9-12. VANDIJCK, E. (1977). Towards a more familiar relational retrieval language. Information Systems, 2 (4), 159-169. ZLOOF, M. M. (1977). Query-by-example: A data base language. IBMSystem Journal, 4/1977, 324-343.

MODELLING

MAN-MACHINE

69

INTERFACE

Appendix EXAMPLES OF QUERIES PARSED TO MODEL CONCEPTUAL FORM

Query Class

1. Get all items sold. (ITEM) 2. Print supplier names. (SUPPLIER) 3. Find names of employees earning more than 11,500. (NAME(GR(SALARY, 11500)))

Jcol J suf

J lval

A A B

I f

Iref

1

4. Which items are sold by the toy department. (ITEM(EQ(DEPT, 'TOY'))) 5. Get the people who earn more than 10,000 and work in the toy department. (NAME(AND(GR(SALARY, 10000))(EQ(DEPT, 'TOY'))))

l suf

11 suf

I suf

6. Who earns more than 10,000, but not working in the cosmetics department. (NAME(AND(NOT(EQ(DEPT, 'COSMETICS')))(GT(SALARY, 10000))))

[suf

J Isuf I

C

I

[suf

I

7. Find all green items sold by the toy department. (ITEM(AND (EQ(COLOR'GREEN'))(EQ(DEPT'TOY')))) 8. Find all people earning more than Lewis. (NAME(GR(SALARY, (SALARY(EQ(NAME, 'LEWIS'))))))

l suf I col

C D

J ]

II val = ref

I suf

I

[ ref ] 9. Find people earning more than Lewis and working in a department selling pens. (NAME(AND(GR(SALARY, (SALARY(EQ(NAME, 'LEWIS'))))) [ v a l = r e f j suf J I I suf (EQ(DEPT, (DEPT(EQ(ITEM, 'PEN')))))))

Isuf [

I I suf [ ref

C

J I r

Iref

I suf

B

Pal = ref

suf

L ]

1 f l

D

70

I. S P I E G L E R

10. Obtain the sum of salaries in the household department. (SUM(SALARY(EQ(DEPT, 'HOUSEHOLD'))))

rsuf

E

J

[ref [ [inf I 11. Who earns more than the average salary (NAME(GR(SALARY, (AVG(SALARY(EQ(SALARY, ALL))))))) ]snf I ref [ val = element [ref

l suf

tcol

J Eval I

] l l

1

[

E