An approach to embed knowledge in database systems

An approach to embed knowledge in database systems

Engng Applic. Artif lntell. Vol. 5, No. 5, pp. 413-423, 1992 0952-1976/92 $5.00 + 0.00 Copyright © 1992 Pergamon Press Ltd Printed in Great Britain...

1MB Sizes 2 Downloads 98 Views

Engng Applic. Artif lntell. Vol. 5, No. 5, pp. 413-423, 1992

0952-1976/92 $5.00 + 0.00 Copyright © 1992 Pergamon Press Ltd

Printed in Great Britain. All rights reserved

Contributed Paper

An Approach to Embed Knowledge in Database Systems CHEE-KIONG SOH Nanyang Technological University, Singapore

AI-KAH SOH Nanyang Technological University, Singapore

KUM-YEW LAI Massachusetts Institute of Technology, Cambridge

Most tools for developing knowledge-based systems do not integrate well with databases. This lack of integration is hampered by slow database queries or by overly limited database interfaces. This paper describes a novel approach to the knowledge-data integration problem by embedding knowledge directly into the database systems. This approach entails the simple but powerful extension of a database system to increase its ability to represent and manipulate knowledge. A description is given of KBase, an environment for embedding knowledge in databases using facilities and extended constructs of the familiar dBase database programming environment. At the end, the applicability of the approach is illustrated with a developmental prototype, CICONSA, for scheduling construction projects. Keywords: Database systems, knowledge-based systems, knowledge-embedded database systems, dBase, KBase and CICONSA.

effort to provide the twin requirements of storage and knowledge processing could fruitfully start with the Database systems now form the basis of many producintegration of database and knowledge-based systems. tion application systems. The technology for the This paper begins by explaining what is meant by purposes of data storage has matured. This maturity is "database" and "knowledge-based systems". After also reflected in the burgeoning research that concenestablishing this common understanding with the trates on how to get more out of the stored data, reader, a framework is presented that classifies many instead of on traditional transaction-level issues. For existing techniques for integrating databases and instance, there is a large amount of work done on knowledge-bases. Within this framework, an attempt deductive databases.L2 On the other hand, contemporc~n be made to classify and better understand existing ary knowledge-based systems have grown larger as they attempts to integrate databases and knowledge-bases. are put into production use. The permanent storage of In particular, there is a need for a simple, integrated knowledge is now an increasingly pressing research design that works reasonably well for most engineering issue. Some investigations have been conducted on how applications, rather than a comprehensive design that to store knowledge items. In particular, the work on requires an unreasonable amount of change in existing object-oriented databases is most prominent. 3'4 installations. This is essentially the 80-20 rule: it would The simultaneous requirements for the permanence be desirable to achieve 80% of the benefits of integof information storage and the knowledge processing of ration at only 20% of the costs. With this objective, it is such information suggest that the study of systems that argued that a novel mean of integration, resulting in provide these two requirements is an important one in what could be called "knowledge-embedded database its own right. Because database systems are now the systems", is an appealing one that has the potential to most widely understood forms of data storage and, make a real contribution in practice. The idea of likewise, knowledge-based systems have been widely knowledge-embedded database systems is illustrated used to support knowledge processing, the research with an implemented environment called KBase. KBase is an extension of the popular dBase 5 dataCorrespondence shouldbe sent to: Chee-KiongSoh, Schoolof Civil and Structural Engineering, Nanyang Technological University, base system so that knowledge engineers can easily NanyangAvenue, Singapore2263. build intelligent database applications that accomplish INTRODUCTION

t~ s~-a

413

414

CHEE-KIONG SOH et al.: KNOWLEDGEIN DATABASESYSTEMS

the twin objectives of data storage and knowledge processing. The distinguishing contribution of KBase is not in advancing the state-of-the-art in databases or knowledge-based systems p e r s e . It is in the practical integration of databases and knowledge-bases in a convenient way that takes into account the many existing dBase applications in the commercial and engineering world. That is, the contribution lies in the elegant integration of two existing technologies instead of advancing the state-of-the-art in either one. Not only is it easy to integrate databases and knowledge-bases with KBase, but applications built using KBase are surprisingly elegant and robust. The next section describes KBase, an environment for developing knowledge-embedded database systems. Its applicability is illustrated with a brief description of an application system built with it. The system, called CICONSA, supports scheduling in civil construction (e.g. the construction of low-rise reinforced concrete buildings). The paper concludes by attributing these pleasant features to the well-tested technologies (e.g. dBase is very widely used) and the fact that KBase is an open environment which facilitates the addition of features and computing services not anticipated by its designers. One important direction that this research leads to is the generalization of the ideas to relational database technology, and more-general kinds of knowledgerepresentation schemes. DATABASE AND KNOWLEDGE-BASED SYSTEMS

In this paper, "database" is used synonymously with "database system" or "database management system," while "knowledge-base" is used synonymously with "knowledge system" or "knowledge-based system". Because of the pervasiveness of relational databases, this paper will concentrate only on such databases as opposed to others. Databases emphasize effectiveness and robustness in data storage, retrieval, and security. On the other hand, knowledge-bases emphasize the expressiveness and flexibility of encoded knowledge. The similarities and differences between databases and knowledge-bases can be examined, to get a better sense of how systems can be designed, that integrate the advantageous features from both kinds of systems. At a functional level, both databases and knowledgebases store and organize information in computer systems in order to help people solve problems. In particular, both usually store information in a declarative manner but use imperative mechanisms to manipulate the stored information.6'7 For instance, relational databases store declarative information in the form of tables s while knowledge systems might store declarative information using a knowledge-representation language. 9-1° Databases use data-definition and datamanipulation languages for manipulating and retrieving data, while knowledge systems use inferences to reason

about the stored knowledge. This separation between declarative and imperative information in both database and knowlege-based systems can potentially be exploited in integration. Of course, there are differences between database and knowledge-based systems, which make it difficult for a straightforward integration of the two kinds of systems. The most obvious difference is that the declarative component of databases is relatively static, while it is the imperative component of knowledgebases that is relatively static. This kind of difficulty is generally called an "impedance mismatch" in the artificial intelligence literature. This difference between database and knowledge-based systems spells trouble for a naive integration. Another kind of difficulty is an inability to exploit people's understanding of aspects of technologies. For instance, it is difficult to migrate information between database tables and knowledge structures, and simultaneously take advantage of the current understanding about database design and knowledge representation. For imperative components, it is not trivial to translate optimized database queries to inference mechanisms that continue to be optimized. One important reason for the difficulty in re-using knowledge from one kind of system in another, is the coupling of the knowledge in static and dynamic components. These observations about the similarities and differences between databases and knowledge-bases will be a running theme in the framework for integration proposed by the authors. They also suggest that one approach to fruitful integration would be a close coupling of database and knowledge systems. This close coupling has the potential to resolve the first difficulty of impedance mismatch by removing any requirements for mapping between parts of the database and knowledge systems. If an integrated system can use a database table-like declarative component for representing knowledge, then it would be easy to extend the component to cope with the dynamic aspects of knowledge representation. Similarly, an imperative component that has an integrated inference and query mechanism need not consider translating static inference processes into dynamic queries, and v i c e v e r s a . Inference processes will remain static and queries will remain dynamic. A close coupling of database and knowledge systems might also help resolve the second difficulty of making use of the knowledge in individual areas of the system. By making use of both database tables and knowledge representation schemes within the same system, use can simultaneously be made of normalized databases, as well as knowledge-level features such as inheritance. Similarly, if the system has an imperative inference mechanism that consists of inference mechanisms as well as database queries, then it would be simple to optimize just the queries without the need to consider optimizing the imperative mechanism as a whole.

CHEE-KIONG SOH et al.: KNOWLEDGE IN DATABASE SYSTEMS This kind of close coupling between database and knowledge systems might best be understood within the context of existing methods for integration. In the next section, such a framework is presented.

415

(a)

:::::::::::::::::::::::::::::::::::::::::::::::::

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Database

iiii~ii~iiiiiiiiiiiii!!i!i!!!iiiiii;{

Knowledge - base

A F R A M E W O R K FOR THE INTEGRATION OF DATABASES AND K N O W L E D G E - B A S E S

A simple, traditional approach to the problem of integrating a database and a knowledge-based system is to use a fiat file as an intermediary. To transfer information from the knowledge-based system to the database, the former writes the information into the file, possibly using delimiters to mark certain syntactic forms. The fiat file is transferred to the database system, which reads it and stores it as rows. The database system transfers information to the knowledge-based system with the converse procedure. This straightforward approach is perfectly suitable for knowlege-based applications that reason over a well-specified data set, and in which the interaction between the database and knowledge-based systems is kept to a minimum. Often, though, the data a knowledge system needs may not be known until analysis within the system is well underway. Without the right network connections, the interaction between the two systems can involve an almost arbitrarily complex process. Perhaps the most debilitating limitation of this approach is that it does not scale well. It also works less well as the number of databases or knowledge-bases increases. A common file format understood by all integrated databases and knowledge-bases would be improbable because: (1) it would be difficult for knowledge engineers to anticipate what databases or knowledgebases they might add to the integrated configuration; and (2) even if the additional databases or knowledge-bases could be reprogrammed so that they can read and write using the common format, it is unlikely that such reprogramming would result in an efficient integration. Because the integration of databases and knowledgebases is a non-trivial but important problem, there has been a fair amount of research into various solutions. These solutions by themselves may look ad hoc, but they can be better seen as various tradeoff points within a unified framework. The framework is primarily an architectural one: it considers in a systematic way, various possibilities for locating the functionality required for integration. Figure 1 illustrates five approaches which form a fairly exhaustive set of possibilities for integration. (1) Build a full bridge between a database system and the knowledge-based systems. This is essentially the "naive" approach considered in the beginning of this

(b) iiiiiiii!iiii!iiiiii!iii

ij!i i i i ji i ijKnowledge iji!i !! - base iiiiiiiiiiiiiiiiiiiiii!i iiiiiiiiii;i

Database

(c)

Database

Knowledge - base

(d) Knowledge - base

(e) Database

Fig. 1. Possible methods of integrating databases and knowledgebases. section. For the purposes of the practical integration of knowledge bases and databases, this approach falls short. It does not scale up well. Most early work on knowledge base-database integration, such as Ref. 11, uses this approach (see Fig. la). (2) Build a bridging interface from the knowledgebased system. This interface can connect to a number of different databases. An example of this approach is KEE-Connection, 12 which is an interface from a knowledge-based system built using the K E E Knowledge Engineering Environment. KEEConnection can connect to a number of database systems. At a lower level of connectivity, KEE-Connection can use different network protocols to connect to these database systems. The advantage of this approach is that it can connect to existing database systems, if these systems are anticipated in the design of the interface (see Fig. lb).

416

CHEE-KIONG SOH et al.: KNOWLEDGEIN DATABASE SYSTEMS

(3) Build a bridging interface from the database system. Examples of this approach include all the enhancements to database systems, such as objectoriented databases 3 and older research on semantic data models 13 (see Fig. lc).

(4) Build database development tools into the knowledge-based system (Fig. ld). A principal disadvantage of this approach is that it requires the construction of a database system after (or in the progress of) building the knowledge system. This order of construction is almost always the inverse of that in the business and engineering world. More often, a database might have already existed for a number of years. As the people concerned became aware of knowledge-based systems technology, they would like to enrich the capabilities then available in their databases. This observation that production database systems are much more likely to precede knowledge systems immediately suggests the final approach.

(5) Build knowledge-base development tools into the database system (Fig. le). Since extending a database system for knowledge processing is non-trivial (although the use of the resulting system is simple), this would be likely to have the same cost as the approach above, and yet it would not suffer the cost of connecting the database and knowledge-based systems. Such systems could be described as "knowledge-embedded" database systems. In order to further develop and test this idea of knowledge-embedded database systems, KBase, an environment for building such systems, was constructed. Furthermore, KBase has been used to build an application system CICONSA. This exercise in the development of KBase and a KBase application provides an insight into how knowledge-embedded database systems might be realized and used in practice.

THE KBASE ENVIRONMENT KBase is designed based on the above observation about the advantages of knowledge-embedded database systems. Its design also hinges on the following additional observations about the current state of the art in knowledge-based systems development and implementation: 14 (1) Much of the training required for developing knowledge-based systems stems from the unfamiliar programming languages and environments, rather than from the need to understand new artificial intelligence techniques; and (2) Current tools for developing knowledgebased systems are either very comprehensive but slow and therefore difficult to understand or use, or very fast but overly simplistic, and therefore are not very useful.

The design of KBase is intended to strike a balance between comprehensiveness and speed. KBase is an attempt to achieve some of the more useful features, including total compatibility with dBase (since KBase is a programming extension of dBase*) with an open architecture so that the developer can build his or her customized features on top of the ones already provided. Furthermore, customizing and extending KBase is as easy as programming in dBase. For example, it is simple to write user-defined functions, and even call programs outside the KBase system. In addition, it is easy to link routines written in dBase with those written in "C" or assembler. One might call this the "reduced feature set knowledge-based system development environment" approach, following the philosophy of "reduced instruction set computers". 17'19This type of environment provides the necessary primitives to build knowledge-based systems in a rapid and convenient way through the use of the high-level programming language to program additional capabilities. A highly desirable consequence of the design decision to leave out exotic features while providing an open architecture is that high run-time speed can be achieved. KBase applications run fast because of 3 reasons: (1) KBase applications can be compiled by C language compilers, (2) KBase is a knowledge-embedded database system which exploits the speed and robustness of the well-developed dBase programming environment, and (3) KBase can be made to run fast using metalevel knowledge 2° for the selective invocation of knowledge sources. The knowledge bases KBase allows the users to build a knowledge base of objects, rules and user-defined functions (i.e. in addition to those functions provided in KBase). These knowledge-base files created by KBase are in dBase format. Each knowledge base, consisting of a collection of objects, rules and user-defined functions, corresponds to a KBase application. At any time during the development, a new knowledge base can be created, or an existing knowledge base can be renamed or deleted. The user can also switch to another knowledge base to be used through the rest of the session.

(1) Objects Conceptually, each object is represented as a frame with an unordered list of fields and values for each field. From a user's perspective, an object is shown in a template format such as that shown below. * KBase is built using Clipper,~5a dBase compiler. Clipper is in turn built using the "C" programminglanguage.~6

CHEE-KIONG SOH et al.: KNOWLEDGE IN DATABASE SYSTEMS

Editor for the PILED_FOUNDATION object Type :concrete N u m b e r :24 Shape :square Size :305_mm Length : 16_m This template format is natural. It is common to list information in this manner, as in forms. It is also expressive because one can make an object point to another. For instance, "concrete" is another object with its own fields and values representing all the properties and characteristics of the concrete material. Finally, an object-oriented representation is flexible and general enough to represent both what one commonly thinks of as objects, as well as relations between them.

(2) Rules Another advantage with an object-oriented representation using a relational database language is the consistency with which both the declarative and procedural knowledge can be encoded. The object's knowledge base forms the declarative part: in a simple sense, it represents the state of the world. The rules form the procedural part: they represent how the state of the world can be changed. The consistency is derived from the use of the field-values format in the predicates and actions of rules. A rule has three parts: its name, its predicate, and its action. The predicate says when the rule should fire, and the action says what to do if the predicate is true. A sample rule is shown below: Editor for rule INSTALL_PILE RULE SET: PILED_FOUNDATION IF : (GET("Type ", "pile") = "concrete" . OR. GET("Type", "pile") = "steel") .AND. GETN("Length " "pile ") < = 20_m .AND. G E T ( ' T y p e " "soil") = "loose" .OR. G E T ( " T y p e " "soil") = " m e d i u m ") THEN : PROMPT("You m a y install the piles as a single piece. ") Rules are separated into rule sets, each of which has a name for identification. The next section will show how the rule sets are used. KBase emphasizes consistency in the user's and developer's interface. Editing a rule is similar to editing an object. This is made possible by the way rules and objects are represented in KBase using the dBase language. The user can choose "Edit" under the "Rules" menu. From a selected rule set, KBase would ask for the rule to edit. Editing, adding, deleting, renaming, duplicating, clearing, and flushing rules are

417

similar to those same operations for objects. The rules in a rule set are fired in the unspecified order as displayed in the rule set.

(3) Functions and the function library DBase programming is widely known and easy. KBase exploits this by using the same programming language in its code. The user can use the same commands and functions in dBase to build additional user own functions. A function is like an object with the following fixed set of fields: "Parameters", "Private", "Body", and "Return". "Parameters" contains the values passed to the called function, technically, the "formals" or the "arguments". "Private" is the declaration of local variables used in the body of the function. The scope of the variables" is within the function and other functions that it might call, i.e. its callees. "Body" contains a series of dBase commands and other functions to be executed when the function is called. The function returns the value of the expression or variable in the "Return" field. Editing a function, again, is like editing an object. A template is displayed so that the user can edit its values: Editor for the INSTALL_PILE_ACTIVITY function PARAMETERS :crew__size, install_pile PRIVATE : activity_duration BODY : activity_duration = GETN(" crew__size ",install_pile) *GETN(" efficiency",crew_size) RETURN : IIF(acitivity_duration > O, GETN("Number", "pile")/ activity_duration, O) In the example above, "instalLpile_activity" is passed "crew_size" and "install_pile" as arguments. Its body then computes "activity_duration", which is declared as a local variable. If "activity_duration" is positive, "instalLpile_activity" returns the number of piles (to be installed) divided by "activity_duration". Otherwise, it returns 0. With user-defined functions and a comprehensive function library, the user can enrich the knowledgebase environment. Rules can be more powerful because they can take arbitrary Boolean combinations of functions and execute complex functions with their actions. Here are two examples. The " Q U E R Y " function is used in a KBase application to ask the user about unknowns in the system. " Q U E R Y " takes the following as its arguments: (1) A question. (2) A picture template which determines what the user can type. For example, "999" means that the KBase application will accept only 3 digits as the answer to the question. (3) A valid check expression. If this returns.T.,

418

CHEE-KIONG SOH et al.: KNOWLEDGE IN DATABASE SYSTEMS then the answer is acceptable; if it returns .F., then it is not. The power of the KBase functionality can be shown in the manner in which both menu-based and form-based interfaces can be cleanly integrated in the " Q U E R Y " function. For a form-based interface, the user can just use a picture template. For a menubased interface, the " O N E _ O F " function as a valid check expression may be used. The expression "ONE_OF('piled_foundation', 'mat_foundation', 'strip_foundation', 'footing')" produces four items in a menu so that the user may choose one of them. (4) A string that tells the user where to find the answer to the question. (5) A string that tells the user why this question is being asked.

The following is an example of the function.

QUERY

QUERY ( " H o w w o u l d you rate the efficiency o f your w o r k e r s ? " " " "ONE_OF(" g o o d " a v e r a g e " "poor')","Check with your site f o r e m e n . " , " T o c o m p u t e the duration o f pile installation. ") The second example is the " I N F E R " function. This system-provided function takes the following arguments: (1) a rule set to consider; and (2) a logical expression, that is, the goal. " I N F E R " runs all the rules in the rule set until the goal is achieved or it is unachievable, i.e. no rule in the rule set applies, or if any applies, it is not going to help. For instance, in the example shown below, the function call runs over all the rules in the rule set named "Piled_foundation". The goal expression that is considered requires that the activities of installing piled-foundation be known, and the duration and precedence of each of the activities be known. INFER( "Piled_foundation ", "KNOWN('Activities ",'foundation ").AND. KNO WN('Duration" "activities').AND. KNO WN('Precedence" "activities') ") In the example above, the " I N F E R " function may be called, say, by a rule which detects when a building should have piled-foundation, and when fired, calls " I N F E R " to find the details of installing the foundation. KBase has a special function called "First_Function". This is the first function automatically called by the application program when it is started by a user. It starts the program running. Editor for the FIRST_FUNCTION function PARAMETERS: PRIVATE :mille BODY:CLEAR

@ lO, O TEXT

CICONSA CIVIL CONSTRUCTION SCHEDULE ADVISOR Copyright (1990) Chee-Kiong Soh ENDTEXT @ 23,0 SAY "At any point, you can type "Esc" to interrupt. " WAIT QUERY ( " W h a t is the function o f your building?")

INFER("Goal ", "Do_something ") showresultO IF QUERY ( " W o u l d you like to file these conclusions?" ""; " ONE_OF('Yes" "No') " " " "") = " Yes" mille = QUERY("What is the file name to use?",; XXXXXXXX , , , ) I F ( ' " # m file, showresult(mfile),. F.) ENDIF RETURN: ,,

tt

H it

an r l

an t l

A typical "First_Function" as shown above sets the screen and asks a few initial questions. It then sets the inference engine running with an " I N F E R " function call. At the end, it calls the user-defined function "showresult", which shows the details of the recommended construction schedule for the construction of low-rise reinforced concrete buildings. It then gives the end-user* the option of saving the recommendations into a dBase file prior to activating the coupled external project management system, Hornet. 2~ While it may be generally sufficient to build customized functions in the KBase programming language, KBase has an open architecture which allows integration with user-written C and/or assembler code. The following section briefly describes a developmental prototype knowledge-embedded database system, CICONSA, for assisting in the scheduling of building construction projects. Readers are referred to Ref. 22 for more details on the system.

CICONSA: CIVIL CONSTRUCTION SCHEDULE ADVISOR The architecture of CICONSA is shown in Fig. 2. CICONSA attempts to mimic the behavior of a scheduling expert who originates building construction schedules. CICONSA, via an end-user interface, interactively solicits all the necessary factual input, preprocesses all this information, and then generates an output that can be utilized for scheduling. The develop* The user of KBase as a development tool is differentiated from the user of CICONSA as an application tool, by calling the former "user" and the latter "end-user".

CHEE-KIONG SOH et al.: KNOWLEDGEIN DATABASESYSTEMS ment prototype has been designed to generate two types of schedule, namely the rough and the detailed schedules. A rough schedule is intended for quick estimates of duration of activities (say, during tendering) while a detailed one is for more-accurate models and estimates (say, during construction). CICONSA is implemented using KBase and is coupled to a commercial project management system, Hornet, via the dBase III Plus database management system. The construction activities, together with their estimated durations and precedence, are first generated by CICONSA. The output is then passed on to Hornet for the scheduling process, taking full advantage of its established algorithms and report forms. The DBase III Plus database system, provides the necessary common interface for the transfer and storage of information between the constituent components in this integrated system. The implemented application system can thus have easy access for storage and retrieval of data, such as the crew's productivity data, etc., stored in dBase

files/tables. Results of consultations with the application system can also be stored in dBase files. These can be read by Hornet via its mask and instruction files. This automatic transfer of input data releases the enduser from the burdensome task of manual input which is prone to input error. Overview of CICONSA

CICONSA's approach offers two main advantages: (1) It avoids the introduction of an entirely new software, as programming in KBase is as easy as programming in dBase, a popular and commonly used software. (2) There is no duplication of effort to redevelop scheduling algorithms of a project planning software such as Hornet. The representation and manipulation of CICONSA's domain knowledge are briefly outlined below.

END - USER " ~ Input & updates

I

Consults

Updates

Output

CICONSA Knowledge bases Reports

)

I User 'defined functions

Objects

Rules

Factual

Heuristic

!Algorithmic

HORNET

know edge know edge knowledge

mmlmall

Scheduling Stores & updates

Retrieves data

DBASE III PLUS Storage

Transfer of pre-processed data

( Via mask & instructions )

Transfer

• Productivity

• Results of consultation • Work break- (i.e. activitydown list in delimited "~ Intermediate format) files •

419

Activitylist

Fig. 2. Architectureof CICONSA.

Project management utilities • Scheduling analysis • Report generation • Graphics etc ....

420

CHEE-KIONG SOH et al.: KNOWLEDGE IN DATABASE SYSTEMS

Activity description Activity description in CICONSA is represented in the following form: (action element) (object element). This form of representation had been used by other researchers, 23'24 and shown to be effective in terms of defining activities by being able to reason about their constituent elements. For example, the (action) elements of a "structural" activity are predefined internally as:

fif the object element is of foundation type) (fixing formwork) (fixing reinforcement) (placing concrete) (curing of concrete) (stripping of formwork)

excavate form rebar *place *cure *strip

*

these three could be aggregated to form a single action

The (object) elements of an activity are the components of a building:

column_footing ground._beam ground_slab etc... For example, an activity (place_column_footing) is an action of placing concrete to the object (column_ footing). Description of all the activities leading to the completion of a component can thus be defined. Aggregation of action elements and object elements can also be used to describe a higher level of abstraction of an activity definition. For instance, the activity (install_foundation) is an aggregation of the actions (excavate, form, rebar, etc..) and the objects (column_footing, ground_beam, etc..). One can thus include more constituent elements to describe grouped activities. This could better serve the end-user's need for different details of reporting or levels of analysis.

Activity durations Estimation of duration for each identified activity is based on the quantity of the work and the average productivity of the crew:

d=-

O

P

where,

x(fl,f2,etc.) d = duration (unit of time) O = quantity (unit of work) p = productivity of crew work/unit of time) f's = adjustment factors included.

(unit

of

to

be

The productivity rates of work are stored in dBase III tables. By storing these productivity data in dBase,

they can easily be retrieved and updated by the endusers, via the CICONSA's interface or dBase III directly. At the same time, the separation of this "active" knowledge from the CICONSA's knowledge base is advantageous to the end-user. He/she can modify the productivity data to suit his/her own conditions as these data are values derived from either standard sources25 or elicited heuristics. For example, productivity rates of work vary from company to company depending on construction experience and available resources of companies. The end-user may change these rates to reflect his/her own rates and, through repeated modifications, establish his/her own standards from past cases.

Activity precedence When all the activities leading to the construction of the facility have been inferred and their durations estimated, encoded precedence rules will infer the logical precedence among these activities. In this version of CICONSA, only rules related to the physical relationships and technological dependencies have been implemented. Other possible rules like resource constraints, and preferential logic from past experience in the field will be incorporated. As explained earlier, each activity description is uniquely defined to consist of (action) and (object) elements. They are stored in array forms (convenient for symbolic processing) to facilitate knowledge manipulation on their precedence. First the rules will test if two activities are indeed of the same object types; if they are, then activity (action) elements are compared. For an example, (rebar) of activity A (rebar_.beam) and (place) of activity B (place_beam) are compared. The rule on technology dependence (rebar must come before placing concrete), will dictate their precedence relation as being activity A precedes activity B. Second, activities will be tested on the (object) element if they are of different objects but of the same action type. Physical rules on the construction sequence then apply. It is necessary to infer/know all the activities that must precede any particular activity to sequence them. Every pair of activities is tested and their relationships determined as either one of preceding, or not preceding. If not preceding, they must be parallel activities. But to make practical use of this list, an algorithm has been written to further reduce all such preceding activities to those that are immediately preceding. The relationships in these precedence rules are made transparent to the end-user, through a template. For example, the sequence of the installation of column_footing, ground_beam and ground_slab are shown in the template below.

The following are the construction sequences inferred by the system. You may change them by typing over the suggested responses.

CHEE-KIONG SOH et al.: KNOWLEDGEIN DATABASESYSTEMS

FOUNDATION col_footing precedes ground_beam col_footing precedes ground_slab ground_beam precedes ground_.slab

Y Y Y

The end-user can change this precedence logic via the template interface, by typing over the system's precedence. Optionally, the end-user can manually input the precedence between any pair of activities. At this point, a list of preceding activities of all the identified activities is then built up. Scheduling

As stated earlier, all scheduling algorithms in CICONSA will be performed by the project management software, Hornet. All the required planning results recommended by CICONSA will be converted to an intermediate dBase file, in delimited format, as required by Hornet for the transfer. Hornet's input mask screen and instruction files are prepared and customized to read the file automatically and to store into the Hornet project files. The end-user will then be able to make use of all the scheduling facilities of Hornet. However, prior to doing so, the end-user is at liberty to run as many consultations as he/she considers necessary to cover all the likely scenarios. The end-user can also choose not to do another consultation but edit records of the recommended list of final results directly via a CICONSA's editing interface. A data-modification facility is also available in Hornet. Interactive

consultation

On starting CICONSA, an interactive dialogue is initiated to solicit input from the end-user. Via the interface, the end-user will establish the following input for the system to infer the activities and their durations: General data describing the type and size of the building, component details, and site conditions. - - The trades and their crew sizes to be employed. - - C o n s t r u c t i o n methods; concrete placing and excavation methods. (Currently only these two variations in technologies are included in the knowledge base. Others like the piling, forming methods, etc. will be encoded at a later stage.)

--

For the detailed schedule analysis, more projectspecific questions are prompted to the end-user than during the rough schedule analysis, where most data would be inferred by the system. Only relevant questions will be asked. For example, if the end-user's response to the prompt "Enter the type of schedule" is "rough", then the detailed questions on the numbers and dimensions of the various modules in the system will be skipped. These quantities of work will then be inferred from the set of general data supplied by the end-users.

421

Where applicable, validity checks have been built-in. For example, if the end-user's answer to the prompt on "Enter the area of the overall size of the site" is smaller than the size of the building, then a remark "the overall size of the site cannot be smaller than the area of the building" will be displayed on the screen. Such selective control of soliciting relevant information and validity checks are desirable features to project the image of "intelligent" behavior of the human expert. In addition to soliciting planning parameters, each question has a "Help" menu to allow the end-user to query the system as to why a particular question is asked and where to find the expected answer. For example, when prompted with a question on the height of a storey, the end-user will be able to ask and learn from CICONSA's suggestion that the height of the storey can be obtained from the drawings, and that the normal range for a typical building is about 2.7 to 4.0 m. Through these prompted messages and explanations, CICONSA can also function as a computer-aided tutor to the junior engineers/inexperienced end-users. Based on the end-user's input during the interactive consultation, CICONSA's inference engine will perform activity definitions and duration estimations for all the components. Inference is achieved via manipulation of the "static" factual knowledge stored in the object knowledge base, using the production-rules and the user-defined functions encoded in the "active" knowledge base. The "static" knowledge refers to the information required to describe the physical system and/or assumed states addressed in a problem, such as the physical components of a building and their dimensions and geometry. On the other hand, "active" knowledge such as the choice of construction methods in concreting, and gang sizes, form the procedural knowledge that instructs the system to change the state of the problem considered. The scope, as well as the hierarchical breakdown of work into sub-units, of the current version of CICONSA is shown in Fig. 3. Currently, CICONSA can only deal with simple construction activities of the low-rise residential type of concrete frame buildings. Different functional types of buildings, additional components of a building like roofing, interior partitions and finishes, and possibly "non-structural" activities like plumbing, H.V.A.C., etc., will be considered in the future development. At the end, the end-user may view the results of the consultation on screen or ask CICONSA to write them to a file. Example of a screen output is shown below:

CONSULTATION RESULTS

DURATION FOR INSTALL_COL_FOOTING Duration for excavating = 4.0 days Duration for forming = 1.5 days

422

CHEE-KIONG SOH et al.: KNOWLEDGE IN DATABASE SYSTEMS

BUILDING

I I

I .ouG.

[DETA,LEDI

I

'

( As in detailed analysis )

I

I

.es,Oon,ia,

I

I

0,,,°.

I.nOos,.,a, ]

i

t

( As in detailed analysis )

I

I

I foundation

frame

I

exterior wall

I

q I as's'os

piled {padf°°tingI I R'C' frame

sheet

I

• column footing

flat / pitched

• ground

beam •ground

slab

I Fig. 3. CICONSA's structural operation.

Duration Duration Duration Duration

for for for for

rebar = concreting= curing = stripping =

3.0 1.0 7.0 0.5

days days clays days

At this point, the end-user can decide if he/she wants to repeat the consultation with different input, quit, or proceed to infer the precedences. If precedences are required, the "precedence" function is then activated to infer precedence among the various activities by reasoning on the (action) and (object) elements of every pair of activities. The methodology of this process has been described earlier. When the operations of this function have been completed, the results are written to a dBase database file"pa.dbf", where all the precedents of the activities

are recorded. To make use of this list, "pa.dbf" is reduced by an algorithmic function to "ipa.dbf", where only the immediately preceding activities are stored. For instance, "excavate_coLfooting" cannot be an immediate precedent to "rebar_col_footing" as it is a precedent to "form_col_footing" which also precedes "rebar_coLfooting". Finally, the "Hornet_form" function is activated to create the "act_list.dbf", making use of the precedence relationships in "ipa.dbf" and the estimated durations. This is then converted to an intermediate file in delimited form using the dBase command "COPY TO act_list.txt TYPE D E L I M I T E D " . The end-user can then exit from CICONSA and start Hornet, to analyse the results obtained with CICONSA. This intermediate file prepared in the appropriate

CHEE-KIONG SOH et al.: KNOWLEDGE IN DATABASE SYSTEMS f o r m a t is then transferred to H o r n e t via the input mask and instruction files of H o r n e t . W h e n the instruction file is e x e c u t e d in H o r n e t , it will display the customized input mask, followed by its o w n activity u p d a t e screen as the data is being read into H o r n e t . This will continue till the end of the data file is reached. T h e end-user is then ready to m a k e use of the scheduling facility of H o r n e t to g e n e r a t e schedules and reports.

CONCLUSION

This p a p e r has i n t r o d u c e d the a p p r o a c h of e m b e d ding k n o w l e d g e directly in d a t a b a s e systems, and has described K B a s e to support the hypothesis that k n o w l e d g e - e m b e d d e d database systems are not only desirable but are surprisingly easy to u n d e r s t a n d and therefore, to d e v e l o p and use. Given an a p p r o p r i a t e d o m a i n , this a p p r o a c h should give a higher mileage than c o n v e n t i o n a l d e v e l o p m e n t efforts. Knowledgebased systems which require large databases m a y be able to be built in a shorter time and should be easier to maintain. H e n c e , this k n o w l e d g e - e m b e d d e d database system a p p r o a c h m a y admit the inexpensive duplication, distribution, preservation, and training of expertise and data. M a n y copies of the system m a y be m a d e and distributed to areas where the expertise is lacking or access to i n f o r m a t i o n is difficult. In business terms, this m a y reduce the exposure of corporations to the shortage of expertise and to tedious data retrieval. It is i m p o r t a n t to emphasize again that the idea of k n o w l e d g e - e m b e d d e d database systems is not confined to dBase systems. I n d e e d , it is believed that this idea can be applied to the m u c h m o r e general class of relational database systems. A future research direction will be to investigate h o w one might generalize the ideas presented in this p a p e r to the m u c h wider scope o f relational databases.

REFERENCES

1. Czejdo Bet al. TANGUY: Integrating the database, rule-based and objected-oriented paradigms Proc. 2nd. Int. Symp. on Database Systems for Advanced Applications, Japan (1991).

423

2. Mylopoulos J. and Brodie M. L. (Eds) Readings in Artificial Intelligence and Databases, Morgan Kaufmann, California, U.S.A. (1988). 3. Kim W. and Lochovsky F. H. (Eds) Object-Oriented Concepts, Databases, and Applications, ACM Press, New York (1989). 4. Zdonik S. B. and Maier D. (Eds) Readings in Object-Oriented Database Systems. Morgan Kaufmann, California, U.S.A. (1990). 5. Ashton-Tate, DBase II1 Plus, Version 1.10, U.S.A. (1986). 6. Ullman J. D. Principles of Database Systems, 2nd edn. Computer Science Press, Rockville (1982). 7. Winograd T. Frame representations and the declarative/ procedural controversy, In Representation and Understanding (Bobrow D. G. and Collins A., Eds). Academic Press, New York (1975). 8. Codd E. F. A relational model for large shared data banks, Commun. ACM 13, 377-387 (1970). 9. Bobrow D. G. and Winograd T. An overview of KRL, a knowledge representation language. Cognitive Sci. 1, (1), 3-46 (1977). 10. Kowalski R. Logic for Problem Solving. North-Holland, Amsterdam (1979). 11. AI-Saadoun S. S. and Arora J. S. Interactive design optimization of framed structures. A SCE J. Comput. Civil Engng 3, (1), 60-74 (1989). 12. Intellicorp Inc., KEE-Connection, U.S.A. (1987). 13. Hammer H. and McLeod D. Database description with SDM: a semantic database model. ACM Trans. Database Syst. 6, (3), 351-386 (1981). 14. Soh C. K., Soh A. K. and Lai K. Y. KBASE: A customizable tool for building dBase compatible knowledge-based systems, Ado. Engng Software 11, (3), 136-148 (1989). 15. Nantucket Corp, The Clipper Compiler, U.S.A. (1986). 16. Kernighan B. W. and Ritchie D. M. The C Programming Language. Prentice-Hall, Englewood Cliffs, NJ (1978). 17. Kane G. MIPS R1SC Architecture. Prentice-Hall, Englewood Cliffs, NJ (1988). 18. Patterson D. A. Reduced instruction set computers. Commun. ACM, pp. 8-21 (1985). 19. Tabak D. Reduced Instruction Set Computer--RlSC Architecture. Research Studies Press, Wiley (1987). 20. Davis R. and Buchanan B. G. Meta-level knowledge: overview and applications. Proc. IJCAI-77, pp. 920-927 (1977). 21. Claremont, HORNET: Project Management System For Microcomputers. Claremont Controls Ltd, U.K. (1984). 22. Wong W. P., Soh C. K. and Phang K. W. An integrated expert system to assist in the generation of building construction schedules, Proc. 2nd IES Information Technol. Conf., Vol. 1, pp. 6576, Singapore (1991). 23. Marshall G., Barber T. J. and Boardman J. T. Methodology for modeling a project management control environment, lEE Proc. pp. 287-300 (1987). 24. Darwiche A. et al. Oarplan: generating project plans by reasoning about objects, actions and resources. J. Artif. InteR. Engng Design, Analysis Manufacturing 2, (3), 169-181 (1988). 25. Bentley J. I. W. Construction Tendering and Estimation. E&F.N. SPON, London (1987).