Towards standards for the validation of expert systems

Towards standards for the validation of expert systems

Expert Systems With Applications, Vol. 2, pp. 251-258, 1991 0957--4174/91 $3.00 + .00 © 1991 Pergamon Press plc Printed in the USA. Towards Standar...

782KB Sizes 3 Downloads 59 Views

Expert Systems With Applications, Vol. 2, pp. 251-258, 1991

0957--4174/91 $3.00 + .00 © 1991 Pergamon Press plc

Printed in the USA.

Towards Standards for the Validation of Expert Systems PATRICK R. HARRISON AND P. A N N RATCLIFFE* U.S. Naval Academy,Annapolis, MD

Abstract-- The paper provides a basis for the standardization of the validation of knowledge-based systems. The place of validation within the development process is discussed and a model is proposed. Knowledge-based system validation problems are decomposed using sequences of independent and dependent generic tasks. A model for validation of KBS causal processes as well as performance outcomes is presented. Practical applications to real systems are described. 1. P U R P O S E

THE PURPOSE OF this paper is to provide a basis for standardizing the validation of expert or knowledgebased systems (KBS). This does not imply a lockstep set of metrics for measuring performance because the range of possibilities is much too large. It does mean a set of conceptual standards for placing validation in proper perspective in the development process. Standards are discussed for the decomposition of validation tasks and for determining the set of measurements that define a minimal validation effort. A model for validating KBS processes as well as outcomes is also presented. Examples from several KBS projects are used to demonstrate the practical use of these concepts in validating actual systems. The paper builds upon ideas introduced in Harrison (1989). Background for the general problem of expert system validation, and the validation of software in general is discussed in a number of references (Adrion, Branstad, & Cherniavsky, 1982; Buchanan & Shortliffe, 1984; Culbert, Riley, & Savely, 1989; Davis & Lenat, 1982; Geissman & Schultz, 1988; Guttag, Horowitz, & Musser, 1978; Guttag, 1980; Hayes-Roth, Waterman, & Lenat, 1983; O'Keefe, Balci, & Smith, 1987; Shortliffe, 1976; Zualkernan, Tasi, & Volovik, 1986). 2. T H E G E S T A L T The paper takes as its point of departure the idea that all knowledge-based systems (KBS) can be viewed as describing models of systems in the world (Clancey,

* Now with Jorge ScientificCorporation, Arlington, VA. Requestsfor reprintsshouldbe sent to PatrickR. Harrison,Computer Science Department, U.S. Naval Academy, Annapolis, MD 21402. 251

1983, 1985, 1989; Clancey & Shortliffe, 1984; Wilkins, Clancey, & Buchanan, 1987). The KBS contains a metalanguage, a set of possible instantiations, a set of interfaces to other systems in the world, and a set of constraints and defaults that define its limitations. By definition, no model is complete. Therefore, the KBS is not only a model of a system in the world, but also is a model of how it was abstracted from the world. It must have constraints and default characteristics that define how it deals with its own abstraction. The KBS as a model of a system in the world has two attributes that are particularly important for this approach to validation. First, the KBS can be classified based on the degree to which the model explicitly describes the underlying causality of the processes it models. Second, the KBS inferencing process can be decomposed in a meaningful way into smaller subprocesses or tasks for the purpose of validation. A simple rule-based expert system that contains rules that represent compiled knowledge would be classified as a model that does not describe explicitly the underlying causality of the processes being modeled. Causality is implicit in a collection of rules (Chandrasekaran, 1983). The KBS implements how to make it work but not why it works. A multiple fault diagnosis system that explicitly describes structure and function for each component as well as causal relations between components would be at the opposite end of the spectrum of models (Davis, 1984; Hamscher, 1988). It makes the underlying causal structure explicit. The validation process for KBS systems at opposite ends of this continuum is different. The degree to which they can be validated and the kinds of validation techniques that would be applicable are different (Clancey, 1989; White & Frederiksen, 1990). A KBS can be decomposed by tasks, called generic tasks by Clancey (1985). The behavior of the KBS can be thought of in terms of the application of a sequence of generic tasks to a model of a system in the world.

252

P. R. Harrison and P. A. Ratcliffe

The application of the task or a set of tasks generates a temporally ordered sequence of model instantiations and definable outcomes. Describing the behavior of a KBS in terms of a sequence of tasks provides a basis for decomposing the KBS into more manageable modules. The validated modules then provide a foundation for overall validation in terms of both process and outcomes. This does not deny the importance of interaction between components; it simply provides a conceptual building block so that validation can proceed in a reasonable way from individual modules to the complete system. It provides a basis for incremental validation. 3. E X A M P L E S Two systems currently being developed by the authors are called VEG and FIRAS (Kimes, Harrison, & Ratcliffe, 1991). These systems represent process models at opposite ends of the classification continuum. At one end is a typical classification problem implemented as a rule-based system with causal relations implicit in rules or sets of rules. An example is VEG, which solves a remote sensing problem using reflectance data from a satellite. Raw reflectance data is evaluated and used to generate a set of data abstractions called strings. Based on the strings and a set of tentative assertions about the strings, a set of extraction techniques is generated. These techniques are applied generating a restricted reflectance data set that is subjected to further analysis and refinement using additional heuristics and rules. At the other end, is a model-based system that explicitly describes causality in terms of the underlying structural and functional relationships in a system. An example is FIRAS, which is a design aid that uses heuristic knowledge to configure a hypothetical ship for active and passive fire fighting given a risk posture and a structural/functional database. Using heuristic knowledge, costs and benefits are calculated from the application model for different risk postures. With both VEG and FIRAS, the practical limits as to what can be accomplished in terms of validation are defined by the limits inherent in the qualitative process models used for representation. A KBS that explicitly represents causality allows for a more complete validation effort. 4. A S S U M P T I O N S It is assumed that a standardized hybrid shell such as KEE will provide the inference engines and general control facilities for the development of the knowledge base. The practical significance of these shells is that they provide a great deal of representational flexibility. It will be assumed that the shell works in a correct

fashion as described in documentation. Therefore, from a validation point of view, it can be considered constant while focus is on the development and refinement of the knowledge base and thus the model of the world the KB represents. In general, a hybrid environment such as KEE, ART, or K N O W L E D G E CRAFT represents the best alternative for KBS prototyping. Any one of the three is preferable to a system developed from scratch. The reason is that it is very difficult to write interfaces, inference engines, support tools, and objects systems that are valid. The validation problem becomes much more difficult to manage as a consequence. It is also assumed that objects in the system are represented in frame-like structures. Frames can encapsulate functional operations, attribute-value relations, network relations, object history, and so on. Conceptually, the concept of an object frame abstracts the complexity and clutter of objects in the world. Frames are models of objects in the world. They provide a data structure that can manage a complex concept easily. A number of articles provide information for evaluating tools (Richer, 1986; Rothenburg, Paul, Kameny, Kipps, & Swenson, 1987; Szolovits, 1987). Kline and Dolins (1989) discuss the relationship between KBS design decisions and tool choice. 5. D E F I N I T I O N S Validation is the collection of measurement operations that ensure three things: 1. The system behaves in a consistent manner; 2. the system does what it is supposed to do; 3. the system does what it is supposed to do in context. Definitions of validation often include the concept of verification. Verification is defined as the formal analysis of the KBS to determine if it has implemented its specification. Most KB systems start with requirements that define a general direction and, hopefully, some measurement operations on the system. Specifications evolve within the prototyping loop and are often limited to functional statements. In view of this and in view of the practical limitations of formal verification techniques, verification is not a focus in this paper. Some of the procedures discussed in the section on consistency are verification techniques. Consistency addresses the formal and internal qualities of the KB. These include the formal properties of the KB that ensure completeness (Cragen & Steudel, 1987; Nguyen, 1987; Nguyen, Perkins, Laffey, & Pecora, 1985, 1987; Stachowitz & Combs, Stachowitz, Combs, & Chang, Stachowitz, Chang, Stock, & Combs, 1987; Stachowitz & Chang, 1990; Suwa, Scott, & Shortliffe, 1982), minimality, and sufficiency (Genesereth & Nilsson, 1987, Reggia, Nau, & Wang, 1984). Consistency checks are included in the standards for

Standards for Expert Systems Validation validation. Consistency in the knowledge bases must be assumed before the validation of application models or performance can proceed. It must be assumed that problems in subsumption, cycles, redundancy, minimality, completeness, and sufficiency have been addressed. Failure to do this confounds errors of inconsistency with errors of performance and model validity. Internal consistency includes both logical consistency and behavioral consistency. The two concepts are distinct though one necessarily precedes the other. Logical consistency refers to formal qualities of the evolving KB mentioned above. The work of Stachowitz and others has detailed the considerations necessary to ensure logical consistency. Figure l summarizes the consistency checks which can be made on individual rules (Stachowitz & Combs, 1987; Stachowitz et al., 1987a, 1987b; Stachowitz, Chang et al., 1987; Stachowitz & Chang, 1990; Suwa, Scott, & Shortliffe, 1982). See Harrison (1989) for details. These tests can also be applied to chains of rules (Nguyen, 1987; Nguyen et al., 1985, 1987). Ginsberg, Weiss, & Politakis (1988) extended these ideas further, incorporating some ideas borrowed from truth maintenance literature, to detect all the inconsistencies and redundancies that exist in a KB. The consistency checks summarized in Figure 1 are purely syntactic. Stachowitz has extended these ideas to incorporate semantic considerations. The Expert Systems Validation Associate (EVA) that he is developing at Lockheed incorporates all the above checks but it also checks for reachability, redundancy, cycles, inconsistencies, and conflicts caused by generalization, synonymy, and compatibility. For example, given that STUDENT ISA PERSON, EVA will detect that the rule PERSON(x) =* MORTAL(x) subsumes the rule STUDENT(x) =* MORTAL(x). Behavioral consistency refers to behavioral invariance (also called sensitivity) under varying conditions that should produce the same results. For example, if data is input in a slightly different order or if two test cases differ only in ways that are not pertinent to the hypotheses being evaluated, the results should be basically the same. If the KBS cannot be relied upon to produce the same outcomes when given a particular class of problem that should produce the same solution class then the system is inconsistent. Though this may seem an obvious point, it is an important one. Order effects and a variety of side effects are common problems that produce undesirable performance variation. The determination that the system is doing what it is supposed to do includes both validation based on requirements and validation based on the qualitative measurements of the process models the KBS develops. Uncertainty is subsumed in the process model validation. Requirements from this approach are called competency based, to emphasize that they must be de-

253 Redundancy: A & B ==* C is equivalent to B & A ~ C Conflict: A & B ~ C conflicts with A & B ~ ~ C Subsumption: A ~ C subsumes A & B = C Unnecessary conditions: B is unnecessary if A&B~CandA&~B~C Unreachable conditions: A ~ C C is unreachable if A does not match a fact or the RHS of another rule Cycles: A ~ B, B ~ C, C ~ A FIGURE 1. Consistency checks on a knowledge base.

scribed as explicit and measurable competencies that the KBS should be able to demonstrate. Measurable implies that either sui generis or constructed validation criteria can be generated from the requirements. If a requirement/specification cannot be operationalized in terms of measurement operations then it cannot be validated in any substantial way, and the client must rely on face validity in evaluating that particular requirement/specification. It is important to have an idea of what might fall into this category early in development since the inability to define measurement operations on a requirement/specification might be critical to the decision to continue with the project. Validation in context which is also called certification refers to the validation of the KBS in its operational environment. The KBS is always part of a larger system in which it must be certified. In our view, certification of a KBS is done as part of a normal software-engineering process. It is not done as part of a prototyping effort.

6. THE PLACE OF VALIDATION IN THE D E V E L O P M E N T PROCESS Two things must be done to put the validation of KB systems in perspective. First, a working model for the software development of a KBS is needed. This is provided in this paper as a working concept. Second, means are needed for reducing the complexity of the validation process by decomposing the problem into conceptually distinct parts that can be looked at more clearly and completely. Figure 2 shows a general model for KBS development that uses a prototyping loop to develop the knowledge base. The assumption in this approach is that the purpose of the prototyping loop is to define a functional specification for the KBS. The KBS taken outside of the prototyping loop provides a partially validated, functional specification that can be revalidated against itself as the KBS is integrated into a general software development model. This recognizes that a good deal of what comprises a finished KBS product is not KBS technology but good software-engineering

254

P. R. Harrison and P. A. Ratcliffe

REQUIREMENTS & NEEDS ANALYSIS Define Competencies & Measurement Operations PROTOTYPING (SPECIFICATION) LOOP LOOP Knowledge Acquisition Knowledge Level Analysis& Organization Domain & Control Knowledge Representation Validate Internal & Formal Properties UNTIL TESTABLE LEVEL OF COMPLETENESS ACHIEVED IN KB Inference Engine Design & Integration Validation and Competency Testing Modification, Augmentation and Assessment of Requirements UNTIL PROTOTYPE IS COMPLETE EMBED KBS IN LIFE-CYCLE MANAGEMENT PROCESS Validation of KBS in Context

necessary and proceed. It is assumed that the validation process will define the completion of the prototyping loop. Validation in context is noted as part of the life cycle. This does not imply that context is ignored within the prototyping loop. Context is always important and should be included as much as possible in the validation process. The implication is that full validation of the KBS must be viewed eventually within the operational context and within the larger system of which it is part. From our view, this is always the case. Even a product that is developed strictly within a hybrid environment such as KEE, includes interfaces, database hooks, and so on that are subject to standard software-engineering practice. It is within this wider context that the KBS must demonstrate its practicality. Harrison (1989) discusses this in detail.

FIGURE 2. General model for KBS development.

technology. The development of interfaces and database requirements are examples. The requirements and needs analysis step includes the definition of competencies and measurement operations that provide a basis for a partial validation of the evolving KBS. Criteria for validation and the definition of measurement operations on outcomes or model parameters will, of course, evolve as the system is developed. For this reason, an outer loop is provided that includes a modification, augmentation, and assessment of requirements step. This recognizes that a problem in the validation step could refer to a KB problem or a problem in the validation requirement itself. This step allows the validation process to have a developmental component. When the prototyping loop is completed, the validation process for the KBS becomes part of the KBS, part of the KBS definition. It provides a basis for validation of the KBS in the larger system context, and it provides a basis for continued validation in the face of later development and maintenance operations on the system. The prototyping phase contains two loops. The inner loop provides for the acquisition and development of knowledge. It separates analysis at the knowledge level from implementation (Newell, 1980, 1982). It also requires that the internal consistency of the KB be addressed prior to validation of the KBS. A failure to do this confounds issues of consistency with issues of validity. The outer loop requires that the knowledge be represented and integrated into the development environment before validation of the KBS proceeds. Validation proceeds in several steps. Validate the KB and then validate the KBS. Rethink the validation process in view of requirements. Modify or augment requirements, competencies, or measurement operations as

7. G E N E R I C TASKS AS A DECOMPOSITION PRINCIPLE KB systems use many kinds of knowledge to solve a problem. The problem-solving behavior can be likewise complex, involving a sequence of distinct tasks. The idea of generic tasks provides a means of describing the inference process for a complex KBS in terms of a sequence of operations applied to the KBS. It provides a basis for decomposing a KBS into smaller, conceptually distinct processes with definable outcomes. This is essential for managing validation studies. One can meaningfully ask, "if the system identifies, predicts, and then controls, how well does it do each of these conceptually distinct tasks?" Generic tasks describe the plan for solving the problem. Clancey (1985) organized the generic tasks in the following manner (Fig. 3). He proposed that a large number of problems can be described using this organization. The VEG (Kimes, Harrison, & Ratcliffe, 1991 ) application could be described in terms of the following ordered task sequence; IDENTIFY, PREDICT,

1. construct (constructed criteria such as judgments) a. specify b. design 1. configure 2. plan c. assemble 1. modify 2. interpret (sui generis criteria) a. identify 1. monitor 2. diagnose b. predict c. control

FIGURE 3. Organization of generic tasks.

Standards for Expert Systems Validation MODIFY, PLAN, and CONFIGURE. Use raw reflectance data from a satellite to IDENTIFY string data and an assertion set for each string. Use the assertion set, the string data, and a rulebase to PREDICT numeric techniques for the complete analysis of the string data abstractions. MODIFY data set to create restricted data set. Test N ranked techniques for accuracy and generate execution PLAN. C O N F I G U R E vegetation canopy characteristics from the refined data outcomes from N executed techniques. FIRAS ASSEMBLES an abstracted ship model from structural and functional data. This model, along with a rulebase and a complex set of constraints, is used to CONFIGURE a fire-fighting system. The configured model is then used to PREDICT the costs and benefits for the system under the risk model. Obviously the sequence of tasks in a KBS could change or be elaborated in a number of ways as the system is developed. In fact, an advantage of task analysis is that it offers an abstracted view of system evolution. Also note that the description o f a KBS in terms of generic tasks does not tell us whether or not the KBS explicitly models causal relations. VEG currently does not explicitly model causality whereas FIRAS does. FIRAS can explain the rationale for a configuration at a detailed level. It uses basic knowledge about fires, metals, cables, paints, ship configurations (pertinent to fire development and spread), and compartment attributes. For both VEG and FIRAS, generic task analysis provides a means for decomposing the system into independent and dependent subprocesses. These can be validated either in parallel or sequentially before the overall validation effort is completed. 8. P R O C E S S M O D E L S Clancey (1985) makes a clear distinction between the problem to be solved and the method used to solve it. One method is called heuristic classification. It is described in terms of three components; data to data abstraction, solution abstraction to specific solution, and the heuristic mapping from data abstraction to solution abstraction. The basic idea of the heuristic classification method is that data are used to define a data abstraction such as typicality or a data class. Heuristics are then used to map this into a solution class, which is then refined by using associations between data and solution in a nonhierarchical opportunistic way. Later Clancey (1989) extended this concept to include simulation process models that define a continuum of KB systems that implement different causal models. Figure 4 shows an organization of KBS process models based on these ideas. The figure organizes elements from Clancey (1989). The models are divided into two large groupings. The heuristic classification models are rule-based systems that do not represent

255 KBS PROCESS MODELS / /

\ \

/

\

/ HEURISTIC

\ SIMULATION

CLASSIFICATION

/\ / / /

\ \ \

STATE .... STRUCTURE/FUNCTION

FIGURE 4. The organization of process models.

explicitly the underlying causality of the system. The simulation models range from state transition networks to full structure/function descriptions and are intended to represent a continuum of increasingly explicit causal representations. Process models provide a standard for determining the overall validation approach for a KBS. The decomposition principle based on generic tasks provides a basis for organizing a KBS into a sequence of generic subprocesses. This suggests a validation model and a set of principles for defining more clearly exactly what validation of a particular KBS means and how it might be accomplished. It also provides a basis for defining classes of measurement operations that are likely to be needed for validation and constraints on validation that might generalize to particular process models or to specific generic tasks. VEG represents a straight forward Heuristic Classification problem. Raw reflectance data are used to generate a data abstraction in the form of assertions and strings. The strings and assertions are used to identify a set of numeric models. The numeric models are applied to the strings producing a refined solution set. This solution set is further refined in terms of various error terms producing a very restricted and refined reflectance database for making assertions about vegetation canopy characteristics. Figure 5 shows a rule from VEG. FIRAS, on the other hand, would be classified as a simulation model. It explicitly builds an abstracted model of the ship system that contains structural/functional relations in the form of object descriptions, object relations, and causal rules. For example, if a compartment has a false overhead then the probability of fire spread to adjacent compartments is very high. Figure 6 shows some FIRAS rules. 9. VALIDATION CRITERIA Generic tasks provide a means of decomposing a KB system into conceptual units for the purpose of vali-

256

P. R. Harrison and P. A. Ratcliffe

(TECHNIQUE-RULE (IF (THE ?SAMPLE IS IN CLASS TARGET.DATA) (THE ?SAMPLE IS 2 STRINGS) (THE AZIMUTH-OF-THE-DISTINCT-STRINGS IS ?AZIMUTH) (LISP (BOTH-STRINGS-ARE-HALF ?SAMPLE ?AZIMUTH)) (THEN (TRY TECHNIQUE-2HALF-STRINGS)))

FIGURE 5. A sample rule from VEG.

dation. The generic task concept carries with it the idea of a general outcome--assemble, plan, identify, and so on. An overarching general outcome provides a basis for one aspect of validation--developing criteria for measuring predictive validity. If the task falls into the general category constructive, chances are the criteria will have to be constructed and will only indirectly measure the competency of the KB relative to the task. If the task falls into the general category interpret, it is more likely to have associated with it directly measurable or sui generis criteria. The latter general category will therefore be easier to validate. Obviously, the general outcomes must be elaborated into clear measurement operations.

cesses. Data objects represent units of the system in terms of construction. Process nodes represent operations on these data objects. Process nodes have input and output links representing data objects created or used by the process. Time dependencies in the network represent one kind of constraint that can be propagated. The importance of the article is that it recognizes the usefulness of constraint modeling as a means of defining model validation criteria and measurement operations. Davis (1984) and Dechter (1990) provide additional ideas for the development of constraint models. Competency-based validation implies that outcomes have to be achieved within constraints. Competency must include both outcomes and constraints. The importance of this is that it ties together the causal model with task outcomes to provide a comprehensive validation approach. Uncertainty from this view is evaluated in terms of the behavior of the system with respect to constraints. An underconstrained model could behave in a variable or uncertain way. Harrison and Ratcliffe (1990) discusses competency-based validation and criterion development in detail.

10. C O M P E T E N C Y - B A S E D CRITERIA A competency is a behavior defined by its measurement operations viewed within an environmental context (Borgida, Greenspan, & Myopoulos, 1985; Brachman & Levesque, 1982, 1985). Competencies define operations and measurements on outcomes and KBS process model applications. Competency-based criteria emphasize both the functionality the system must demonstrate and also the constraints under which it must be done. In VEG, the system must be able to process reflectance data producing a self-consistent set of conclusions about the vegetation canopy such that error terms have certain characteristics. In FIRAS, the system must be able to build fire systems for existing platforms that agree with what was actually done on those platforms. It must also work within a severe set of constraints defined by ship-building codes. In addition, it must agree with expert judgments as to what the relationship between risk posture and cost should be. Waters (1988) discusses validation using constraint models. He uses a very limited set of fault diagnosis problems to illustrate the basic concept of constraint propagation. He then goes on to discuss type checking and automatic system construction as two kinds of validation activity that can be accomplished using constraints and constraint propagation. For type checking, consider each function call, procedure call, or operation linked by input and output ports. Type is propagated throughout the system using the links and the transfer functions across nodes. Automatic program construction models data objects and pro-

11. S U M M A R Y In this paper, we have tried to develop a model of validation that uses generic tasks to provide a decomposition principle and a language for KB system description. We have also defined process models to provide a schema for developing validation approaches which are more comprehensive and which provide conceptual standardization. It is clear that validation requirements for heuristic classification problems and simulation problems will be different. It is also clear that validation will be affected by the ability of the KB to produce sui generis criteria for measurement operations. Standard-

IF ?compartment is to be protected AND type of ?compartment is electronic AND ?compartment contains ?module AND ?compartment ?module is electronic.equipment.enclosure THEN determine if ?module is open to ?compartment IF ?compartment is to be protected AND ?module in ?compartment is not open AND power to ?module in ?compartment can be isolated THEN use ?gas in ?module IF ?compartment is to be protected AND ?module in ?compartment is not open AND all modules in ?compartment are isolated THEN use water sprinkler in ?compartment

FIGURE 6. Sample rules from FIRAS.

Standards for Expert Systems Validation ization i n this sense precedes the d e v e l o p m e n t of standards for actual testing. T h e next steps are to develop standards i n terms of the k i n d s o f constraints a n d forms of validation that should be required for different c o m b i n a t i o n s of process models a n d generic task sequences. It is also imp o r t a n t to elaborate the generic task classes a n d subclasses a n d b u i l d a n operational syntax for defining m e a s u r e m e n t operations o n competencies. Finally, the work o f Neches, Politakis, Rich, Waters, a n d Weiss shows promise for a u t o m a t i n g some o f these ideas (Neches, Swartout, & Moore, 1985; Politakis & Weiss, 1984; Rich & Waters, 1986a, 1986b; Weiss, Politakis, & Ginsberg, 1986).

REFERENCES Adrion, W.R., Branstad, M.A., & Cherniavsky, J.C. (1982). Validation, verification and the testing of computer software. ACM Computing Surveys, 14, 159-192. Borgida, A., Greenspan, S., & Myopoulos,J. (1985). Knowledgerepresentation as the basis for requirements specifications. IEEE Computer, 18, 82-90. Brachman, R.J., & Levesque, H.J. (1982, August). Competence in knowledge representation. In Proceedings of The National Conference on Artificial Intelligence (pp 189-192). Stanford, CA: American Association for ArtificialIntelligence. Brachman, R.J., & Levesque, H.J. (Eds.). (1985). Readings in knowledge representation. Los Altos, CA: Morgan Kaufmann. Buchanan, B.G., & Shortliffe, E.H. (1984). Rule-based expert systems: The M YCIN experiments of the Stanford Heuristic Programming Project. Reading, MA: Addison-Wesley. Chrandrasekaran, B. (1983). On evaluating AI systems for medical diagnosis. AI Magazine, 4(2), 34-37. Clancey, W.J. (1983, August). The advantages of abstract control knowledge in expert systems. In Proceedings of The National Conference on Artificial Intelligence (pp. 74-78). Clancey, W.J. (1985). Heuristic classification.Artificial Intelligence, 27, 289-350. Clancey, W.J., & Shortliffe, E.H. (Eds.). (1984). Readings in medical artificial intelligence. Reading, MA: Addison-Wesley. Clancey, W.J. (1989). Viewingknowledgebases as qualitativemodels. IEEE Expert, 4(2), Summer, 9-23. Cragen, B.J., & Steudel, H.J. (1987). A decision-table-basedprocessor for checking completeness and consistency in rule-based expert systems. International Journal of Man-Machine Studies, 26, 633648. Culbert, C., Riley, G., & Savely, R.T. (1989). An expert system development methodology that supports verificationand validation. ISA Transactions, 28( 1), 15-18. Davis, R. (1984). Diagnostic reasoning based on structure and behavior. Artificial Intelligence, 24( 1), 347--410. Davis, R., & Lenat, D.B. (1982). Knowledge-based systems in artificial intelligence. New York: McGraw-Hill. Dechter, R. (1990). Enhancement schemes for constraint processing: Back-jumping,learning, and cutset decomposition.Artificial Intelh'gence, 41(3), 273-312. Genesereth, M.R., & Nilsson, N.J. (1987). Logical foundations of artificial intelhgence. Los Altos, CA: Morgan Kaufmann. Geissman, J.R., & Schultz, R.D. (1988). Verificationand validation of expert systems. A1 Expert, February, 26-33. Ginsberg, A., Weiss, S.M., & Politakis, P. (1988). Automatic knowledge base refinement for classificationsystems. Artificial Intelligence, 35, 197-226.

25 7 Guttag, J.V. (1980). Notes on type abstraction (version 2). IEEE Transactions on Software Engineering, SE-4i(I), 13-23. Guttag, J.V., Horowitz, E., & Musser, D.R. (1978). Abstract data types and software validation. Communications of the ACM, 21, December, 1048-1064. Hamscher, W.C. (1988). Model-based troubleshooting of digital systems (MIT Report No. AI-TR 1074). Cambridge, MA: MIT. Harrison, P.R. (1989). Testing and Evaluation of Knowledge-based Systems. In J. Liebowitz& D.A. DeSalvo(Eds.), Structuring expert systems: Domain, design and development (pp. 303-329). Englewood Cliffs, NJ: Yourdon Press. Harrison, P.R., & Ratcliffe, P.A. (1990, May). Validation and performance criteria. Paper presented at the ORSA meeting, Las Vegas. Hayes-Roth, F., Waterman, D.A., & Lenat, D.B. (Eds.). (1983). Building expert systems. New York: Addison-Wesley. Kimes, D.S., Harrison, P.R., & Ratcliffe, P. A. (1991). A knowledgebased expert system for inferring vegetation characteristics. International Journal of Remote Sensing, in press. Kline, P.J., & Dolins, S.B. (1989). Designing expert systems. New York: Wiley. Neches, R., Swartout, W.R., & Moore, J.D. (1985). Enhanced maintenance and explanation of expert systems through explicit models of their development. IEEE Transactions on Software Engineering, SE-I 1, 1337-1351. Newell, A. (1980). Physical symbol systems. Cognitive Science, 4, 135-183. Newell, A. (1982). The knowledge level. Artificial Intelligence, 18, 87-127. Nguyen, T.A. (1987, February). Verifyingconsistencyof production systems. In Proceedings of The Third Conference on Artificial Intelligence Applications (pp. 4-8). Washington, DC: IEEEComputer Society Press. Nguyen, T.A., Perkins, W.A., Laffey, T.J., & Pecora, D. (1985). Checking an expert system knowledge base for consistency and completeness.In Proceedings of the International Joint Conference on Artificial Intelligence (pp. 375-378). Menlo Park, CA: American Associationfor ArtificialIntelligence. Nguyen, T.A., Perkins, W.A., Laffey, T.J., & Pecora, D. (1987). Knowledge base verification.AI Magazine, 8, 69-75. O'Keefe, R.M., Balci, O., & Smith, E.P. (1987). Validating expert system performance. IEEE Expert, Winter, 81-89. Politakis, P.G. (1983). Using empirical analysis to refine expert system knowledge bases. Unpublished master's thesis, Computer Science Department, Rutgers University, New Brunswick,NJ. Politakis, P.G., & Weiss, S. (1984). Using empirical analysisto refine expert system knowledgebases. ArtificialIntelligence,22, 23-28. Reggia, J.A., Nau, D.S., & Wang, P.Y. (1984). Diagnostic expert systems based on a set covering method. In M.J. Coombs (Ed.), Developments in expert systems. New York: Academic Press. Rich, C., & Waters, R.C. (Eds.). (1986a). Artificial intelligence and software engineering. Los Altos, CA: Morgan Kaufmann. Rich, C., & Waters, R.C. (1986b). Towards a requirements apprentice. On the boundary between informal andformal specifications (MIT Report, A.I. Memo 907). Cambridge, MA: MIT. Richer, M.H. (1986). An evaluation of expert system development tools (Report No. KSL 85-19). KnowledgeSystems Laboratory, Computer Science Department, Stanford University, Stanford, CA. Rothenberg, J., Paul, J., Kameny, 1., Kipps, J.R., & Swenson, M. (1987). Evaluating expert system tools (Report No. R-3542DARPA). Santa Monica, CA: Rand Corporation. Shortliffe, E.H. (1976). Computer-based medical consultations: MYCIN. New York: Elsevier. Stachowitz, R.A., & Combs,J.B. (1987). Validationof expert systems. In Proceedings of the Hawaii International Conference on Systems Sciences (pp. 6866-6695).

258 Stachowitz, R.A., Combs, J.B., & Chang, C.L. (1987a). Validation of knowledge-based systems. In Proceedingsof the Second AIAA/ NASA/USAF Symposium on Automation, Robotics and Advanced Computingfor the National Space Program. Stachowitz, R.A., Chang, C.L., Stock, T.S., & Combs, J.B. (1987). Building validation tools for knowledge-based systems. In Proceedings of the Space Operations and Robotics (SOAR) Workshop. Stachowitz, R.A., Combs, J.B., & Chang, C.L. (1987b). Performance evaluation of knowledge-based systems. In Proceedings of the AFIT AOG/AAAIC Joint Conference. Stachowitz, R.A., & Chang, C.L. ( 1990, May). Completeness checking of expert systems. Paper presented at the ORSA Meeting, Las Vegas. Suwa, M., Scott, A.C., & Shortliffe, E.H. (1982). An approach to verifying completeness and consistency in a rule-based expert system. AI Magazine, Fall, 16-21. Szolovits, P. (1987). Expert systems tools and techniques. In W.E.L.

P. R. Harrison and P. A. Ratdiffe Grimson & R.S. Patil (Eds.), A1 in the 1980s and beyond. An MIT survey (pp. 43-74). Cambridge, MA: MIT Press. Waters, R.C. (1988). System validation via constraint modeling (MIT Report, AI Memo 1020). Cambridge, MA: MIT. Weiss, S.M., Politakis, P., & Ginsberg, A. (1986). Empirical analysis and refinement of expert system knowledge bases. In Proceedings of the IEEE Tenth Annual Symposium on Computer Applications in Medical Care (pp. 53-60). Washington, DC: IEEE Computer Society Press. White, B.Y., & Frederiksen, J.R. (1990). Causal model progressions as a foundation for intelligent learning environments. Artificial Intelligence, 42, 99-157. Wilkins, D.C., Clancey, W.J., & Buchanan, B.G. (1987). Knowledge base refinement by monitoring abstract control knowledge. International Journal of Man-Machine Studies, 27, 281-293. Zualkernan, I., Tasi, W.T., & Volovik, D. (1986). Expert systems and software engineering: Ready for marriage? IEEE Expert, 1, 24-31.