Int. J. Human—Computer Studies (1998) 49, 523—546 Article No. hc980217
Reuse, CORBA, and knowledge-based systems JOHN H. GENNARI, HEYNING CHENG, RUSS B. ALTMAN and MARK A. MUSEN Stanford Medical Informatics, Stanford University, Stanford, CA 94305-5479, USA. email:
[email protected]
By applying recent advances in the standards for distributed computing, we have developed an architecture for a CORBA implementation of a library of platformindependent, sharable problem-solving methods and knowledge bases. The aim of this library is to allow developers to reuse these components across different tasks and domains. Reuse should be cost-effective; therefore, the library will include standard problem-solving methods whose semantics are well understood and are described with a language for stating the requirements and capabilities of a component. In addition, when a developer needs to adapt a component to a new task, the adaptation costs should be minimal. Thus, we advocate the use of separate mediating components that isolate these adaptations from the original component. We demonstrate our approach with an example: an implementation of a problem-solving method, a knowledge-base server, and mediating components that adapt the method to different knowledge bases and tasks. ( 1998 Academic Press
1. Cost-effective reuse Researchers in both knowledge-based systems and in software engineering have looked to reuse as a methodology for reducing the high cost of software development and maintenance. With a reuse approach to software construction, developers adapt existing software components at a fraction of the cost of developing a system from scratch. Reuse can save development costs when (1) the overhead cost of building a component for reuse is low, (2) the frequency with which developers reuse components is high and (3) the cost of finding, and adapting a component is low. Unfortunately, there are a number of obstacles to overcome before component reuse is cost-effective. Reuse is not cost-effective when the cost of building, finding and adapting a library component is greater than the cost of building a solution from scratch. Reuse is cost effective only when the developer can find and understand a component quickly, and when the component solves a significant problem; one that would be expensive to solve with software built and debugged from scratch. Development and maintenance of knowledge-base systems without reuse is known to be expensive. Originally, first-generation knowledge-based systems consisted of an inference engine and a knowledge base that included both facts about the domain and rules that controlled the processing of those facts. Unfortunately, these systems did not scale well to large knowledge bases, as the set of rules and facts quickly became unwieldy and difficult to maintain (Bachant & McDermott, 1984). In response to this problem, researchers isolated knowledge about the process used to solve some problem from 1071-5819/98/100523#24$30.00
( 1998 Academic Press
524
J. H. GENNARI E¹ A¸.
knowledge of specific facts about a particular domain. Thus, second-generation knowledge-based systems are composed of two large-grained components: (1) a problemsolving method and (2) a knowledge base used by the method (David, Krivine & Simmons, 1993). Although the design of second-generation knowledge-based systems allows developers to build systems that scale to very large problems, significant reuse of components has yet to be demonstrated. One of the obstacles to reuse for knowledge-based systems is the inability to share components across development environments. Typically, different environments have idiosyncratic ways of specifying components, and thus, developers building a reuse library for one architecture cannot use components developed in different environment. To address this problem, we advocate the use of the Common Object Request Architecture (CORBA) standard for platform-independent communication and component definition (Orfali, Harkey & Edwards, 1996). Our vision for the development of knowledge-base systems is that developers will be able to build systems at reduced costs by retrieving and adapting components from a distributed and platform-independent reuse library. We believe (1) that components such as problem-solving methods are particularly appropriate for cost-effective reuse, (2) that an effective strategy for minimizing component adaptation costs is to construct separate mediating components that filter and transform information among components and (3) that a reuse library built with a standard for cross-platform communication and distributed computing such as CORBA, maximizes the potential reuse frequency for components and amortizes the cost of developing a large reuse library. While the first claim has been made by many in the field (e.g. Chandrasekaran, 1986; Schreiber, Wielinga, Akkermans, van de Velde & de Hoog, 1994; Motta, Stutt, Zdrahal, O’Hara & Shadbolt, 1996; Breuker, 1997), and a few have been actively pursuing the second claim (e.g. Fensel & Groenboom, 1997), we are not aware of any other experiments with knowledge base component reuse and CORBA. In this paper, we present an initial example in support of our vision, using a set of example tasks and knowledge bases from the field of molecular biology, and a wellknown problem-solving method, propose-and-revise (Marcus, Stout & McDermott, 1988). We have constructed a simple knowledge-base server using the ideas of the Open Knowledge Base Connectivity Protocol (Chaudhri, Farquhar, Fikes, Karp & Rice, 1998; see also related work of Karp, Myers & Gruber, 1995) and have built mediating components that connect the method to the appropriate knowledge bases in the server. Our example demonstrates the reuse of a problem-solving method across multiple tasks, where the adaptations of this method are isolated in mediating components, and where all components (method, mediator and knowledge-base server) can be implemented as distinct services available over the network. Although a single example cannot be used to prove claims about effort saved via reuse, or measure the ease with which components can be retrieved and adapted from the reuse library, our work is a first step toward such studies. Over time, reuse cases must be built and components made available so that developers of knowledge-based systems can assess the value of an entire reuse library of components. Before presenting our reuse example, we describe the construction of second-generation knowledge-based systems, including our architecture for building systems from reusable knowledge-base components.
REUSE, CORBA & KNOWLEDGE-BASED SYSTEMS
525
2. A reuse architecture for knowledge-based systems Our work to develop a reuse architecture for knowledge-based systems is part of Prote´ge´: a long-term project to build a toolset and methodology for the construction of domain-specific knowledge-acquisition tools and knowledge-based systems from reusable components (Puerta, Egar, Tu & Musen, 1992; Gennari, Tu, Rothenfluh & Musen, 1994; Eriksson, Shahar, Tu, Puerta & Musen, 1995). Prote´ge´ is one of several environments for the construction of second-generation knowledge-based systems— other examples include VITAL (Shadbolt, Motta & Rouge, 1993: Motta et al., 1996) and CommonKADS (Schreiber et al., 1994). These environments are designed to help system developers build knowledge-based systems from reusable components: problem-solving methods and knowledge bases. As we describe below, the Prote´ge´ methodology helps developers design these components for reuse: developers can apply the problem-solving method to different knowledge bases, and developers may use different methods over the same knowledge base. 2.1. METHODS, KNOWLEDGE BASES, AND ONTOLOGIES
A problem-solving method captures knowledge about how to accomplish some class of tasks, whereas a knowledge base provides the data or information about the domain that is necessary for some problem-solving method to operate. Examples of problem-solving methods include constraint satisfaction, reactive planning, or the temporal abstraction of data (Tu, Eriksson, Gennari, Shahar & Musen, 1995). These methods are algorithmic procedures that can be at least somewhat domain-independent: they can be applied to different sets of data to solve problems in different domains. A knowledge base captures information about a domain that is more static, relative to the method or methods that use the knowledge base. Although knowledge bases are typically designed as input data for some problem-solving method, a knowledge base may be designed for several methods, and it may capture information about the domain that is independent of the problem at hand. Thus, knowledge bases can be at least somewhat method-independent: they can serve as sources of information for more than one problem-solving method (Musen & Schreiber, 1995). To allow developers to understand and reuse both methods and knowledge bases, both of these components are accompanied by ontologies that more formally describe the terms and relations used by that component (Gruber, 1993; Guarino & Giaretta, 1995). In our environment, these ontologies describe at a more abstract level the set of objects used by the component. Thus, an ontology for a constraint-satisfaction method defines the abstract notion of a constraint as an object with particular attributes or slots, and with particular inheritance relationships to other objects in the ontology. When the method is invoked, this abstract notion of constraint must be instantiated with a particular set of constraints that the method then attempts to satisfy. Similarly, an ontology for a knowledge base describes the abstract classes, their relationships and attributes, and these classes are instantiated by ground-level facts in the knowledge base. These ontologies are essential for component reuse, because they allow developers some insight into the semantics of a method or a knowledge base. Thus, the ontology for a knowledge base specifies the vocabulary for that domain, and allows method
526
J. H. GENNARI E¹ A¸.
developers to formulate queries of the knowledge base in the terms of that ontology. If the knowledge base is designed for reuse, it should be able to respond to a range of queries. Furthermore, as we describe in Section 5, if a set of knowledge bases share a common representation language, then they can be grouped together and made available by a single knowledge-base server. The ontology for a problem-solving method specifies the classes of inputs and outputs used by the method. As with the knowledge-base ontology, this method ontology simplifies the reuse of a method. For other developers to use our problem-solving methods, we must provide application-programming interfaces (APIs) for each method, and we can derive these interfaces from method ontologies that describe inputs and outputs. As we describe in Section 6, these method ontologies are part of richer method-description language that would specify method semantics such as goals, competence and decomposition into sub-methods.
2.2. REUSING METHODS AND KNOWLEDGE BASES WITH MEDIATORS
In general, it is cost-effective for developers to reuse both problem-solving methods and knowledge bases when building knowledge-based systems. For example, a developer assembling a knowledge-based system for a computer hardware domain should be able to retrieve and reuse either a building-block method, such as a constraint satisfaction algorithm, or a knowledge-base component, such as a repository of information about standard pieces of computer hardware. However, such a scenario raises the issue of component adaptation costs: the costs of modifying and adapting existing components to fit a new task. Unless the method and the knowledge base are matched perfectly, the developer must either change the knowledge base to match the input and output specifications of the method, or modify the method to use exactly the terms specified by the knowledge-base ontology. Either of these solutions modify existing components, and will therefore inhibit the ease with which those components might be reused by other developers. Therefore, we advocate the use of separate mediating components that isolate any customizations required to use a particular problem-solving method with a particular knowledge base. Figure 1 shows our view of these three types of components (methods, knowledge bases and mediators) that developers can configure to solve knowledge-based problems. Unlike problem-solving methods and knowledge bases, the mediating components are specific to a particular task—they encapsulate information about how to connect and adapt a specific method to a specific knowledge base to solve a particular problem. Figure 1 shows all components interconnected via the Common Object Request Broker Architecture (CORBA), a platform-independent standard that supports object-oriented distributed computing. This software standard allows objects and associated methods to operate across a network in a hardware and language-independent manner (Orfali et al., 1996). All CORBA components include an interface definition—a stub declaration of an object’s attributes and methods specified in the interface definition language (IDL). Any developer that wishes to use a CORBA component would compile that component’s IDL specification into the machine-specific form for inclusion into a local application. For knowledge-base components, this specification is related to the ontology for that component. However, IDL specifies only a syntactic description of a component’s
REUSE, CORBA & KNOWLEDGE-BASED SYSTEMS
527
FIGURE 1. CORBA support for a reuse architecture.
interface elements; it does not specify any semantic information about the component. Nonetheless, once an IDL specification is defined for a component, the CORBA standard makes it easy to reconfigure that component to allow connections to different clients or servers. Our aim is to support component interoperation—to allow developers in different knowledge-base environments, or using different development and representation languages, to share components. Without sharing across environments, each group of developers would need to build up their own library of components, rather than using components shared from other architectures. CORBA supports this goal by establishing a standard of communication. Whether this standard is CORBA or alternatives, we argue that the high cost of developing a large reuse library should be distributed across developers by the use of standards for defining and describing software component interfaces. In this paper, we describe a distributed implementation of a knowledge-based system, built as a set of reusable components as shown in Figure 1. These include a legacy problem-solving method, a knowledge-base server and mediating components that solve two configuration problems. This example demonstrates reuse of a problem-solving method, and shows how the CORBA standard facilitates communication among components.
3. Configuration problems in molecular biology In molecular biology, transfer RNA (tRNA) and the ribosomal macromolecule are part of the cellular machinery responsible for translation of RNA to protein in living organisms. Researchers are interested in determining the three-dimensional shape of such structures, since such information may be essential if we are to understand the mechanisms of protein synthesis. In other words, the task for the researcher is to determine the three-dimensional position or configuration of the ribosome or the tRNA macromolecule, given experimental evidence about the sequence and known structure of
528
J. H. GENNARI E¹ A¸.
these macromolecules. To build a knowledge-based system to solve this problem, we need (1) a specification of a knowledge base of experimental molecular biology information, and (2) a method that can solve the configuration problem of determining threedimensional structure. Fortunately, neither of these components need to be built from scratch: researchers in this area have already begun to share knowledge bases about ribosome sequence, and a number of legacy computational methods are available (Chen, Felciano & Altman, 1997; Altman, Abernathy & Chen, 1997). We describe two variations of this configuration problem: (1) for the ribosome itself, and (2) for transfer RNA. In both cases, the general problem can be solved by constraint satisfaction: Given experimental information about the molecule that includes constraints among components, find a configuration of the molecule such that no constraint is violated. Our solutions to these two problems require two different knowledge bases: one for the tRNA structure, and one for the ribosome. However, as we show, the two problems are sufficiently similar that they can be solved by a single problem-solving method.
3.1. THE 30 S SUBUNIT CONFIGURATION PROBLEM
The 30 S ribosome subunit is made of a single chain of RNA bases and a set of 21 unconnected proteins. Current experimental techniques provide four types of information. First, they provide the location in three dimensions of the 21 proteins. Second, they provide the primary sequence of RNA bases. Third, they provide the location in the primary sequence of geometric components known as secondary structures, such as double helices and coils, that have a known, regular structure. Fourth, they provide distance constraints between the components and the fixed proteins, as well as among the components themselves. Thus, given this experimental information, the configuration task is to find sets of locations and orientations for each component such that no distance constraint is violated. The secondary structure for this ribosomal knowledge base specifies 10 helices; Figure 2 shows these helices in a solution position, where all distance constraints are satisfied. The knowledge base required to solve this problem can be described by its ontology; in our framework, the ontology is specified with the classes and attributes shown in Figure 3. The Object class specifies the 10 helices that make up the ribosomal secondary structure; the Representation class describes the size of each helix; the Location-file is an explicit list of possible locations in three-space for that helix; and the Constraint class specifies the distance constraints between helices. This ontology is a simpler, more task-specific version of a general-purpose ontology and knowledge base that we are building for a wider variety of tasks (Chen et al., 1997). Our goal is to use the same problem-solving method to solve different but related problems in molecular biology. Thus, it is important to note differences and similarities in the ontologies of the two problems. For example, in order to reduce the total number of possible locations (and thus, the computational burden on the constraint-satisfaction algorithm), this knowledge base includes information from a pre-processing step that computes an explicit list of candidate helix locations. As we will see, this information and the corresponding ontology elements are missing from the tRNA configuration problem.
REUSE, CORBA & KNOWLEDGE-BASED SYSTEMS
529
FIGURE 2. A configuration of the 30 S subunit of the ribosome. Cylinders represent helices; ellipsoids represent proteins.
FIGURE 3. An ontology for the ribosomal 30 S subunit configuration task.
3.2. THE tRNA CONFIGURATION PROBLEM
Transfer RNA (tRNA) is a critical component of the molecular machinery that translates the DNA genetic code into proteins. It interacts with the ribosome to provide individual protein components that are specifically matched to segments of the DNA sequence. The three-dimensional structure of tRNA is one of only a few RNA structures that is known at high resolution from X-ray crystallographic studies. Thus, this problem has a known gold-standard solution, unlike the configuration problem for the ribosome 30 S subunit.
530
J. H. GENNARI E¹ A¸.
FIGURE 4. An ontology for the tRNA configuration task.
As with the ribosome configuration problem, our aim is to find a position in threespace for all elements of tRNA such that no distance constraints are violated. However, unlike the knowledge base for the ribosome problem, this knowledge base does not include an explicit list of positions for each helix; there is no pre-processing step that produces a list of possible locations. Instead, each helix has an associated sampling step for each dimension. As we shall show, this information allows the knowledge-based system to produce dynamically a list of possible helix locations for use by the problemsolving method. Figure 4 shows a schematic view of the ontology for the tRNA knowledge base, including a Sampling— rate class. In the Prote´ge´ framework, all ontologies (such as those shown in Figures 3 and 4) are represented with a simple frame-based formalism, and viewed and manipulated within the Prote´ge´ Ontology Editor Reuse of components for related tasks minimizes adaptation costs. In the example that we describe here, we reused the propose-and-revise method for two configuration tasks. Although the tasks are clearly similar, all of differences between the ontologies of Figures 3 and 4 must be accommodated by the reuse developer. In addition to conceptual differences, such as how sampling rates are represented, even simple differences in terminology must be addressed. For example, in the ribosome ontology, the constraint class has attributes named Obj1, Obj2, and so on, while in the matching constraint class in the tRNA ontology, the corresponding attributes are labelled Helix1, Helix2, and so on. In Section 5, we show how our mediating components address these differences between ontologies.
4. A legacy method: propose-and-revise An important feature of a reuse architecture is the ability to accommodate legacy methods and algorithms. In the molecular biology domain, there are many legacy algorithms that researchers would like to apply to new data sets. For example, there are analysis tools that simply determine whether a set of constraints is consistent with a set of
REUSE, CORBA & KNOWLEDGE-BASED SYSTEMS
531
helix positions in three-space. In our example, there is an algorithm for predicting location information given a set of constraints to be satisfied. The molecular biologist wishes to apply such analysis tools to many different data sets. As more and more researchers collaborate, they will create many different data sets reflecting different experimental techniques as well as different macromolecules. Thus, for our reuse approach to be effective, we must support the reuse and distribution of legacy methods. In this section, we present the problem-solving method known as ‘‘propose-andrevise’’ (Marcus et al., 1988). This method is a simple backtracking search through the space of all solutions, by repeatedly modifying the design as constraint violations are flagged. The algorithm follows the following steps. (1) Propose an initial design. (2) Check for any constraint violations. (3) If there are no constraint violations, succeed; otherwise, either (a) choose the best fix for a violated constraint, of (b) if no fixes are available, backtrack to the most recent choice point. (4) Revise the design by applying the selected fix. (5) Go to step 3. Propose-and-revise is a legacy system from the knowledge-based systems literature. Our ability to adapt this algorithm to problems from the field of molecular biology is a demonstration of the flexibility of our generic problem-solving method. The algorithm was originally designed by Marcus for an engineering task: configuring elevator components to be consistent with customer specifications and with safety requirements (Marcus et al., 1988). Our first implementation of this method was for use with a knowledge base of elevator components, as part of the Sisyphus-2 benchmarking project in 1992 (Schreiber & Birmingham, 1996). This project compared the ability of different environments by testing them against a common knowledge-based problem, as detailed in the elevator design task specification (Yost & Rothenfluh, 1996). Although our implementation of propose-and-revise was designed for the elevator configuration problem, we wrote the method to be reusable, with a method ontology of generic search and constraint-satisfaction terminology, rather than with an ontology of elevator-specific concepts. Our method ontology is a simple specification of the requirements of proposeand-revise, with classes for constraints, fixes and the state variables that participate in constraint and fix specifications. As we describe in Section 5, this generic design has allowed us to reuse the module for other problems in other domains.
5. Reuse of propose-and-revise Over the past 5 years, we have adapted and reused the legacy code for propose-and-revise for a series of different tasks. The way in which we implemented the reuse of this method for new tasks reflects our progression toward a more general-purpose architecture for reuse. A review of our different uses of propose-and-revise will help explain our current approach to reuse. After building the method for elevator configuration, we initially adapted the method for reuse in a similar, artificial test problem for Uhaul' configuration (Gennari et al., 1994). In this domain, the system is given information about various truck sizes and rental costs and then selects the appropriate truck and total cost, based on customer
532
J. H. GENNARI E¹ A¸.
information about the volume of goods to be shipped. This domain was engineered to be similar to that of elevator configuration; for example, the customer’s selection is upgraded to a larger truck if a capacity constraint is violated, much as an elevator cable might be upgraded to a stronger cable if a safety constraint were violated. Although the UHaul task is merely a simplified version of the elevator-configuration task, there are still adaptation costs for reusing the propose-and-revise problem-solving method. For example, the knowledge bases for UHaul equipment and for elevators have different terminologies. Information in these knowledge bases must be translated into the terminology of the propose-and-revise method ontology, thereby adapting the knowledge bases to the requirements of the problem-solving method (Gennari et al., 1994). We isolated these adaptations as a set of declarative mappings that translated the domainspecific knowledge base to a method-specific knowledge base. Using this same approach, we reported on the adaptation of propose-and-revise to the ribosome-configuration problem (Gennari, Altman & Musen, 1995). The ribosome knowledge base was designed independently (without knowledge of the propose-andrevise method) and originally used with the Protean problem-solving method (Altman, Weiser & Noller, 1994). Unlike our initial demonstration of reuse across elevator and UHaul problems, the adaptation of propose-and-revise to the ribosome configuration problem demonstrated our ability to reuse method code with real-world problems that were designed to be solved by other problem-solving methods. However, in all our work to date, all code was written within a single software environment: the CLIPS production-system language that supports both an objectoriented language for knowledge bases, and a rule-based system for problem-solving methods. The requirement that all users of our components build systems within CLIPS greatly constrained our ability to reuse components. Furthermore, our implementations used CLIPS to load the complete knowledge base as part of the run-time system, and this approach will not scale to large knowledge bases. Finally, our declarative mappings could not support dynamic queries to the domain knowledge base. That is, our architecture required that all domain knowledge be translated as a pre-processing step before invoking the problem-solving method. This requirement is awkward and especially problematic in the tRNA configuration task, where the knowledge base does not include an explicit list of possible helix locations. These problems led us to design the reuse architecture introduced in Figure 1: a distributed architecture in which methods, knowledge bases and mediators are independent components that communicate via the CORBA standard. In Figure 5, we show an instantiation of this architecture with specific examples of method, mediator and knowledge-base components. Our architecture is compatible with any development environment that complies with the CORBA standard. The use of a knowledge-base server allows problem-solving methods to retrieve knowledge-base information at run time, without loading or pre-processing an entire knowledge base. In Sections 5.1—5.3, we describe each component in Figure 5 in greater detail.
5.1. THE PROPOSE-AND-REVISE COMPONENT
As described in Section 4, our implementation of propose-and-revise is built up from legacy code written 5 years ago in CLIPS. To wrap this code and turn the method into
REUSE, CORBA & KNOWLEDGE-BASED SYSTEMS
533
FIGURE 5. The propose-and-revise method with two mediators and knowledge bases.
a CORBA service, we built a C# # component that invokes CLIPS. This component communicates both with client components that invoke the method, and with the knowledge base of constraints and variables required as input to propose-and-revise. Thus, this component is both a service for clients that may invoke the method, and also a client to knowledge-base servers that provide information for the method. Clients invoke the problem-solving method upon some knowledge base and then supply the method with any additional run-time inputs from the user. The knowledge-base server responds to queries from propose-and-revise (via mediating components), allowing the method to retrieve information from the knowledge base. Figure 5 shows the client, method and knowledge-base server in left-to-right order. As described earlier, any CORBA component includes an IDL specification that is a formal specification of its methods and objects. As a service, the propose-and-revise component includes an IDL specification for the following methods: (1) ‘‘connect’’, which allows a client to inform the method as to which knowledge base the method will use; (2) ‘‘getInputs’’, which retrieves from the knowledge base the list of required run-time inputs, and (3) ‘‘solve’’, which invokes propose-and-revise, passing in the run-time inputs as parameters, and returning the set of output variables and their associated values. For the molecular-biology tasks described in Section 3, there are no run-time inputs since all inputs for the method are included in the molecular biology knowledge base, whereas for the UHaul and the elevator-configuration problems, some information must be elicited from the client component at run time. The invocation of the legacy CLIPS code is within the implementation of ‘‘solve’’, and input and output parameters are transformed between CLIPS and C# # representations. As shown in Figure 5, the method component connects to a particular knowledge base as a client, and retrieves required information for processing. For the propose-and-revise method, in all instances of variables, constraints and fixes must be available before inference can begin. The propose-and-revise component may also make queries of the knowledge base during inference, as it tries to apply certain types of fixes to resolve a constraint violation. However, whether before or during inference, all queries are formed in terms of the method ontology: Propose-and-revise does not include any
534
J. H. GENNARI E¹ A¸.
knowledge about elevators, molecular biology or UHaul equipment. Thus, we need mediators that translate between problem-solving methods and the knowledge bases that they are querying. Before we describe mediators, we present the design of a methodindependent knowledge-base server component.
5.2. THE KNOWLEDGE-BASE SERVER COMPONENT
As seen in both Figures 1 and 5, an important component of our architecture is the notion of reusable knowledge bases that are available as independent resources. Thus, a knowledge base developed within one environment could be accessed and used in another environment, in spite of difference between corresponding knowledge-representation languages. In support of this type of knowledge-base interoperation, researchers have developed a generic frame-based protocol for communication among knowledgerepresentation systems known as the Open Knowledge Base Connectivity (OKBC) protocol (Chaudhri, et al., 1998; see also related work of Karp, et al., 1995). This protocol formally defines the terminology and axioms for a frame-based system, with concepts such as class, slot and instance, and then specifies a set of queries and functions that comprise an interface to any frame-based knowledge representation system. As mentioned in Section 2, our reuse research is within the context of the Prote´ge´ environment. Thus, our ontologies and knowledge bases are manipulated and built with Prote´ge´-generated knowledge-editing tools, and all knowledge bases built with these tools share a simple, frame-based knowledge-representation language based on CLIPS. Because this language is frame-based, we can build a Prote´ge´ knowledge-base server that uses a subset of OKBC as its interface specification. Consistent with the rest of our architecture, this server is a CORBA component, and OKBC calls into a particular knowledge base are serviced via the CORBA standard for communication. The use of CORBA and OKBC allows for the broadest range of access by developers into any of our Prote´ge´ knowledge bases. Developers can make queries of frame-like knowledge-bases with these commands; for example, they can retrieve classes via getclass-all-subs, slots via get-frame-slots, or instance of classes via get-class-all-instances. Currently, we have only implemented a subset of the OKBC accessor functions, in part because the semantics of the knowledge model assumed by OKBC is somewhat different from the knowledge model used by Prote´ge´ and CLIPS (Grosso, Gennari, Fergerson & Musen, 1998). Before we can build a complete OKBC server, we must resolve these differences, so that the semantics of arbitrary OKBC queries are interpreted correctly by the Prote´ge´ knowledge-base server. Although OKBC was not designed specifically for CORBA, it was designed as a communications layer, and therefore has been easy to convert into IDL. OKBC provides the semantic layer for accessing our knowledge bases—without OKBC, IDL provides only syntactic communication information. With OKBC, any component that understands the protocol and the assumptions of the OKBC knowledge model can access knowledge bases stored on our server. As we describe in Section 5.3, we have implemented mediating components that use our knowledge-base server to retrieve information from knowledge bases for use by the propose-and-revise problem-solving method.
REUSE, CORBA & KNOWLEDGE-BASED SYSTEMS
535
5.3. MEDIATING COMPONENTS
For the propose-and-revise method to solve the two tasks described in Section 3, it must access two separate knowledge bases. Rather than custom-tailor the method to adapt to these separate knowledge bases, we isolate all adaptation of the method to mediating components. As shown in Figure 5, there is a separate mediator for each of the two tasks. In addition, we built a third mediator that allows the method to solve the UHaul task, and that accesses a third knowledge base that contains UHaul information. All mediators stand between the method and the knowledge-base server, as shown in Figure 6. On the method side, mediators provide a service for the problem-solving component, responding to method-specific queries for information; on the knowledge-base side, mediators are clients to the general-purpose knowledge-base server, sending queries to particular knowledge bases via OKBC. There may be several mediators that retrieve information from a single knowledge base for different methods, or, as in our example, several mediators that service a single problem-solving method and that retrieve formation from different knowledge bases. If mediators were large and expensive to build, our reuse architecture would not be cost-effective. If developers need to connect N problem-solving methods with N knowledge bases, then they would have to build N2 mediators. Fortunately, it is rarely appropriate to connect all knowledge bases to all methods, since some methods simply cannot be applied to some knowledge bases. Furthermore, as seen in Figure 6, mediators all share the same structure, especially if they use either the same knowledge base or the same method. Building a set of mediators for the same component is therefore inexpensive. Because all three of our mediators service queries from the propose-and-revise method, they all share the same server-side IDL specification; they all receive the same requests for information from the method. In general, a mediator implements a method service via some set of OKBC queries to the appropriate knowledge base. The responses to these queries (received from the knowledge-base server) use the classes defined in the domain ontology for that knowledge base. The mediator must then translate or map this domain-specific data into the terms of the method ontology before returning them as responses to queries from the propose-and-revise server.
FIGURE 6. A method, mediator, and knowledge-base server.
536
J. H. GENNARI E¹ A¸.
For example, propose-and-revise must receive the set of all constraints before starting the ‘‘solve’’ process that searches for solutions. The method component expects constraints to be in a generic format that includes an expression attribute that can be evaluated to true or false. To match this data structure requirement to different knowledge bases, specific mediators must transform portions of the knowledge base to create constraints that match this format. Thus, for the tRNA mediator, the attributes in the tRNA knowledge base for upper-bound and lower-bound must be converted into a single expression attribute (an expression that ANDs together the upper-bound and lower-bound values). From the method component’s point of view, the mediator acts as a virtual knowledge base, conforming to the method ontology, and responding to method-specific queries. However, each mediator may implement method services in different ways, translating and adapting information from the knowledge-base server to be compatible with the method. The transformation of constraint information described above is an example of a simple static mapping. In the next subsections, we describe in greater detail both this type of static mediation, and a more dynamic, run-time mediation as implemented by the three mediators we constructed for the propose-and-revise method. 5.3.1. Static mediation For any mediator for the propose-and-revise method, the initial request for information from the method is to initialize and load the knowledge base. For propose-and-revise, a large amount of data must be available before the problem solver can proceed: all information about constraints, fixes for those constraints, and state variables. Thus, as soon as any of our mediators receives a ‘‘LoadKB’’ request from the method, in addition to connecting to the correct knowledge base, the mediator immediately begins requesting and transforming information from the knowledge-base server for use by the method. A successful completion of the ‘‘LoadKB’’ request indicates both that the knowledge base has been successfully loaded from files into the Prote´ge´ knowledge-base server, and that all static mapping of the initial information has been completed by the mediator. Thus, the mediator copies and converts information from the server component into the mediator component. When the method component next makes requests, such as ‘‘get all constraints’’ or ‘‘get all fixes’’, the mediator can supply this information directly, without additional requests from the knowledge-base server. For both molecular-biology problems, it is necessary to augment the problem-solving method with domain-specific procedural information—in particular, with knowledge about how to compute distances between helices. For the tRNA problem, this distance is computed between the tops and bottoms of the helices, based on the dimensions of each helix. The ribosome problem requires a somewhat different distance function: one that determines the distance between two points expressed relative to the local coordinate systems of the two helices. These local systems are related to the global coordinate system by translations and rotations. In both cases, these functions are not specified in the domain knowledge bases, but are instead stored in their respective mediating components, and are passed to the method as a set of CLIPS ‘‘domain functions’’ used to adapt the behavior of the method for specific knowledge bases. Because the entire body of these functions is passed to the method, the run-time invocation of helix distance computation can remain within the method component.
REUSE, CORBA & KNOWLEDGE-BASED SYSTEMS
537
The notion of domain-specific functional or procedural knowledge is not unique to propose-and-revise. In fact, we believe that our ability to specify this type of knowledge in mediating components will provide the developer with a powerful mechanism for adapting a method to a particular domain and knowledge base. Our aim is to support easy method adaptation, yet to provide method implementations that have no domainspecific knowledge. 5.3.2. Dynamic mediation Although propose-and-revise could be implemented to solve our configuration tasks with static mediation alone, this type of mediation is not sufficient for many problemsolving methods, and does not scale to applications with large knowledge bases. For some methods, the requests for information from a knowledge base may be entirely dependent on user-provided run-time inputs. Such methods need to be able to query the knowledge-base server at run time, and therefore need mediators that can process their requests dynamically, at run time. The choice between static and dynamic mediation depends on the problem-solving method and on the knowledge base. In general, dynamic mediation is appropriate for any application that needs only a small portion of a large knowledge base, whereas static mediation could be used for applications that need all or most of the knowledge base. For example, in the ribosome-configuration problem, the knowledge base includes lists of possible helix locations based on the preprocessing sampling procedure. Some helices have several hundred possible locations. Since the propose-and-revise method halts as soon as a single solution is found, and since the search space is densely populated with solutions, it is likely that most of these possible locations will never be examined by the method. Therefore, our mediator allows the method to query for locations as needed at run time. At initialization time, our mediator sends only the initial location of each helix to the problem-solving method. Then, as the method discovers constraint violations and wishes try another configuration of a helix, it makes a run-time query to receive the next location for a given helix. Unfortunately, knowledge about how to request a new helix position is clearly domain-specific information. If this knowledge is encoded into the problem-solving method, then that component is no longer as generic, and it will have fewer opportunities for reuse. To keep the method domain-independent, we encode information about how to request a new helix location in the mediating component. Thus, the mediator adapts the generic problem-solving request of ‘‘fix a constrain-violation’’ to the domain-specific request for a new helix location from a particular knowledge base. In particular, we have implemented the capability for run-time knowledge-base access via a flexible messaging protocol between the method and mediating components. This protocol allows the mediator to send a set of messages to the method component at initialization time, and then at run time, the method invokes a particular message at some point during processing. In our examples, the mediator informs the method at initialization time that ‘‘requesting a new helix location’’ is the appropriate message to use when fixing a constraint violation. At run time, the method sends this message, parameterized with a particular constraint violation, back to the mediator, which responds by querying the knowledge-base server for the particular helix location.
538
J. H. GENNARI E¹ A¸.
All three of our mediators for propose-and-revise include some form of dynamic mediation. For the tRNA mediator, the same ‘‘get-object-location’’ message is declared. However, since the knowledge base for the tRNA task specifies sampling rates, instead of explicitly listing helix locations, the mediator implementation of ‘‘get-object-location’’ does not make any queries to the knowledge-base server. Instead, helix-location information can be generated from a logical location number plus the sampling rate within the mediator. In this way, the mediator behaves as a virtual knowledge base, answering run-time queries from the method without accessing the knowledge-base server. For the UHaul mediator, the analogous run-time query for ‘‘get-object-location’’ is ‘‘get-upgrade’’: a function that retrieves information from the knowledge base about how to upgrade a piece of equipment. For either the UHaul or the elevator configuration tasks, the knowledge base includes explicit information about how to upgrade parts along implicit dimensions of size, strength, or cost. For example, in the UHaul domain, there is a constraint violation if the vehicle storage capacity is insufficient for the customer’s needs. In this case, the equipment must be upgraded according to a sequence stored in the knowledge base. Thus, the ‘‘get-upgrade’’ message includes three parameters: the name of the class indicating the type of equipment, the slot name within that class containing the upgrade sequence information, and the name of the old model. Our ability to build three different application systems with this dynamic messaging mechanism and with the same problem-solving method is a first demonstration of the versatility of our approach. The only requirement of messaging is that the method server include a mechanism for passing back the message call to the mediator. Developers are then free to implement arbitrary functionality in the mediating component for a given message call. This approach allows developers to adapt method components to new tasks, thereby encouraging more frequent method reuse in a wide range of applications.
6. Discussion and future work The implementation that we presented in Section 5 is a proof-of-concept of our approach for the construction of knowledge-based systems from reusable components. Our goal is to reduce the development time and cost of building large knowledge-based systems. With only a single example of reuse, we can neither make claims about the amount of work saved via reuse, nor measure the ease with which components can be found, retrieved and adapted from the reuse library. However, our work is the first step needed to carry out these studies: over time, reuse cases must be built and components made available so that developers of knowledge-based systems can assess the value of an entire reuse library of components. We have argued that CORBA provides a useful communications standard that helps make components in a reuse library more accessible. The inability to share components across different development environments is a significant obstacle to cost-effective reuse. Until reuse libraries become large enough to help solve a variety of problems, it may not be cost-effective for developers to use such libraries: the costs of searching for, understanding, and adapting components is often usually higher than the benefit gained by reuse. The CORBA standard allows developers in different environments to contribute to a large library of components. Thus, we argue that this software engineering standard should be embraced by developers of reuse libraries for knowledge-based systems.
REUSE, CORBA & KNOWLEDGE-BASED SYSTEMS
539
To achieve cost-effective reuse, we must increase the frequency with which components are reused, reduce the cost of finding and adapting components, and reduce the cost of adding new components to a reuse library. To meet these goals, we are extending our work, as we describe in the next three subsections. First, we are developing a language for describing and indexing problem-solving methods. Such a method-description language will reduce the cost of finding and understanding a legacy component from a library. Second, we are expanding our set of CORBA implementations of problem-solving methods, including methods that are decomposable into sub-methods. Third, we are investigating ways in which to reduce the cost of constructing the mediators that adapt components to new tasks.
6.1 A METHOD-DESCRIPTION LANGUAGE
If a developer wishes to reuse an existing component to solve a new task, it is necessary for him/her to understand that component. If components are small and if the semantics of components are well-known, then it is easy for developers to reuse components. For example, any developer that understands trigonometry can reuse a component that computes cos(x). However, for knowledge-based systems development, where components are typically large and complex, it is unreasonable to expect either that users will understand a component a priori, or that users can read and understand the software code that specifies a component’s behavior. Therefore, components in such a reuse library must be accompained by a more abstract description of how and what the component does. Developers can use this description to understand a component, and this understanding should help them know whether or not a component is appropriate for their task. Furthermore, if this description includes formal specifications, it would be possible to build tools that help developers select appropriate methods from a reuse library, and validate whether or not a method (with the adaptations by some mediator) is appropriate for a given problem and knowledge base. In the implementation described in Section 5, the only component descriptions that we used were the IDL specifications for each component. Although this supports component inter-operability, it does not guarantee that developers share and understand the semantics of components. That is, IDL specifies the syntactic definition of the inputs and outputs, but it does not specify higher-level semantics such as the goals of the method, assumptions the method makes about its inputs, or type of problem the method is designed to solve. The need for method specifications to help organize and index a library of problemsolving methods has been recognized by many researchers (Angele, Decker, Perkuhn & Studer, 1996; Fensel & Groenboom, 1997; Motta & Zdrahal, 1998). One way to organize methods in a library is by the abstract task or problem type that each method addresses (Breuker & van de Velde, 1994). Example tasks might be planning, diagnosis or prediction. Additionally, problem-solving methods can be indexed by their goals and result types (Gil & Melz, 1996). Thus, the goal of propose-and-revise might be to ‘‘find a state where all constraints are satisfied’’, whereas the result type might be a configuration of all state variables. However, while these organizations are useful, goals and task descriptions are not by themselves sufficient for describing software components in a reuse library.
540
J. H. GENNARI E¹ A¸.
For our needs, we envision a method-description language designed to help developers select, understand and adapt methods from the reuse library (Gennari, Grosso & Musen, 1998). In addition to the syntactic interface information provided by IDL, this language should include the following elements. f
f
f
f
f
It would specify the goals or capabilities of the method. This might include a description of the problem type that the method is designed to solve. It would specify the constraints across inputs and outputs. This specifies what the method accomplishes, assuming it runs correctly, and is known as the competence of the problem-solving method (Akkermans, Wielinga & Schreiber, 1994). For example, if propose-and-revise runs successfully, every output variable will have an assigned value. It would specify constraints about inputs and outputs. This captures the formal assumptions that the method makes about inputs and outputs. For example, all input constraints for propose-and-revise must be computable, i.e. they must be expressed in a language that allows an algorithm to evaluate them as true or false. It would include a description of the control flow within the method. As we describe in Section 6.2, problem-solving methods are often decomposed into a set of sub-methods. Therefore, the method-description language must include information about these sub-methods, including a specification of the control flow among them. It would include a history of usage of the problem-solving method. Although a history is more descriptive than analytic, it would provide examples of use from which developers could build.
Our initial goal for a method-description language is to support the construction of mediators. A formal description of components and their requirements could help developers in three ways: (1) the specification could help developers find and select appropriate problem-solving methods, (2) the formal descriptions of constraints about inputs and outputs could be used to semi-automatically construct mediators and (3) the specification (plus a proof engine) could be used to verify that a mediator is correct and complete with respect to a domain knowledge base. In some ways, a method-description language is similar to the specification language for design patterns (Gamma, Helm, Johnson & Vlissides, 1995). Although design patterns differ substantially from problem-solving methods, a library of patterns and a library of methods both require means for developers to search for and understand appropriate patterns of methods. Thus, patterns are described in a structured manner, including features such as ‘‘known uses’’ that would be appropriate for either problem-solving methods or design patterns. Both sorts of reuse libraries are trying to specify information about the semantics of their elements: information beyond what is captured by an IDL specification and at a more abstract level that what is communicated by source code. If the capabilities of problem-solving method components can be specified, then these components can become agents that communicate and interact via the Knowledge Query and Manipulation Language, or KQML (Finin, McKay, Fritzson, & McEntire, 1994). If a set of agents share a common language, then KQML is the communication protocol for making and processing queries in this language. Ideally, one agent could broadcast information about the type of problem that the developer wants solved, another agent about the knowledge available in a knowledge-base server, and other
REUSE, CORBA & KNOWLEDGE-BASED SYSTEMS
541
agents about the capability of problem-solving methods. Thus, the KQML protocol would allow agents to associate appropriate problem-solving methods and knowledge bases with a particular problem, thereby obviating the need for a developer to search a reuse library for appropriate components. However, KQML provides only the query and response mechanism among agents, and does not specify the language used to communicate information about component capabilities or semantics. Thus, before we could use KQML, we must first complete the specification of our method-description language. The development of a method-description language is one of the aims of the KARL and New KARL work at the University of Karlsruhe (Angele et al., 1996; Fensel, Angele & Studer, 1997). These are formal and executable specifications of problem-solving methods. In fact, one of the aims of New KARL is to capture and differentiate knowledge about (1) inputs and outputs, (2) method decomposition into sub-methods, (3) control flow information, (4) inference structure and (5) pre- and post-conditions (Angele et al., 1996). This list of types of knowledge about a method is very similar to the sort of information we need to capture in our method-description language. More recently, this line of research has been extended to include ‘‘adapters’’ that include mappings between a domain model and the requirements and goals of a problem-solving method (Fensel & Groenboom, 1997). As we develop our language, we will continue to work with these researchers, sharing work and ideas where appropriate. 6.2. DECOMPOSABLE PROBLEM-SOLVING METHODS
We believe that developers will be able to reuse problem-solving methods more easily if the methods are broken into sub-methods, i.e. methods should be decomposable (Chandrasekaran, 1986; Steels, 1990; Musen, Tu, Das & Shahar, 1996). For example, Figure 7 shows a decomposition of propose-and-revise into three sub-methods: Goalp, Revise and Transition. This decomposition is not the only one possible; it is simply one that we have implemented for the elevator-configuration task (Rothenfluh, Gennari, Eriksson, Puerta,
FIGURE 7. The propose-and-revise method decomposed into three sub-methods.
542
J. H. GENNARI E¹ A¸.
Tu & Musen, 1996). Although our implementation included this decomposition of the method, in the reuse example of Section 5, the problem-solving method was wrapped as a single CORBA component. Instead, if each sub-method in Figure 7 were available as a separate CORBA component, developers could modify a method simply by interchanging one sub-method for another. For example, one developer may need an efficient implementation of the Revise sub-method that uses a dependency network to recalculate state variables when applying a fix. With a different task, such an implementation might not be desirable: efficient revision is unimportant for tasks with few state variables and the requirement of a dependency network could be an unnecessary burden. If these two implementations of Revise—one with a dependency network and one without—could be wrapped as CORBA components with similar (or identical) IDL specifications, then the problem-solving method could be configured to use one or the other version of Revise, depending on the task. Thus, if methods are designed to be decomposable when they are added to the reuse library, developers can reuse large portions of the components, yet be free to replace sub-methods that may not be appropriate for their task. This reuse scenario is just one that would be possible with a CORBA library of decomposable problem-solving methods. Another example that we are pursuing is the reuse of the temporal abstraction problem-solving method (Shahar, 1997). This method takes time-stamped data as input and produces abstractions of those data as output—in the medical domain, it might take red-blood-cell counts over time, and infer that a patient is anemic as a resulting abstraction of those data. This method can be used alone for some tasks (as embodied in the Re´sume´ system, Shahar & Musen, 1993) or it can be reused as a sub-method of a larger problem-solving method, to solve problems such as planning for medical care (Tu et al., 1995). To date, we have reused temporalabstraction as a sub-method in a number of different configurations, but always within the CLIPS programming environment. We are working to wrap both temporal abstraction and our planning methods as CORBA components to be added to our reuse library. As we said earlier, if methods are decomposable, the method-description language must include information about how sub-methods are organized within the large method. For example, the description of propose-and-revise would have to indicate the control flow among the three sub-methods. We are currently exploring different possible languages for expressing control-flow knowledge. Gil and Melz (1996) have implemented propose-and-revise as a method within their LOOM environment; their implementation includes both a decomposition of the method into a set of components and a language for describing the control flow among these components. However, just as our previous implementations are sharable only within the CLIPS system, their implementation is sharable only within the LOOM environment. We are currently collaborating with them to combine our CORBA implementations with a method-description language that incorporates elements from their implementation.
6.3. COMPONENT ADAPTATION
For every instance of reuse, there is some adaptation cost: the cost of adapting or reconfiguring the component to the task at hand. For example, when developers replace
REUSE, CORBA & KNOWLEDGE-BASED SYSTEMS
543
a sub-method (as in Section 6.2), or build a mediating component (as in Section 5), to apply a problem-solving method to a new knowledge base and task, they are adapting the component to a new task. We want to make reuse cost-effective; therefore, our approach includes strategies for minimizing these adaptation costs. Some adaptation costs are minimized by our use of the CORBA standard, which allows developers to wrap legacy code with a standard API so that they do not have to work at the inter-process communications level when integrating and adapting legacy components. We have demonstrated our ability to wrap legacy code as a CORBA component with the propose-and-revise method. We are using this same approach to wrap the temporal-abstraction method and add that method to our library of reusable CORBA components. We believe that our use of mediating components also minimizes adaptation costs. As shown in Figure 6, all our mediating components share a canonical form: an implementation of a set of services required by the problem-solving methods in terms of a set of OKBC calls to a knowledge-base server. Thus, all mediating components for a given method component must offer the same set of services. From an implementation perspective, once a developer has built one mediator, successive adaptations to new domains become easier and easier, because all the mediators for these adaptation share so much structure. Thus far, our experience has been that the adaptation costs decrease as the developer gains more experience with the method and with our architecture. Our long-term goal is to reduce the adaptation cost by building tools that perform semi-automatic mediator construction. Unlike automatic programming, mediator construction should be a tractable problem, because the mediators share structure. For example, given the IDL of a problem-solving method, we can easily generate a skeletal mediator that provides stub definitions based on the method TDL for all queries that the method may make of the mediator. Better yet, given a formal method-description language specification of the requirements of a method, we can generate actual mediating functions that map information in a knowledge base to those requirements. As a specific example, for queries where the method ontology matches the knowledge base fairly well, we may be able to provide an implementation for that query using knowledge of OKBC and the ontology for the knowledge base. A sophisticated tool might ask developers for corresponding terms in the knowledge-base ontology for particular method queries, and then might construct the appropriate OKBC requests of the knowledge-base server. The purpose of such a mediator-constructing tool is to decrease the adaptation costs associated with component construction. To understand more about how to decrease these costs, we must know what sorts of adaptation are typically required. Thus, we must build a library of reusable components, make them available to a variety of knowledgebase system developers and collect a set of reuse examples from which we can learn about typical component adaptations. The example of method reuse that we present in this paper represent our initial attempt to learn about reuse in our architecture for the construction of knowledge-base systems. Our architecture emphasizes development of separate mediating components and use of the CORBA standard for component communication. Both of these ideas support the use of legacy code, and allow developers to isolate their component adaptations. We presented only a single case of method reuse here; however, we are
544
J. H. GENNARI E¹ A¸.
expanding both our library of reusable components, and the set of example cases of component adaptation and reuse. By growing this set of reuse cases, we will eventually be able to measure the cost savings due to reuse, and to learn to build additional tools that assist developers reuse components in a cost-effective manner. This work has been supported in part by the Defense Advanced Research Projects Agency, and by the National Science Foundation (dIRI-9257578). We would like to thank all members of the knowledge modeling group for their contributions to our ideas. We would also like to thank three anonymous reviewers for constructive suggestions and Lyn Dupre´ for editorial assistance.
References AKKERMANS, H., WIELINGA, B. & SCHREIBER, G. (1994). Steps in constructing problem-solving methods. Proceedings of the 8th Banff Knowledge Acquisition for Knowledge-Bases Systems ¼orkshop, pp. 29.1—29.21. Banff, CA. ALTMAN, R. B., ABERNATHY, N. F. & CHEN, R. O. (1997). Standardizing representations of the literature: combining diverse sources of ribosomal data. Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology, pp. 15—24. Halkidiki, Greece. ALTMAN, R. B., WEISER, B. & NOLLER, H. F. (1994). Constraint satisfaction techniques for modeling large complexes: Application to the central domain of the 16S ribosomal subunit. Proceedings of the 2nd International Conference on Intelligent systems for Molecular Biology, pp. 10 —18. Stanford, CA. ANGELE, J., DECKER, S., PERKUHN, R. & STUDER, R. (1996). Modeling problem-solving methods in New KARL. Proceedings of the 10th Banff Knowledge Acquisition for Knowledge-Bases Systems ¼orkshop, pp. 1.1—1.18. Banff, CA. BACHANT, J. & MCDERMOTT, J. (1984). R1 revisited: four years in the trenches. AI Magazine, 5, 21—32. BREUKER, J. A. & VAN DE VELDE, W., Eds. (1994). ¹he CommonKADS ¸ibrary for Expertise Modeling. Amsterdam: IOS Press. BREUKER, J. (1997). Problems in indexing problem solving methods. ¼orkshop on PSMs for Knowledge-Based Systems, pp. 16 —36. Nagoya, Japan. CHANDRASEKARAN, B. (1986). Generic tasks for knowledge-based reasoning: High-level building blocks for expert system design, IEEE Expert, 1, 23—30. CHAUDHRI, V., FARQUHAR, A., FIKES, R., KARP, P. & RICE, J. (1998). ¹he Open Knowledge Base Connectivity Protocol 2.02. [Online] Available http: //www.ai.sri.com/&okbc/spec.html. February 1998. CHEN, R. O., FELCIANO, R. & ALTMAN, R. B. (1997). RIBOWEB: linking structural computations to a knowledge base of published experimental data. Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology, pp. 84—87. Halkidiki, Greece. DAVID, J., KRIVINE, J. & SIMMONS, R., Eds. (1993). Second Generation Expert Systems. Berlin, Germany. Springer-Verlag. ERIKSSON, H., SHAHAR, Y., TU, S. W., PUERTA, A. R. & MUSEN, M. A. (1995). Task modeling with reusable problem-solving methods. Artificial Intelligence, 79, 293—326. FENSEL, D. & GROENBOOM, R. (1997). Specifying knowledge-based systems with reusable components. Proceedings of the 9th International Conference on Software Engineering & Knowledge Engineering (SEKE-97), pp. 349 —357. Madrid, Spain. FENSEL, ANGELE & STUDER, R. (1997). The knowledge acquisition and representation language KARL. IEEE ¹ransactions on Knowledge and Data Engineering. FININ, T., MCKAY, D., FRITZSON, R. & McENTIRE, R. (1994). KQML—a language and protocol for knowledge and information exchange. In K. FUCHI and T. YOKOI, Eds. Knowledge Building and Knowledge Sharing. Amsterdam: Ohmsha and IOS Press. GAMMA, E., HELM, R., JOHNSON, R. & VLISSIDES, J. (1995). Design Patterns: Elements of Reusable Object-Oriented Software. New York: Addison-Wesley.
REUSE, CORBA & KNOWLEDGE-BASED SYSTEMS
545
GENNARI, J. H., ALTMAN, R. B. & MUSEN, M. A. (1995). Reuse with PROTE¨GE¨-II: From elevators to ribosomes. Proceedings of the Symposium on Software Reuse, pp. 72—80. Seattle, WA. GENNARI, J. H., GROSSO, W. & MUSEN, M. (1998). A method-description language: An initial ontology with examples. Proceedings of the 11th Banff ¼orkshop on Knowledge Acquisition, Modeling and Management. Banff, Canada. GENNARI, J. H., TU, S. W., ROTHENFLUH, T. E. & MUSEN, M. A. (1994). Mapping domains to methods in support of reuse. International Journal of Human-Computer Studies, 41, 399—424. GIL, Y. & MELZ, E. (1996). Explicit representations of problem-solving strategies to support knowledge acquisition. Proceedings of the 13th National Conference on Artificial Intelligence (AAAI-96), pp. 469—476. Portland, OR. GROSSO, W., GENNARI, J., FERGERSON, R. & MUSEN, M. (1998). When knowledge models collide (How it happens and what to do). Proceedings of the 11th Banff ¼orkshop on Knowledge Acquisition, Modeling and Management. Banff, Canada. GRUBER, T. R. (1993). A translation approach to portable ontology specifications. Knowledge Acquisition, 5, 199—220. GUARINO, N. & GIARETTA, P. (1995). Ontologies and knowledge bases: Toward a terminological clarification. In N. J. I. MARS, ed. ¹owards »ery ¸arge Knowledge Bases, pp. 25—32. Amsterdam: IOS Press. KARP, P., MYERS, K. & GRUBER, T. (1995). The generic frame protocol. Proceedings of the International Joint Conference on Artificial Intelligence, pp. 768—774. MARCUS, S., STOUT, J. & MCDERMOTT, J. (1988). VT: an expert elevator designer that uses knowledge-based backtracking. AI Magazine, 9, 95—112. MOTTA, E., STUTT, A., ZDRAHAL, Z., O’HARA, K. & SHADBOLT, N. (1996). Solving VT in vital: A study in model construction and knowledge reuse. International Journal of Human-Computer Studies, 44, 333—401. MOTTA, E. & ZDRAHAL, Z. (1998). A library of problem-solving components based on the integration of the search paradigm with task and method ontologies. International Journal of Human—Computer Studies (this issue). MUSEN, M. & SCHREIBER, A. T. (1995). Architectures for intelligent systems based on reusable components. Artificial Intelligence in Medicine, 6, 189—199. MUSEN, M. A., TU, S. W., DAS, A. K. & SHAHAR, Y. (1996). EON: a component-based approach to automation of protocol-directed therapy. Journal of the American Medical Informatics Association, 3, 367—388. ORFALI, R., HARKEY, D. & EDWARDS, J. (1996). ¹he Essential Distributed Objects Survival Guide. New York: Wiley. PUERTA, A. R., EGAR, J. W., TU, S. W. & MUSEN, M.A. (1992). A multiple-method knowledgeacquisition shell for the automatic generation of knowledge-acquisition tools. Knowledge Acquisition, 4, 171—196. ROTHENFLUH, T. E., GENNARI, J. H., ERIKSSON, H., PUERTA, A. R., TU, S. W. & MUSEN, M. A. (1996). Reusable ontologies, knowledge-acquisition tools, and performance systems: PROTE¨GE¨-II solutions to Sisyphus-2. International Journal of Human-Computer Studies, 44, 303—332. SCHREIBER, A. Th., WIELINGA, B., AKKERMANS, J. M.,VAN DE VELDE, W. & DE HOOG, R. (1994). CommonKADS: a comprehensive methodology for KBS development. IEEE Expert, 9, 28—37. SCHREIBER, A. Th. & BIRMINGHAM, W. P., Eds. (1996). Special issue on the Sisyphus-VT initiative. International Journal of Human—Computer Studies, 44, 275—568. SHAHAR, Y. (1997). A framework for knowledge-based temporal abstraction. Artificial Intelligence, 90, 79—133. SHAHAR, Y. & MUSEN, M. A. (1993). Re´sume´: a temporal-abstraction system for patient monitoring. Computational Biomedical Research, 26, 255—273. SHADBOLT, N., MOTTA, E. & ROUGE, A. (1993). Constructing knowledge-based systems. IEEE Software, 10, 34 —38.
546
J. H. GENNARI E¹ A¸.
STEELS, L. (1990). Components of expertise, AI Magazine, 11, 30—49. TU, S. W., ERIKSSON, H., GENNARI, J. H., SHAHAR, Y. & MUSEN, M. A. (1995). Ontology-based configuration of problem-solving methods and generation of knowledge-acquisition tools: Applications of PROTE¨GE¨-II to protocol-based decision support. Artificial Intelligence in Medicine, 7, 257—289. YOST, G. R. & ROTHENFLUH, T. R. (1996). Configuring elevator systems. International Journal of Human—Computer Studies, 44, 521—568.