Development of a knowledgebased design support system T Smithers, M X Tang, N Tomes, P Buck*, B Clarke*, G Lloyd*, K Poulter*, C Floyd t and E Hodgkint
A notion of design has been developed that is fundamentally different from others in the field. Creative design across a number of domains has been focused on, and a model of design as an exploratory activity rather than as a form of search has been developed. The exploration of a design problem's characteristics is an activity that creates and bounds the space within which possible design solutions can be located. The seeing of design as an exploration and mapping of parameter space highlights the inherent complexity of the creative design process, and it has implications for the specification of knowledge-based design systems. The resulting design philosophy places the human designer at the heart of the exploration process with the computer system, using integrated A I techniques, acting to support him~her throughout the design process. The Cast&maine project has adopted the philosophy of design support, and it evaluates the model of design exploration within the domain of pharmaceutical small-molecule design through the specification of a knowledge-based design-support system. The paper describes the background to the Castlemaine project, the research programme, and the status of the project in 1991. Keywords: design support, pharmaceutical drugs, system architecture, design exploration, reason maintenance, blackboard control Research work at the University of Edinburgh, UK, has demonstrated that the development of effective computer-based support for designers needs to be driven by better modelling of the design process, and that an effective way of developing such an understanding is by
Department of ArtificialIntelligence,Universityof Edinburgh, Edinburgh EHI 2QL,UK *LogicaCambridgeLtd., 104 Hills Rd, CambridgeCB2 ILQ, UK ,British Bio-technol0gyLtd., BrookHouse,WatlingtonRoad,Oxford OX4 5LY, UK Paper presentedat ArtificialIntelligencein Design "91Conf. Edinburgh, UK (25-27 Jun 1991). Revised paper received 15 November 1991. Accepted 15 November 1991 Vol 5 No 1 March 1 9 9 2
the building of artificially intelligent systems that can be used to evaluate the knowledge processes involved in design'. This methodology has been adopted in the formulation of the Castlemaine project, which is a two-anda-half year industrial and academic collaborative research project that is sponsored through the UK Information Engineering DirectorateS. The aim of this project is the evaluation and development of (a) the domainindependent model of the design process, and (b) an associated knowledge-based design-support system with an architecture that attempts to embody some of this model. The applicability of the 'exploration-based model of design' developed by the Artificial Intelligence Department at the University of Edinburgh, UK, will be tested in two ways: first, by the construction of a designsupport system that integrates various knowledge-based systems techniques to support the design activities of pharmaceutical drug designers, and second, by the use of a design case study from another design domain against which the design model will be able to be evaluated. So that the objectives set may be achieved, the project participants are required collectively to have an appreciation of the common design processes, to be expert in the fields of molecular design, to be capable of applying knowledge-based system techniques, and, preferably, to be a user of computer-based molecular-design tools. As a result of these requirements, the Castlemaine project is a collaboration of four partners: • Logica Cambridge, UK, which has experience in the development of conventional software and knowledge-based systems, • the Department of Artificial Intelligence at the University of Edinburgh, UK, where a domain-independent understanding of the design process is currently being developed, ~TheCastlemaineproject is one of the industrialand academiccollaborative projects that has been set up under the UK Department of Trade and Industry Information EngineeringDirectorate initiative, which is the UK-government-directed information-technology research programmethat followedon fromthe Alveyprogramme.
0950-7051/92/010031-10 © 1992 Butterworth-Heinemann Ltd
31
I ,/ \ Drug discovery
drug molecule ~
binding site
Rational drug discovery
drug discovery
/ drug design
\
Figure 2. Drugfits into target receptor's binding site Indirect drug design
Figure 1. Hierarchy of drug-discovery approaches • British Bio-technology, UK, which brings a knowledge of molecular design to the project, and which is a user of computer-based molecular-modelling systems, • CamAxys, UK, which is a company that has extensive database experience. The Castlemaine project will span two and a half years, and it is organized around the construction of two prototype systems. The purpose of the first of these, which is the preliminary prototype that is to be built in the first half of the project, is to assess the applicability of the exploration-based model of design developed by Edinburgh University to the domain of molecular design. The evaluation of the preliminary prototype will support and inform the definition of the architecture of the main prototype, which will be constructed in the latter half of the project. Towards the end of the project, a case study in software design will be selected against which the design model and the main prototype will be able to be assessed. The following sections present early project results. The domain of indirect drug design is described, the knowledge-acquisition and knowledge-representation techniques are presented, and an outline of the architecture of the preliminary-prototype drug design-support system is given.
DOMAIN: INDIRECT DRUG DESIGN Towards rational drug discovery Knowledge-elicitation sessions with British Bio-technology have emphasized the division of rational drug design into the two conceptually different approaches o f direct drug design and indirect drug design. These are located in the hierarchy of drug-discovery approaches in Figure 1. Traditional drug discovery has relied on the highly serendipitous method of screening a large set of existing candidate substances for ones with the desired physiological effect. As pharmacological knowledge has increased, so the typical candidate set has grown to contain many thousands of potential molecules. The screening method, although successful in the past, is now 32
A
increasingly seen as being wasteful of research resources, and also as being unlikely to produce the ideal drug. Increasingly, the approach is to design the drug at the molecular level, on the basis of an understanding of the relationship between drug structure and drug activity. This approach can be termed rational drug discovery, and it is based on the precept that the pharmacological activity of a drug is a direct consequence of its binding to the 'target' receptor molecule. That is, when a drug binds to a receptor, some biological change takes place, such as the opening of an ion channel. Any small molecule, of whatever origin, that binds to a receptor is more generally termed a 'ligand'; the paper restricts itself to referring to 'drugs', for the sake of simplicity. Drugs with a better 'fit' to a receptor bind more strongly, and their activity is thus improved. The popular metaphor of the lock and key is a useful one here, in which the drug 'key' is specifically designed to fit and operate the receptor 'lock' (see Figure 2). Fit, however, is not purely a matter of geometry, but is largely determined by the complementary distribution of chemical and electronic properties across the molecule and its receptor. The requirements of the receptor's binding site, which is the part to which the drug is specifically attached, need to be modelled with a view to the definition of a drug molecule that can meet these requirements. When the nature of the binding site is known, the task of drug design entails the production of a complementary molecule. If the molecular structure and geometry of the receptor are known, for example via X-ray crystallography, then the drug designer knows the receptor's requirements exactly, and can use the direct approach in designing drugs for that site. Unfortunately, the geometries of most pharmacologically interesting receptors have not been characterized at an atomic level, and the process of obtaining the molecular structure for a single receptor type may take many years.
Indirect drug design More often, the designer has to take the indirect approach to drug design. The nature and size of a receptor binding site can often only be inferred from the drugs that the receptor most readily accepts. This approach is analogous to an attempt to infer the inside configuration of a lock from an examination of the keys that best fit it. Most molecules are flexible, and it is not known which 3D configuration any given molecule will take. This means that indirect approaches cannot directly model the geometry of the receptor. Obviously, the binding-site Knowledge-Based Systems
model produced from these techniques can only be an incomplete approximation. The indirect drug-design approach models the 'pharmacophore', an arrangement of chemical properties that are thought to be necessary for specific binding to a particular receptor type. This model is derived from the analysis of a set of drugs, all of which produce the desired biological activity to some degree, and which are assumed to bind to the same receptor. This model is used to guide the design of new chemical molecules that target the same receptor type. The drug designed must be specific in that it binds to the target receptor, but not to any other type of receptor; overgeneralization produces unwanted side effects in the recipient of the drug. In indirect drug design, the structure of the receptor and its binding site can be entirely unknown, or perhaps only partially postulated in terms of important features; the effort here is concentrated on the elucidation of the features and functional groups of the drugs that contribute to high activity. Analyses of structure-activity relationships (SAR) identify the salient structural features of the compounds that contribute to the pharmacophore model, and rank these structures in terms of their importance for activity. Conservative modifications of existing compounds are made with the goal of producing a novel drug with improved receptor fit and high activity. The basic assumption of this approach is that properties that are related to binding are effective over a short range. This means that any changes made to modify a compound are local in extent, and they do not affect the behaviour of the molecule as a whole. There are additional critical factors that affect the way in which the drug is transported through, and metabolized by, the body, and these form part of the context for the novel compound's design. These pharmacokinetic factors are given increasing importance later on in the design project, once an active compound has been successfully described, and the focus of development moves to the optimization of the compound for activity in the mammalian body. The Castlemaine consortium has chosen indirect drug design as the subdomain of molecular design to be focused on for the first prototype system. The reasons for this choice are as follows: • Indirect drug design relies on substantial amounts of expert knowledge and experimental techniques. This is the kind of task in which knowledge-based systems have a valuable part to play. The Product Formulation Expert System, developed through the Alvey programme 2, provides an exemplar of experientially based synthesis design support. • More than 95% of the drug design work carried out by pharmaceutical companies is indirect drug design. The popularity of indirect design is not the main reason for choosing it for support. However, it is, naturally, more comfortable to support a mainstream approach. • Direct drug design is essentially based round the geometric manipulation of well defined objects with the use of well defined data. This kind Of task is one for which traditional computing techniques are more suitable than knowledge-based systems techniques, although 'expert systems' for this approach have been postulated 3. Vol 5 No 1 March 1992
MODELLING OF REASONING AND DEFINITION OF OBJECTS Four main subtasks within the field of indirect drug design were identified, through a series of hierarchically organized knowledge-acquisition techniques that appeared to be appropriate for support within the preliminary-prototype system. These subtasks spanned the whole of the initial drug-design process. They were as follows: • Analyse a compound." Compounds are partitioned into pharmacologically meaningful building blocks that form the basis of an analysis of functional properties. • Build a pharmacophore model: Patterns of functional properties are identified across a selected group of compounds. The overall pattern is called the pharmacophore model, and this is described in terms of a number of property features and the relationships between these features. • Explain the structural relationships within the pharmacophore model: Structure-activity relationship data is elucidated; the relationship between the activity of a compound and its structure is described in relation to the features within a particular pharmacophore. • Specify a novel compound." The pharmacophore patterns and SAR data are used to specify a new drug, or to suggest modifications that can be made to an existing drug to produce a novel drug with improved activity.
Figure 3 shows the breakdown of the indirect drugdesign task into these four main subtasks. For the preliminary prototype system, each subtask has a separate support system (see rectangular boxes), which, in turn, comprises a number of independent knowledge sources (see rounded boxes). The task analysis led on to the modelling of the decision processes involved in the drug-design activities. This model is reflected in the design of the prototype system, which supports the decisions that a drug designer would normally make. The early stages of drug design require an analysis of the design problem's characteristics, i.e. the parameterization and constraining of a model of the design problem. In pharmaceutical design, this analysis is largely evidence-driven, and it is based on a theoretical understanding of how chemical properties are distributed, and how they contribute differentially to the binding of a drug molecule to a receptor. The medicinal chemist takes a variety of different views of a molecule, focusing, for example, on the molecule's electronic properties, its size, and how different pieces of it respond to the presence of water. It was necessary to reflect these different views in the objects with which the prototype design-support system could reason. First, a compound can be viewed as a number of constituent physical structures, each of which has homogeneous chemical characteristics that can be reasoned about unambiguously. For the project, these have been termed molecular components. Second, a compound can be seen as delivering a number of chemical properties, and these properties are distributed locally across the compound. Any physical structure may be involved in the delivery of more than one type of property, and so there is not a one-to-one 33
.
.
.
build a pharmacophore model
analyse a compound I
I parse a compound into molecular | components )
I
identifyeach component's properties [
I
identifythe set o f ~ fragments | within a compound)
.i
Iidentify a gr°up °fl compounds as the basis for pharmacophore model-building
I
I
explain structure-activity relationships within pharmacophore feature
I
I
4.
produce a new compound '
I
select I i pharmacophore feature to be analysed
identify 1 pharmacophore model to be used
uoofyivitisof1 i
|
[fragment properties [ [ structuralvariantswith [ /
a~ross the group of
I ithe feature's properties I
l compoundsto | | |produce pharmacophore] [ ~,~ features J ~ ii i
!
(| r identify t h e - ~| topological [relationships between ] | pharmacophore [ [~. features J
withinthe | pharmacophore J group j
elect a compound
[ tobe modified ii .i
|
( selectproperty ~ | and location l ] to be modifiedin ] [ referenceto the [ [k~pharmacophoremod~J I
/ r selectisostere -'~ | on basis of l l properties [ ] (and in relation to | | structure-activity| relationships) J I
to create I isosteres exchange a new compound 1 which satisfies pharmacophore description Figure 3. Four main tasks of indirect drug design
mapping between structures and chemical properties. The units that deliver chemical properties within a compound have been termed molecular fragments, and a fragment is made up of one or more molecular components. The fragments may, by this definition, overlap each other in structural terms. Components and fragments are defined for individual molecules. Two other objects are defined that relate to the modelling of the drug-design problem. These are the pharmacophore model and the pharmacophore model's features, and these relate to a group of compounds. The feature is defined by a pattern of chemical properties that is found across a set of compounds that is being analysed. As such, it is contextual, and it is abstracted from the physical structures of the molecule. Figure 4 represents the main subtasks that are supported (see rounded boxes), and the representations that they operate on (see rectangular boxes). Physical structures are represented as molecular diagrams, and property groupings are represented as tinted areas. The top-level feedback loop is given for simplicity, but there 34
are, of course, intermediate feedback loops in which aspects of problem analysis and synthesis are explored. One of the major objectives of indirect drug design is the determination of the pharmacophore. The pharmacophore can be seen as a hypothesis about the arrangement of chemical properties that are required for the desired biological activity to be achieved. As such, it integrates across the patterns of properties found for individual compounds, identifying the common properties and the relationships between these properties. Each of the common subpatterns within this model is a feature. In terms of medicinal chemistry, a feature is an identifiable set of properties that work cooperatively on a local basis to affect the way in which the compound binds to its receptor. The pharmacophore model is analysed in terms of the structural variation of the molecules that contribute to the definition of each of its features. The usefulness of any piece of structural information is related to how active the compound is in binding to the target receptor. Compound activities are referenced through the moleKnowledge-Based Systems
Molecule knowledge base
I sol aou 1
--
Molecular components
I eachAnm°~lYselein
~
O
M
e
tpr°rmI~e~"~ t s S
~
O
M
e
I
Analyse the patterns1 -]of properties within ~ eachmolecule
Molecularfragments
I Evalu&ate 1
I
/
store novel molecule
Ill~hfine and describe " e pharmacophore ~ model across the selected group of molecules J
~"-] ~ l ~ y & o -
I Analyse structural 1 P" variation within a pharmacophore ~ feature
] S7e~at~'°na~fip~ty ~'-q aromatic-a: 4-substitutedpyridine > 2-substitutedpyridine
I Specifya design ~
~.._[
solution
9
1
[ Pharrnacophorefeatures , ~ ,~. ~ [ & relationships
fl
.'~¢"-
-'~
Novel molecule ~
O
M
e
Figure4. Castlemainemodelof indirectdrug-designactivities cule knowledge base. This analysis of structure-activity relationships forms an important part of the problem definition that, in combination with the pharmacophore model, allows possible design solutions to be generated. A pharmacophore model is described in terms of the properties (that is, the functions) that are delivered by each of its features. There are a number of ways in which variant structures can deliver the same set of properties. Design options are generated by the specificationof variants in molecular structure within each of the pharmacophore features, these being chosen in such a way as to maximize the activity of the compound as a whole. A molecular structure that can be used to replace a different molecular structure, while a given profile of properVol 5 No 1 March 1992
ties is retained, is called an isostere. A pharmacophore model in combination with its related structure-activity analysis can be used to determine plausible isosteric replacements for the design of novel compounds. DESIGN Pharmaceutical drug-design projects fall largely into the category of creative design, rather than innovative design or routine design4. The search is for compounds of significant and demonstrable novelty that will merit patents. This is not to say that each novel drug does something different from its predecessors; it may achieve the same physiological effect, but it must use new science to 35
achieve this effect in a creative way. Ira design solution is not innovative, then it is not patentable. Design can be divided broadly into the phases of formulation, synthesis and evaluation 5, and much of the effort of indirect drug design is concentrated on the formulation phase. Most current research on design bypasses this early stage of formulation, and considers design projects from synthesis onwards. Design formulation is consequently poorly understood and poorly modelled. An artefact can be defined in terms of its function, behaviour and structure, and Gero 4 suggests that the interrelationships of function, behaviour and structure are stored as schemata, or prototypes, that the designer calls on. Gero and Roseman 6 suggest that the functional description is used to derive the artefact's expected behaviours. These expected behaviours dictate which actual behaviours are derived from the structural description. The functionality of a novel drug is stated in physiological terms - - to lower blood pressure, for example. This physiological level of functional description cannot readily be used to provide a description of the expected behaviours for a drug molecule at the atomic level, a s there is no known set of correspondences between physiological function and the anatomy and behaviour of the receptor molecule. The functionality for the design project can be restated at a lower level of description to make the design problem more tractable: to produce biological activity at a particular level in a specified receptor. However, this does not alter the basic fact that the expected behavioural attributes of the drug molecule cannot be derived, as the behaviour of the drug's receptor is not known. The structural vocabulary of pharmaceutical design is straightforward; a molecule must be composed of atoms, and these are largely drawn from the organic subset of atoms. The laws of physics dictate how atoms combine into molecules, and determine the behavioural properties of the resulting structures and substructures. In most other domains of design, there is a known mapping between behaviours and structures; in molecular design, the behaviours of individual atoms are contextual, because of the complex interrelationships of physical forces. In medicinal chemistry, behaviour can be derived from structure only on a local basis. That is, given a small number of connected atoms, their chemical behaviour can be hypothesized with reasonable certainty, as many physical properties are short-range, and thus largely noncontextual. There are, however, longer-range properties that modulate local behaviours, reducing the certainty of the hypothesis about their relationship to structure. As the number of atoms under consideration increases, so the certainty about behaviour decreases, because of the complexity of the interrelated physical properties. In this respect, pharmaceutical design is quite unlike design in other domains such as architectural or engineering design, where the relationships between behaviours and structures can be enumerated. How well does indirect drug design fit with a description of schema-based design? There are no preexisting schema that can be used to inform the novel design, because of the creative nature of the design, and the lack of knowledge of the receptor's behaviours, and so there is no means of deriving expected behaviours from the functional description. Instead, the expected behaviours
36
must be generalized from the actual behaviours and structures of a number of molecules that have similar functionality. Only when the expected behaviours have been inferred can the design of a novel molecule proceed into the synthesis phase, and these expected behaviours may have to be revised in the process of exploration of design solutions. It might be possible to look at the process of modelling the pharmacophore as being analogous to the construction of a schema that is used to inform the specification of design instances; however, as the model relates to a specific body of evidence, the notion of the pharmacophore as a schema lacks generality. The starting point for a drug-design project is usually some molecule that exhibits the desired functionality, and that provides a lead. The molecule may not exhibit this functionality very strongly, and it may, indeed, exhibit the inverse of the desired function; all that is required is some correlation with the desired functionality. Often, several candidate molecules can be identified that show similar functionalities, or the single starting compound is used to generate structural variants that provide evidence for analysis. These molecules are solutions (of varying quality) to aspects of the design problem in hand, and they are the starting point for the new design. This suggests some similarity with case-based design s,v, where earlier design episodes are used to provide analogical solution processes or methods of exploration. A case-based model of the design process has been applied by Maher to design synthesis, but not to design formulation. Case-based reasoning relies on episodic information about previous similar designs that can provide a link from function to behaviours and structure. Episodic information of sufficient specificity is seldom available in relation to a particular receptor target. A starting molecule's value is principally as an analogous solution, and it is limited to those behaviours that can be inferred from the interrelationships of its structural elements. Episodic information is brought to bear on a new problem in the more general form of the designer's repertoire of domain methods. Dasgupta s characterizes design as an evolutionary process that is modelled on the scientific hypothesistesting method. In Dasgupta's hypothesis-testing model, the designer sets out to test and evaluate the current hypothesis. Indirect drug-design formulation has much in common with scientific hypothesis testing, and it is consonant with the exploration-based model of design s, which is outlined in the next section of the paper. The designer specifically seeks evidence from which hypotheses can be tested. These hypotheses are evolved through exploration of aspects of the design problem that are dependent on the strength and utility of the available evidence. Where evidence is lacking, or hypotheses are not testable, further data may be generated. The exploration is based on substantive and methodological considerations. This exploration results in a model that embodies a set of hypotheses about the nature of the drug receptor, and this model is used to generate testable design specifications. The evaluation of a design specification may lead to the refinement or reconstruction of the model through further exploration of the characteristics of the design solutions that can be derived from the current model. There are two types of exploration, therefore, the first being part of the design formulation that is Knowledge-Based Systems
used to explore that particular domain, define the design knowledge base.
ARCHITECTURE OF PRELIMINARY PROTOTYPE
knowledgeapplication knowledgegeneration knowledgetransfer Figure 5. Knowledge process underlying exploration-based design directed at the production of a set of expected behaviours, and the second being part of the design specification and evaluation that is directed at the production of the set of actual behaviours that specify the final design. These types of exploration are interdependent.
Exploration-based model of design The exploration-based model of design characterizes design as an exploratory process, and describes how knowledge that underlies the design is organized, applied and generated, as shown in Figure 5. The design exploration process evolves a user-defined initial designrequirement description into a final design-requirement description. The initial design-requirement description is usually a weak initial statement of the expected functionality of the artefact to be designed that may be incomplete, ambiguous and inconsistent. As the design proceeds, more of the space of possible designs is explored. Through this exploration, the initial design-requirement description becomes more thoroughly defined; this results finally in a complete and consistent description of the artefact's functionality. This is the final designrequirement description. A final design specification that is consistent with the final design-requirement description is also developed. This is a technical description of the solution that will deliver the functionality described in the final requirement description. The design exploration process is a collection of interconnected activities, such as search, analytical assessment, problem decomposition, parameterization, synthesis and optimization, that is performed in a sequential or concurrent manner. The record of activities, decisions made, and rationale behind each decision constitutes a history of the desigrt process. The history, the final requirements description, and the design specification form the design-description document. The design space.is explored by the application of knowledge of the domain together with knowledge about how to design in the domain. Domain knowledge partially defines the space of possible designs to be explored. Domain knowledge, together with knowledge about the methods that can be Vol 5 No 1 March 1992
The exploration-based model of design grew out of, and informed the design of, the Edinburgh Designer System (EDS), which was part of the Alvey Design to Product project 9. EDS was a system that integrated various representations, reasoning and control subsystems to support mechanical design. The exploration-based model of design and EDS influenced the architecture of the Castlemaine preliminary-prototype system. The architecture of the Castlemaine preliminary prototype encompasses the portion of the model with the tinted background shown in Figure 5 for the domain of indirect drug design. In the development of the architecture, the primary goal has been to provide a design-support system in which the user is actively involved in problem solving, rather than an autonomous problem-solving system, which is the aim of many conventional expert systems. This is because the designer's exploration of a problem may take many different routes and involve different problem decompositions, posing problems of control for an autonomous system. A design-support system must take on the role of assistant, and appear to be competent in the design domain. For this to be so, a number of requirements have been identified that must be addressed by the architecture: • Complexity management: The system should enable complex design explorations to be undertaken by transferring some of the cognitive load from the users to the system. This cognitive load is associated with various design activities, such as the recording and structuring of the information generated by the consideration of large numbers of possible design choices. • User support: The system should act as an assistant to the user by cooperatively working with the user, by relieving users of some of their workload by performing the mundane design subtasks and providing support for the more complex ones. The system should be able to account to the user for the reasoning processes that it follows. • Context management: For many design tasks, the user wishes to explore simultaneously multiple avenues to a solution. This requires some form of context management so that mutually inconsistent solutions can be examined. • Reason maintenance: Some design subtasks may use default reasoning. A reason-maintenance system enables the dependencies between inferences made during problem solving to be recorded, and consistency between them to be maintained. • Knowledge representation: The system must be able to represent, in a structured fashion, the different types of knowledge commonly used during design, such as those found in textbooks and design handbooks. In addition, the system must be capable of applying this in a timely fashion to the performance of various design subtasks, as the user requires. These requirements have been satisfied to various extents 37
i1
SE. INT . AC
i:ili
DEVELOPE.
ii] Menu hierarchy iii!i INTERFACE ii] User control panel i:: Lis- listener iil User information panel ii::i ~ . v . .o r o w s e r ::l i:!:l object iiI Molecule display window 1 :-:.:~:'x: ~ ~~.!-"--:--;-'.'":% ~~~ ~%~*,~::%-':.".~ !i~t ~i~~"~':':'..":~:% ~:.':~i~!.:.~' -..'i.~.'_%~. ~~ 6 ~ ~ ~~,,~
~ ~
INTERACTION MANAGER llu...
EXTERNAL APPLICATION
GATEWAY ~ : . : . x . : ; : ; ~
DRUG DESIGb KNOWLEDGE BASE
I
EXTERNAL 1 DATABASE x.
~..~,.':~
~ ~ x ~ . , ~ . ~ : : : . :
~
CONTROL
iiii
Rulesets
Initial object hierarchy Hierarchy elaboration !!i SUPPORT SYSTEMS ::~i Analyse compounds DESIGN HISTORY Build pharmocophore mode] User assertions '~i Explain SAR Dependencies ii~!i Specify a novel compound ~ 8 i - ' . : ~ : ~ :
.:-.?. ~:: ~ . # : : . - : : : : : : :
Figure 6. Architecture of preliminary prototype by the architecture of the preliminary prototype, shown in Figure 6, which is described below. The architecture is based on a blackboard model of control that is centred around the concept of multiple agents that communicate via a global workspace, referred to as the blackboard, to solve cooperatively a problem ~°. In the Castlemaine prototype system, there are two types of agents: the support systems, with their related knowledge sources, and the user, who is treated as a high-priority knowledge source. There are four support systems, each of which performs some design-support task. Each support system corresponds to one of the four major tasks within indirect drug design determined from the task analysis described above. A support system's overall task can be decomposed into self-contained subtasks, each of which is implemented as a knowledge source. A knowledge source, in this implementation, comprises forwardchaining rules. The support-system and knowledgesource hierarchy is shown in Figure 3. There is a need to control the invocation of knowledge sources so that design subtasks are performed in a computationally efficient manner and at the user's command. This is achieved through a mechanism based on GoldWorksII rule sets. A rule set is an object that controls the activation and deactivation of a set of rules assigned to it. When a rule is deactivated, none of its rules are used to generate any items for the agenda in the pattern-matching phase of the match-fire cycle. If all the rule sets are deactivated, nothing gets onto the agenda. Rules within a rule set can control the activation or deactivation of any rule set, which means that a rule set can have a lowpriority rule whose role is to deactivate the rule set itself, and then activate an alternative rule set. In this way, different rule sets can be chained together to perform some complex task, with an efficient use of the limited system resources. Rule sets can be used for forward
38
chaining, backward chaining and goal-directed forward chaining. This provides a higher-level control over knowledge sources, and allows some sort of system resource control and focusing of control to be achieved. Multiple rule sets can be defined and organized as a chain by generic rules being put into each rule set such that their activation and deactivation can be controlled. Usually, when a rule set is activated, all the other rule sets are deactivated. This effectively focuses the system's resources on the relevant knowledge sources. It also prevents two knowledge sources that share the same objects in the dynamic knowledge base from generating conflicting agenda items. Rule sets can be constructed in line with the design-task decomposition so that each knowledge source is assigned to a unique rule set. The invocation of knowledge sources can thus be controlled by the activation and deactivation of the relevant rule set in the system. This may be done either by the rule set itself, or by the user of the system. In the Castlemaine drug design-support system architecture, the user has been given the highest priority in the control of the knowledge-source invocation. A main menu hierarchy is provided in the user interface to allow the user to activate a rule set so that explicit control of the inference process can be achieved. This rule-set-based control scheme allows several important issues, such as the design-task decomposition, control and focusing of the system's resources, to be consistently embodied in the architecture. The core of the knowledge representation for the preliminary prototype is the drug-design hierarchy that has been constructed within the GoldWorksII frames system. In addition to the comprehensive hierarchy of drug-design concepts generated during knowledge acquisition, there is a version that has been optimized for the implementation of the preliminary prototype. The drugdesign hierarchy at the outset of a session initially holds a Knowledge-Based Systems
firings, and the supporting justifications for a rule's triggering, • the temporal ordering of rule firings.
drug
[ chemicallinenomlonJ I m,~ t,~ I
~ has-parts molecular component
Ichemicalline
connected-to
[~_
notatiol
__~_~ermines l has-partsdetermines
I
chemical concept
/s-a
molecular fragment
determines
I
I secondary property[~ /s-a
I determines
I fo~'med-from/
determines
. I I [phannacophore quality pharmacophore
Figure 7. Relationships between chemical concepts
This information is used by the system to provide an explanation of its inferencing. For the first prototype system, the history record also includes snapshots of significant past states, which enable the user to return to an earlier stage in the design process, and take an alternative route. This is a simple form of context management. The success of a design-support system largely depends on the ease of interaction with the user. The user interface of the preliminary prototype has been designed so that the user has full control over the system, enabling him/her to pursue his/her intentions as desired during the design process, with the system acting as an assistant. Also, the interface allows the communication of large volumes of data associated with indirect drug design between the user and the system. This functionality is provided by the four major subcomponents that comprise the user interface, as follows: • The menu hierarchy allows the user to invoke support systems through the control module. • The user control panel enables the user to load/save compounds from/to text files, or to add, delete or modify compounds in the drug-design hierarchy. Also, a facility is provided so that users can search and browse through the drug-design hierarchy. The user control panel makes use of the gateway. • The user information panel displays system-status information that informs the user whether some expected work has been performed by the system. • The molecule-display window is a multiple window display area that displays simultaneously up to nine molecules. Active graphics images are used so that the user can interact with the system through the graphical displays. This is particularly important in drug design, where medicinal chemists are used to thinking about and manipulating 2D graphical representations.
set of compounds and concepts for the modelling of compounds at different levels of abstraction, as outlined above. This corresponds to the domain knowledge in the design model. During a session, as the design process proceeds, the hierarchy is elaborated with new instances that represent structural and functional aspects of compounds, and pharmacophore models with their features. The design specification for a novel compound would be derived from the information stored in the frame hierarchy, as shown in Figure 7. The boxes represent chemical concepts, and the arrows represent relationships between them. For example, a molecular fragment, implemented as an object, comprises molecular components, also implemented as objects. A molecular fragment has the attributes primary property and secondary property, determined from the attributes of the molecular components comprising it. The design history records the changes that take place in the dynamic database, and it holds a justificationbased chronological record of the design process. The design history is built up during inference by the support systems, and from input given by the user. It records
There is a requirement to convert the textual chemical formulae of compounds into the internal representation (instances in the drug-design hierarchy), which can be reasoned about, and vice versa, to retrieve formulae from the internal representation. These chemical formulae may be provided or needed by the user, a chemical database, or an external applications package, such as a molecular-modelling system. In the preliminary prototype, this function is performed by the gateway. The interaction manager is concerned with the workings of subsystems within the whole design-support system, and it is, consequently, closely linked to the functioning of the control system and the user interface. It manages the interactions between the subsystems, between foreign languages, to different interfaces and to external applications or databases.
• assertions made by the user during a design session, • the reasoning history, i.e. the inferences in the form of changes t6 the drug-design hierarchy made by rule
The development of effective computer support for designers needs to be driven by better modelling of the design process, and an effective way of developing such an understanding is by the building of systems that inte-
Vol 5 No 1 March 1992
SUMMARY
39
grate artificial-intelligence techniques through which this process can be modelled. This methodology has been adopted by the Castlemaine project, and this has resulted in the specification of an architecture for a molecular drug design-support system. In the first phase of the project, a prototype designsupport system to aid molecular drug designers has been constructed. This prototype integrates a number of knowledge-based-system techniques. The construction and evaluation of this prototype has further tested the 'exploration-based model of design' that was developed at the University of Edinburgh. Work in the second half of the project is extending the exploration-based model of design through the development of a main prototype design-support system. The aim is to make the architecture of the main prototype more generic, so that systems based on this architecture can be applied to other domains, such as software design, as well as to drug design.
ACKNOWLEDGEMENTS The work discussed in this paper is supported by grant number GR/F3567.8 from the UK Science and Engineering Research Council, and through the UK Department of Trade and Industry's Information Engineering Directorate.
REFERENCES 1 Smithers, T, Conkie, A, Doheny, J, Logan, B and Millington, K 'Design as intelligent behaviour: an AI
40
in design research programme" in Gero, J (Ed.), Art# .ficial Intelligence in Design Springer-Verlag, UK (1989) 2 Skingle, B 'An introduction to the PFES project" Proc. A vignon 90." lOth Int. Wkshp. Expert Systems & Their Applications (1990) pp 907-922 3 Lewis, R A and Dean, P M ~Automated site-directed drug design: the formation of molecular templates in primary structure generation' Proc. Royal Soc. Lond. B Vol 236 (1989) pp 141-162 4 Gero, J S 'Prototypes: a new schema for knowledgebased design' Working Paper Architectural Computing Unit, University of Sydney, Australia (1987) 5 Maher, M L 'Process models for design synthesis' AI Magazine (Winter 1990) pp 49-58 6 Gero, J S and Roseman, M A 'A conceptual framework for knowledge based design research at Sydney University's Design Computing Unit' in Gero, J (Ed.) Artificial Intelligence in Design Springer-Verlag, UK (1989) 7 Zhao, F and Maher, M L ~Using analogical reasoning to design buildings' Eng. Comput. Vol 4 (1988) pp 107-119 8 Dasgupta, S 'The structure of the design process' Adv. Comput. Voi 28 (1989) pp 1-67 9 Smithers, T 'The Alvey large scale demonstrator project Design to Product' in Bernhold, T (Ed.) Artificial Intelligence in Manufacturing, Key to Integration North-Holland (1987) pp 251-261 10 Nii, H P 'Blackboard systems: Part 1' AI Magazine Vol 7 (1986) pp 38-53
Knowledge-Based Systems