Object-oriented design and programming in medical decision support

Object-oriented design and programming in medical decision support

Computer Methods and Programs in Biomedicine, 36 (1991) 239-251 239 © 1991 Elsevier Science Publishers B.V. All rights reserved 0169-2607/91/$03.50 ...

1MB Sizes 0 Downloads 50 Views

Computer Methods and Programs in Biomedicine, 36 (1991) 239-251

239

© 1991 Elsevier Science Publishers B.V. All rights reserved 0169-2607/91/$03.50 COMMET 01247

Section II. Systems and programs

Object-oriented design and programming in medical decision support H e a t h e r Heathfield ~, Jim Armstrong 2 and Nigel Kirkham 3 t IBM (UK) Scientific Centre, Winchester, U.K. 2 Information Technology Research hlstitute, Brighton Polytechnic, Brighton, U.K. and "~Histopathology Department, Royal Sussex County Hospital, U.K.

The concept of object-oriented design and programming has recently received a great deal of attention from the software engineering community. This paper highlights the realisable benefits of using the object-oriented approach in the design and development of clinical decision support systems. These systems seek to build a computational model of some problem domain and therefore tend to be exploratory in nature. Conventional procedural design techniques do not support either the process of model building or rapid prototyping. The central concepts of the object-oriented paradigm are introduced, namely encapsulation, inheritance and polymorphism, and their use illustrated in a case study, taken from the domain of breast histopathology. In particular, the dual roles of inheritance in object-oriented programming are examined, i.e., inheritance as a conceptual modelling tool and inheritance as a code reuse mechanism. It is argued that the use of the former is not entirely intuitive and may be difficult to incorporate into the design process. However, inheritance as a means of optimising code reuse offers substantial technical benefits.

Decision support; Inheritance; Medical system; Object-oriented; Software engineering

1. Introduction

In recent years the computer science community has shown increasing interest in the objectoriented design paradigm [1-3]. The conventional procedural design model utilises functional decomposition to identify the set of tasks required to solve a problem [4,5]. In contrast, the objectoriented approach seeks to identify the objects of a domain and their behaviours. Proponents of the object-oriented design paradigm have claimed

Correspondence. H. Heathfield, IBM UK Scientific Centre, Athelstan House, St. Clement Street, Winchester, Hampshire, SO23 9DR, U.K.

that it provides considerable benefits over procedural design techniques, including encapsulation, re-use and extendibility [6]. This paper highlights the realisable benefits of using the object-oriented approach in the design and development of clinical decision support systems. Such systems are often intended as research vehicles to explore the potential role of decision support in a given domain, and therefore preserit a unique set of software engineering problems. We begin by examining the problematic aspects of designing and developing clinical decision support software. The limitations of procedural design techniques are indicated. A brief introduction to the object-oriented paradigm is

240

given. This is not intended to be a comprehensive account and for further information the reader is referred to the following texts [7-9]. We illustrate the use of object-oriented design with an example taken from the domain of breast histopathology. This discussion concentrates upon the design aspects of knowledge representation and inference. In particular, the dual roles of inheritance are explored, i.e., inheritance as a conceptual modelling tool and inheritance as a code reuse mechanism. Our study indicates that using inheritance to model abstract domain hierarchies is not necessarily an intuitive process. However, inheritance as a means of optimising code reuse offers substantial technical benefits. Finally, we discuss the object-oriented design paradigm in the context of general problem solving philosophies for clinical decision support systems.

2. The nature of software in clinical decision support

The influence of computers on medical practice has already been substantial, particularly in the areas of record keeping, data acquisition and analysis and information storage and retrieval. Hospital information systems are widely used, providing both communication links and information management functions to medical personnel. Computer technology is being employed to facilitate rapid and convenient access to biomedical literature. For example, the MEDL1NE system's bibliographic database contains in excess of 900000 references to recent journal literature [10]. In addition to computer software which functions in an information management capacity, computer programs may be designed to assist clinicians in decision making *. The relatively new discipline of artificial intelligence and increasing interest in expert systems has stimulated

* Information management software may indirectly affect the decisions made by medical personnel. However, decision support systems provide explicit advice based on patientspecific data.

much research into the provision of clinical decision support tools. Experimental systems have been built to assist in a plethora of diagnostic and treatment planning problems [11]. Software intended to provide information management services draws on established techniques of data processing. Its objectives are relatively well defined, and its role is clearly understood by medical personnel. This contrasts with the comparatively new discipline of artificial intelligence, where techniques are unfamiliar and end-user requirements are largely unexplored. The evolutionary nature of such systems is reflected in the literature, and despite the 10000 documented studies of clinical decision support systems, few are used in routine clinical practice [12]. The essentially exploratory nature of most clinical decision support systems means that they cannot be formally specified at their outset. In part this is due to the lack of understanding between computer scientists and clinicians. Clinicians often find it difficult to articulate their requirements. This problem is compounded by the fact that they often have no conception of the potential benefits of decision support systems, nor of the limitations of them. Consequently system development tends to proceed through prototyping, and is characterised by numerous cycles of clinical test and system amendment, with user feedback determining which aspects of the system require modification. This technique of prototyping poses a particular set of software design problems. Decision support systems are intended to assist the clinician in decision making. The requirements for a decision support system are usually difficult to specify, and thus the first approach to design involves building a computational model of the problem domain. This process of model building is not elegantly represented in the taskbased decomposition of procedural design paradigms. The design units produced by taskbased decomposition are procedures that perform tasks. They are artefacts of the design process and relate to the proposed solution, not the problem domain. Given that communication between the system designer and clinician may be

241

difficult, it is desirable that design units are closely identified with the real-world concepts which they model, providing a common set of entities to which both parties can relate. Functional decomposition is the principal technique of the procedural design paradigm [13]. It places emphasis of the order in which actions are to be executed *. Thus it is possible for slight changes in a control sequence to yield entirely different design structures. This premature binding of temporal relations is not compatible with the evolutionary nature of the problem-solving process. For example, during the initial stages of system development, it is unlikely that the clinician will be able to identify with certainty the required sequence of treatment planning activities. The incremental nature of prototyping necessitates retaining maximum flexibility, with respect to sequencing restraints. Functional decomposition also stresses the external interface of a system, i.e., the manner in which the system interacts with the outside world (this differs from the user-interface which determines the form of information presented to the user). For example, the question of whether patient data will be input to the system from the user or from another source such as a database, can have important implications for a functional design. Interface design is difficult and often subject to much experimentation during prototyping. It is therefore most appropriate to use a design technique which enables the interface to be largely decoupled from the main system structure. The process of prototyping may demand numerous system changes at each stage. This task can become formidable if these changes are not localised to the relevant modules, and so propagate throughout the system. Functional decomposition places its emphasis principally upon functions not data. Thus, many functions in many places, may act upon the same set of data. Conse-

* There are alternative procedural techniques (termed data structure analysis) in which the data structures are used to assist in the process of functional decomposition, e.g. Jackson [4], Yourdon [5] and Booch [7].

quently, any changes to the underlying data structures of a system can necessitate extensive updating of the functions that manipulate those data. It is desirable that all software be 'correct', i.e., perform exactly the tasks defined by the requirements and specification. However, since decision support systems have no fixed and rigourous specification, it is essential that they be 'robust', i.e. possess the ability to function even in abnormal conditions. In particular, a system should be capable of recognising an abnormal situation that lies outside the scope of its problem-solving ability and acting accordingly. The task of checking the correctness of a system is aided by a design method which enables individual objects and the manipulations that can be performed on these objects, to be separately tested as autonomous units. Given the immense effort and cost that is involved in producing quality software for any domain application, the reusability of software is an important issue. The commonality that exists in many elements of clinical decision support systems indicates that substantial reuse may be possible if the appropriate design techniques are employed.

3. The object-oriented philosophy The object-oriented design paradigm seeks to mimic the way that people form models of the real world. In contrast to procedural design methods, it de-emphasises the underlying computer representation. Its major modelling concept is that of the object, which is used to symbolise real-world entities and their interactions. Objects are entities which have state and behaviour. They can be implemented in computer systems as data and a set of operations defined over those data. The object-oriented methodology extends through both the analysis and design phases of software development to implementation via object-oriented programming languages. The analysis phase constructs a model of the problem domain by identifying a set of interacting entities. Software-based models of these entities and the relationships between them are then assembled

242

to form the basic design architecture of the system. The design phase retains and directly represents the objects identified during analysis and so facilitates better communication between users and designers. The close association between these two phases differs radically from the distinct nature of analysis and design in procedural techniques. The object-oriented paradigm was first introduced through the programming language Simula [14] and fully developed in Smalltalk [15], and has since progressed into analysis and design techniques. However, these techniques are not well established, and as a consequence, much of the design terminology used reflects specific language constructs. To confound the problem, various individuals have defined the term object-oriented differently. This section aims to provide an appreciation of the underlying philosophy of the object-oriented paradigm: in particular it introduces the concepts of encapsulation, inheritance and polymorphism. However, it does not claim to be comprehensive and seeks to avoid specific language nomenclature.

3.1. Encapsulation The real world is composed of entities or 'objects'. For example, people, cars, and buildings. Each object has a distinct set of properties (i.e., a state or set of data). Furthermore, there is a set of meaningful operations or actions that can be applied to each object. Thus, the object 'car' would contain a description of the car's attributes such as its make, colour and licence number, plus some associated behaviours such as drive forwards, drive backwards or stop. This form of object recognition is equally apparent in the medical domain, and gives rise to objects such as a patient - Ms Smith, a treatment - chemotherapy, and a hospital - St Elsewhere. The manner in which an object integrates both state and behaviour is termed 'encapsulation'. The concept of encapsulation restricts access to the internal state of an object, allowing only a pre-specified set of operations to act upon that object (termed the 'interface or specification'). The object is abstracted in the sense that we are

concerned only with its external behaviour, not the internal details of its data structure. This separation enables an objects interface to be mapped to several different implementations. It also permits the internal representation of an object's state to be revised without affecting other objects which communicate with it, provided its interface remain unaltered.

3. 2. Classes and inheritance Given the countless number of individual objects within the world, reasoning about single entities becomes complex and cumbersome. Classification is an important human activity which strives to construct abstractions describing sets of objects, rather than just individual objects. This grouping together of objects enables us to assume some basic similarities (of both state and behaviour) between individual members of the group. For example, all patients have a name, a set of symptoms and an illness. In object-oriented terminology, a set of similar objects described by an abstraction, is called a 'class'. The definition (as opposed to actual software implementation) of a class interface is often referred to as an 'abstract data type'. Its internal state may be termed 'private data', whilst its available operations are called 'methods' or 'functions'. In the real world we often classify objects in a hierarchical fashion. Objects may be grouped into classes, which are in turn further grouped into more general classes. The concept of 'inheritance' describes this hierarchical classification process. It enables the definition, and by implication the implementation of a class to be based upon that of an existing class. Thus, if we have defined the basic class Person, we can inherit its attributes into the new class Patient, which can be further inherited into the classes In-patient and Out-patient, extending their specificity as necessary (Fig. 1). In object-oriented terminology the base class may also be called the 'parent' or 'superclass', whilst those that inherit from it are termed the 'child' or 'subclass'. Inheritance supports 'extensibility' within a system, i.e., enables basic class concepts to be extended. It seeks to

243 'polymorphism' forms).

(i.e., the ability to take many

4. A case study: The Histology System

Fig. 1. Classes and inheritance.

realise the goal of constructing software systems from reusable components. A class definition is a template from which representations of individual objects can be created. Such an object which results from the instantiation of this definition is termed an 'instance'.

3.3. Polymorphism Objects communicate via 'messages'. A message sent to an object will invoke a particular operation belonging to that object. The distinction between a method or function and a message is subtle but important. For example, the message 'apply treatment' applied to the object patient will cause some medical procedure to be carried out. However, each individual patient will suffer from a different illness, and so we would expect the type of treatment applied to vary accordingly. In the object-oriented paradigm, the capability of each member of a set of objects, to respond in a different manner to the same message is termed

This section describes the application of object-oriented design in the development of a decision support system intended to assist pathologists in the histopathological diagnosis of breast disease. Whilst it illustrates the practical advantages of using the object-oriented design approach in prototyping, it also highlights difficulties encountered in the use of inheritance. We begin by giving a brief outline of the problem domain and general aims of the project. This is followed by an examination of the analysis and design procedure. As our intension is to clarify the role of inheritance in system design, we give a description of the two forms of inheritance, namely 'abstract inheritance' and 'code inheritance', before proceeding with details of the knowledge representation and decision support aspects of the Histology System. The use of object-oriented techniques in interface design is well documented (e.g., Refs. 2, 6 and 9) and therefore will not be discussed here. Finally, we look at how object-oriented design can accommodate the test and refinement cycle.

4.1. The domain problem Breast carcinoma is the leading cause of death for women aged 25 to 54 [16]. However, not all breast carcinomas are the same, and there are in excess of 100 different microscopically identifiable types. Many of these histological types are associated with a distinct prognosis which has important implications for the choice of patient treatment [17]. The technique of histological typing is problematic for three reasons: 1. There are many histological types of breast disease, some of which are rate and not often seen in routine practice. 2. There are benign and malignant counterparts which can exhibit similar appearances.

244

3. There are numerous features which have to be taken into consideration when making a diagnosis. The difficult nature of histological diagnosis is reflected in the low rates of inter- and intra-observed agreement obtained [18], and presents a genuine need for decision support. The aim of the project was to produce a computer-based system (named the 'Histology System') which could act as an intelligent assistant to the pathologist. Its success depended on an accurate characterisation of pathologists' diagnostic skills and identification of their strengths and weaknesses. However, at the onset these diagnostic skills were poorly understood, and the detailed requirements of potential end-users not fully elucidated. From the designers' viewpoint, it was intended that the system be employed as a research vehicle, providing a framework within which to test the suitability of several novel inference models. It was also anticipated that the initial implementation of the interface would be replaced at a later date.

4.2. The design process Object-oriented design involves defining the objects of a domain and their required behaviour. Intuitively, we can perceive three major components of the Histology System: 1. A body of domain knowledge, i.e., the Knowledge Base. 2. A model of problem-solving which can apply domain knowledge to a diagnostic problem, i.e., the Inference Model. 3. A means of managing communication between the user and system, i.e., an Interface Manager. (Note that this is in fact a component of the solution domain.) Having identified these three components, we can now recognise the need for additional units. In particular, there must be some object which can facilitate user interrogation and browsing of the knowledge base (a Browser). Furthermore, we require an object which can coordinate a consultation session between the system and user (a Decision Support Module). Fig. 2 shows these

Fig. 2. The five major components of the Histology System.

objects. Whilst the object-oriented design approach concentrates on identifying the entities that are present in the solution to a problem, it is not limited to physical objects. Abstract concepts such as the Browser may be necessary to bridge the gap between the problem domain and solution domain. Furthermore, some objects can be more intuitively viewed as computational processes rather than objects, for example, the Inference Model component. The five components shown in Fig. 2 are not composed simply of individual objects, but collections of conceptually related objects that form subsystems. Each top-level object provides an external interface to the underlying constituent objects. These clusters of objects form natural groupings, which incorporate the necessary classes to implement the concept, and together provide a useful design abstraction. Table 1 lists the behaviours and external properties of these subsystems. We can now undertake a more detailed design of each subsystem, identifying the individual objects and the relationships between them, that comprise that subsystem.

4.3. The dual roles of inheritance Inheritance is a central concept of many schemas for modelling knowledge, particularly in semantic networks and frame-based representations. Object-oriented languages have reflected the concept of abstract inheritance, enabling

245

problem domain knowledge to be directly represented in a program's structure. However, object-oriented languages also enable the programmer to optimise resource sharing within a system, by allowing members of a common class hierarchy to share code. This section examines in detail these two uses of inheritance, which we will term 'abstract inheritance' and 'code inheritance'. We discusses their application in the Histology System, and seek to clarify the relationship between them in section 4.6.

Abstract inheritance In semantic terms, an inheritance relation between two entities is termed an 'IS_A' relation. This is defined by Brachman [19]; given an entity X, and an entity Y, to imply that 'Y IS A' X it must be shown that: 1. Entity Y shares all the attributes of X (i.e., attribute conformance). 2. The set of values or entities described by Y is a subset of the set of values or entities described by X (i.e., the subset rule). A relation which satisfies 1 and 2 above is known as a 'strict' IS_A relation. However, it can still be utilised to represent a range of subtly different concepts [19]:

1. Generic/generic relations. Here the IS.A relation is employed to describe relations between classes describing 'sets' of entities. It implies that a child concept is more specific than a parent concept. For example, 'invasive ductal carcinoma (of the breast) IS_A invasive carcinoma', i.e., the set of all invasive ductal carcinomas are a subset of the set of all invasive carcinomas. Such relations are often termed 'IS_AKIND_OF' to distinguish them from the generic/individual relation given below. 2. Generic / indil"idual relations. Here the IS A relation is used to represent the manner in which individual entities are related to generic concepts. For example, 'patient X's carcinoma the invasive ductal carcinoma IS_A invasive carcinoma', i.e., this particular instance of invasive ductal carcinoma belongs to the set of invasive ductal carcinomas which in turn belongs to the set of all invasive carcinoma. When dealing with simple and familiar objects, identifying the strict IS.A relation may not be unduly problematic. However, when defining the relationship between less familiar objects it may be difficult to determine which properties to emphasise.

TABLE 1 Behaviours and external properties of object subsystems Subsystem

Behaviour

External interface

Knowledge Base

Store disease knowledge

Load and retrieve disease information

The Knowledge Base Browser

Search the knowledge base and process complex queries

Process complex queries e.g., Find all diseases with a particular set of attributes in common

Decision Support Unit

Coordinate a diagnostic session: formulate the initial hypothesis update the hypothesis according to new data apply various categories of knowledge to the diagnostic problem

Accept initial patient findings Request patient data Provide explanations Provide supplementary information to the

Inference Model

Model a particular problem-solving strategy

Generate questions

Interface Manager

Coordinate all user-system communication

Present menus and information to the user pass under commands to the appropriate subsystem

user

246 IS_A relations in which the attribute conformance principle is relaxed are termed 'non-strict'. Thus, generic/generic IS.A relations can exploit cancellation to model meaningful exceptions to common generalisations. For example, it may not be unreasonable to claim 'sclerosing adenosis is an invasive carcinoma with cancelled malignancy'. For generic/individual relations, cancellation can enable an individual to be distinguished from the class of examples it is perceived as a member of. For example, 'this specific instance of an invasive intraductal carcinoma IS A invasive intraductal carcinoma with no mitosis'. However, cancellation may be used to express relationships that are not conceptually understandable. For example, 'lobular carcinoma in situ IS A invasive intraductal carcinoma with the invasive and intraductal attributes cancelled'. Whilst cancellation facilitates efficient re-use in knowledge representation, the point at which it ceases to be intelligible is a matter for debate [201. The benefits of abstract inheritance can be summarised as follows: 1. It provides a powerful modelling tool which can be used to analyse the structure of any problem space, i.e., it is domain independent. 2. It has a precise definition (if strict IS_A is used), thereby allowing design decisions to be articulated and justified. 3. It reduces redundancy of information in system models and documentation, by explicitly capturing commonality between behaviour and structure of system components. 4. It facilitates improved communication between program designed and domain expert.

Code inheritance In contrast to abstract inheritance, code inheritance provides the programmer with a means of manipulating patterns of resource sharing within an object-oriented system. It is employed to relate two classes A and B, which share a subset of data and interface methods, such that class A can delegate responsibility for managing that subset to class B. For example (using a simple illustrating for clarity) consider the class 'Point' and class 'Display Point'. The class Point has attributes X

and Y for storing co-ordinate information and an interface protocol of the form; GetX, GetY, SetX, SetY, which enables the attributes to be accessed. The Display Point class uses the same protocol as the class Point, but further specialises its behaviour with the additional interface method; Display. To avoid duplicating definitions and code in its implementation, the class Display Point inherits from the class Point. The advantages conferred by code inheritance include: 1. A reduction in the amount of code necessary to implement a given program. 2. To facilitate extensions and alterations to a program\without rendering versions dependent on 01d code obsolete. 3. Enables code extensions to be automatically broadcast to members of an inheritance hierarchy, through the addition of new behaviour to the superclass. 4. Increases programmer productivity by means of 1 and 2 above. 5. Reduces implementation costs by means of 1, 2, 3 and 4 above. 6. Reduces adaptive maintenance costs by 2 and 3 above. 7. Reduces the complexity of solutions by representing them as extensions and specialisations of one another.

4.4. The Knowledge Base The Knowledge Base is composed of Disease objects. Within each Disease object are several items of associated information: 1. General disease knowledge such as occurrence rates. 2. The differential diagnoses of the disease. 3. Commonly associated diseases. 4. A set of attributes and attached values that characterise the disease. 5. Symbolic certainty factors which represent the degree of confidence in a 'disease-attributevalue' association and reflect the descriptive terms commonly employed in the domain of breast histopathology. Given a disease d i and an attribute a k then: A = a k always occurs with di; H = a k has a high likelihood of occur-

247

ring with d~; M = a k has a medium likelihood of occurring with di; L - - a k has a low likelihood of occurring with d i ; N = a k never occurs with di. An abridged example of a disease frame is shown below: D I S E A S E papilloma CLASS benign lesions; O C C U R R E N C E a rare lesion occurring primarily in middle age; D I F F E R E N T I A L D I A G N O S I S multiple papilloma, papillary carcinoma in situ; A S S O C I A T E D DISEASES sclerosing adenosis, epitheliosis; SET clinical features turnout location = subareolar(H); age group = middle age ( H ) ; nipple discharge = blood stained (L); END SET microscopic features stromal proliferation = yes(N), no(A ); papillary growth = yes(A), no(N); double cell layer = present(A); cytological atypia = yes(N), no(A); foci of papillary growth = single(A), multiple(N); apocrine metaplasia = absent(H); lesion features = necrosis(M), haemorrhage(M); END END

Each disease frame is represented by an instance of the class Disease. This integrates the individual data items into a single co-ordinated entity. For example, the differential diagnosis and associated disease information are both held as a String-List objects. Each attribute set (e.g., Clinical features) becomes an individual object with its own list of 'attribute-value-certainty factor' objects. These distinct entities are assembled together by declaring instances of them within the implementation of the class Disease. Such inter-

nal details are hidden, and the only access to the enclosed object instances is through those operations defined in the Disease class interface. This relationship, known as the 'component relation', facilitates the reuse of general concepts, by allowing them to be used to build new concepts. However, unlike inheritance, which expresses a specialisation of an existing class definition, the component relation provides a service in the implementation of a class. Thus, the class Attribute Set represents a set of disease attributes; it does not define what a disease is. Table 2 shows the interface to the Disease object. Disease information (i.e., the Disease name and a pointer to an instance of the Disease class) is stored in a database (Disease Table). The efficient location of an individual Disease is achieved through the use of a hashing function. The Disease Table provides methods to access individual Diseases by name. For example, the method 'Put' inserts a disease entry into the Disease Table, indexes this against the specified name and returns a pointer to the new Disease object. The method 'get' returns a pointer to the Disease object associated with a given name. The Disease pointers can then be used to call the access methods to the class Disease. Many data retrieval operations are likely to be based upon disease attributes, rather than individual disease names. Therefore, in order to ensure effective search, it was necessary to hold an additional datatable, which indexes on disease attributes (Attribute Table) rather than disease names. To conceal specific details of the search strategies employed in data retrieval operations, an

TABLE 2 The Disease class interface abstraction (abridged) Operation

Description

PutClass PutDifferential PutSet

Input disease class information Input differential diagnosis information Add a new set of disease attributes. Return a pointer to the set, enabling access to AttributeSet access methods Return the disease class Return the list of differential diagnoses Return a list of pointers to attribute sets

ReadClass ReadDifferentials ReadSets

248

additional 'Query' Class was implemented. This class accesses data through either the Disease Table or Attribute Table, to process a range of knowledge base queries in a manner that is transparent to the user. For example, 'Find all diseases belonging to the class X', and 'Find all diseases with attribute X in common'. The Browser Class facilitates communication between the Query Class and Interface Manager, enabling the end user to interrogate the knowledge base in a hierarchical manner, either by disease class, individual disease or disease attribute [21].

4.5. The Inference Model The Histology System utilises a novel inference technique based upon an extension of the hypergraph model described in Gondrian and Minoux [22]. (A theoretical description of this methodology has been given elsewhere [23], [24].) In brief, the hypergraph model embodies two intuitive concepts. Firstly, the principle of minimum effort, i.e., avoid questions that are not directly relevant to a given problem, and secondly, the idea that definitive information is preferable to dubious information. A hypergraph operates upon a binary matrix (of the form disease versus attribute, where each matrix entry indicates a causal association between the disease and attribute), to isolate the minimum cover, or smallest number of attribute tests necessary to differentiate between all dis-

ease pairs. The minimum cover upholds the rationale of minimum effort. To support the principle that definitive information is preferable to dubious information, the inference algorithm begins by considering only categorical knowledge (at the 'AN' certainty level). If a unique diagnosis cannot be reached by asserting the attributes in the current minimum cover, then the process is repeated sequentially using less certain information (i.e., H, M and L levels, respectively). There is a considerable representational gap between the hypergraph model and the general concept of inference. Thus, we require several additional classes to unite the two aspects. 1. A Hypergraph class. This takes an integer matrix and calculates the minimum cover. It has a low level interface that indexes matrix entries purely in numerical terms. 2. An Inference class, which provides a more abstracted interface to the hypergraph functions. It reasons about diseases and attributes rather than numerical entities, and holds additional information, e.g., Current hypotheses, rejected hypotheses and known data. 3. A Translation Class. This takes a list of disease frames (i.e., the initial hypothesis set) and produces an integer matrix representation at the appropriate certainty level, which can be input to the Hypergraph class. These three classes are brought together by the Decision Support class, which guides a consulta-

TABLE 3 The Decision Support class interface (abridged) Operation

Description

lnitialise NextQuestion AssertAttribute EraseAnribute IncrementLevel ReadCurrentHypotheses ReadCurrentRejects Explain Confirm

Initialise a consultation session Obtain the next question in the m i n i m u m cover Input user asserted data Erase previously asserted data and backtrack Increment the level of certainty Return the list of current hypotheses Return the list of rejected hypotheses Provide an explanation for a question Return a list of attributes necessary to confirm a disease hypotheses

249

tion session through the four certainty levels and processes users assertions and queries. The interface to this class is shown in Table 3.

4.6. The use of inheritance in the histology system The Histology System made predominate use of the code inheritance mechanism. Despite initial expectations to the contrary, abstract inheritance was only of limited use [25]. Whilst we anticipated its appliance in the definition of disease taxonomies, analysis revealed that these tend to be more definitional than evidential in the domain of breast histopathology. For example, the diseases ductal carcinoma insitu and lobular carcinoma in situ are commonly arranged as subgroups of in situ carcinoma [26]. However, whilst they have some descriptive features in common, few of these are employed in diagnosis. As our knowledge base was compiled with the task of differential diagnosis in mind, we were primarily interested in diagnostic features. Thus, the representation of disease hierarchies would have required much use of cancellation. Abstract inheritance was employed in the design of the interface. For example, the Interface Manager uses Selection Windows, Multi-selection Windows and View Windows. These are all types of Basic Window, and so inherit attributes and behaviour from the Basic Window. This technique is well established and therefore easily reproduced. Code inheritance was used extensively in the specialisation of solution-oriented classes. For example, a generic class 'LinkList' was used to construct 'StringList' and 'IntegerList' classes. Likewise, a generic class 'Matrix' formed the basis of 'IntegerMatrix' and 'CharacterMatrix' classes. This type of code inheritance could arguably be viewed as abstract, in the sense that it does represent IS_A relations, such as StringList IS.A LinkList. However, it was used mainly with the motive of code reuse. It is interesting to speculate upon the reasons for the failure to utilise abstract inheritance in the Histology System. It may be that there are in fact few domain objects that can be intuitively organised into a conceptual hierarchy. Alterna-

tively, this failure may be attributed to our lack of knowledge and experience in using abstract inheritance. It should be noted that inheritance in object-oriented programming was originally seen by many as purely a code sharing mechanism e.g., Cox [27] notes "Inheritance ... is an aid to building systems; an implementation issue, not a design issue". However, recent methods of object-oriented analysis include inheritance identification from the earliest stages [8]. This confusion over whether inheritance is a modelling concept or a code reuse mechanism highlights the fact that the benefits of both are somewhat disparate. Whilst the use of IS_A may promote the construction of intuitive domain models, unrestrained emphasis upon code reuse leads to a lack of structure.

4. 7. Test and refinement This section illustrates the manner in which the key concepts of object-oriented design and programming can ease the task of prototyping. In particular it explains how encapsulation, inheritance and polymorphism have enabled the Histology System to be repeatedly amended without resorting to major changes in the system's structure. Encapsulation hides the internal data structures and implementation code from the external interface of an object. Thus, as long as interface stability is retained, regardless of detailed changes that occur in either data representation or implementation, communication with other objects will not be affected. Internal changes will be localised to the particular object involved and not propagated throughout the system. In the Histology system encapsulation proved useful in several ways. As system development progressed it became apparent that the originally defined knowledge representation schema was not sufficiently rich to capture the required domain expertise. Therefore the data structures of the knowledge base were augmented and additional methods added to the appropriate classes to allow manipulations upon the new data. This did not affect the previously defined methods in any way, or the objects that used these methods.

250

The Hypergraph class possesses methods to perform complex mathematical calculations on its internal data. These were initially implemented using an heuristic algorithm. However, this was judged to be unacceptably slow by end-users and so was subsequently re-implemented in a more efficient manner. Alterations in the implementation code of individual methods has no effect on any other aspect of the system. The major use of inheritance within the Histology System occurred in the Interface Manager. Here a basic window class was extended into several more specialised versions. Inheritance enabled new window forms to be created in a quick and efficient manner. Polymorphism in the form of dynamic binding allows the set of classes which inherit from a base class to be extended without requiring explicit changes to class references in the program code. For example, mid-way in the development cycle, the need for a trace-facility on the decision making process was identified. Therefore, a new variation on the basic Window was needed, that could display and format text output. This new window was defined by inheriting and extending the base Window class. The use of dynamic binding meant that pointer references in the program to a basic Window could automatically call the appropriate methods of the new window type.

5. Discussion The design and development of the Histology System has demonstrated that genuine benefits can be gained by using object-oriented design and programming to prototype clinical decision support software. The three major concepts of the object oriented paradigm have contributed particular advantages: Encapsulation de-couples class behaviour from implementation, therefore enabling the designer to concentrate on the domain model. During implementation it allowed substantial internal changes to a class to be made whilst preserving external properties.

Whilst inheritance was not employed predominantly as a modelling concept, it did facilitate code re-use through the extension of previously defined class definitions. Polymorphism helps produce coherent and intelligible programs, by reducing the amount of messages in the system, since many implementations can be associated with the same message. To illustrate the significant amount of re-use of Histology System software, it should be noted that approximately 40% has been used in other applications, including individual base classes and the Interface Manager as a whole. Data abstraction enabled other system developers to understand the behaviour modelled by a class without needing to be aware of implementation details. Whilst our experiences with object-oriented design and programming have been encouraging, certain caveats should be mentioned. Classifying objects within a system is generally thought to be an intuitive process. The fundamental classes of a system should be obtained simply by looking at the classes of physical objects in the external reality handled by the system. However, in practice there are often several ways in which a set of domain objects may be organised and it may be difficult to choose which properties to accentuate [28]. Furthermore, the dual roles of inheritance may present design problems. Software engineering has been identified by contemporary computer scientists as holding the cure for the difficulties of complex system development. Whilst the methods, mechanisms and tools of software engineering are important contributors to a system's success, they should be the nature result of a well-developed philosophy for solving the application problem. However, computer scientists often gain extensive experience of a limited number of software methods and tools. This may constrain them to view a problem in terms of these specific computer artefacts, and so present them from developing a general set of principles, concepts and strategies for dealing with the situation [29]. When a problem is viewed in terms of specific mechanisms, this perspective brings a priori conditions of what can and cannot be accomplished. Thus from the onset, the solu-

251

tion domain is restricted, often being forced into a framework of computer-related artefacts that conflict with the natural problem philosophy. A philosophy that is natural for the solution of a problem, or class of problems will lead to paradigms and models, followed by methods, mechanisms and tools that are consistent with the methodology. The use of an objects as an abstract unit of design moves us closer to the actual objects found in the problem domain. The object-oriented paradigm provides a coherent philosophy that is fully supported by the mechanisms and tools of object-oriented programming languages, and as such has reduced the conceptual disparity between real-world problems and software solutions.

References [1] B.J. Cox, There is a silver bullet, Byte Mag. 15 (1990) 206-267. (McGraw-Hill). [2] D. McGregor and T. Korson (editors), Object-oriented design, Commun. ACM 33 (1990) 38-159. [3] S.R. Alpert, S.W. Woyak, H.J. Scrobe and L.F. Arrowood, Object-oriented programming in AI, IEEE Expert 5 (1990) 6-27. [4] M.A. Jackson, Principles of Program Design (Academic Press, London 1975). [5] E.N. Yourdan and L.L. Constantine, Structure design: Fundamentals of a discipline of computer program and systems design (Prentice-Hall, Englewood Cliffs, 1979). [6] B. Meyer, Object-oriented software construction (Prentice Hall International (UK) Ltd, 1988). [7] G. Booth, Software Engineering with Ada (Benjamin/Cummings, 1986). [8] P. Coad and E. Yourdon, Object-oriented analysis (Yourdon Press, 1990). [9] G. Winstanley and H.A. Heathfield, Expert systems shells and languages, in: Artificial Intelligence in Engineering, ed. G. Winstanley, pp. 104-118 (John Wiley, New York, 1991). [10] E.R. Siegel, M.M. Cummings and R.M. Woodsmall, Bibliographic-retrieval systems, in: Medical Informatics Computer applications in health care, eds. E.H. Shortliffe and L.E. Perreault, pp. 434-465 (Addison-Wesley, 1990). [11] O.M. Goodyear, M. Hobsley, D.G. Jameson, J. Scurr and A. Shamsolmaali, Expert systems in the medical data environment: An example from breast surgery, in: Proceedings of the 6th Annual Meeting of Expert Systems in

Medicine, pp. 1-10 (British Medical lnformatics Society, June 1990). [12] D. Cramp and O.M. Goodyear, Expert systems in medicine - Report on a European survey, Healthcare Informaties Foundation 1989 (ISBN 1 872774008). [13] N. Wirth, Program development by stepwise refinement, Commun. ACM 14 (1971) 221-227, [14] O. Dahl and K. Nygaard, SIMULA - An algol-based simulation language, Commun. ACM 9 (1966) 671-678. [15] D. Robson and A. Goldberg, SMALLTALK-8-: The language and its implementations (Addison-Wesley, Reading, MA, 1983). [16] E.R. Fisher ER, R. Sass and B. Fisher, Pathologic findings from the national surgical adjuvant project for breast cancer (protocol No. 4), Cancer 53 (1984) 712-723. [17] J.M. Dixon, D.L. Page and T.J. Anderson, Long-term survivors after breast cancer, Br. J. Surg. 72 (1985) 445448. [18] S.J. Cutler, M.M. Black, G.H. Friedell, R.A.R. Vidone and I.S. Goldenberg, Prognostic factors in cancer of the female breast. Reproducibility of histopathologic classification, Cancer 19 (1966) 75-82. [19] R.J. Brachman, What IS-A is and isn't: An analysis of taxonomic links in semantic networks. IEEE Computer, October (1983). [20] R.J. Brachman, 'I lied about the trees' or, defaults and definitions in knowledge representation. The AI Mag., Fall (1985) 80-93. [21] H.A. Heathfield, G. Winstanley and N. Kirkham, A menu-driven knowledge base browsing tool, J. Med. Informat. 15 (1990) 151-159. [22] M. Gondrian and M. Minoux, Graphs and Algorithms (John Wiley, New York, 1979). [23] H.A. Heathfield, G. Winstanley and N. Kirkham, A decision-support system for the differential diagnosis of breast disease, Biomed. Eng. J. 13 (1990) 51-57. [24] H.A. Heathfield, D. Bose and N. Kirkham, A model of differential diagnosis in histopathology using the theory of hypergraphs and social choice, Artificial Intell. Med. J. 3 (1991). [25] J.M. Armstrong and H.A. Heathfield, An object-oriented medical decision support system. Proceedings of the IEE Colloquium on Applications and Experiences of ObjectOriented Design, London. January 1991. [26] D.L. Page and T.J. Anderson, Diagnostic Histopathology of the Breast (Churchill Livingstone, Edinburgh 1987). [27] B. Cox B, Object-oriented programming: an evolutionary approach (Addison-Wesley, Reading, MA, 1986). [28] M.B. Rosson and E. Gold, Problem-solution mapping object-oriented design, Proceedings of the 1989 International Conference on Object-oriented Programming, Systems and Languages (Addison-Wesley, Reading MA, ISBN 0-0201-52249-7) 7-10. [29] P. Rechenberg, Programming languages as thought models, Struct. Progr. 11 (1990) 105-115.