Int. J. Bio-Medical
Computing,
223
21 (1987) 223-235
Elsevier Scientific Publishers Ireland Ltd.
PULMONOLOGIST: A COMPUTER-BASED PULMONARY DISEASES
DIAGNOSIS SYSTEM FOR
ANJANA KAR”, GERALD E. MILLERb and SALLIE V. SHEPPARD’ “Computing Services Center, bBioengineering Division, Department of Industrial Engineering and CLaboratory for Software Research, Texas A&M University, College Station, TX 77843 (U.S.A.)
(Received January 20th, 1987) (Accepted April 28th, 1987) PULMONOLOGIST is a prototype expert system designed as a decision-making aid for physicians in providing accurate diagnosis and treatment of lung diseases. Disease information in the system has been encoded in the form of rules and schemata. The system has been developed using the expert system building tool ART (Automated Reasoning Tool) from Inference Corporation. This paper gives a brief introduction to the expert system concept, and includes a detailed description of the diagnosis system. This system will be tested against clinical data in future studies. Keywords:
Expert system; Knowledge base; Schemata; Rules
Introduction
A major focus of expert systems research has been the development of software diagnostic capabilities in the area of medical science. One of the most famous and successful expert systems is Mycin (Harmon and King, 1985), developed at Stanford University in the mid-1970s, which helps physicians in selecting antibiotics for patients with infectious blood diseases. Several evaluations have shown Mycin’s ability to perform at or near the level of human experts (Duda and Shortliffe, 1983). This paper reports on the development of a prototype expert system for the diagnosis and treatment of pulmonary (lung) diseases, produced by a joint study between the Bioengineering Program and the Computer Science Department at Texas A&M University. The software runs on a Symbolics machine, and was written using the Automated Reasoning Tool (ART) from Inference Corporation. Such a system may not replace an expert but can instead function as a training aid for medical personnel by helping them perform some difficult but well-understood procedure. As expert system technology develops further, more medical expertise can be captured and placed in the knowledge base. In time, such systems can be used not only in training, but in supporting diagnostic decision-making where medical
with
0020-7101/87/$03.50 0 1987 Elsevier Scientific Publishers Ireland Ltd. Printed and Published in Ireland
224
A. Kar et al.
expertise is not readily available. Prototype systems, such as the one described in this paper, provide vehicles for experimentation with the tools and techniques, leading to the development of full diagnostic expert systems. The Expert System Concept An expert system is a computer program that can help solve complex real-world problems in specific domains such as medical specialties. In these systems the knowledge from a human expert or from some collection of ‘expertise’ such as a manual or textbook in some problem domain is collected and encoded into a knowledge base. The problem solving expertise encoded in such systems can then be used to make useful inferences for the user of the system. Figure 1 illustrates the architecture of a knowledge-based expert system (Harmon and King, 1985). The expert system concept can be divided into two parts, a knowledge base and an inference engine, shown as rectangles on the figure. The knowledge base contains facts and rules that embody the expert’s knowledge unique in a particular domain. A fact represents a single piece of information. Rules are general guidelines that are used to solve a problem.
Knowledge
base
-3
( Rules
Expert
Facts
or
Knowledge Engineer
Fig. I. Architecture of a Knowledge Based System.
Diagnosis by PULMONOLOGIST
225
In Fig. 1, the knowledge base is connected to the working memory, which is a small portion of memory that is activated at any particular time. In an expert system, the static knowledge of facts and rules are stored in the knowledge base. When the system begins a consultation process the values of various attributes are determined, and this dynamic information gets stored in the working memory. When the consultation process ends, the working memory is deleted by the system, but the knowledge base still contains all the facts and rules as they were added initially. An inference engine takes the statements in the knowledge base and ‘executes’ them. It also decides the order in which inferences are made. Unlike the knowledge base, an inference engine may be common to a number of domains that have similar characteristics (Myers, 1986). It basically provides the control that drives the expert system. Knowledge in an expert system is compiled by a knowledge engineer, who organizes and encodes the knowledge obtained from the domain expert in the form of facts and rules. This process is known as the Knowledge-Acquisition process. Since the knowledge engineer typically has far less knowledge of the problem domain than the expert, communication problems may impede the process of tranferring expertise into a program (Hayes-Roth et al., 1983). The user interface in a knowledge based expert system is essential because it helps the users to interact with the system. User interface features include explanation facilities, on-line help systems, etc. An explanation subsystem must be included in the system if the user of the system needs to examine the system’s reasoning process. This allows the user to ask questions like ‘why’ or ‘how’ at any stage of the consultation process. There are two main phases in the development of an expert system as shown in Fig. 2 (Hayes-Roth et al., 1983). The first phase involves identifying and conceptualizing the problem the system is to solve. During this phase an expert is selected, knowledge sources and resources are acquired, the problem is clearly defined and the key concepts of the problem are uncovered. In the second phase, formalization, implementation and testing of the system take place. This phase includes constant reformulation of concepts and refinement of the implemented system. When the prototype system performs as desired, the knowledge engineer and the expert are ready to expand on the prototype and build a complete system. The prototype system is the initial version of an expert system which is designed to test the effectiveness of the facts and rules in solving a particular problem. The prototype system can be expanded by adding rules and modifying the reasoning process. Thus, expert systems can be developed and maintained incrementally by building a kernel, and then adding more information to it. Knowledge based systems are quite different from conventional programs in this regard. In conventional systems, since each statement is executed in a sequential order, it is difficult to add more information to the program, because one has
A. Kar et al.
226
I t?
Identify problem characteristics
IDENTIFICATION
5
_
5 ‘;-
% DC
w Find concepts
to represent
I I I I 1 I
knowledge
I
C43NCEPTUAUZATlCN
I to organize knowledge
m 2 s t;
FoRMALL?ATKhY
v,
Refinements
Formulate
rules
toembody
t
I
knowledge
z
IMPLEMENTATION VaMaie
rules
that organize v----l
knowledge
I
I TESTING
Fig. 2. Stages of Expert
System Development.
to find the exact place in the program to add the information. Such additions may change the entire design of a conventional program. This is not the case in knowledge based systems however, since the knowledge is completely separated from the control. Rules are executed by the inference engine whenever there are necessary facts in the knowledge base or working memory to enable them. A new rule can always be added since there is no nesting of rules. Because each rule behaves like a separate module, incremental development of expert system is possible and is encouraged. Representation of Medical Knowledge
PULMONOLOGIST is a prototype expert system for the diagnosis and treatment of lung diseases. It has been developed using ART, which is a complete software tool-kit for building expert systems. In ART, a knowledge base is composed of facts, schemata and rules (Clayton, 1985a). Facts are single unrelated
Diagnosis by PULMONOLOGIST
221
pieces of information about objects, whereas a schema is a collection of related facts about a single object. A knowledge base developed using ART has two states, initial and current. The initial state contains the initial facts and all the rules that ART requires to begin reasoning about an application (Clayton, 1985b). .Facts that are asserted or retracted by the rules do not affect the initial state of the knowledge base. When the reasoning process begins, rules and facts in the initial state interact, thereby changing the information in the knowledge base (i.e. working memory as described in the previous section). This is reflected in the current state of the knowledge base, which at any point in time contains the conclusions that ART has reached up to that point. The initial state of the knowledge base can always be recovered by resetting ART, which then erases the current state of the knowledge base. Medical knowledge in PULMONOLOGIST is represented in terms of schemata and rules. A schema is a data structure in which all knowledge about a particular object is stored together. A schema can have zero or more slots. Each slot describes a characteristic of the schema. A slot consists of a slot-name and a slot-value. A slot-name relates the values in a schema slot. Slot values can be determined dynamically by the user or by deduction. The rules manipulate the facts and slot-values in the knowledge base, thereby producing a solution to the problem. Each disease in the knowledge base is represented as a schema containing slots such as probable-symptoms, diagnostic-tests and probable-treatment. Probablesymptoms are the sensations reported by the patient to the physician. The laboratory tests, that may be performed on the patient before the final diagnosis can be made, are referred to as diagnostic-tests. Specific information for this diagnosis system was obtained from medical physiology and pathology textbooks such as The Merck Manual. These texts list the symptoms and signs of various diseases, the laboratory tests that have to be performed before a diagnosis can be made, and recommends possible treatment for those diseases. This knowledge was then represented in PULMONOLOGIST in the form of schemata. Figure 3 shows an example of a disease schema written for PULMONOLOGIST. In the figure, the disease Bronchial-Asthma is represented as a schema consisting of 5 slots, in which the slots probable-symptom, diagnostic-test and probable-treatment hold multiple values. The slot probable-symptom contains the symptoms which might appear in a patient if he has the disease being considered. The next slot on the schema is total-symptoms, which has a single slot value of 7, because there are 7 probable-symptoms for this disease. The 4 diagnostic tests, blood exam, sputum exam, chest X-ray and physical exam, that may be performed on the patient if the disease is diagnosed, are contained in the slot named diagnostic-test. The 7 symptoms and 4 diagnostic-tests of this disease makes a grand total of 11 slot values to be investigated before the disease can be diagnosed.
228
A. Kar ef al.
(defschema Bronchial-Asthma (is-a lung-disease) (probable-symptom (slot-how-many multiple-values) dyspnea chronic-cough respiratory-distress audible-wheezes tachypnea tachycardia chest-pain) (total-symptoms 7) (diagnostic-test (slot-how-many multiple-values) blood-exam sputum-exam chest-x-ray physical-exam) (grand-total 11) (probable-treatment (slot-how-many multiple-values) Adrenergic-agent-for-treating-acute-attack Theophylline-for-long-term-continuous-therapy Corticosteroids-if-all-treatments-have-failed Disodium-chromoglycate-for-maintenance-therapy)) Fig. 3. Example of a disease schema in ART.
If a patient has all 7 symptoms and the 4 tests confirm the disease, then the disease will be diagnosed with 100% probability. The treatment that will be recommended for the disease is listed in the slot probable-treatment. BronchialAsthma is thus defined as a schema. Note on the second line of Fig. 3 that the schema is-a lung-disease. This is-a relationship provides a linkage between this schema and a general schema for lung-disease declared elsewhere in the program. This means that Bronchial-Asthma inherits all properties associated with a lungdisease. Inheritance transfers characteristics from one schema to another. With an inheritance relation, information originally defined in one schema can be ‘copied’ and used in a related schema (Clayton, 1985a). Thus, information that is true for all lung-diseases can be added to the general lung-disease schema once, and it will automatically propagate along inheritance links into the schemata of all lungdiseases. In PULMONOLOGIST, the facts drive the rules in that a rule is fired only when the required situation arises. In the prototype system, the rules are written in an IF/THEN format. The IF portion (Left Hand Side) of a rule contains several different conditions which must be satisfied before the rule can be executed. Manipulation of the facts and schema slots take place in the THEN portion (Right
Diagnosis by PULMONOLOGIST
229
retract-confirmed ?x <- (dis-total ?d) ?y c- (disease-retracted ?r)
(defrule
(schema ?disease (total-symptoms ?total) (symptoms-absent ?abs&:(((?abs*l
OO)/?total) > 50)))
=> (retract ?disease) (retract ?x ?y) (assert (disease-retracted =(?r + 1))) (assert (dis-total =(?d + 1)))) Fig. 4. Example of an ART rule.
Hand Side) of a rule. Expert system software such as ART allows the user to create variable or generic patterns, which can match several different patterns (facts) in the knowledge base. Figure 4 gives an example of a rule which retracts a disease from the working memory (the current state of the knowledge base), if the patient exhibits fewer than 50% of the total-symptoms of a particular disease. In ART, a variable name has a question mark (?) associated with it. Thus, in Fig. 4, the variable ?x is assigned the pattern dis-total ?d, where ?d corresponds to the number of diseases investigated so far in the diagnostic process. The variable ?y is assigned a pattern in which ?r is the number of diseases that have been deleted from the current state of the knowledge base. The ?disease schema pattern can locate any disease schema with a matching pair of variables ?total and ?abs, where ?total denotes the total number of symptoms of that disease, and ?abs refers to the number of symptoms of the disease that were not displayed by the patient. This rule will fire only if (?abs*lOO)/?total is greater than 50, and all other patterns on the left hand side of the rule are matched. When the rule is executed, the disease that matched the schema pattern is retracted from the current state of the knowledge base. The patterns that are bound to the variables ?x and ?y are also deleted. Since one disease is retracted by the rule, the number of diseases retracted is incremented by one. Also, the number of diseases investigated so far by the reasoning process is increased to ?d + 1. The current prototype expert system has 15 disease schemata in the knowledge base, but it can easily be extended by increasing the number of diseases in a format similar to that of Fig. 3. Rules were written using generic patterns, which may match several different patterns in the knowledge base. These rules work for any number of diseases contained in the knowledge base. Thus, if the knowledge base is extended, the rules do not have to be altered, and the system is expected to provide accurate diagnoses as before.
230
A. Kar er al.
The Diagnostic Process
In medical expert systems, a diagnostic process may be divided into four general areas as follows: (1) communication of information about the patients to the system; (2) comparison of patient information with the available medical knowledge base; (3) diagnostic decision-making by the rules; (4) treatment of patient recommended after the final diagnosis. Initially all 15 disease schemata in PULMONOLOGIST are considered as potential diagnoses and are activated in working memory. PULMONOLOGIST begins the diagnostic process by asking the user about the symptoms displayed by the patient. The user responds to these questions with a ‘yes’if the patient displays the symptom, ‘no’ otherwise. The rules in the knowledge base are marked with the symptoms present as confirmed-symptom, and the symptoms absent as negativesymptom. If a patient has less than 50% of the probable-symptoms of a disease, then that disease schema is deleted from the potential list of possible diseases. This deletion takes place in the current state of the knowledge base, the initial state remains unaltered. If all the symptoms of a certain disease are displayed by the patient, then all other disease schemata are retracted from the current state of the knowledge base, and the system diagnoses the disease with a 100% probability. If a patient has more than 50% of the symptoms of a disease, then a confirmation level of the disease is calculated. For example, if 5 out of 7 symptoms of a disease are present in a patient, the confirmation level of that disease will be (5/7) * 100, i.e. 71%. As soon as a disease with a higher confirmation level is found, the other diseases with lower confirmation levels are deleted from the current state of the knowledge base by the system. If two diseases have the same highest confirmation level, then both diseases are considered for diagnosis. When a disease is diagnosed based on the symptoms of the patient, the disease with its confirmation level is displayed on the screen, together with the diagnostic tests, suggested to be performed on the patient. The system queries the user about the diagnostic tests, and the user has to answer whether the required tests were performed or not. If the tests were performed, the user is asked whether each of the test results confirmed the diagnosis based on the symptoms. If the diagnosis was confirmed by the test result, then the confirmation level of the disease is increased, otherwise it is decreased. If the laboratory test was not performed on the patient, then that test is marked as a negative-test. At the end of this session, the diagnosed diseases are displayed with their probability, together with the probable-symptoms of that disease that were not displayed by the patient, the diagnostic-tests suggested to be performed in order to confirm the diagnosis, and the recommended treatment for that disease. If more than one disease is diagnosed, then the patient will most likely have the disease with the highest probability, but the laboratory tests have to be performed before
231
Diagnosis by PULMONOLOGIST
an accurate diagnosis can be made. If the symptoms do not match any of the diseases in the knowledge base, then this is reported to the user. The diagnosis system will be tested against clinical data in future studies. An Interactive Session with PULMONOLOGIST An example session begins by displaying the introductory window (Fig. 5) created by using the graphics capability of ART. Next, the user is questioned about the symptoms displayed by the patient in the following manner: Does the patient
exhibit DYSPNEA
(Y/N)?
The system lists all the symptoms that have been observed by the patient on a window (Fig. 6) as they are entered. Once all the symptoms have been entered, a diagnosis is made based on the symptoms using the schemata for the diseases in the knowledge base. For example, for the symptoms displayed by the patient as in Fig. 6, the system lists the two diseases diagnosed on the screen (Figs. 7 and 8). The final diagnosis is made based upon the laboratory tests that were performed, and treatment is recommended by the system (Fig. 9).
Welcome This
to PULMONOLOGIST
will diagnose and suggest
system
for 15 possible
Lung
Diseases.
The user will be prompted (2) the symptoms (2)
the diagnostic
treatment
for
exhibited tests
by the patfent
to be performed.
Each disease diagnosed will have a probability associated
with it, and the patient
most likaly will
have the disease with the hfghest probability. Fig. 5. Introductory
window
in PULMONOLOGIST.
A. Kar et al.
232
tymgtom
dlrpl8y.d
by the patlont:
OvSPmn CBWSRIC-m RESPIRRt~4ISTRESS zsc?ErzEs TRCIWCRRRIR RRPI04ESPfRRfxm PROTURERRNT-~oolr#I
~~W'ROTEIRERIR RRERI~
Fig. 6. Symptoms displayed by the patient.
OIREAW OIAmosE0 OISERSE OIRSNOSEO
I
Bv THE SYSTEM :
CVSTIC-PIRROSIl) 8SR ~ILITV
??
Por~ora dl~ostio toas a ~IL~~I~REORTSPNRRESIS-SMERT-TEST
Fig. 7. First disease diagnosed by the system.
Diagnosis by PULMONOLOGIST
OISEASE OIAGNOSEO SY THE DISERSE
OIROXOSEO
Porforr dlagnostle BLOW-EXllll
233
SYSTEM :
a RRO#UCMIRL-RSTMNR
ast
?? RoDm1l.11T
Costs
x
Fig. 8. Second disease diagnosed by the system.
Enhancements
A proposed enhancement to the current system involves extending the initial state of the knowledge base by increasing the number of lung diseases. Also, the system need not be limited to one type of disease. Other forms of disease such as heart disease, can be added to the knowledge base without much difficulty. Another extension to the existing system would be to implement a natural language capability (Roach et al., 1985). This will involve building a ‘language-understanding’ system that would classify possible ‘words’ according to contextual and grammatical rules (Harmon and King, 1985). This will allow the user to interact with the system by means of ordinary language. A third enhancement could be the addition of an explanation facility. This would give the user the freedom of asking why a rule was fired or how a conclusion was reached in the diagnostic process. The system would respond to such queries and explain its reasoning process. This will make the consultation process more acceptable to the user, and will help the human expert find errors in the system’s reasoning process when they occur. Lastly, the current system could be converted to a rule-based system, by removing all the disease schemata and incorporating IF/THEN rules, such as if patient has symptom1 and symptom2 and symptom3 then patient has disease1 with x% probability.
234
A. Kar et al.
DIAGNOSIS
DISEASE
OIllCnOSEO:
AN0 TREAT~NT
CVSTIC-FIBROSIS IS 90% PUUSRME.
lha
SVN?TONS absent are I OI~ETIS Suagostod TREIITIIEIIT : AOEQUAIE-PROTEIN-INfIII(E NULTfUIlA(lIN-TMLETS CnEs1-PwsxC#-fnERnPt
OISEIISE OIIGNUSEO 1 BRONCHIM-RSTHM IS 9S1 ?ROSHLE. The SVWTOITS absent arm : CHEST -rnf n Suggostod TRERTHNT : AORENERGIC-#NEWT-FOR-TREITINC-ACUTE-llTTMT TWEOQ~LLIIIE-FOR-COIIC-TERM-COIITIIWWS-T)(EHVT CURTICOSTEROIDS-IF-ML-TRERTNEHTS-MUE-FRILEO OISOOIW-CHROMUCLtCllTE-FOR-NAIMTE?lRNCE-TnRrrPt
Fig. 9. Final diagnosis and treatment suggested by the system.
The system must ask the user to input the symptoms of the patient in the first step of the diagnostic process. After the user inputs all the symptoms of the patient, an appropriate rule will be executed, and the required diagnosis and treatment will be made. This will save the user a lot of time by not having to answer ‘no’ for most of the symptoms that the patient does not have, and will make the system more efficient. Conclusions The emergence of computer-aided tools that support knowledge engineering makes it practical for software engineers without extensive artificial intelligence training to build and maintain knowledge based expert systems. Such systems may gain widespread utilization as they become operable on common personal computer systems as opposed to their current use on large minicomputers. Expert systems can never take the place of an expert physician in patient diagnosis, but
Diagnosis by PULMONOLOGIST
235
can serve as a valuable aid in identifying complex diseases as well as a useful training device. References Clayton, B.D., 1985a, ART Programming Tutorial, I, II, III, Inference Corporation, Los Angeles, California. Clayton, B.D., 1985b, ART Reference Manual, Inference Corporation, Los Angeles, California. Duda, R.O. and Shortliffe, E.H., 1983, Expert systems research, Science, 220, 261-268. Harmon, P. and King, D., 1985, Expert Systems: Artificial Intelligence in Business, John Wiley & Sons, Inc. Hayes-Roth, F., Waterman, D.A. and Lenat, D.B., 1983, Building Experr Systems, Addison-Wesley Publishing Company. Myers, W., 1986, Introduction to expert systems, IEEE Experf, 1, 100-109. Roach, J., Lee, S., Wilcke, J. and Ehrich, M., 1985, An expert system for information on pharmacology and drug interactions, Camp Bio Med, 15, 1l-23. The Merck Manual, 1982, Merck & Co. Inc., Rahway, New Jersey.