MATHEMATICAL AND COMPUTER ASSISTED P R O C E D U R E S IN C L I N I C A L DECISION MAKING Paul Phillip Sher, M.D.*
Abstract Numerous mathematical and computer assisted procedures have been developed and tested as aids in clinical decision making. With pressures to curtail unnecessary utilization of diagnostic tests, these models may play an increasingly important role. In practice, additional data may benefit the construction of mathematical models but may not necessarily benefit clinicians. With improvements in computer technology, laboratory medicine is in a strategic position to inltuence the direction of diagnostic testing and cliifical decisions in a more cost effective manner.
In the past decade, developments in microprocessor and computer technology have provided easy access to computational power once available only in large costly systems. This potential has helped expand interest in mathematical and computer assisted strategies to improve the utilization of laboratory data and provide an aid to clinical decision making. There is a growing realization that the volume of data generated by the clinical laboratory conceals important information or potential patterns and associations, which could be valuable in the diagnostic process if they did not go unrecognized. Improving the information content o f laboratory tests has been dealt with in a variety of models. 1 "Fhe use of mathematical and computer strategies to evaluate tests has two major goals: to make possible more accurate decision making and to provide an internal laboratory tool to determine appropriate testing strategies and reduce the redundancy of tests ordered. There are three basic areas in which computers can provide help: collecting data from patients, differential diagnosis o f possible diseases (computer assisted diagnosis), and direct analysis of test results. It is with the third area, the direct analysis of test results, that the laboratory is most concerned. It is generally recognized that there is more information in the multidimensional expression of data. Yet providing increasing amounts o f data does not necessarily help the clinician. Evidence exists that there is a limit to the amount of information that a
clinician can handle before it has a deleterious effect. DeDombal ~-demonstrated that clinicians can be overloaded with information, so that the provision of additional "statistically valuable information" confuses rather than helps. The computer on the other hand can handle increasing amounts of information without overload. There are numerous examples of techniques to reduce the amount of information to either optimal subsets or optimal diagnostic strategies? An example of optimal strategy could be parallel versus sequential testing? Others have approached the problem by developing extensive computer assisted diagnostic modules incorporating the necessary mathematical and statistical algorithms into easily used computerized models? Numerous authors have addressed the theoretical considerations in developing and using computer aided diagnosisk t~ I would like to review the diagnostic models that have been applied to this problem and review some of the current problems, and those that will have to be addressed in the future, if we are to achieve-more effective utilization o f laboratory tests. DIAGNOSTIC
MODELS
There are a number of diagnostic models, which can be classified by the particular mathematical, statis tical, or decision theory approach taken (Table 1).
Accepted for publication February 15, 1980. *Associate Professor of Clinical Pathology,New York UniversitySchool of Medicine. Director of Clinical l.aboratories, Department of Pathology.University ttospltal. New York UniversityMedicalCenter, New York, New York.
420
HUMAN PATHOLOGY-- VOLUME 11, NUMBER5 September1980
MATHEMATICAL T A B L E 1.
DIAGNOSTIC MODELS
I. Models of physicians' thought process 2. Models ot" physic:d processes 3. Statistical model approaches Clustering techniques and numerical taxonomy Simple Ba)'esian models Sequential Bayesian models Discriminant function analysis 9t. Logical approaches Flow charts Sequential methods Decision trees
Models of Physicians' Thought Processes Tiffs has not been a well studied area in medical diagnosis. Physicians' thought processes have been approached as being too complex. Attempts to organize the clinician's thought processes into a workable /tiagnostic scheme can be a difficult task. One such approach is Internist, a computer based consultative system that attempts to mimic a "master internist. ''tt
Models of Physical Processes These models attempt to use information relating to the physiological inter-relationships in disease.t_~,~3 At present they do not offer a general approach, but are uniquely specific for a given problem.
AND COMPUTER ASSISTED PROCEDURES-SnER
where P denotes the probability of occurrence, S represents all data (symptom, sign, laboratory tests), D represents a disease, and the notations P(D/S) and P(S/D) signify the probability of D g!ven S and o f S given D, respectively. P(S/D), P(D), and P(S) should be derived from subjective estimates. Simple Bayesian approaches allocate each complete set o f data to a disease class. 23-~6 Sequential Bayesian approaches develop a diagnostic tree search in which the greatest probability (informativeness) directs the selection o f the next step. 27"-~9 In some schemes, informativeness can be replaced by utility or cost to the patientY "3~ Bayesian approaches have received much attention and they represent almost 60 per cent of all algorithms in the literature? In spite of criticisms, Bayesian models are popular because they are statistically "robnst" and easily applied. They are efficient users of computer time and can even be set up to operate on a pocket calculator? ~ Some basic problems with Bayesian approaches are that: 1. Prior probabilities may differ widely both geographically and from institution to institution. 2. Conditional probabilities in certain circumstances may result fro m subjective estimates that are less accurate than survey data, and unusual symptom complexes may be difficult to handle correctlyY 3. Multiple diseases in the same patient can distort the results of Bayesian probability. Diseases should be mutually exhaustive and exclusive (independent). Models can be developed that allow for dependency among t h e variables used. as
Statistical Models CLUSTERING TECHNIQUES. In an attempt to define clinically homogeneous and useful groups, each patient is compared with every other patient, and clustering methods are employed to groups such that in any given cluster, patients resemble each other more than members of other clusters. This approach does not ]lave usefulness in a real-time decision making system but is of value in developing taxonomic classifications of diseases and in evaluating the response to therapy) 42~ A modification of this approach, with greater potential in analyzing multiple test results, is multidimensional spatial analysis, which provides a mechanism for quantitating sinlilarities and differences between groups of individuals on the basis o f multiple inter-related parameters.-Ot
Bayesian Probability Models Over 200 )'ears ago Thomas Bayes~ proved a flmdamental theorem in conditional probability. His ntathematical concept allows the use of an a priori probability of diseases and conditional probabilities o f signs, symptoms, or laboratory tests to calculate the posterior probability. T h e basic form of Bayes' theorem can be stated as: l'(I)/S) = I'(D) I'(S/D) P(S)
DISCRIMINANT FUNCTION ANALYSIS Originally developed by Fisher 34 for taxonomic classification and introduced by Zieve and coworkers ~5'~6 to laboratory diagnosis, discriminant analysis techniques have been widely applied ill diagnostic models as well as in a laboratory and in evahtating the usefulness of individual tests applied simply or in groups? 74~ Ill muhivariate discriminant analysis, linear fllnctions (equations) are computed from the test values and tested to separate two or more disease groups in an optimum manner. I f the individual tests are introduced in a stepwise m a n n e r (stepwise discriminant analysis), the discrimination of individual tests and combinations of tests can be evaluated and redundant tests isolated. From a laboratory perspective, this allows for the evahmtion of the contribution o f individual tests and for the development of an appropriate testing repertoire to maximize discrimination. Limitations and restrictions of the use of discriminant analysis are sometimes overlooked. LachenbruclP ~ reviewed the misuses of discriminant analysis. He found that problems fell into four areas: study goals unfocused, improper salnpling procedure, assumptions violated, attd variables poorly defined or selected. T h e major statistical assumptions that are often neglected are that there are no missing
421
IIUMAN PATHOLOGY--VOLUME 11, NUMBER 5 September 1980 o r e r r o n e o u s data, the s a m p l e size m u s t be large e n o u g h to allow for reliable estimates, the observations to be allocated by the discriminant function m u s t c o m e f r o m one o f the p o p u l a t i o n s f r o m which the l e a r n i n g samples a r e d r a w n , a n d the distribution o f the data m u s t be multivariate n o r m a l (Gaussian) with equal covariance matrices. A new a p p r o a c h utilizing discriminant analysis deserves special note. R o b e r t s o n a n d colleagues*-' have applied discriminant function analysis in following biochemical individuality. Linear d i s c r i m i n a n t fnnctions on individuals can identify f u t u r e specim e n s for the same individual with 90 p e r cent accuracy o v e r a two year follow-np period. In addition, they p r o d u c e d graphic r e p r e s e n t a t i o n s ( c o m p u t e r d r a w n faces or nonlinear m a p p i n g ) . T h e a u t h o r s conchlde that subtle differences a n d similarities in graphic facial r e p r e s e n t a t i o n s m a y e n h a n c e the usefnlness o f multivariate data.
LOGICAL A P P R O A C H E S Basically w h e t h e r tlow charts, sequential questionings, or decision trees, the logical a p p r o a c h e s tend to mimic the classic differential diagnosis schemes with decision n o d e s and binary branches. Most o f o u r laboratory testing strategies a r e c u r r e n t l y based on these a p p r o a c h e s . In m o r e soplfisticated a p p r o a c h e s , both probability attd utility can be incorp o r a t e d into the tree structure. *~
VALIDITY OF MATHEMATICAL MODELS T h e advantages a n d d i s a d v a n t a g e s o f the various models should not be o u r p r i m a r y concern. C r o f W c o m p a r e d the most c o m m o n mathematical-diagnostic models, utilizing an extensively d o c u m e n t a t e d set o f cases o f liver disease with 2428 patients constituting 20 diseases. Each case h a d 50 attributes. Since m o d e l selection constitutes the most difficult aspect o f the utilization o f these techniques, o n e m u s t ask, H o w i m p o r t a n t is the p a r t i c u l a r mathematical-statistical m o d e l selected? T h e a u t h o r studied the s t r e n g t h s and weaknesses o f 10 models utilizing a training p o p u l a tion (1991 patients) a n d a test p o p u l a t i o n (437 patinnts). All models p r o d u c e d 51 to 64 p e r cent correct diagnoses with all diseases a n d attributes. T h e attthors concluded that "All the models p r o d u c e similar diagnostic results, indicating that the d e v e l o p m e n t o f increasingly m o r e sophisticated models o f this type m a y be a fruitless exercise. T M
F U T U R E A C C E P T A N C E A N D USE OF D I A G N O S T I C MODELS In o r d e r to derive m a x i m u m benefit f r o m these a p p r o a c h e s , clinical l a b o r a t o r y m e d i c i n e a n d allied
422
fields m u s t work t o w a r d o v e r c o m i n g serious deficiencies in o u r p r e s e n t medical "systems." T h e full utilization o f the potential o f m i c r o p r o c e s s o r - c o m p u t e r technology requires a c o m m i t m e n t to i m p r o v i n g the standardization o f medical t e r m i n o l o g y a n d data as well as an integration o f health i n f o r m a t i o n systems with diagnostic m o d e l s and p r o b l e m solving techniques. I n addition, we m u s t be aware o f the difficulties in gaining acceptance o f these a p p r o a c h e s by physicians. B r a n s c o m b 4s points to the fact that in o u r p r e s e n t c o m p u t e r revohltion, h u m a n s m a y not be p r e p a r e d to interact with a c o m p n t e r and that continn e d h u m a n factors r e s e a r c h will be necessary to enhance human-machine communications. We are now e n t e r i n g a new age: the i n f o r m a t i o n age. Data wiU be the uatttral resource a n d i n f o r m a tion its refined p r o d u c t . T h e conservation o f o u r resources may be viewed in this analogy as the optinfizing o f the data n e e d e d to p r o v i d e useful information. T h e m a t h e m a t i c a l and c o m p u t e r assisted strategies we d e v e l o p and test now can play an i m p o r t a n t role in the direction o f l a b o r a t o r y medicine in the future. REFERENCES 1. Wagner, G., Tautu, P., and Wolber, U: Problems of medical diagnosis--a bibliography. Meth. Inform. Med., 17:55, 1978. 2. DeDolnbal, F. T., Horrocks, J. C., Staniland,J. R., and Guillou, P.J.: Pattern recognition: a comparison of the performance of a computer-based system. Meth. Inform. Med., 11:32, 1972. 3. I)eDombal, F. T.: Computer-assisted diagnosis. In Whitby, L. G., and Lutz, W. (Editors): Principles and Practice of Medical Computing. Edinburgh, Churchill Livingstone, 1971. 4. Taylor, T. R.: Computer guided diagnosis. J. Roy. Coll. Phys. Lond., 4:188, 1970. 5. DeDombal, F. T., Leaper, D.j., Staniland,j. R., Horrock,J. C., anti McCann, A. P.: Computer-aided diagnosis of abdominal pain. Brit. J. Med.,2:9-13, 1979. 6. Gorry, G. A., and Barnett, G. O.: Sequential diagnosis by computer. J.A.M.A., 205:840, 1968. 7. Lnsted, L. B.: Introduction to Medical Decision Making. Springfield, Illinois, Charles C Thomas, 1968. 8. Croft, D. J.: ls computerized diagnosis possible? Comput. Biomed. Res., 5:351, 1972. 9. Gorry, G. A.: (~.omputer-assisted clinical decision making. Meth. Inform. Med., 12:45, 1973. 10. Fisher, L., Kronmal, R., and Diehr, P.: Mathematical aids to medical decision-making. In Schuman, L. J., Speas, R. D., and Young, J. P. (Editors): Operations Research in Health Care. Baltimore, Johns Hopkins University Press, 1975, p. 365. 11. Glitz, W.: University of Pittsburgh computer helps ph)sicians diagnose disease. Science and Heahh Report 195. Bethesda, Maryland,. Department of tteahh, Education and Welfare, National Institutes of ttealth, 1978. 12. Lively, W. M., Szygenda, S. A., and Mize, C. E.: Modelling techniques fur medical diagnosis. I. Heuristics and learning programs in selected neonatal hepatic disease. Comput. Biomed. Res., 6:393~t10, 1973. 13. Mize, C. E., Lively, W. M., and Szygenda, S. A.: Modelling techniques in medical diagnosis, l I. Differential diagnosis of neonatal hepatitis and biliary atresia. Comput. Biomed. Res., 9:239-245, 19713. 14. Manning, R. T., and Watson, L.: Sigqs, symptoms and s)stematics. J.A.M.A., 198:1180, 1966.
MATHEMATICAl. 15. Hayhoe, F. G.j., Quaglino, D., and Doll, R.: The cytology and cytochemistry of acute leukemias: a study of 140 cases. Medical Research Council Special Report Series 30-t. London. tter Majesty's Stationery Office, 1964. 16, Baron, D. N., and Frazer, P. M.: Medical applications of taxonomic methods. Brit. Med, Bull., 24:236, 1968. 17. Jones, J. II.: The application of numerical taxonomy to the separation of colonic inflarnmatory disease. In Rose, J. (Editor): Compt,ters in Medicine. Dorchester, J o h n Wright and Sons, 1972. 18. Zinsser, H., Bonner, R., Lemlich A., and Roots, L.: Pyelonephritis: a stud)' of a disease in depth. Proceedings of the Fourth IBM Medical Symposium. Endicott, New York, 1962, pp. 371-402. 19. Bouchaert, A.: Computer diagnosis of goiters. I. Classification attd differential diagnosis. J. Chron. Dis., 24:299, 1971. 20. Winkel, P.,Julfl, E., and Tygstrup, N.: The clinical significance ofclassifications ofcirrhosis. A comparison between conventional criteria and numerical taxonomy. Scand. J. Gastroenterol., 11:33, 1976. 21. Thompson, It. D., Jr., and Woodbury, M. A.: Clinical data representation ira ntuhidimensional space. Comput. Biomed. Res., 3:58, 1970. 22. Seal, H. L.: Bayes, Tltomas. In Kruskal, W. H., and Tanur, J. M. (Editors): International Encyclopedia of Statistics. New York, The Free Press, 1978, p.7. 23. Boyle, J. A., Greig, W. R., Franklin, D. A., ltarden, R. M., Buchanan, W. W., and McGirr, E. M.: Construction of a model for computer assisted diagnosis: application to the problem of non-toxic goitre. Quart. J. Med., 35:565, 1966. 24. Lodwick, G. S.: Solitary malignant tumors of b o n e - - t h e application of predictor variables in diagnosis. Roentgenology, 1:293, 1966. 25. Reale, A., Maccacaro, G. A.', Rocca, E., E'Intino, S., Gioffre, P. A., Vestri, A., and Motolese, M.: Computer diagnosis of congenital heart disease. Comput. Biomed. Res., 1:533, 1968. 26. Stern, R. B., KnilI-Jones, R. P., and Williams, R.: Clinician versus computer in the choice of I 1 differential diagnoses of jaundice based on formalised data. Meth. Inform. Med., 13:79, 197,t. 27. Warner, 11. R., Rutherford, B. D., and Houtchens, B.: A sequential Bayesian approach to history taking and diagnosis. Comput. Biomed. Res., 5:256, 1972. 28. Cobelli, C., and Salvan, A.: A medical record and a computer program for diagnosis of thyroid disease. Meth. Inform. Med., 14:126, 1975. 29. Knill-jones, R. P., Stern, R. B., Girmes, D. H., Maxwell, J. D., Thompson, R. P. H., attd Williams, R.: Use of setluential Bayesian model in diagnosis of jaundice by computer. Brit. Med. J., 1:530, 1973. 30. Gorry, G. A., Kassirer, J. P., Essig, A., and Schwarz, W. B.:
31. 32.
33. 34. 35. 36.
37,
38.
39.
40. ,tl. 42.
9t3.
9t4. 9t5.
AND COMPUTER
ASSISTED PROCEI)URES--SHvR
Decision analysis as the basis for computer aided management of acute renal failure. Am. J. Med., 55:473, 1973. Sherman, H.: A pocket diagnostic calculator progrant for computing Bayesian probabilities for nine diseases with sixteen symptoms. Comp. Biomed. Res.; I1:177, 1978. Leaper, D. J., Gill, P. W., Staniland, J. R., Horrocks, J. C., and DeDombal, F. T.: Computer assisted diagnosis of abdominal pain using estimates provided by clinicians. Brit. Meal. J., 4:350, 1972. Norusis, M. J., and Jacquez, J. A.: Diagnosis. 1. Symptom nonindependence, in mathematical models for diagnosis. Comput. Biomed. Res., 8:156, ! 975. Fisher, R. A.: The use of multiple measurements in taxonomic problems. Ann. Eugenics, 7:179, 1936. Zieve, L.: On interpreting variations in laboratory tests as illustrated by an iinalysis of liver fimction tests. Postgrad. Med., 35:A46, 1964. Zieve, L., and tlill, E.: An evaluation of factors inlluencing the discriminative effectiveness of liver function tests. I. The utilization of multiple measurements in medicine. Gastroenterology, 28:759, 1955. Werner, M., Brooks, S. II., and Cohnen, G.: Diagnostic effectiveness of electrophoresis and specific protein assays, evaluated by discriminate analysis. Clin. Chem., 18:116,. 1972. Ramsoe, K., Tygstrup, N., and Winkel, P.: The redundancy of liver tests ira the diagnosis of cirrhosis estimated b)' multivariate statistics. Scand. j. Clin. Lab. Invest., 26:307, 1970. Zieve, L., and tlill, E.: An evaluation of factors influencing the discriminative effectiveness of a group of liver function tests. III. Relative effectiveness of hepatic tests in cirrhosis. Gastroemerology, 28:785, 1955. Sher, P. P.: Diagnostic effectiveness of biochemical liverfunction tests, as evaluated by discriminant analysis. Clin. Chem., 23:627, 1977. Lachenbruch, P. A.: Some misuses of discriminant analysis. Meth. Inform. Med., 16:255, 1977. Robertson, E. A., VanSteirteghem, A. C., Byrkit, j . E., and Young, D. S.: Biochemical individuality and the recognition of personal profiles with a computer. Clin. Chem,, 26:30, 1980. Weinstein, M. D., and Fineberg, I--I. V.: Cost-effectiveness analysis for medical practices: appropriate laboratory utilization. In Benson, E. S., and Rubin, M. (Editors): Logic and Economics of Clinical Laboratory Use. New York, Elsevier, 1978, p.3. Croft, I). J., and Machol, R. E.: Mathematical methods in inedical diagnosis. Ann. Biomed, Eng., 2:69, 1974. Branscomb, L, M.: htformation: the ultimate frontier. Science, 203:1"t3, 1979. University Hospital 560 First Avenue New York, New York 10016
423