Mathematics and Computers in Simulation XXIV (1982) 425-429 North-Holland Publishing Company
425
COMPLEX BIOLOGICAL MODELS: THEIR CONSTRUCTION AND EFFECTIVE USE David GARFINKEL Moore School of Electrical Engineering, University of Pennsylvania, Philadelphia, PA 19104, U.S.A.
The need for complex models in biology, and reasons such as esthetics and tradition for resistance to their use, are discussed. Examples considered are an existing cardiac metabolism model, and a possible comprehensive and complex diabetes model, with their associated problems. Characteristic problems of complex modeling include how complex a model should be, combination and interaction of models from different sources, the kind of underlying information involved, and the usual problems of model credibility. Specific problems of biological modeling include: effective interaction with experimenters to secure timely solution of currently important problems, (e.g., by experimental design), and effective interaction within the community of biological modelers.
In most current biological modeling efforts experimenters in appropriate disciplines, such as pharmacokinetics, construct and fit relatively simple models to represent one or more sets of experimental data they have obtaimea. Such experimenters may not claim to be building models, and may not even be aware that they are doing so. Rather, they feel they are reducing or calculating out their experimental data. Other experimenters may deliberately build more complex models and use them to design experiments as well as interpreting results, particularly the results of multiple experiments. At the other extreme methodologically are complex "synthetic" models, which combine enough information that it must necessarily come from more than one source, and might even involve more than one modeler. We are here primarily concerned with the construction an n use of such complex models, including their relationship with simpler models and with ex-periment. The Need for Complex Models If most biological models are simple, why are complex ones needed? Most of biology is inherently corple~. Often complex models are built to assist in studying biological systems where (i) A complex system cannot be coll~nsea or lumped into a simple one without losin~ information or structure. (2) Many subsystems, or unit structures, or other constituent elements are known to exist (and may have been well studied in iso!ation). Here it is necessary to examine or check such elements under the relevant conAi -tions to determine their significance or insignificance, rather than neglecting them for the sake of convenience. A claim that complex models (or complexity in general) are necessary is likely to be esthetically unappealing. DeMillo et al 1, in studying the acceptance of mathematical theorems, found that most mathematical theorems that ar~
proved are subsequently neglected for esthetic reasons, primarily because they are complex. Similar considerations enter into modeling as well as many other human activities: a simple, neat "solution" is preferred to a complex and messy one, and the preference may be strong enough to override a considerable lack of validity. However, complex and messy situations do exist and must be understood. Inherently complex systems should not be misrepresented through simplification just because this would be esthetically pleasing. An uncritical search for simplicity can lead to endless disputes over the relative merits of different simplifications. In modeling "Occam's razor" is as applicable as in other scientific activities: of two possible alternative explanations for a given body of facts, choose the simpler. However, this does not justify "simplifying" the body of facts. This situation was well summarized by Einstein: "Everything should be made as simple as possible, but not simpler". How complex Should a Model Be If complex models are necessary how complex should they be? This issue deserves and has received, serious consideration. There is widespread feeling that a complex model ~rith many parameters can be manipulated to the builder's taste: "You can fit an elephant to a straight line if given enough variable parameters." This sentiment appears to have been expressed most strongly by those who have never tried it, and it is not cormnonly modified to allow for the number of constraints on those parameters. Complexity, large numbers of parameters, and freedom to maneuver are not synonyms: a variable parameter may be the simplest possible representation of e.g. some model subsystem that would otherwise have to be represented in detail, increasing both model complexity and the modeler's freedom to manipulate it. Perhaps the principle being properly expressed here is that
0378-4754/82/0000 0000/$02.75 © 1982 IMACS/North-Holland
426
D. Garfinkel / Complex biologicalmodels
a model should be well determined by the data it fits, and have a minimal number of degrees of freedom. This implies that complex models should be based on or determined by large bodies of information, rather than that they cannot be reliable. Conversely a simple model which fits only one set of data is somehow less powerful than a more complex one which fits several sets of data obtained under different conditions. It must be noted, however, that simplification in some particular situations may be desirable and valid. Additional factors affecting how complex a model should be include: (i) The purpose to which the model is to be put. (2) The disciplinary area involved. Biochemists, physiologists, and clinicians may regard as desirable different degrees and kinds of complexity when considering the same subject. Biochemists will often be concerned with a relatively small subject area under localized conditions in great detail, physiologists with a broader subject area in less detail, and clinicians more with how well (and reliably, in the sense of making few bad mistakes) the model predicts variables important in patient care rather than those explaining the underlying phenomena. These differences in expectations among the several disciplines mentioned do cause difficulty. (3) The extent to which the structure of a model is fixed by information other than the data to which it is being fitted (e.g., a structure may have been determined by other means). Simplifying a model too much may cause loss of much of the structure. Even inaporopriate linearization of a nonlinear situation may cause difficulties. (4) The complexity of a model may be limited by problems in cormnunicating, displaying, or analyzing the results it yields - perhaDs in very large volume. (A pile of printout pa~er is not a satisfactory outcome of the modelin~ process, even if it inherently contains much valuable information.) Extreme viewpoints on complexity in the diabetes2modeling literature are those of Albisser et a Z and of Bergman-: "If we look back in the past, we find that many of our pr~ecessers endeavored to simplify physiology ..... mathematical models have to consolidate knowledge. If they don't consolidate knowledge, they ~re approximations which probably won't work" . "Large models of endocrine systems have ~roven useful to explain certain phenomena .... The application of models for purposes of clinical diagnosis, however, requires the identification of system parameters. For this purpos~ greatly simplified system models are required ''~. A Specific Example: Cardiac Metabolism and the Role of M$ ~- Ion The author has been involved in the construction of a series of similar but not identical complex models of
cardiac metabolism (e.g~) whose detailed description is not appropriate here. These represent cardiac metabolism in terms of 60 to 80 enzymes under a variety of physiological and pathological conditions includin~ anoxia and ischemia. These models are based on some thousands of data points and literature values, have their structure determined by the biochemical textbooks, and are mathematically represented by several hundred simultaneous nonlinear differential equations. Variations of these models may be needed to represent disturbances of cardiac metabolism due to diabetes (since diabetes changes the enzyme composition and phosphorylation state). Only one specific point regarding this cardiac metabolism will be descri~$d here: the modeling of the behavior of Mg--, which is a significant regulator of this metabolism. Recently Wu et al claimed that it is not a metabolic regulator, as the level they determined in cardiac muscle is too high. The work of Wu et al was a~ attempted refutation of ~ a t of Gupta and Moore-, who had determined a Mg- level in skeletal muscle about a factor of 5 lower ~ r e in line with that calculated ~$th our models ). In both cases the level of Mg- in muscle had supposedly been determined by nuclear magnetic resonance techniques, with some ancillary measurements. Careful examination (including considerable calculation) of these conflicting papers showed that the basic experimental measurements were much the same, almost to within experimental error, and that the ancillary measurements were not different enough to account + for the different results. Determining the Mg 2 level from the measurements requires mathematical modeling. The divergence in these results arises primarily in the process of modeling the experimental measurements, and some of this modeling- is of poor quality. Examination of the other literature on this subject in muscular and other tissues shows a wide range ~ o r e than an order of magnitude) of reported Mg levels obtained by various techniques. The use of a complex model at ~$ast permits cross-checking these reported Mg ~ levels against other kinds of information in a way not possible by experiment only. 2$hree points emerge from this examination of Mg : (i) In performing our modeling, it was necessary to draw conclusions about what still appears to be an important subject from what is best described as contradi~$ory and uncertain information so that the Mg- level cannot be considered accurately known. Furthermore, it is not obvious that this information is being rendered less uncertain by the routine performance of additional ~xperiments. (2) Mg 2- level in mammalian tissues generally cannot be measured directly, so that the process involves indirect measurement and modeling. At least some of the difficulty here is due to poor modeling performed by the experimenters.
D. Garfinkel / Complex biological models
(3) The design of experiments and handling of information regarding Mg--in such related fields as enzyme kinetics is routinely done poorly~ e.g. by routinely doing experiments with the Mg- level higher than even the largest reported "physiologic~l" value. To quote from a review by Morrison: "It is unfortunate that studies on many metal-activated enzymes, especially those subject to allosteric activation or inhibition, have been undertaken using conditions that preclude interpretation of the data." A Specific Complex Modelin$ Need: Diabetes An example of an apparent need for a complex model (or family of models) is posed by diabetes. "Diabetes" actually describes a family of perhaps 30 diseases involving the hormone insulin and its effect on metabolism, as by causing aberrations of blood glucose level and of the amount and activity of enzymes that the tissues make (e.g. by changing the phosphorylation state of enzymes). Its literature ~s very large and growing. It also has effects at the whole-body or patient level and causes secondary diseases, particularly those affecting the circulatory system. According to the diabetes literature, our present control of this disease family by medication leaves much to be desJredo A considerable modeling effort involving a fair number of workers and different model types h~s been performed (a large current~compilation is the book by Cobel!i and Bergman~.) Published diabetes - related models include: (i) Models of the release of insulin by the pancreatic islet cells ~ a t make it, e.g., Grodsky ~ and Porte and Pupo-~. (2) Models of insulin behavior, in the body as a whole, e.g. Sherwin et al ±. (3) Models of actions of individual organs which are important in the glucose regula .~ tory process, such as ~ e liver regulation model of Cramp and Carson--, and of major individual pathways involved, such as the gluconep~e~esis model of Achs, Anderson and Garfink~l ±j'14. (4) Models of the behavior of the ~lucoseinsulin system in the body as a whole, of vary-~ ing degrees of complexit~ ranging from the sim-5 ple one of Service et al 16 to complex ones like that of Guyton et a~ , with some tendency to cluster around the maximum number of v~riables, about four, that one can reasonably Het e r m i ~ by experiments on ~8Patient, e.g. Cer~si et al -, and Bergman et al- . There are as yet no published models representing behavior of receptors and transport carriers (e,g., the "down-regulation" of ipsulin receptors resulting from high blood insulin levels), nor are there representations o~ the naturally resulting hierarchical interactions (e.g., insulin sensitivity should be affected by the state of the insulin receptors but in the present state of the art it is simply a model parameter). The existing models generally do not permit us to calculate what effect a given change at the molecular level
427
would have at the whole-body level, or what a given whole-body phenomenon indicates must be happening at the molecular level. "Basic biochemical explanations are the Holy Grail of western medicine" but such basic explanations have not been adequately coupled to clinical management by existing models, nor do these models adequately represent al__~lthat is going on at one particular level, such as all influences that induce the B-islet cells to secrete insulin. A really comprehensive model of diabetes would not only be complex, but would probably strain existing modeling techniques. Additional development, primarily of a technical nature, appears required: (i) A very large amount of information would have to be incorporated, of sufficient variety and amount that no one person could be expert in all of it, and no one laboratory could provide all of it. It might be necessary to combine information from different research areas where the workers make different basic assumptions, work under different conditions, or follow different basic conventions (sometimes without documenting them adequately). Enough information would be involved to2~ose difficulties for the unaided human mind , as the cognitive psychologists have shown that there is a considerable limit on the amount of information we can handle simultaneously. (2) Such a model (or assembly of models) would have to be larger and more complex than existing biological models. Several tissues would have to be represented in the same (or greater) degree of Adetail as our cardiac metabolism model (e.g.~). (3) The processes involved would take place on time scales from seconds to years. To some extent this problem can be alleviated by separating processes that occur on different time scales. (4) If a model is to be clinically useful, it must withstand governmental (e.g. FDA) scrutiny, which implies adequate documentation and testing, and probably orderly feedback from users. It must also be operable by clinicians who are not computer experts, i.e., its implementation must be "friendly". (5) Such a model would be used for different purposes by different users. Effective use of such models for many purposes imposes requirements of timeliness: if a model is to be used to design tomorrow's experiment or treatment its behavior must be calculated and reported by that time. Conversely, such a model must be kent abreast of current developments (which means "easy to maintain and modify"). The comprehensive diabetes model described above is only one example of a possible very complex biological model. Other possible examples, which might impose different requirements on the modeling process, are detailed models of the body's immunological defense system, with its varied types of B and T anti-
428
D. Garfinkel / Complex biological models
genic cells2~nd their evolution,as described by Cebra et al -, or a cancer chemotherapy model involving several simultaneous treatments, or a detailed ecosystem model. Some Applications of Complex Models Once models of the type discussed above, and their properties and construction techniques are well established, they (or their submodels) can be used for a variety of functions: Assist experimenters in determining whether their conceptual and experimental models are numerically realistic and do indeed have the properties the experimenters think they have. Help design experiments, calculate (inc!u~ing indirect calculation of quantities that cannot be measured directly) and interpret tbe results (including resolution of disputes). It is particularly desirable to be able to design critical experiments if possible. Design optimal management methods. Help crosscheck information from different areas or levels of biology. Construction of sizeable models forces such cross-checking, which is often not done otherwise. Help in teaching the relevant subject matter. An example of a large model performing several of these functions is th~ cardiovascula~ ~o~_2 ~ el of Guyton and associates , which has been used to design and interpret experiments, and is now being distributed in teaching version ("HUMAN"). Required Technical Developments Construction of models of this order of complexity necessari .... ly assumes the availability of such standard modeling tools as differential equation solvers and continuous simulation languages (although further attention to the proper selection and combination of these tools is needed) and means of efficiently and conveniently performing other more or less standard calculations. Also needed= but less standard, are tools for interpretation, display, and communication of model behavior. Technical developments required to meet these needs and others indicated above include: (i) Construction and checking of complex biological models requires biological informc-tion, both quantitative and qualitative, ~ sufficiently large amounts to cause problems--. Biological expertise is normally involved in meeting these needs (the biological expert and the modeler may Se the same or different persons.) A biological expert can usually supply some, but not necessarily all, of the quantitative information needed for a model and not nec-essarily rapidly. Resolution of disagreements over data Ss described in connection with the muscle Mg 2 problem, may require explicit research rather than only the judgement of an expert (or several of them, acting as peer reviewers). Here explicit data bases are required but they need not be particularly large (as compared to business data bases). Specific problems requiring attention include the handlinB of "soft", contradictory, and incomplete infor-
mation of uncertain detailed structure, and extraction from it of complete but relevant information for the ongoing modeling with minimal delay, fuss, error, or manual motion. Models may fail because their structure is incorrect, or because the data to which they are being fitted is incorrect, incomplete, or incorrectly interpreted. More attention to interaction between explicit data bases and models is also needed. (2) Strategies and Modeling Practices: Initially the problem to be solved must be defined and matched to a model structure, so that the actual biological questions are answered as well as possible, and so that feasible experiments to test the model conclusions can readily be designed. A model should be most thoroughly based on these underlying facts that are best known. The most attention should be given to those elements of the model that most strongly affect its behavior, and those that have to be estimated or guessed at should be those that least strongly affect its behavior. Identity of these elements can be determined from the characteristic sensitivity structure of the model, which should be determinable by appropriate sensitivity analysis and does not change greatly during the course of construction. Plausible alternative models should be considered during construction and the best one among them chosen when possible (which includes the design of critical experiments to make this choice). (3) A complex model must be adequately documented. The model of Guyton and associates has been criticized as being limited in accessibility to users other than the builders because of limited documentation. This important task includes the description of what the model is, how it got to be that way (so that the reader could reproduce the model construction process), what it is based on, what its properties are, and what it predicts, how reliably. Part of this, but only part, will be done in the publication process. Generally accepted standards for this activity have ffot yet been established. This author has had the experience of having referees ask for considerably different documentation to accompany similar papers describing similar models. However, there is beginning to Re a literature on model documenta2 tion (e.g. -), which suggests thoroughness of documentation so great as to make an already difficult and unpopular task even more odious. Some means of computer assistance with the documentation therefore appear desirable. (4) Model credibility, which is largely a social problem, has received considerable attention in the modeling literature. Some technical credibility considerations that particularly apply to complex biological models are that a biological model must adequately fit the data on which it is based, and must be biologically realistic (models are often oversimplified for the sake of mathematical tractability, or to reflect engineering practice). It should also be robust, in the sense that a small change
D. Garfinkel / Complex biological models
in a structure or numerical value should not radically change its behavior. Required Social Developments Some of the problems which have been described require developments which are more nearly of a social than a technical nature. Many of these developments require some action by the community of biological modelers. Some of the problems mentioned arise from the strong empirical or experimentalist tradition among biologists which makes it difficult for them to admit that theory and models are relevant to their work, much less use them effectively. Weakening of this tradition is most often found when performance of additional experiments does not resolve a biological problem, and the experimenters start to look for assistance. An example was described by a recent visitor: An experimenter first encountering the full complexities of metabolic control may simply "bounce" - but if he bounces back (probably having acquired some respect for the complexity of his subject) he may then take more interest in applying modeling and other theory. Effective use of modeling in biology does require its effective interaction with experiment. Some required developments are the straightforward ones of setting standards and developing better communication in biological modeling. The preceding discussion of complexity in diabetes models indicates what is involved in setting complexity standards. Other types of standards are needed - what is good or bad modeling practice, what is acceptable or unacceptable, communicable or uneommunicable, how modeling should relate to experiment (which requires technical as well as social developments). As may happen when the workers in a given area of research have entered it from many different directions, the effective standards have too much of a tendency to be determined by the criterion "This isn't what I do - therefore it must be wrong". Cormnunication about simple models has. not been a difficult problem, but communication about more complex models certainly is. It takes more than one person to communicate, and development of a collective means for doing so is important. This is prerequisite to any substantial effort to combine or interface models from different sources, which would be important in modeling so large and complex a subject as diabetes. The process of combining m o d e ~ has been described in some detail by Norwich--. Also prerequisite to this process are effective means of checking for compatibility of assumptions and analysis as well as underlying data. Perhaps the most important action now required of the cormnunity of biological modelers is to admit that it is a community with common interests, and to b e g ~ acting accordingly. As pointed'out by Norwich-- (in a chapter subtitled " The Need for Worldwide Cooperation") "We modelers are individualists all, working independently --" - and certainly not benefiting
429
our subject area (or making life easier for each other) in the process. The volume of biological modeling now being done is sizeable, and growing. We are reaching the point where some form of collective action in defining and properly performing this activity is needed if it is to progress properly. Acknowledgements: Supported by NIH grants AM 19525 and HL 15622. Stimulating discussions with Dr. Michael C. Kohn are appreciated.
i. 2. 3. 4.
5. 6. 7. 8. 9. i0. ii.
12.
13. 14. 15.
16.
17. 18.
19. 20. 21.
22. 23. 24.
References DeMilIo,R.A., R.J.Lipton, & A.J.Perlis, Comm ACM 22 271 (1979) Alblsser,A.M., Y.Yamasaka, H.Broekhuyse, & J.Tiran, Ann Biomed. Engg. 8. 539 (1980) Bergman,R.N., Simulation Today, No. 54 p. 213, Aug. 1977 Kohn,M.C., M.J.Aehs, and D.Garfinkel, Am. J. Physiol. 237 RI53,RI59,RI67,RI74,RI81 (1979) Wu,S.T., Pieper,G.M., Salhany,J.M.and Eliot R.W., Biochemistry 2 0 7399 (1981) Gupta,R.K. & Moore,K.D., J. Biol. Chem. 255 3987 (1980) Morrison,J.F., AdvanceS in Enzymology 6 3 257 (1979) C. Cobelli & R.H.Bergman, eds., "Carbohydrate Metabolism", Wiley, N. Y. 1981 Grodsky,G.M., J. Clin. Invest.51 2047(1972) Porte,D.,Jr., & A.A. Pupo, J.Clin. Invest. 48 2309 (1969) Sherwin,R.S., K.J.Kramer, J.D. Tobin, P.A. Insel, J.E.Lilienquist, M. Berman, & R.Andres J. Clin. Invest. 53 1481 (1974) Cramp,D.G. & Carson,E.R. in Carbohydrate Metabolism, ed. C.Cobelli & R.N.Bergman, Wiley, N.Y., 1981, p.349 Aehs,M.J., Anderson, J.H., & Garfinkel, D. Computers & Biomedical Research 4 65(1971) Anderson,J.H., Achs,M.J. and Gar~inkel, D. Computers & Biomedical Research 4 107 (1971) Service,F.J., G.D.Molnar, J.W.Rosevear, E. Aekerman, L.C. Catewood, & W.R.Taylor, Diabetes 19 64# (1970) Guyton,J.R., R.O.Foster~ J.S.Soeldner, M.H. Tan, C.B.Kahn, L.Konez, & R.E.Gleason, Diabetes 27 1027 (1978) Cerasi,E., G.Fick, & M.Rudemo, Eur J. Clin. Invest. 4 267 (1974) Bergman,R.N., Bowden,C.R. & Cobelli, C. in Carbohydrate Metabolism, ed. C.Cobelli & R.N.gergman, Wiley, N.Y. 1981 p. 269 H.Rheingold & H.Levine, Talking Tech., Wm. Morrow & Co., N.Y. 1982 Garfinkel,D., J.Theoret. Biol. 96 3 (1982) Cebra,J.J., J.A.Fuhrman, P.A.Gearhart, J.L. Hurwitz, & R.D.Shakin, Recent Advances in Mucosal Irmnunity, K.W. Sell, ed. (in press) Guyton, A.C., T.C.Coleman, & H.J.Granger Ann. Rev. Physiol. 34 13 (1972) Gass,S.I., R.H.~.Jackson, L.S. Joel & P.B. Saunders, Comm. ACM 24 728 (1981) Norwieh,K.H. in Carbohydrate Metabolism ed. C.Cobelll & R.H.Bergman, Wiley, N.Y. 1981 p. 419