Ecological Modelling, 18 (1983) 171-186 Elsevier Science Publishers B.V., Amsterdam - Printed in The Netherlands
DEVELOPMENT
AND APPLICATION
171
OF DESIRABLE ECOLOGICAL
MODELS
W.G. CALE, Jr. Environmental Sciences Program, University' of Texas at Dallas, Richardson, Texas 75080 (U.S.A.)
R.V. O'NEILL and H.H. SHUGART Environmental Sciences Division, Oak Ridge National Laboratoo', Oak Ridge, Tennessee 37830 (U.S.A.)
(Accepted for publication 22 December 1981) ABSTRACT Cale, Jr., W.G:, O'Neill, R.V. and Shugart, H.H., 1983. Development and application of desirable ecological models. Ecol. Modelling, 18:171-186 The concepts of model validation and model realism are reconsidered in terms of the domain of applicability of a model. Ambiguity in both the application and the conceptualization of model validation is a problem for clearly assessing the usefulness of models. Model desirability (as well as the concepts of model adequacy and model reliability) is developed in a set theoretic context which is intended to reduce the ambiguity that often obstructs clear discussions of model validation. A set of criteria for defining desirable models is outlined and a list of principles for increasing model desirability is provided. It is shown that models of known adequacy can be used to reduce sample size in traditional statistical sampling procedures. A derivation using models and Bayes' theorem is provided as an example of such an application.
INTRODUCTION O v e r the last two decades, ecological m o d e l s h a v e g r o w n in i m p o r t a n c e for theoretical a n d a p p l i e d research. H o w e v e r , ecologists h a v e not yet agreed o n a n a p p r o a c h to building, analyzing, a n d e v a l u a t i n g models. O f t e n such issues are d e b a t e d within the n a r r o w c o n f i n e s of m a t h e m a t i c a l f o r m a l i s m . Schools of t h o u g h t are d i v i d e d into linear a n d n o n l i n e a r m e t h o d s , differential a n d d i f f e r e n c e e q u a t i o n s , s t o c h a s t i c a n d d e t e r m i n i s t i c m o d e l s (Wiegert, 1975a). R e c e n t l y , t h i n k i n g h a s c e n t e r e d o n such q u e s t i o n s as h i e r a r c h i c a l o r g a n i z a t i o n ( W e b s t e r , 1979) a n d general s y s t e m t h e o r y ( P a t t e n a n d Auble, 1981). Yet, it is clear t h a t m a t h e m a t i c s is not the central issue limiting the d e v e l o p m e n t a n d use of ecological models.
172
A far more important question is the evaluation of model performance. Borrowing terminology from other fields of simulation, ecologists attempt to "validate" their models. Validation tests the model against independent data and often fails. The inability to construct "valid" models has become a serious impediment to the acceptance and use of ecological models. Using terms and definitions inappropriate to the ecological problem continues to impede progress. This paper develops a conceptual framework within which ecological models can be evaluated in a relevant and rigorous context. A desirable ecological model is characterized by five criteria. A set of principles is presented to ensure that models become progressively more useful in ecological research. Measures of model performance are presented which assess the likelihood that model output accurately represents system behavior. Finally, a technique is given for integrating model results into field sampling programs.
Reconsidering Validation and Realism As usually stated, the objective is to produce models that are "valid" and "realistic". These terms are ambiguous and must be replaced with concepts that are more precise and quantifiable. The Venn diagram (Fig. 1) is taken from Mankin et al. (1977). The universe, P, contains all possible vectors of numbers. There exist points in P which correspond to the outcome of every possible ecological experiment or model simulation. The set S, represented by a circle, contains only those vectors that represent potentially measurable behaviors for the real ecological system. The set M contains all possible outputs from a model. Assuming the model contains some ecological insight, some outputs will match system behavior. These outputs are indicated by Q, the overlap between M and S. At the same time, there will be some system behaviors which are not simulated by the model (i.e. S - Q). There will also be model outputs which are unrealistic (i.e. M - Q). It is important to recognize that M (Fig. 1) represents all ecological models. By definition, models are homomorphic with the system (Patten, 1971) and a model represents only a portion of system behavior. There will always be system behavior that is not included in the model and there will always be unrealistic model behavior. Viewed in this light, the conventional use of the term "realistic" is ambiguous. Naively, the concept means that the model never produces unrealistic results, i.e. the set M - Q is empty. This is unlikely to be satisfied by any ecological model. Realism could also mean that the model is perfect, i.e. M=S--Q, which is impossible by the definition of a model. At the other extreme, realism may mean that the model produces some realistic behavior,
173
Fig. 1. The intersection of model output M with ecosystem state S is shown as Q. P is the universe of all vectors of real numbers. Redrawn from Mankin et al. (1977).
i.e. Q is not empty. This is an unsatisfactory criterion because it is satisfied even if the model matches only the data used to build the model in the first place. The greatest problem is that the term implies that the model is something more than a partial representation of the system. It implies that the model and the system are the same. The term " v a l i d " is also ambiguous. Conventionally, a validation test demonstrates that the set Q contains at least a single point. This is critical information which establishes that Q is not empty. However, the test says little about the size of Q, which is the information of real interest. If the validation test fails, it establishes that there is a single point in M - Q. Arguing from the definition of a model, we can assume that there are points in both Q and M - Q. Thus, the success or failure of a single validation test adds little to our understanding. Mankin et al. (1977) approached this problem with the concept of model usefulness. A useful model simulates all (or at least some) of the system behavior needed to solve a specific research problem. This concept permits us to redefine the validation problem in terms of a domain of applicability, i.e. the range of conditions for which the model simulates system behavior (Simulation Council, 1978). The term implies that there are some conditions for which the model is useful and others for which it is useless. The critical information is whether a model's domain of applicability is adequate for the task at hand.
174
Mankin et al. (1977) introduced measures which can be applied to the concept of domain of applicability (Table I). Adequacy is the size of Q relative to S, and reliability is the size of Q relative to M. Adequacy can be estimated as the number of times the model matches the system divided by the number of experiments one has attempted to simulate. Reliability can be estimated as the number of times the model matches the system, given the number of times the model has been run. The domain of applicability provides the tool needed to evaluate ecological models. Recognizing that ecological models are useful in some circumstances and useless in others allows us to avoid the ambiguities of "real" or "valid". However, the concept of a useful model focuses too strongly on specific research problems. The progress of ecology requires something more; a model should be applicable over a range of specific problems. Such a model can be progressively tested and progressively improved. The purpose of this paper is to suggest criteria for such "desirable" models and to outline a procedure for systematically improving adequacy and reliability. There is one additional point which must be made before we proceed. We assume throughout the following that a statisfactory method is available for establishing agreement between model output and experimental data. That is, some statistical test has been devised that permits the identification of a point in Q. Testing the null hypothesis that model output and data are equal is a significant technical problem. However, this problem does not bear directly on the purpose of this paper. We will simply assume that some defensible approach is available to the investigator.
TABLE I Operational definitions of model adequacy and reliability #(Q) Model Adequacy:
a=
Model Reliability:
g~
~(s)
~(Q)
nq .s "q
#(M) "m adequacy. F reliability. /~(Q) = a measure of hypervolume corresponding to points of agreement between model and experimental observation. ~(s) = a measure of system hypervolume. # ( M ) = a measure of model hypervolume. Hq number of observations in which the model and system agree. number of observations made on the system. F/s number of observations made from the model. n m = a
=
175
Criteria For Desirable Models
(1) A desirable model has a domain of applicability sufficient to cover a range of ecological problems. There will always be a place for specific-purpose models designed for a single study. However, the progress of ecology is enhanced by models that integrate knowledge across a wide range of conditions. These models lead to improved understanding as one proceeds through a sequence of studies. Some traditional models, such as the logistic model of population growth, do not fare well when judged by this criterion. The logistic model has been criticized by ecologists, but for the wrong reasons. The model is not unrealistic: it is quite realistic to describe population growth by an intrinsic rate of increase and a carrying capacity determined by the environment. It is not that the model cannot be validated: there are, certainly, restricted laboratory conditions under which growth occurs according to this formulation. The real problem with the model is its narrow domain of applicability. It is useless for most ecological problems. A Venn diagram (such as Fig. 1) for the logistic model would show that its adequacy, Q / S , is small. The domain of applicability of the model is small relative to what one would like to see covered by a population growth model. Thus, even though the logistic model may be "real" and "valid", it is not desirable. For a model to make a useful contribution to the development of ecology, its domain of applicability must be large enough to cover the range of conditions that are encountered in ecological research. (2) A desirable model produces few nonsense results over the range of conditions of interest to the ecologist. This criterion is complementary to the first. Not only should the model cover a wide range of conditions but it should not produce absurdities anywhere over this range, i.e., the model should have high reliability, Q / M . If this criterion is satisfied, the model can be used for new ecological studies with some degree of confidence that the model will not produce nonsense. An instructive example of an unreliable model is the von Foerster model of world human population growth (Table II). Model parameters were derived from 24 independent estimates of world population over the past two millenia. Agreement between model output and data is extremely good (Caswell, 1976). As further proof of the predictive capability of the model, one can point to (1) a favorable comparison between the simulated population in 2000 A.D. and the estimate made by the United Nations, and (2) the near identity between estimates of world population made in 1975 by the Population Reference Bureau (3.97 billion) and model predictions made in 1960 (3.65 billion). The von Foerster model can clearly be "validated". But is it desirable? An
176 T A B L E II T h e von Foerster model of world h u m a n p o p u l a t i o n size Differential Equation:
dN ~ - = a N ~1+ l / K )
Solution Equation:
N(t)=
N a K t*
= = = =
N 0 ~ -[t*-t°)K ( )~ t~-ZT_t
world h u m a n p o p u l a t i o n size. constant. c o n s t a n t slightly less than 1. constant =to + l/[N(O)]l/K.a.K.
examination of its solution shows that when t reaches t*, the model predicts an infinite population size. In the words of Caswell (1976), " . . . this model is a m e m b e r of a class of models which reach, not approach but reach, infinite values in finite time". (Emphasis is in the original.) This prediction is, of course, nonsensical. Despite its obvious virtues, the model is undesirable because the set Q becomes empty if the model predictions are extrapolated sufficiently far forward in time (Fig. 2). (3) A desirable model has both an algebraic formulation which follows ecological principles, and parameters that can be defined ecologically and measured directly. The Michaelis-Menten function provides an example of a model which satisfies our first two criteria but violates the third. The function describes uptake as proportional to the concentration of available nutrient, X1, and the size of the receiving compartment, X2, and as inversely proportional to nutrient supply plus a constant; rate = a X 1 X 2 / ( X 1 + S ) . Here, a is the m a x i m u m uptake rate and S is the half-saturation constant. This function describes nutrient uptake for a wide range of aquatic organisms and over a wide range of conditions (Criterion 1). Because it is an asymptotic function, it cannot take on absurd values (Criterion 2). However, the function contains a parameter, S, which cannot be measured directly. It must be estimated statistically from data on uptake rates over a range of nutrient concentrations. As a result, one loses sight of the fact that uptake involves a variety of ecological phenomena such as limiting biochemical processes and limitations due to other nutrients. The parameter S subsumes a n u m b e r of these processes, and it is difficult to state how S will change when any of the individual processes change. S is a description of data rather than a description of ecological processes. To account for limitations by two or more nutrients, some models simply multiply a series of Michaelis-Menten functions (e.g. DiToro et al., 1975). This violates the third criterion because the algebraic formulation of the model no longer conforms to ecological principles. Although synergistic
177
p
,
$
M
I M
Fig. 2. The situation where a validated model has a domain of applicability which vanishes when projected forward in time. The von Foerster Model exemplifies this undesirable characteristic.
effects may occur, there is no ecological basis for assuming that uptake declines geometrically as more nutrients are explicitly considered. A more serious violation of the third criterion occurs when the model is an arbitrary function fitted to data. The Weierstrass theorem (Scheid, 1968, p. 267) demonstrates that a polynomial can always be found which fits the data. However, the predictive power of such a function is poor when environmental conditions change. An arbitrary function may satisfy the
178 requirements of a specific study, but the model is useless for other studies outside the measured environmental conditions. The strategy of letting data determine model structure can lead to undesirable models. The pattern of hypothesis formulation, testing and revision is not a part of this purely inductive approach. Desirable models must be generated from an ecological understanding of the system and revised as that understanding increases. Data may suggest the form that a model should take, but the suggestion must be founded on ecological principles. If a model conforms to the third criterion, it is more likely to satisfy the first two criteria as well. The model is more likely to be applicable to a wide range of conditions (Criterion 1) if it is ecologically based, instead of being an arbitrary polynomial. It is also less likely to produce absurd results (Criterion 2) if it incorporates known environmental constraints. (4) A desirable model has been tested with data other than that used to estimate its parameters. This is the conventional validation criterion. That the model is able to simulate an independent data set obviously increases one's confidence that it has a useful domain of applicability and will not produce nonsense. Thus, the ability of the F O R E T model to simulate the effects of chestnut blight (Shugart and West, 1977) and the ability of the MS. C L E A N E R model (Collins, 1980) to simulate the behavior of Lake IOvre Heimdalsvatn, are excellent examples of the benefits to be derived from testing models. At the same time, we must emphasize that validation does not, by itself, guarantee a desirable model. Figure 3 shows that this test only demonstrates that there are two points in Q. The first corresponds to conditions used to quantify the model. Validation shows only that there is one additional set of conditions where model and system match. If conditions for the validation are close to the first data set, the test yields little information about the dimensions of the domain of applicability. A validation test is a necessary criterion for desirability, but a far from sufficient one. (5) A desirable model treats the c o m m o n properties of ecological phenomena rather than the special properties of a specific case. This criterion is closely related to the preceding four and is probably not an independent criterion. However, it serves to emphasize an important point. The more general the class of phenomena covered by a model, the more likely it is to be desirable. This criterion should not be interpreted to mean that specific, special-purpose models are useless: there will always be room for specific models, designed for singular and unique purposes. Such models enable an investigator to focus on unique phenomena which regulate a process, but, nevertheless, such models are very limited. Typically, they are useful in the original study but cannot be applied by other investigators in related problem areas.
179
P
Fig. 3. Validation does not establish the size of Q but only the existence of single points therein. A model which has been verified (point 1) and has passed a validation trial (point 2) has met only one of five characteristics of a desirable model.
A desirable model would emphasize common properties useful over a range of conditions, and such a model has greater potential for increasing our ecological understanding.
Improving Model Desirability All of the above criteria for a desirable model are open-ended: that is, the desirability of a model can always be improved. The possibility for systematic improvement is one of the greatest benefits that modeling offers to ecology. However, to ensure that model desirability increases, certain principles must be followed. (1) Modeling should proceed deductively from ecological principles rather than inductively from data. Modeling strictly from induction (i.e. from fitting functions to ecological data) encounters the pitfalls outlined under Criterion 3 in the preceding section. The problem is particularly evident when the modeler uses convenient but arbitrary functions such as power series or Fourier series where the coefficients are easily fitted by statistical procedures. Such functions are notorious for unrealistic excursions when they are extrapolated beyond measured conditions. Commonly, the fitted coefficients have no ecological significance. A model designed to fit the data may fill the requirements of a study, but, such formulations make little contribution towards making ecological models progressively more desirable.
180 Modeling, therefore, should be deductive. Functions derived from ecological principles are seldom convenient for curve-fitting. This complicates analysis, but at considerable advantage to the overall development of ecological understanding. Models deduced from ecological principles are far more likely to be of use in related studies. (2) Models should be ecological abstractions, not mathematical formalisms. It has often been argued that ecological systems are complex and, therefore, only a mathematically complex model can be "realistic". Replacing a linear model by a nonlinear one may increase its adequacy (Q/S) by increasing the range of conditions it can simulate (Criterion 1). However, this gain may be more than counterbalanced by a loss of reliability ( Q / M ) because the nonlinearities permit an increase in absurd behavior (Criterion 2). A priori, a nonlinear model may be no more "realistic" than a linear one. Linear models may be undesirable because they have a limited domain of applicability. Thus, simply increasing mathematical complexity of the model has an unknown effect on model desirability. Therefore, the model should describe ecological understanding without a priori biases about mathematical form. (3) Modeling should increase the range of conditions over which individual rate processes can be simulated. This principle can be explained by way of an example. Many ecological rate processes are functions of temperature. A variety of processes follow the Ql0 law which states that a rate increases by some factor (usually about 2.0) for each 10°C increase in temperature. Depending on the individual problem, use of the Q10 function is completely satisfactory. However, the law is restricted to intermediate temperatures. When temperature increases or decreases beyond this range, the model is unreliable. Both reliability and adequacy can be increased by using a function that describes the rate process over all possible temperatures. Such functions have been proposed by Bledsoe et al. (1971), O'Neill et al. (1972) and Sharpe and DeMichele (1977). (4) Modeling should disaggregate parameters and functions into component processes. Many model parameters, such as carrying capacity, involve a number of processes. When the parameters are disaggregated, the response of each component process to a range of conditions can be dealt with explicitly. The domain of applicability is systematically increased. By considering constraints which operate on each process, it is possible to simultaneously increase the reliability of the model by reducing the number of input conditions that result in absurd outputs from the model. An excellent example of disaggregation is found in the work of Wiegert (1973, 1974, 1975b). Ecological phenomena associated with the intrinsic rate of increase and carrying capacity in the logistic model have been disaggregated in his formulations. His models consider threshold effects with respect
181
to both resources and the consuming population, respiratory losses, mortality, assimilation efficiency, ingestion rate, predation, immigration, and emigration. The model also ensures that maintenance energy costs are met before growth occurs. Aside from the impressive list of processes included, the resultant equation is easily interpreted ecologically. Because the model considered ecological processes, it could be applied over a series of studies. This permitted improvements in the model and led to increasingly accurate predictions (Wiegert, 1975b). (5) The level of resolution of a model should be limited to the resolution of available data. The last two principles emphasize the intimate relationship that must be maintained between modeling and experimentation. The preceding principles tend to dissociate the deductive process of modeling from data gathering. This tendency is counterbalanced by the final two. In essence, this principle states that taking the level of resolution of a model beyond the resolution of data is pointless. Modeling is deductive, but strictly structured by data. One gains nothing, for example, by modeling daily primary production when solar radiation is known only as an average monthly value. (6) Models should be tested at the limits of their domain of applicability. Conventional validation tests yield little information about model behavior. This is particularly true if test conditions are similar to those for which the model was parameterized. The sixth principle states that far more information can be obtained by testing the model more rigorously. A model represents a complex hypothesis about system behavior over a range of conditions. The model can be used to predict behavior near the extremes of this range. This prediction can then be tested experimentally. If the prediction is correct, the domain of applicability is known to extend to that extreme. If the prediction is incorrect, the domain of applicability does not extend that far. In either case, insight is gained about the range of conditions over which the model is useful. It is only by such a critical test that a validation experiment increases our understanding, and it is only by such interaction between the deductive modeling process and the inductive hypothesis testing process that ecological theory progresses.
Applications A statistical procedure which integrates a model of known adequacy into an experimental field study is developed. We use ~ to indicate a measured state in the ecosystem and ~ to indicate an ecosystem state predicted by a model, m. The method we propose for integrating desirable models into experimental design incorporates statistical hypothesis testing. Of interest will be the relationships that exist between the quantities .~, ~ and x, where
182
x is the true state of the system. The symbol ~ is used to indicate agreement in a statistical sense. For example, -~i ---' £i would indicate that the ith model run agrees statistically with the ith experimental observation, and ~i ~ 2i would indicate disagreement. The hypothesis-testing procedure we use throughout the remaining discussion is based on the assumption that a null hypothesis, H 0, is true. An example of a possible H 0 would be that x is within the interval [.9 Y, 1.1 £]. Let A be the event that the experimental results indicate that H 0 should be accepted, and let A c be the event that the experimental results indicate rejection of H 0. For a given significance level, a, the probabilities of events A and A c are: P(A) = P(Z~
HoIH 0 is true) = 1 - a
and p ( A c) = p ( £ - o HoIH o is true) = a Let B be the event that the model agrees with the experimental results (~ ~ ~), and let B c represent the event that the model disagrees with the experimental results ( ~ - o if). The model is independent of these experimental results. The model can now be used in the hypothesis testing procedure. Event A is illustrated in the two-dimensional representation of the product space S x M shown in Fig. 4a. The region of acceptance is shown in the shaded portion, the width of which depends on the definition of agreement, .~ ~ H 0. The boundaries are vertical lines since agreement is independent of the model. Event B is shown in Fig. 4b. The dotted line indicates perfect agreement between the model and experiment. The boundaries of the region of agreement are made specific for a particular definition of agreement. However, this figure illustrates that the boundaries of the region of agreement bear some relation to the 45 ° line representing perfect agreement (they are definitely neither vertical nor horizontal lines). Figure 4c illustrates the intersection of events A and B. To use the model to strengthen a hypothesis-testing procedure, we must know the probability that the model will agree with the experimental evidence. We have this from our a priori information about the adequacy and reliability of the model. In fact, adequacy is the probability of agreement between model and experiment. Agreement can occur in two ways: the model can agree when experimental evidence indicates acceptance of this null hypothesis, or the model can agree when experimental evidence indicates rejection. Adequacy is given by a = P ( ~ ~ if'lx ~
H0)(1 - a) + P ( ~ ~ £ 1 £ ~ H o ) a
183
(b
(o)
M
M
S B= ~"-~ ~
S A = ~ H
(c)
ss j
M
$ AnB
Fig. 4. Regions in hyperspace where (a) experimental results support a null hypothesis, (b) the model agrees with the experimental results, and (c) is the intersection of the two events.
Solving for P(2 --* ffl.g--* H0) yields: a-
P(2 --* .glY --, Ho) =
P ( 2 ~ YtY-~ H o ) a (1 - a )
If we assume a useful m o d e l has a d e q u a c y in the range 0.5 ~ a _< 1, direct s u b s t i t u t i o n w h e n a = 0.05 shows that P(2--* .~1~---, H0) is always within _+6% of a d e q u a c y for all possible values of P(2 --* ffl.~ ~ H0). H o w can we use this a priori i n f o r m a t i o n to increase the c h a n c e of m a k i n g a correct decision; i.e., can we decrease the p r o b a b i l i t y of m a k i n g a T y p e I error?
184
An estimate of the a posteriori probability of accepting the null hypothesis when it is true, given that our model indicates that is should be accepted, is needed. Applying Bayes theorem (Fisz, 1963), we have p(.~ ~ H0[x ~ £ ) = p(AIB) =
P(B[A) P ( A ) P(B[A) P ( A ) + P(BIAC) P ( A c)
Substituting the previously computed probabilities, we have P(~
a(1 - a ) 1 -a H0l~ ~ X) = P(AIB ) = a(1 - ~) + (1 - a)cx = 1 - ( 2 - 1/a)~
Provided adequacy is in the previously defined range, 0.5 < a < l, an improvement in our estimate is assured. Viewed differently, the expression indicates the degree of adequacy a model must have to aid in experimental design. The Bayesian probability expression provides a general formulation for conditioning the significance level (u) on the measured a priori adequacy (a) of a model's performance. An example of the importance of this formulation is found in the familiar problem of determining the number of samples needed to estimate a mean with some desired confidence limit. From the classic technique of Stein (1945), the number of samples needed to estimate the mean of some ecosystem response within a confidence limit, d, is n = t2s2/d 2
where n = the number of samples needed, t t = the tabulated value of the t statistic for the desired confidence level and for degrees of freedom of the initial sample, d = the half-width of the desired confidence limit, and s 2 = the sample variance, s 2 is estimated by a preliminary sample, and n is the size of the sample that determines the degrees of freedom for the t statistic. N o w consider a case in which a model (of known adequacy) predicts a mean response within the range (_+d) of the sample mean obtained in the preliminary sample mentioned above. Then n* = t2,sZ/d2
where n* = the number of samples needed, given the agreement between the preliminary data and the model; t 2 = the tabulated value of the t statistic for the desired confidence region, for degrees of freedom of the initial sample, conditioned by model agreement; and s 2, d are as above Conditioning of t 2 to t 2 needs clarification. For a selected Bayesian probability [P(A[B)] and a known adequacy, a may be computed. For adequacies in the range 0.5 < a < 1, this calculated a will be larger than the original significance level. This has the effect of reducing t to a new value, t . , which yields therefore an n* less than n. Incorporation of an acceptably
185
adequate model in a sampling program can reduce the required number of field samples by an amount n - n*.
Concluding Remarks Ecological models which treat ecosystem processes (a modeling approach that meets our criteria for increasing model desirability) have been seen by several ecologists as a strategy for developing accurate simulations across a broad range of inputs. Collins (1980), for example, after calibrating and testing her phytoplankton-zooplankton model on data from Ovre Heimdalsvatn, a subalpine lake in Central Norway, initialized and ran the model under conditions characterizing the Vorderer Finstertaler Sea, an Austrian high mountain lake. The excellent fit in both cases led her to conclude (p. 646) that, " . . . increased realism at the process level provides both increased predictive capacity and generality". As greater understanding is obtained in such cases, the usefulness of models (as measured by their adequacy and reliability) increases, thereby enlarging the domain of applicability. Evaluating model performance and incorporating model output into experimental design are important steps in the logical evolution of modeling research. We illustrated how this might be done for Stein's test. It may be possible to extend statistic tables (e.g., F-test, t-test, etc.) found in elementary statistics books by conditioning them on various levels of a priori Bayesian corrections. Such a development would be important for the automatic inclusion of models into experimental sampling programs. It is difficult to estimate model adequacy and reliability, but the value of such estimates can be expressed in reduced cost and effort in collecting samples. Ecosystem science may not be ready to claim that its models are completely valid over all possible circumstances, but models which have adequacies over 50% can be successfully incorporated into research programs. Because greater demands are being placed on simulation modelers in the environmental sciences to answer complex questions about issues such as environmental effects or regulatory policy, it is essential that models be used in a proper manner. The methods we propose take into account the limitations and strengths of a model when weighing its utility in assisting the researcher. ACKNOWLEDGEMENT
Research supported by the National Science Foundation's Ecosystem Studies Program under Interagency Agreement DEB-77-25781 with the U.S. Department of Energy under contract W-7405-eng-26 with the Union Carbide Corporation. Publication No. 2140, Environmental Sciences Division, Oak Ridge, Tennessee 37830.
186 REFERENCES Bledsoe, L.J., Francis, R.C., Swartzman, G.L. and Gustafson, J.D., 1971. PWNEE: A grassland ecosystem model. Technical Report No. 64. Grassland Biome, US/IBP, Fort Collins, CO. Caswell, H., 1976. The validation problem. In: B.C. Patten (Editor), Systems Analysis and Simulation in Ecology, Vol. IV. Academic Press, NY, pp. 313-325. Collins, C.D., 1980. Formulation and validation of a mathematical model of phytoplankton growth. Ecology, 61: 639-649. DiToro, D.M., O'Connor, D.J., Thomass, R.V. and Mancini, J,L., 1975. Phytoplankton-zooplankton-nutrient interaction model for Western Lake Erie. In: B.C. Patten (Editor), Systems Analysis and Simulation in Ecology, Volume III. Academic Press, NY, pp. 424-474. Fisz, M., 1963. Probability Theory and Mathematical Statistics. Wiley, NY, 677 pp. Mankin, J.B., O'Neill, R.V., Shugart, H.H. and Rust, B.W., 1977. The importance of validation in ecosystem analysis. In: G.S. Innis (Editor), New Directions in the Analysis of Ecological Systems, Part I. Simulation Councils, La Jolla, CA, pp. 63-71. O'Neill, R.V., Goldstein, R.A., Shugart, H.H. and Mankin, J.B., 1972. Terrestrial ecosystem energy model. EDFB MR-72 19. Oak Ridge National Laboratory, Oak Ridge, TN. Patten, B.C., 1971. A primer for ecological modeling and simulation with analog and digital computers. In: B.C. Patten (Editor), Systems Analysis and Simulation in Ecology, Vol. I. Academic Press, NY, pp. 3-121. Patten, B.C. and Auble, G.T., 1981. Systems theory of the ecological niche. American Naturalist, 116: 893-922. Scheid, F., 1968. Theory and Problems of Numerical Analysis. McGraw Hill, NY, 422 pp. Sharpe, P.J.H. and DeMichele, D.W., 1977. Reaction kinetics of poikiloterm development. J. Theor. Biol., 64: 649-670. Shugart, H.H, and West, D.C., 1977. Development of an Appalachian deciduous forest succession model and its application to assessment of the impact of chestnut blight. J. Environ. Manage., 5: 161-179. Simulation Council, 1978. Terminology for credibility testing. Technical Committee on Model Credibility, Society for Computer Simulation, LaJolla, CA. Stein, D., 1945. A two-sample test for a linear hypothesis whose power is independent of the variance. Am. Math. Stat., 16: 243-258. Webster, J.R., 1979. Hierarchical organization of ecosystems. In: E. Halfon (Editor), Theoretical Systems Ecology. Academic Press, NY, pp. 119-129. Wiegert, R.G., 1973. A general ecological model and its use in simulating algae-fly energetics in a thermal spring community. In: P.W. Geier, L.R. Clark, D.J. Anderson and H.A. Nix (Editors), Insects: Studies in Population Management, Vol. 1. Canberra, Australia, pp. 85-102. Wiegert, R.G., 1974. Competition: a theory based on realistic, general equations of population growth. Science, 185: 539-542. Wiegert, R.G., 1975a. Simulation models of ecosystems. Annu. Rev. Ecol. Syst., 6: 311-338. Wiegert, R.G., 1975b. Simulation modeling of the algae-fly components of a thermal ecosystem: effects of spatial heterogeneity, time delays, and model condensation. In: B.C. Patten (Editor). Systems Analysis and Simulation in Ecology, Vol. III. Academic Press, NY, pp. 157-181.