Ecological Modelling, 4 (1978)41--50
41
© Elsevier Scientific Publishing Company, Amsterdam -- Printed in The Netherlands
ADEQUACY OF ECOSYSTEM MODELS EFRAIM HALFON and MARCELLO G. REGGIANI * Canada Centre for Inland Waters, Burlington, Ont. L 7R 4A6 (Canada) • Direzione Ricerche Selenia, 00131 R o m e (Italy)
(Received 24 August 1976)
ABSTRACT Halfon, E. and Reggiani, M.G., 1978. Adequacy of ecosystem models. Ecol. Modelling, 4 : 41--50. A formal approach is developed to assess the adequacy of mathematical models to represent a given ecosystem. The procedure is based on the hypothesis that two or several models of an environment can be compared by using a vectorial approach: several model properties are analyzed and related to the model's capability to simulate the observed behaviour and to describe the ecosystem processes. The models are thus ordered from the standpoint of their adequacy. One preliminary model, a more recent one, and four lumped versions of the latter have been tested for adequacy. These models were developed by Wiegert and they describe the behaviour of an algal--fly community energetics in a thermal spring. Results obtained by our procedure agree with Wiegert's but some new points have been emphasized.
INTRODUCTION A m a t h e m a t i c a l m o d e l o f an e n v i r o n m e n t is t h e result o f several simplific a t i o n s a n d a s s u m p t i o n s . A n ecological s y s t e m can be m o d e l l e d in several w a y s a c c o r d i n g t o t h e p r o j e c t p u r p o s e . T h e m o d e l m a y have more-or-less c o m p l i c a t e d features such as t i m e delays, t i m e - v a r y i n g coefficients, nonlinearities, spatial h e t e r o g e n e i t i e s , etc. Likewise, t h e c h o i c e o f s u b - s y s t e m s or m o d e l c o m p a r t m e n t s is a r b i t r a r y . In fact, a s y s t e m can be subdivided i n t o m a n y s u b - s y s t e m s a c c o r d i n g t o ecological i n f o r m a t i o n a n d m o d e l l i n g ease. T h u s , several alternative m o d e l s can be derived f o r t h e same e n v i r o n m e n t and usually n o objective m e t h o d is used t o select a particular m o d e l instead o f a n o t h e r , given s o m e m o d e l l i n g goals. T h e c h o i c e o f t h e c o m p a r t m e n t s involves a c o n c e p t u a l i z a t i o n o f t h e s y s t e m u n d e r s t u d y so t h a t m a x i m u m inf o r m a t i o n can be o b t a i n e d f r o m t h e m o d e l . O n c e t h e c o m p a r t m e n t s are c h o s e n , o t h e r p r o b l e m s o f m o d e l building such as s y s t e m i d e n t i f i c a t i o n ( H a l f o n , 1 9 7 5 ) a n d s i m u l a t i o n ( P a t t e n , 1 9 7 5 ) can be a p p r o a c h e d . H o w e v e r , t h e process o f c o n c e p t u a l i z a t i o n is still t h e m o s t f u n d a m e n t a l because o n c e decisions are m a d e at this level, all results a n d c o n c l u s i o n s derived f r o m t h e m o d e l will be d e p e n d e n t o n this choice.
42 Wiegert (1975) has approached the aggregation problem and developed a base model of algal--fly c o m m u n i t y energetics in a thermal spring; he then lumped some compartments to test the differences between several levels of model aggregation. Wiegert's approach is implemented here by a m e t h o d that allows the objective evaluation of model structure and model goals so that an optimal strategy of model development is possible. A formal procedure to test model adequacy of both distributed and lumped models has been developed following an idea proposed by Reggiani and Marchetti (1975). This method is based on the hypothesis that a set of numbers is generally necessary in order to compare the relative merits of the models. These numbers can be considered the components of a vector, here called " t h e vector-performance or vector-distance". Thus, our procedure can be named "vector-approach m e t h o d " . This approach is superior to the "scalar-approach m o d e l " , where a single number (a scalar-performance index) is said to be sufficient to compare the relative merits of the models. As it will be shown below, this hypothesis is not well suited for our problem and the drawbacks of the scalar m e t h o d will be pointed out. In this paper the vector approach is employed to find the most adequate model among those previously developed by Wiegert (1973, 1975). Our and Wiegert's results agree well. Our formal approach, however, guarantees the validity of his results and also improves his conclusions. METHOD The above mentioned point of view can be expressed in a simple mathematical form. If the goals to be reached in the modelling effort are known, then it is possible to link every model to a set of numbers, each corresponding to a single goal, and defined in such a way to decrease as the model adequacy improves. The numbers so defined are the components of the vectordistance between the model and the object ~2. The latter is defined as the model for which all the goals are perfectly reached. Let clTg1 and cfl~2be two models and ~2¢Tgl,~2cYg2 the previously defined vector
43
Fig. i (left). Hasse diagram of ordered models. Model 1 is better than Model 2, Model 2 better than Model 3, etc. Fig. 2 (right). Hasse diagram of partially ordered models. Both models c ~ 2 and Q?~3 are better than c)~t, but t h e y are u n c o m p a r a b l e for each other. Thus, it is not i m m e d i a t e l y clear which m o d e l should be chosen as the best.
comparable to each other (c#~ 2 is better than Q?~3as far as the first component is concerned, but the opposite is true for the second component). In the present example, the Hasse diagram for the models ct~ 1, c~ 2, cy~ 3 is of the type shown in Fig. 2. Under these circumstances, i,t is n o t immediately clear which model should be chosen as the best. This kind of difficulty frequently arises in m a n y fields of h u m a n knowledge, e.g. in ~nodelling physical systems, optimization problems, etc. In order to overcome ~he mentioned difficulty, the m e t h o d of the function of merit has been widely proposed in the past. It consists of a search for a suitable scalar function of the vector-distance components and in defining as " t h e best" the model for which the considered function is a minimum. Since the function of merit is a scalar quantity, we can conclude that, in the present hypothesis, problems concerned about the non-comparability of the models cannot arise and t h a t they can always be represented by a chain in the Hasse diagram. Unfortunately, the suitability of a function of merit can seldom be accepted as appropriate. Moreover, the choice of a function of merit (instead of another one) greatly affects the results. As a consequence, though very simple, we completely disregard as misleading the m e t h o d of the figure of merit. A simple example can clarify the previous arguments. Let M' and M" be the vector-distance components of the model ~ and let F = M' + 2M" be the chosen figure of merit. Model c~ 2 previously considered has the components M~ = 2;M~ = 3; thus F~ = 2 + 2 × 3 = 8. Analogously, for the model M3, M'a = 3;M~= 2 and F3 = 3 + 2 × 2 = 7. As F3 < F2, the model cy~ 3 would be considered more adequate than ~ 2. Conversely, if the function of merit is ~p = 2M' + M", ~P2 < ¢P3 and c~ 2 would have to be considered more adequate than c~ a. The conclusions are opposite to each other. They depend u p o n the different choice of the figure of merit. As a consequence, every time the definition of the figure of merit cannot be firmly grounded
44 on a theoretical basis, the results can be completely misleading and the m e t h o d cannot be employed. This situation would take place in the present work and very frequently arises in other fields. See for example, the arguments of Reggiani {1976) and Naill {1976) against the use of the figure of merit m e t h o d in implementing optimization techniques applied to a model of the world. See also Reggiani and Marchetti (1975) with reference to the problem of the simplification of linear systems. The approach we propose is completely different from the previous one and can be performed in three successive steps. The first step regards the preselection of the models, i.e., the models for which one or more components of the vector-distance exceed certain definite limit values {depending on the problem at hand) are disregarded. In the second step, the remaining models are suitably plotted {by means of a Hasse diagram or other) in such a way that their relative merits can be easily appreciated at first glance. The third step is the decision-making process. This step has not been so well formalized yet from a mathematical point of view as the previous ones. Nevertheless, in any practical problem such as the problem we are dealing with in the present work, the most suitable model can be selected provided that the final purpose of the modelling effort is well known. This purpose may be very different to several modellers and also to the same modeller in different stages of his study. For example, a modeller may look for a very simple model to get a first approximate understanding of the phenomena involved; another one (or the same one in a next stage of his study) may look for a very complete model in order to gain a good insight of the reality and so on. This point will be completely clarified in the following section. A peculiarity of our m e t h o d consists of the fact that it allows a good understanding about the relative merits of the models taken into consideration and suggest ways to be employed in order to obtain further improvements in the models, if needed. This kind of information is not available with the m e t h o d of the function of merit. The example previously considered can be used to clarify our claim. Let us suppose that the figure of merit taken into consideration is F = M' + M". In this case both models 9~. 1 and 9~ 2 would be considered equally good, w i t h o u t any further reference to the components of the models. In our method, however, the study of the relative merits of the models is carried out comparing directly the components of the models and, as a consequence, the final model is chosen on the grounds of a decision making process, not relying to an algebraic formula bounding the figure of merit to the components of the model. The m e t h o d we are proposing here in the field, of ecology can be defined as a "pseudo-metric approach", as opposite to the "metric approach", peculiar to the m e t h o d of .the figure of merit. In fact, the figure of merit, can be viewed as a sort of distance between any model M and the object ~ . Analogously, our vector-distance can be named a pseudo-distance with a termi-
45
nology largely employed in the field of functional analysis. In other terms, the concept of the distance leads to the study of the metric spaces while the concept of the vector-distance (or pseudo-metric distance) leads to the study of the pseudo-metric spaces. MODEL DESCRIPTION AND ADEQUACY C R I T E R I A
One preliminary model (Wiegert, 1973), a more recent one, and four lumped versions of the latter had been tested for adequacy (Wiegert, 1975). Two goals were defined. Which model is able to simulate more accurately the behaviour at steady state of an algal--fly c o m m u n i t y energetics in a thermal spring and which model is more ecologicall:¢ realistic and also has some good simulation capabilities? The assumptions on which these models are based, their structure and the differential equations are presented in the above referenced papers arid will not be repeated. Here, only some generalmodels' features are used to assess the models' adequacy. These properties must not depend on the degree of lumping since this would include a bias in the method. Therefore, the following parameters have been chosen to fill the vector elements which describe each model: (1) Number of the model parameters (this is an index of model complexity); (2) Number of compartments (also an index of model complexity and of the ability of the model to describe ecosystem components); (3) Time delays (this element represents a n y special feature of the model: TABLE 1
~
Predicted and measured steady state standing crops of algae and flies. Prediction % taken from day 318. Percentage in each age class (modified from Wiegert, 1975) Total algae (kcal/m 2) Measured in field Prediction from model W* A B C
Amt. Total flies
Eggs
1--2 larvae
3 larvae
Pupae
Adults
889
7.3
2.1
5.1
42.1
33.5
17.3
937 1,030 1,087 1,031
1.1 1.6 1.3 1.7
3.8 3.7 3.1 7.7
: 59.6 40.7 39.7 44.6
22.7 34.0 39.8 22.6
12.9 20.0 16.1 24.0
--
.
--
--
D
1,069
12.2 6.0 5.5 7.3 O.
E
1,069
4.2
.
. --
. .--
* Model W is presented in Wiegert (1973); all the others in Wiegert (1975).
--
46 T A B L E II V e c t o r - d i s t a n c e c o m p o n e n t s . See t e x t Model
W A B C D E
Number of parameters
46 46 44 44 32 21
Number of compartments
6 6 6 6 4 2
Special features ( t i m e delays)
1.0 0.33 0.33 0.66 0.66 1.0
A c t u a l value G o o d n e s s of fit (algae)
G o o d n e s s o f fit (flies)
40.98 321.70 601.10 325.96 757.72 1,515.43
32.80 4.69 9.82 0.0 ¢¢ 114.40
time delays were chosen since they were used b y Wiegert (1975). In the vector, 0.33 represents an extensive use of time delays, 0.66 represents a moderate use, 1, no use. The formula used is 1/(1 + Number of time delays)); (4) Goodness of fit weighted for the complexity of the model (formula used: {2; (pred--obs)2/pred) • 100/number of compartments)). This last vector element was divided into two components: goodness of fit of the algae compartment (FIT ALGAE) and goodness of fit of the fly compartment (FIT FLIES). When a c o m p a r t m e n t was modelled by life stages, these were summed together to give an estimate of the total population. This approach was used to have a further element of choice, i.e. if two models were equally adequate, one might simulate one compartment better than the other. Data used for testing the goodness of fit are described in Table I. Therefore, the vectors used to assess adequacy of the six models have five components and the numerical values are presented in Table II. Model W refers to Wiegert (1973). Model A is the more refined base model and models B - - E are successively lumped versions of A (Wiegert, 1975). RESULTS
Models D and E were eliminated since they had a much worse fit than the others (Table II). Model D was eliminated since the fly populations became extinct under the postulated conditions (Wiegert, 1975). Model E was characterized b y the value 1515 (fourth vector c o m p o n e n t (Table II) related to FIT ALGAE), 37 times worse than model W and five times worse than A and C. The remainder four models (A, B, C, W) were then analyzed. This analysis consists in the ordering of the models, i.e. to find which ones are closer to the real system given the t w o above mentioned goals. These four models have the same number of compartments and, therefore, this vector element cannot be used to compare them and it is n o t given further consideration. Also, the models are almost equally complex (44 to 46 parameters) and,
47
therefore, in this instance, no analysis is done on this vector element. This procedure is supported b y the theory proposed by Reggiani and Marchetti (1975) under the heading "Tolerance". Some classes of models can be grouped together since their distances are less than a certain tolerance vector (component-wise). This results in an obvious simplification in the decision process. Thus, the ordering of the models must be done in relation to the three remaining elements (time delays, FIT ALGAE, FIT FLIES). The optimal model, ~ , should be ecologically realistic, i.e. a maximum number of time delays (Wiegert, 1975) and a goodness of fit equal to zero. The Hasse diagram, used to order the models A, B, C, and W in relation to ~2, is presented in Fig. 3. W cannot be compared directly with A, B, or C in this diagram, since it does not have time delays. Model W is the best to simulate the algae b u t is the poorest to simulate the flies. Also, B and C cannot be compared directly (different number of time delays). However, each is farther away from ~ than A. Models A, B, and C can also be compared on the basis of ecological realism -- in this instance, the number of time delays. Graphically, this is done by using a Vogt diagram (Fig. 4). This analysis indicates that B has no advantage over A. In fact, it has as many time delays as A b u t has a worse "goodness of fit". Therefore, no advantage can be obtained by lumping model A into B. Model C, however, cannot be eliminated altogether. In fact, it cannot simulate the system better than A, b u t it is a simpler model. Therefore, this lumped model could be useful when computer time and m e m o r y is limiting and when only a steady state response is sought. The transient response of this model is quite different from the observed system behaviour. At this stage, only three models (A, C, W) remain to be analyzed in detail before any conclusion on the ecological validity of the models can be expressed. For this purpose, the models are plotted on a diagram (Fig. 5) proposed b y Reggiani and Marchetti (1974). In this graph, the models "are dis-
Fig. 3 (left). Hasse diagram: Model A is better than Models B and C. Models B and C cannot be compared (different number of the delays). Model W cannot be compared directly with the other three models (no time delays). The diagram indicates that models A and W are more satisfactory than B and C. Fig. 4 (right). Vogt diagram used to analyse effect of time delays in relation to goodness of fit. For explanation see text.
48
I ALGAE I. . . . . .
I-~ I
Fig. 5. Reggiani and Marchetti diagram. The models are ordered by considering the fourth and fifth co m p o n en t of the performance vector, goodness of fit algae and goodness of fit flies, respectively. Each model is joined to the next worse model of means of an oriented arc, i.e., the arc points toward the model with a higher value of the vector element. The two graphs are placed on each other and identified by different kinds of lines: dotted lines FIT ALGAE, unbroken lines, FIT FLIES. Arcs and nod lines are labelled with compact values relating to the preceding element. This diagram indicates that model C is better to simulate flies, model W better to simulate algae.
located according to the Hasse diagram (Fig. 3) b u t all the branch lines are discarded. The order is considered as defined by the fourth and fifth component of the performance vector". The numbers in the d o t t e d squares (unbroken squares) are the rounded values of FIT ALGAE (FIT FLIES). The final step to test the model adequacy depends on the goals. The results are as follows: (a) none of the three remaining models (A, C, W) are completely satisfactory. No model has a good fit to both algae and flies. When an understanding of the ecological factors controlling the system is sought, model C is not complex enough. Models A and W are more complex but their lack of goodness of fit to the algae (model A) or to the flies (model W) implies that their description of the ecological system should be improved. Such improvement is now taking place (R.G. Wiegert, personal communication); (b) if there is only an interest on the algal (flies) behaviour, model W(C) is more adequate than model C(W), (c) when a simple model is required, and a realistic simulation is sought, model C is adequate; (d) model W lacks time delays and therefore is not sufficiently realistic even if its FIT ALGAE is very good. DISCUSSION
As it was expected at the beginning of this work, our conclusions generally agree with those formulated b y Wiegert (1975). The models are simple
49
enough for an empirical study and different conclusions would have invalidated our method. Model A is the most realistic and can be considered superior to all lumped models (B--E). Model W, an early version of model A describes accurately the algae but is ecologically less realistic. The conclusion is that no model is completely satisfactory either because of the wrong choice of compartments or the wrong setting of the differential equations. This result is encouraging since it can be used to stimulate further research. The m e t h o d is n o t required to give a positive answer in all cases. From the ecological standpoint, we have eliminated three models (B, D, E) because the assumptions on which they were based were n o t realistic or did n o t produce satisfactory results. These models are certainly not adequate to describe or to simulate the organisms in the thermal spring c o m m u n i t y . The others can be used according to the limitations described in the previous section. A detailed analysis of the model properties, such as the inclusion of time delays, spatial heterogeneity of the algae, assumptions, constructions of the differential equations, etc., has not taken place in this paper because it does n o t belong to the stage of adequacy testing. Here, it is important to note that the m e t h o d described in this paper has been used to analyze and compare six models of an environment without having to consider the model structure and c o m p a r t m e n t choice. This choice should be left to the ecologist interested in a given ecosystem. Our m e t h o d has produced some results not emphasized by Wiegert such as the ordering of the models. The choice of the vector elements used to compute model adequacy was done in a way to emphasize both model complexity and ecological realism. These elements proved to be satisfactory in this study but t h e y m a y be changed in the analysis of models of other ecosystems. The m e t h o d can be used to compare several degrees of lumped models or completely different models describing an ecosystem. Both kinds of models were used in this study. This m e t h o d cannot be used a priori to study model aggregation as Zeigler {1974) proposed, but it does n o t require much effort once all models to be analyzed work properly. The analyzed models can be further improved -- other models are still in the "thinking stage" -- but testing the adequacy of models during their development can help the understanding of the processes involved and the model limitations so t h a t some new line of research can be begun.
REFERENCES Halfon, E., 1975. The system identificationproblem and the development of ecosystem models. Simulation, 38: 149--152. Naill, R.F., 1976. Optimizing models of socialsystems. IEEE Trans. Syst. Man Cybern., SMC-6 (3): 201--207. Patten, B.C. (Editor), 1975. System Analysis and Simulation in Ecology, Vol. 3. Academic Press,N e w York, N.Y., 601 pp.
50 Reggiani, M.G., 1976. Comments on "On Optimization Techniques Applied to the Forrester Model of the World". IEEE Trans. Syst. Man Cybern., SMC-6 (3): 201. Reggiani, M.G. and Marchetti, F.E., 1974. The pseudometric view in problems involving vector-valued performance criteria. Alta Freq., 43(7): 462--467. Reggiani, M.G. and Marehetti, F.E., 1975. On assessing model adequacy. IEEE Trans. Syst. Man and Cybern., SMC-5 (3): 322--330. Wiegert, R.G., 1973. A general ecological model and its use in simulating algal--fly energetics in a thermal spring community. In: P.W. Geier, L.R. Clark, D.J. Anderson and H.A. Nix (Editors), Insects: Studies in Population Management.. Vol. 1, Occasional Papers. Canberra, A.C.T., pp. 85--102. Wiegert, R.G., 1975. Simulation modeling of the algal--fly components o f a thermal ecosystem: effects of spatial heterogeneity, time delays and model condensation. In: B.C. Patten (Editor) System Analysis and Simulation in Ecology, Vol. 3. Academic Press, New York, N.Y., pp. 157--181. Zeigler, B.P., 1974. A conceptual basis for modeling and simulation. Int. J. Gen. Systems, 1: 213--228.