Estimation of winter respiration rates and prediction of oxygen regime in a lake using Bayesian inference

Estimation of winter respiration rates and prediction of oxygen regime in a lake using Bayesian inference

Ecological Modelling 182 (2005) 183–197 Estimation of winter respiration rates and prediction of oxygen regime in a lake using Bayesian inference夽 Ol...

638KB Sizes 2 Downloads 61 Views

Ecological Modelling 182 (2005) 183–197

Estimation of winter respiration rates and prediction of oxygen regime in a lake using Bayesian inference夽 Olli Malvea,∗ , Marko Laineb , Heikki Haariob,c b

a Finnish Environment Institute, P.O. Box 140, FI-00251 Helsinki, Finland Department of Mathematics and Statistics, University of Helsinki, P.O. Box 68, FI-00014 Helsinki, Finland c Lappeenranta University of Technology, P.O. Box 20, FI-53851, Lappeenranta, Finland

Received 27 March 2003; received in revised form 29 April 2004; accepted 12 July 2004

Abstract In this paper, we estimate the winter respiration (oxygen depletion per unit area of hypolimnetic surface) in a hyper-eutrophic shallow lake (Tuusulanj¨arvi) in the northern hemisphere (Finland, northern Europe, latitude 60◦ 26 , longitude 25◦ 03 ) under ice-cover periods in the years 1970–2003. We present a dynamic nonlinear model that can be used for predicting of the oxygen regime in following years and to dimensioning of needed artificial oxygenation efficiency that will prevent fish kills in the lake. We use Bayesian estimation of respiration using Markov chain Monte Carlo (MCMC) method (Adaptive Metropolis–Hastings algorithm). This allows for analysis and predictions that take into account all the uncertainties in the model and the data, pool information from different sources (laboratory experiments and lake data), and to quantify the uncertainties using a full statistical approach. The mean estimated respiration in the study period was 301 ± 105 mg m−2 d−1 , which is on the upper limit of winter respiration of eutrophic Canadian lakes on the same latitude. The reference rate of the respiration k (d−1 ) at 4 ◦ C indicated cyclic behavior of about 9-year amplitude and had a statistically significant negative trend through out the study period. The temperature coefficient and respiration rate of the model prove to be highly correlated and unidentifiable with the given data. The future winters can be predicted using the posterior information coming from the past observations. As new observations arrive, they are added to the analysis. Methods are shown to be applicable to the dimensioning of artificial oxygenation devices and to the anticipation of the need for oxygenation during the winter. © 2004 Elsevier B.V. All rights reserved. Keywords: Lake winter respiration; Model uncertainty; Dynamical model; Bayesian modelling; MCMC

夽 This research is a part of the Academy of Finland’s MaDaMe

project Development of Bayesian methods with applications in geophysical and environmental research. ∗ Corresponding author. Tel.: +358 9 4330 0359; fax: +358 9 4030 0390. E-mail addresses: [email protected] (O. Malve), [email protected] (M. Laine), [email protected] (H. Haario). 0304-3800/$ – see front matter © 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.ecolmodel.2004.07.020

1. Introduction In this research, we model winter respiration in Lake Tuusulanj¨arvi as total consumption of the oxygen in the lake (mg m−2 d−1 ), which includes both the consumption in the lake water and on the bottom sediment. Our principle aim is not to develop a complicated oxygen

184

O. Malve et al. / Ecological Modelling 182 (2005) 183–197

model, but to study the changes in the condition of the lake as manifested by yearly average respiration, to predict future oxygen regime in the lake, to dimension probabilistically artificial oxygenation efficiency and to illustrate the multiplicity of model uncertainties and the strengths of the MCMC method in tracking these. The identifiability and uncertainty analysis of large environmental simulation models (Scavia, 1980; Adams VanHarn, 1998; Brun et al., 2001; Omlin et al., 2001a,b; Reichert and Vanrolleghem, 2001) have by now manifested some shortcomings in the scientific basis of complicated ecological model equations and of their parameter ranges. This is why the use of complicated water quality models as quantitative forecasting tools without comprehensive and well-designed measuring campaigns has been challenged and why new interest has been developed in the application of methods that can better trace unidentifiabilities and uncertainties in the models (e.g. MCMC) (Harmon and Challenor, 1996; Adams VanHarn, 1998; Omlin and Reichert, 1999; Annan, 2001; Borsuk et al., 2001a). As a subject of research and management, water resources are complicated, multiform and sometimes respond to external disturbances in an unpredictable way. Moreover, our measurements are spatially and temporally limited. Data analysis using traditional statistics relies on simplifications, typically in forms of linearization and large sample arguments, both of which can lead to unrealistic estimates of the parameters and especially of their accuracy. On the other hand, traditional physics has encouraged the use of complicated theoretical models which may contain parameters that cannot be estimated accurately with the available data or due to nonlinearities of the model. In the environmental sciences, the results of the analyses are frequently used in decision-making, and the more accurately the uncertainties can be evaluated, the more accurately we can then evaluate the risks of the decisions. Bayesian statistical inference with modern computational methods have provided very useful tools for assessing the uncertainties in environmental studies (Harmon and Challenor, 1996; Kokkonen, 1997; Adams VanHarn, 1998; Omlin and Reichert, 1999; Annan, 2001; Borsuk et al., 2001a). In this research, we have applied Bayesian data analysis with a computational tool called the Markov chain Monte Carlo (MCMC) method. The usefulness of the method is demonstrated with an example of lake mod-

elling. The time evolution of winter respiration in eutrophic Lake Tuusulanj¨arvi is estimated, the long-term impact of loading reduction and artificial aeration is assessed and oxygen regime in the lake in a future winter is predicted. Modelling of the interaction between a lake’s oxygen regime and all relevant ecological variables may be based on a complicated theoretical framework. The difficulty lies in the design and implementation of experiments that minimize parameter uncertainty, in the quantification of parameter uncertainty, and in the propagation of uncertainty to the predictions. In our particular example, a simplified model was selected to clearly illustrate the multiplicity of uncertainties and the strengths of the MCMC method in tracking them. Limited observational resources, typical in lake management practice, supported the use of a simple model structure. The efficiency and operational status of the artificial oxygenation devices over 30 years could not be traced very accurately and the temperature dependence of respiration was inaccurately known, which caused uncertainties in the inference. Uncertainties in the results were quantified and studied thoroughly and MCMC methods were evaluated with the given example. We present the basic principles of Bayesian methodology needed for environmental modelling. In addition to the standard MCMC methodology used in some recent papers on environmental problems (Borsuk, 2001; Borsuk et al., 2001a; Qian et al., 2003) we also show useful ideas not commonly implemented. The modelling of the error in the control variables is used to account for all the relevant uncertainties of the phenomenon. We also show how predictive probabilities can easily be calculated for trends in the time evolution of the model. Use of the adaptive MCMC algorithm (Haario et al., 2001) makes it feasible to do calculations with a high dimensional parameter vector, that is, when we have a large (from 50 up) number of unknown parameters.

2. Materials Lake Tuusulanj¨arvi is located in the northern hemisphere in southern Finland, latitude 60◦ 26 , longitude 25◦ 03 . The lake is hyper-eutrophic and shallow, its hydrologic and morphometric characteristics are shown in Table 1. The previously mesotrophic state of the lake

O. Malve et al. / Ecological Modelling 182 (2005) 183–197 Table 1 Hydrology and morphology of Lake Tuusulanj¨arvi (Anonymous, 1984) Surface area (km2 ) Volume (m3 ) Maximum depth (m) Average depth (m) Length (maximum) (km) Theoretical water residence time (days) Area of drainage basin (km2 ) Lake percentage of drainage basin (%)

6.0 19 × 106 10 3.2 7.5 250 92 8.4

shifted to hyper-eutrophy in the 1960s due to sewage discharge. In the beginning of the 1970s, the winter oxygen regime was in critical condition. The oxygen regime was slightly improved in 1973 when winter aeration began. The situation was further improved with reductions in nutrient loading. Sewage discharge was diverted in 1979. Summer aeration started in 1980. The high trophic condition in the lake remained, and blooms of blue–green algae occurred every summer after loading reduction (50% in phosphorus loading) in 1979. The phosphorus load from agriculture (4500 kg year−1 = 0.75 g m−2 year−1 ) still surpasses the lake’s tolerance level, which is why the reduction of phosphorus content of the water body by decreasing both external and internal phosphorus loading has been required. In the beginning of the 1970s, the lake’s winter net oxygen consumption was estimated to be 200 000 kg on average. The flux of pumped dissolved oxygen has been approximated by aerator consultants to be about 100 t on average. It has a large yearly variation due to technical problems and duration of ice-cover. This leaves significant uncertainty concerning the estimated dissolved oxygen fluxes, and affects the lake respiration estimates. The prior distribution for the flux was calculated from the information available in technical reports. In 1971, before aeration was started no dissolved oxygen was actually pumped (average 0 kg d−1 , S.D. 20 kg d−1 ). In 1972–1980, the amount of oxygen dissolved into the lake’s hypolimnion by Nokia aerator was 1350 kg d−1 (S.D. 675 kg d−1 ). In 1982–1990, water from the hypolimnion was pumped by Hydixor aerator to the lake surface, where it was mechanically aerated and pumped back to the hypolimnion. Reported dissolved oxygen flux was 240 kg d−1 (S.D. 24 kg d−1 ). In 1990–1997, the average oxygen flux pumped by Planox aerator was 450 kg d−1 (S.D. 45 kg d−1 ). From

185

1997 onwards Lappalainen (1994), oxygen was not dissolved at all (average 0 kg d−1 , S.D. 20 kg d−1 ), but oxygen-rich water from the epilimnion was pumped by Mixox aerator into the hypolimnion. In 2003, the aerator was not used at all. Other restoration measures such as dilution of the lake water with nutrient-poor water from a neighboring water body as well as more recent biomanipulation have been performed to decrease the trophic state and internal loading of the lake. Those measures also have an implicit impact on the lake respiration and on the oxygen regime. The lake water was sampled at two-meter intervals at the deepest point of the lake (point P10 in Fig. 1) (maximum depth 10 m) by the Uusimaa regional environment center and Central Uusimaa Federation of Municipalities for the Water Protection during the period 1968–2003. Samples were collected two to seven times each winter. Oxygen concentration and temperature were analysed with standard analysis methods. Vertical average and standard deviation were calculated. In March 2001, oxygen concentrations were also measured in situ at nine points (P1–P9 in Fig. 1) to determine the area of aerator impact.

3. Methods 3.1. Modelling of winter oxygen regime In standard lake aeration planning techniques, the average winter respiration rate in the lake is typically estimated with linear regression where the y variable is the oxygen content of the water body (mg m2 ) and the x variable is the time after the beginning of the icecover period (d). The slope of the regression line repre-

Table 2 The lake oxygen consumption model equation dCO2 feed = −kCO2 θ Tobs −Tref + dt vol oxygen concentration in the lake (g m−3 ) CO 2 k total respiration rate constant (d−1 ) θ temperature coefficient of respiration rate Tobs observed temperature in lake water (◦ C) Tref reference temperature (4 ◦ C) Feed 103 × dissolved oxygen flux (kg d−1 ) vol volume of aerator impact (m3 )

186

O. Malve et al. / Ecological Modelling 182 (2005) 183–197

Fig. 1. Bathymetric map of Lake Tuusulanj¨arvi. Location of water and bottom samples, aerator, area of aerator impact, and depth-volume curve.

sents the respiration rate (mg m−2 d−1 ) (Lorenzen and Fast, 1977). By coupling the temperature dependency of the respiration according to the Arrhenius formulation (Bowie et al., 1985) and the time-varying oxygen flux from the aerator, we introduced nonlinearity and dynamics to the model. The oxygen concentration is the average vertical concentration in the area of aerator impact. The model used in this study (Table 2), describes the average oxygen regime in the area of aerator impact

(1 km2 , Fig. 1). Due to the fact that biological oxygen demand (BOD) is under the detection limit in winter periods, it was not included in the model. Similar formulation has been used in the modelling of estuarine and coastal oxygen dynamics (Borsuk, 2001; Borsuk et al., 2001a,b). As we were interested in the time evolution of the respiration in the lake, the rate parameter k was estimated separately for each winter period. The tem-

O. Malve et al. / Ecological Modelling 182 (2005) 183–197

187

perature dependency was, however, assumed to have common value for all years. Fitting separate rate parameters for each year is also reasonable, as such timevarying factors as weather condition, nutrient loading, inflow–outflow rates and plankton population composition, which are not included in the model, must still contribute to the total oxygen consumption. Uncertainties in the model are connected to the total respiration rate parameter k, to the temperature coefficient θ, to pumped flux of dissolved oxygen and to the variance of vertically averaged oxygen concentration. The definition procedure of the prior values for these parameters is presented in Section 3.3. By their character the respiration rate parameter k and the temperature coefficient θ are correlated due to the model formulation. Correlation may constitute severe identifiability problem in parameter estimation but does not necessarily have severe impact on distributions of model predictions.

cent advances in the MCMC computing, together with constantly increasing CPU resources, have made even larger problems tractable, see for example, Haario et al. (2001). In the classical least squares estimation, we usually assume that the control variables (the x variables) are measured without error or at least with error that is negligible compared with the observational error in the dependent variables. Bayesian approach offers an intuitive way to model these uncertainties. An observation with error is considered as an extra parameter and a prior distribution describing the error structure is attached to it. A drawback with this approach, however, is that the number of the parameters of the model can grow quite large, which in turn imposes extra burden on the calculations.

3.2. Bayesian modelling

To use the model (1) in Bayesian MCMC settings we need to specify a prior distribution for the parameters. If we assume prior independence of the parameter values it is sufficient to specify a marginal density for each component of the parameter vector. Although in principle any kind of prior distributions can be used in MCMC calculations we have used only Gaussian priors with possible upper and lower limits for the values (e.g. positivity constraints). For the total respiration rate constants k1 , . . . , k31 (one for each winter period) it was reasonable to use non-informative priors. These parameters were originally of primely interest and we wanted to explore their posteriors without any prior constraints (other than positivity). The temperature dependency of the rate parameter θ in the form used here is not very well identified by the model. Preliminary modelling suggested a heavy nonlinear correlation between k and θ. A proper prior distribution was acquired from a laboratory experiment (sediment sampling points are shown in Fig. 1) (Lehtoranta and Malve, 2001). The distribution suggested by the experiment was Gaussian N(1.45, 0.4), and it was used as a prior for θ. For the unknown variance σ 2 of the observation error , a non-informative conjugate prior was used. Calculating the model (1) given the data and parameters requires solving a differential equation using numerical methods. To solve the equation (1) numer-

Unidentifiability of the parameters can be caused by limited availability of observational data or by the structure of the nonlinear model equations. This can sometimes be solved with better design of the experiments and reparametrizing the model, but many times we are stuck with the available inaccurate data and modelling is to be done in conventional units. Bayesian approach has shown to be powerful way to quantify the uncertainties in the whole the modelling procedure. This is show in many examples in ecological modelling (Adams VanHarn, 1998; Annan, 2001; Borsuk, 2001; Borsuk et al., 2001a; Harmon and Challenor, 1996; Omlin and Reichert, 1999; Reckhow, 2002; Qian et al., 2003). Unidentifiability is easily detected by inspecting the posterior distributions and in many times it does not even matter if the parameter themselves are correlated, because the model predictions are what we are interested on. These predictions are not affected by the unidentifiability if taken into account accordingly. Nonlinear models with large number of correlated parameters causes additional challenges to the computations, however, MCMC simulation of values from the posterior distribution of the parameters is a computational method that can be applied to wide variety of modelling problems (Gamerman, 1997). Re-

3.3. Applying Bayesian modelling to the lake model

188

O. Malve et al. / Ecological Modelling 182 (2005) 183–197

ically, we need to specify an initial condition for the oxygen concentration, i.e. the value of CO2 at the initial time point 0 for each year. However, this value is itself an observation subject to error, and solving the equation with fixed initial values causes bias. A better approach is to let the initial concentrations vary also, meaning that we have an extra parameter for each winter period. As with the other parameters, the MCMC algorithm will produce posterior distributions for these values as well. The term feed/vol in the model (1) corresponds to the amount of fresh oxygen feed dissolved in the lake water. For the value of feed we have the manufacturers information but the value is subject to some doubt. The imprecise knowledge of the volume of aerator impact also causes uncertainties. Gaussian priors for the oxygen feed in the five periods were (mean ± S.D. (kg d−1 )): (1) 1970–1972: 0 ± 20, (2) 1973–1982: 1350 ± 675, (3) 1983–1990: 240 ± 24, (4) 1991–1997: 450 ± 45, (5) 1998–2002: 0 ± 20. Note that in the period (1) no pump was installed yet, while in the latest period (5) the aerator only mixed the water, keeping the ice cover open near the pump. To sum up the model and the parameters so far: we have 31 respiration rate parameters k1 , . . . , k31 , one for each winter period, together with 31 initial values of the concentrations. The temperature dependency constant θ is assumed to be independent of the time and thus adds only one parameter. Error in the x variables approach for the oxygen feed term of the model brings five more parameters. Together with the unknown observation error σ 2 , the total number of parameters totals up to 69.

For example, this way we can calculate trends in the yearly respiration of the lake oxygen regime during the winter and have full posterior distribution describing the uncertainties in the trend estimates. In this way, any predictions based on the model include all the relevant background information and uncertainties. Unidentifiability of model parameters, which results from the parameter correlation, non-linearity of the model and from the characteristics of the data is also revealed by the MCMC chain. The posterior correlations and two-dimensional plots of the parameter combinations can be used as diagnostics of parameter identifiability. High posterior correlations between parameters make estimates of single parameters themselves useless if the dependency in the other parameters are not clearly stated, but usually have little impact on prediction distributions as the correlations are in consistent way taken into account in the predictions. The oxygen time profile can be predicted for a new year using the posterior information of the fitted model. This can even be done without any new observations. The posterior from the previous years is treated as a prior for the new year. Previous years can be weighted in a suitable matter. If we believe that winters closer to the current situation are better for the prediction (we believe that there is a trend in the parameter values) we can give decreasing weight to past year’s posteriors. This has been done in Section 4.3. Monte Carlo simulations with different aerator efficiencies together with the model predictions can be used to asses the risks of oxygen depletion or fish kills, Section 4.4.

3.4. Using the MCMC chain 4. Results The MCMC chain produced by the algorithm is a (n:o simulations) × (n:o parameters) matrix of samples from the posterior distribution of the multi-dimensional parameter vector. All statistical reasoning on the model and the parameters can be based on this matrix. Common usage is the calculation of posterior means, standard deviations and correlations from the chain together with some illustrative plots such as histograms and density estimates. In addition, any model derived values (calculations based on the model) are given predictive posterior distributions when calculated for each (or randomly sampled subset of) parameter in the chain.

4.1. Posterior distributions for the lake model parameters This simple model with separate rate parameter k estimated for each year gives very good agreement with the winter time observations and allows us to study the changes in the respiration over the years and the effect of the external oxygen feed. Fig. 2 show the data used together with the fit obtained by Bayesian methods. The model predicted oxygen concentration 95% posterior limits are shown by gray area around the median

O. Malve et al. / Ecological Modelling 182 (2005) 183–197

189

Fig. 2. Observed oxygen concentrations (circles) (g m−3 ) and temperatures (◦ C) (lower solid line) at ice-covered period in years 1970–2000. The x-axis is time from the start of the ice-cover period. The dots are the observed vertically averaged O2 concentrations. Smaller dots and the dashed line show observed temperatures (◦ C). The solid line with gray area around shows the median and the 95% region of posterior predictive distrubution.

estimate. The rate parameter k for each year is plotted in Fig. 3. The prior and posterior distributions of the oxygen feed for the three middle periods of the different aerators are plotted in Fig. 4. These were the periods when the pump was operated in a way that increased the available oxygen in the water (compared to just mix-

ing). Priors were based on the manufacturers announcements. The prior/posterior acceptance is good except for the first period of 1973–1982 where the prior mean 1350 (±675) is changed to posterior value 124 (±63). It seems that the ability of the aerator to dissolve oxygen to the water was over estimated at that time. In the later aeration periods this was not the case.

190

O. Malve et al. / Ecological Modelling 182 (2005) 183–197

Fig. 3. Box plots of the posterior distribution of the fitted oxygen respiration rate constant k (d−1 ) of the lake model (1) for each year.

The average respiration (mg m−2 d−1 ) in each year can be calculated from the fitted model as average oxygen consumption (mg m−2 d−2 ) multiplied by the average depth. It was calculated for each parameter realization of the MCMC chain thus giving the posterior predictive distribution of the respiration, as explained in Section 3.4. Fig. 5 shows the posteriors for estimated yearly values of the respiration. The mean estimated average respiration for the entire period 1970–2000 was 301 mg m−2 d−1 . This value is in the upper level of 11 eutrophic Canadian lakes on the same latitudes (latitude 64°) where the winter respiration was reported to be 131–306 mg m−2 d−1 by

Welch and Bergmann (1985). Annual variation in respiration was also quite high (S.D. 105 mg m−2 d−1 ). Average respirations in the 1970s, 1980s, 1990s and 2000–2002 were 327, 311, 281, and 270 mg m−2 d−1 showing slight (but not statistically significant) downward trend. Laboratory experiments conducted at Lake Tuusulanj¨arvi in April–May 2001 showed the value for the bottom sediment respiration to be 130 mg m−2 d−1 at +3 ◦ C and 230 at +4 ◦ C (Lehtoranta and Malve, 2001). Estimated respiration from the lake data in winter 2000–2001 was 215 (S.D. 97), which is on the same level as respiration in the laboratory experiments.

Fig. 4. A prior distribution of the oxygen feed (kg d−1 ) (dotted line) compared with the posterior distribution (solid line).

O. Malve et al. / Ecological Modelling 182 (2005) 183–197

191

Fig. 5. Average respiration (mg m−2 d−1 ) calculated from the MCMC chain. Box plots corresponds the predictive posterior distribution of estimated respiration.

The estimated oxygen respiration rate constants and the respiration values represent the area of aerator impact. In spite of that, those values are quite near the lake averages because most of the respi-

ration in Lake Tuusulanj¨arvi consists of the bottom sediment respiration, which is horizontally quite homogenously distributed (Lehtoranta and Malve, 2001).

Fig. 6. Two-dimensional MCMC chain from a preliminary fit where the respiration constant k was held fixed over all the years (with Tref = 20). The picture illustrates a heavy nonlinear dependency of the two parameters k and θ (the temperature dependency parameter), and also the evolution of the MCMC algorithm when the chain is started a far from the core of the posterior distribution.

192

O. Malve et al. / Ecological Modelling 182 (2005) 183–197

Fig. 7. Prior and posterior distributions of the temperature dependency parameter θ.

4.2. Parameter identifiability The model includes a temperature dependency term θ. However the temperature variation of the water in the winter time and during the aeration period varies quite a little. The temperature is typically steadily rising from about 1 ◦ C in the beginning of ice to some 2 ◦ C when the ice melts. At the same time the temperature dependency confounds itself with the k parameter.

As the temperature rises the concentration decreases at the same time. This makes it practically impossible to distinguish the temperature effect from the effect of the decrease because the respiration rate being proportional to the actual concentrations. In Fig. 6, a model with common k and θ for each year is fitted without any prior constraints. It clearly shows the nonlinear relationship of these parameters and unidentifiability arising from that. Laboratory experiment to determine the temperature dependency parameter θ (Lehtoranta and Malve, 2001) suggested a value 1.45 with standard deviation 0.2 so the no temperature effect (θ = 1) still remained valid alternative. In the literature several values for theta are given, but mostly for temperatures well above 4 ◦ C. Predictions given with model with θ fixed as 1 gave practically almost as good results as with θ > 1. However to allow predictions to depend on the temperature conditions and thus to study their effects, parameter θ was held in the final model. Fig. 7 shows the used prior together with the posterior from the final model with separate respiration rate k for each year, while Fig. 8 shows the posterior correlations of the parameters in the final model for one particular winter period.

Fig. 8. Two-dimensional marginal posterior distributions of k, θ, and Oxygen feed at winter 1993–1994. The two contour lines correspond to 50 and 90% regions.

O. Malve et al. / Ecological Modelling 182 (2005) 183–197

193

Fig. 9. Smoothed rate constant k with two levels of smoothing. Upper plot with lowess parameter h = 0.2 correspond to about 6-year trend, and the lower with h = 0.4 about 12-year trend. Gray levels give 50, 90, and 95% limits of the posterior.

Fig. 10. Smoothed respiration with two levels of smoothing, as in Fig. 9 for the k parameter. Gray levels give 50, 90, and 95% limits of the posterior.

194

O. Malve et al. / Ecological Modelling 182 (2005) 183–197

4.3. Time evolution of lake winter respiration The trend in different time scales of the respiration rate can be calculated as a derivate of a smoothed rate versus year curve. Using different smoothing parameters we can detect trends over different time periods. Smoothing could be done for instance with a moving average. We have used “lowess”—a locally weighted regression smoother (Cleveland, 1985). In Figs. 9 and 10, posterior limits for smoothed values of the rate parameter k and for the average respiration are calculated together with the value of the trend (one year change). The smoothing used here corresponds to an approximate 10-year trend. Note that here again we get posterior confidence limits of the trend as we calculate the trend curves over the parameters sampled in the MCMC chain.

Trend plots in Figs. 9 and 10 show statistically significant fluctuations and cyclic behavior in yearly values of the parameter k and the respiration in 5–10-year timescales. There seems to be a delay in the expected cyclic rise of the rate parameter k in the most recent years and the overall long time trend shows decreasing values for the rate. In the respiration, no overall trend over the whole study period can be detected. 4.4. Predicted oxygen regime in future winters Combining the posteriors from the past years, we can have data based priors for rate parameter k, for the initial concentration at the beginning of the winter and for the feed term. With these distributions we can simulate possible O2 time series and calculate prediction intervals. As soon as we receive data, for example, a

Fig. 11. Predicting a new winter. Four plot in the upper part show how prediction limits for the concentration decrease as more data becomes available. The lower plot displays how the posteriors distribution of parameter k becomes more accurate with new observations.

O. Malve et al. / Ecological Modelling 182 (2005) 183–197

first observation for that winter, we can fit a new model with the information from the previous years as prior and have a new posterior predictions. As new observations arrive we again update the model accordingly. Fig. 11 shows an example how the posterior distribution of the parameter k evolves as new observations become available. To predict the possible oxygen depletion and fish kills we want to know whether there is a risk of the oxygen concentration to go below 4 g m−1 . This is done by predicting the concentration at the end of the ice covered period. As the length of the winter is unknown beforehand we must allow uncertainty for it too. The distribution of the length was taken to be empirical distribution of the observed past winters. Fig. 12 shows one prediction from the data in Fig. 11. The prediction procedure can be used to plan the aeration measures, to optimize aerator efficiency and to help in the real time process control of aerators. Monte Carlo simulations for new years using the model and the posterior distributions together with simulated values for aerator efficiency and temperature profiles gives us predictive distributions of possible oxygen depletion scenarios. This can be done “on-line” with new observations.

195

5. Discussion and summary In this study, the estimation of the long-term evolution of lake winter respiration and the prediction of lake oxygen regime in future winters were used as examples of how uncertainties can be taken into account with the Bayesian approach and MCMC methods. The respiration, the rate of oxygen depletion per unit of hypolimnetic surface, has been used as a trophic state index (Hutchinson, 1938; Mortimer, 1941). The mean estimated respiration in whole the study period was 301 ± 105 mg m−2 d−1 , which is on the upper limit of winter respiration of eutrophic Canadian lakes on the same latitude reported by Welch and Bergmann (1985). The rate parameter k which is the reference rate of the respiration (d−1 ) at 4 ◦ C indicated cyclic behavior of about 9-year amplitude and some signs of global long term decrease. It had a statistically significant negative trend through out the study period, indicating a decrease in respiration rate independent of temperature. Coupling this with the fact that the temperature increased in the 1990s, it can be hypothesized that respiration would have decreased earlier and more steeply if the temperature had remained stable. The temperature dependency θ was also estimated from the data. Due to very little variability in the ob-

Fig. 12. Predicting a new winter. First plot on the right shows predictive posterior distributions for the amount of oxygen in the water at the end of winter. The four distributions correspond to the four observations in Fig. 11. The middle plot shows the probability that the concentration will go under 4 g m−3 after we have observed the second concentration. Plot on the right shows how the estimated estimated fresh oxygen feed that would be needed to keep oxygen amount above 4 g m−3 depends on the length of the winter after the second observation.

196

O. Malve et al. / Ecological Modelling 182 (2005) 183–197

served temperatures during the winters and to high non linear correlation between it and the rate parameter k, the temperature effect could not be identified by itself. Prior information for θ obtained from separate laboratory experiment solved the unidentifiability and allowed a proper posterior distribution for it. The poor acceptance of the prior oxygen feed term of the model compared to the posterior (Fig. 4) at the first aeration period 1973–1982 suggested that the ability of the aerator to dissolve oxygen to the water was over estimated at that time. In the later aeration periods, this was not the case. The reduction of the waste water loading in 1979 did not have an immediate impact on respiration rates. The overall decrease of respiration rate and of its confidence limits could have been a response to restoration measures and stabilization of lake trophic state. The Bayesian approach depends on the ability to define prior distributions on all the unknowns of the model. In ecological modelling the priors can be based on previous experiments and observations and also on the physical constrains of the phenomena. The demand of quantizising the uncertainties forces a fruitful dialog between modellers and specialists. Markov chain Monte Carlo (MCMC) method is a computational tool for Bayesian modelling. It can handle complicated and over parametrized environmental models with confounding badly identifiable effects. Prior information is combined with the model and the data to produce posterior information and can be used in analysis and predictions that take into account all the uncertainties in the model and the data. The benefits coming from using Bayesian estimation are that we are able to combine and pool information from different sources (laboratory experiments and lake data) and to quantify the uncertainties with full statistical approach using prior and posterior distributions. We can predict the future winters using the posterior information coming from past observations and combine it with new observations to produce predictive posterior distributions. This allows us, for example, to dimension the aerator to secure a target level of oxygen concentration with a given margin of safety. The unidentifiability of model parameters might prevent us to separate the effects of parameters but does not affect the model predictions because Bayesian computations take the full multidimensional distribu-

tions of the parameters into account without resorting to linearizations or other approximations.

Acknowledgments This research has been funded by the Academy of Finland. We are grateful to Mauri Pekkarinen from Central Uusimaa Federation of Municipalities for the Water Protection and Jarmo V¨aa¨ riskoski from the Uusimaa Regional Environment Centre for transmission of data. The head of the Water Resources Laboratory, Professor Pertti Vakkilainen, and Mr. Hannu Sirviö from the Finnish Environment Institute are warmly thanked for their help in the preparation phase of this research.

References Adams VanHarn, B.A., 1998. Parameter distributions for uncertainty propagation in water quality modeling. Ph.D. thesis, Department of Environment, Graduate School of Duke University. Annan, J.D., 2001. Modelling under uncertainty: Monte Carlo methods for temporally varying parameters. Ecol. Model. 136, 297– 302. Anonymous, S., 1984. Tuusulanj¨arven kunnostussuunnitelma. Keski-Uudenmaan vesiensuojelun kuntainliitto (Restoration plans for Lake Tuusulanj¨arvi), Vantaa, Finland (in Finnish). Borsuk, M.E., 2001. A graphical probability network model to support water quality decision making for the neuse river estuary, north carolina. Ph.D. thesis, Department of the Environment in the Graduate School of Duke University, in press. Borsuk, M.E., Higdon, D., Stow, C.A., Reckhow, K.H., 2001a. A Bayesian hierarchical model to predict benthic oxygen demand from organic matter loading in estuaries and coastal zones. Ecol. Model. 143, 165–181. Borsuk, M.E., Stow, C.A., Luettich Jr., R.A., Paerl, H.W., Pinckney, J.L., 2001b. Modelling oxygen dynamics in an intermittently stratified estuary: estimation of process rates using field data. Estuarine Coastal Shelf Sci. 52, 33–49. Bowie, G., Mills, W., Porcella, D., Campbell, C., Pagenkopf, J., Rupp, G., Johnson, K., Chan, P., Gherini, S., Chamberlin, C., 1985. Rates, constants, and kinetic formulations in surface water modeling. Tech. Rep. EPA/600/3-85/040, US Environmental Agency, ORD, Athens, GA, ERL. Brun, R., Reichert, P., Künsch, H., 2001. Practical identifiability analysis of large environmental simulation models. Water Resources Res. 37 4, 1015–1030. Cleveland, W.S., 1985. The Elements of Graphing Data. Wadsworth, Monterrey, CA. Gamerman, D., 1997. Markov Chain Monte Carlo — Stochastic Simulation for Bayesian Inference. Chapman & Hall.

O. Malve et al. / Ecological Modelling 182 (2005) 183–197 Haario, H., Saksman, E., Tamminen, J., 2001. An adaptive Metropolis algorithm. Bernoulli 7 2, 223–242. Harmon, R., Challenor, P., 1996. A Markov chain Monte Carlo method for estimation and assimilation into models. Ecol. Model. 101, 41–59. Hutchinson, G., 1938. On the relation between oxygen deficit and productivity and topology of lakes. Int. Rev. Gesamten Hydrobiol. 36. Kokkonen, T., 1997. Parameter identification in groundwater models. Part I. Bayesian approach to inverse groundwater problem. Licentiate thesis, Faculty of Civil and Environment Engineering, Helsinki University of Technology, Helsinki, Finland, 106 pp. Lappalainen, K., 1994. Positive changes in oxygen and nutrient contents in two Finnish lakes induced by Mixox hypolimnetic oxygenation method. Verh. Inter- nat. Verein. Limnol. 25, 2510– 2513. Lehtoranta, J., Malve, O., 2001. Tuusulanj¨arven pohjasedimentin hapenkulutuskoe 26.4.–4.5.2001. Tech. rep., SYKE, Helsinki, Oxygen consumption rate of Lake Tuusulanj¨arvi sediments (in Finnish). Lorenzen, M., Fast, A., 1977. A guide to aeration/circulation techniques for lake management. Tech. Rep. EPA-600/3-77-004, Corvallis Environmental Research Laboratory, Office of Research and Development, US Environmental Protection Agency, Corvallis, OR.

197

Mortimer, C.H., 1941. The exchange of dissolved substances between mud and water, 1 and 2. J. Ecol. 29. Omlin, M., Brun, R., Reichert, P., 2001a. Biogeochemical model of lake Zürich: sensitivity, identifiability and uncertainty analysis. Ecol. Model. 141 13, 105–123. Omlin, M., Reichert, P., Forster, R., 2001b. Biogeochemical model of lake Zürich: model equations and results. Ecol. Model. 141 13, 77–103. Omlin, M., Reichert, P., 1999. A comparison of techniques for the estimation of model prediction uncertainty. Ecol. Model. 115, 45–59. Qian, S.S., Stow, C.A., Borsuk, M.E., 2003. On Monte Carlo methods for Bayesian inference. Ecol. Model. 159 (2), 269–277. Reckhow, K.H., 2002. Bayesian approaches in ecological analysis and modeling. In: Canham, C.D., Cole, J.J., Lauenroth, W.K. (Eds.), The Role of Models in Ecosystem Science. Princeton University Press. Reichert, P., Vanrolleghem, P., 2001. Identifiability and uncertainty analysis of the river water quality model no. 1. Water Sci. Technol. 43 7, 329–338. Scavia, D., 1980. Uncertainty analysis of lake eutrophication model. Ph.D. thesis, Environmental and Water Resources Engineering in the University of Michigan. Welch, H., Bergmann, M.A., 1985. Winter respiration of lakes at Saqvaqjuac, N.W.T. Can. J. Fish. Sci. 25, 521–527.