The implementation of the ecosystem module of a coastal environmental model: Port Phillip Bay, Australia

The implementation of the ecosystem module of a coastal environmental model: Port Phillip Bay, Australia

Environmental Modelling & Software 15 (2000) 357–372 www.elsevier.com/locate/envsoft The implementation of the ecosystem module of a coastal environm...

287KB Sizes 3 Downloads 51 Views

Environmental Modelling & Software 15 (2000) 357–372 www.elsevier.com/locate/envsoft

The implementation of the ecosystem module of a coastal environmental model: Port Phillip Bay, Australia Alexander G. Murray a

a,*

, John S. Parslow a, Stephen J. Walker b, Jason R. Waring

a

Department of Zoology, University of Aberdeen, Tillydrone Avenue, Aberdeen, Scotland, UK b CSIRO Land and Water, GPO Box 1666, Canberra, ACT, Australia Received 6 May 1999; accepted 8 February 2000

Abstract When implemented as a computer program, an ecosystem model is only a part of a larger programming environment. This programming environment includes other programs, non-model program components, program design rules, data files, and associated analysis analytical tools. These components should be divided to allow programmers to focus on their areas of expertise, but must then be rejoined in such a way as to minimise debugging and execution overheads. We describe this larger programming environment as it surrounds a model of the ecosystem of Port Phillip Bay, Australia. The ecosystem model requires a transport model to allow spatial modelling; this transport model uses currents from a computationally intensive hydrodynamic model. Implementation of the ecosystem model also requires non-model code, such as routines to initialise parameters or the integration method. Their design determines program reliability and performance. A modular structure allows different parts of the model to be independently modified; this makes for efficient programming. We describe formal design rules used to enhance readability and information content of the model’s parameter names. To execute, the model must access data files and a record of the run must be kept — a Unix shell program serves both these functions. The data files may require software tools for generation or manipulation. Output from the model also requires post-processing for visualisation and analysis. The model is thus only a part of a network of software, whose development must be coordinated to ensure reliability and efficiency.  2000 Published by Elsevier Science Ltd. All rights reserved. Keywords: Marine ecosystem design

Software Availability Name of Software: bm ecosystem modules Developer: CSIRO Marine Research Contact Address: GPO Box 1538, Hobart, Tas 7001, Australia First available: 1996 Hardware requirements: SUN/Solaris Workstation Software Requirements: None Program Language: C++ Program Size: 547 KB Availability: Currently internal use only

* Corresponding author. Tel.: +44 1224 876544; fax: +44 1224 295511. E-mail address: [email protected] (A.G. Murray).

1. Introduction The concept of an environment is literally all encompassing. Modelling a particular ecosystem requires the description of a wide range of both biogeochemical and physical environmental processes using many different scientific disciplines and this can often require the collaboration of workers in these very different fields (Blackford and Radford, 1995). The modelled processes often require large amounts of forcing data to drive them and this may require researchers from other sciences. Finally output from the model must be analysed and visualised. The development of graphical user interfaces (GUIs) can be a complex task best handled by specialist programmers, although simpler post-processing may require only simple programs. Development and operation of an environmental model may therefore require a complex structure of its own to handle these interactions.

1364-8152/00/$ - see front matter  2000 Published by Elsevier Science Ltd. All rights reserved. PII: S 1 3 6 4 - 8 1 5 2 ( 0 0 ) 0 0 0 1 6 - 5

358

A.G. Murray et al. / Environmental Modelling & Software 15 (2000) 357–372

Because an ecosystem is complex, the ecosystem model’s programs are often computationally expensive. It is possible to use the separation of tasks to minimise computation overheads. Pre-processing or post-processing of activities allows the model to be restricted to a core component, thus speeding execution. For example physical oceanographic processes are largely unaffected by biology and hence the physical model can be preprocessed, producing a relatively simple transport model. Readability is a factor in program design that can be as important as execution speed; this is of critical importance where many programmers work on a program which is designed to be used over an indefinite period (Brookes, 1982). An essential strategy in ensuring readability is the minimisation of information exchange so that programmers working on one part of the program are able to treat other parts of the program as a black box. Consistent design standards and nomenclature also enhance readability. In this paper we will discuss the structures used in the modelling of a particular marine ecosystem, that of Port Phillip Bay, Australia. The 1950 km2 bay is a valuable natural resource, being used for fisheries, recreation, conservation, transport, and treated effluent disposal. Due to its narrow entrance, and to shallow sandbanks surrounding this, the water of the bay is quite isolated from the ocean — turnover of the bay’s water taking around 1 year. This isolation makes the ecosystem in the bay dependent on local processes and so amenable to management or sensitive to mismanagement. The bay is the site of the city of Melbourne and a particular environmental concern is the response of the ecosystem to changing inputs of nutrients via rivers and the Western Treatment Plant, a sewage treatment plant. As a result of these concerns, a scientific study was funded by Melbourne Water with the aim of establishing the “health of the bay” and this study has recently been reported (Harris et al., 1996). As a part of this study we developed a model of the Port Phillip Bay ecosystem (Murray and Parslow, 1997; Murray and Parslow, 1999b). This model links into a transport model which handles the movement of dissolved and particulate materials, including their inputs to the bay, and which describes the physical environment of the bay (Walker, 1999). The integrated model uses pre-generated current fields, output by a hydrodynamic model (Walker, 1999), and is forced by a variety of inputs either based on data or derived from further models, notably the model of nutrient inputs from major rivers derived by Sokolov and Black (1999). One possible approach to program design is to treat data objects (the boxes in Fig. 2) as the foci: object oriented programming. We have used this approach in Ada to model a marine ecosystem using parallel processes

(Tasks) which each contain a data object and which interact via Rendezvous (Murray, 1990). In the case of the Port Phillip Bay model our approach has been a more conservative top-down functional design. Because most of our objects have quite simple behaviour and all the processes involve interaction between objects, an object orientated approach would considerably increase the complication of programming. This would lead to less readable and more inefficient code. A problem with ecosystem description is that object boundaries are difficult to define. Ensuring the objects do not overlap, or leave gaps, may be more difficult if these objects are handled separately. The future application of the model to other environments means new program variable descriptions are likely to be required. One advantage of an OOP approach is that it is possible to add or remove complete objects (model variables) relatively easily. This is particularly the case in OO languages which allow a basic class to be defined and for particular instances of objects and their associated processes to be inherited from this class. For example, in the marine ecosystem there are many similar, but subtly different, types of phytoplankton (single celled plants). In different ecosystems, or to answer different questions, it may be necessary to include different phytoplankton types (which may be defined on the basis of size, taxonomy or some other property). Under an OO approach it is possible to create different types of phytoplankton as instances of a general class. When new variables are added, testing the model, as opposed to the program, may remain a very timeconsuming task. Since work on the Port Phillip Bay modelling began, in the mid 90s, computing power has substantially increased, thus reducing the computational overhead problem. OO design methodologies have also improved (Firestone et al., 1998), as have our own model analysis methods (Murray and Parslow, 1999a). It is intended, in the longer term, to produce an object orientated version of this model for future application to other marine ecosystems. The ecosystem model (Murray and Parslow, 1999b) is implemented at the centre of a complex programming environment. This environment includes routines within the ecosystem model’s own part of the program that handle the book book-keeping matters of initialisation, data fetching, and model integration methods, which are not part of the model’s specification per se. The programming environment also includes rules for design such as the formal specification of the names of model parameters. The integrated program in which the ecosystem model exists includes the physical transport model, which itself depends on pre-processing in a hydrodynamic model. The program requires input and output files whose access by the main program is coordinated using a Unix shell program. The preserved Unix shell also acts as a record of the exact circumstances underlying the

A.G. Murray et al. / Environmental Modelling & Software 15 (2000) 357–372

run. Finally, there are are a variety of software tools used to process the output files and tools used to create and to modify nutrient-input forcing files. The programming environment thus extends way beyond the original model to include: other non-model program components, other models, design rules, data data-processing tools and input and output data files.

2. The integrated model structure 2.1. Introduction The core of the integrated environmental model consists of an ecosystem model, which handles the local ecological transformation of material, and a transport model, which moves this matter from place to place and controls the program’s inputs and output. The ecosystem model is the principal subject of this paper, but the transport model provides the physical environment in which this part of the integrated model exists. The forcing for the transport model is derived from a more detailed hydrodynamic model. The interaction of programmers reduces the efficiency of programming because of the need to coordinate all activities. The degree to which programming is inhibited depends upon the degree of interaction. A factor of 8 has been estimated as the difference in coding efficiency (as lines of code per programmer per year) between cases where there are very few interactions and those where there are many interactions (Brookes, 1982). Fortunately the ecosystem processes have very little effect on the transport model. This means that the transport model can handle ecosystem variables with a minimum of knowledge of their nature. Similarly the ecosystem model need not know the processes that take place within the transport model. Thus both the ecosystem and transport model need only a moderate degree of interaction allowing efficient programming. This separation of the transport and ecosystem models in part reflects their separate development and it allows these models to be paired with different partner models in future. Unlike more complex models, such as the European Regional Seas Model (Blackford and Radford, 1995), only one group has handled all of the Port Phillip Bay ecosystem modelling. Since almost all processes in an ecosystem have effects on all other processes and variables, if the ecosystem model were to be split between different groups coordination of programming between these groups would be expected to greatly reduce efficiency of modelling. However, the model includes work on such diverse subjects as water-column plankton ecology, sediment geochemistry, and population dynamics of seagrasses, among other things (Murray and Parslow, 1999b). Any significant increase in model complexity would probably require that the ecosystem mod-

359

elling should be split between different specialist groups or individuals. Because of the lack of clear natural boundaries between different components and the extensive interaction in ecosystems, such a split would result complex interfaces and a severe loss in programming efficiency. By far the most computationally intensive part of the modelling of Port Phillip Bay is the calculation of the currents that hydrodynamic modelling provides. However, once the hydrodynamic model has been validated for a given period its output need not be changed if the biology is changed. The ecosystem model can be run using a transport model which is based on the outputs of the hydrodynamic model; avoidance of the heavy computation requirements of the hydrodynamic modelling allowed the integrated model computation time to be reduced by over three orders of magnitude. Since there is much more certainty about the processes underlying the physics than ecology, only a few runs are required to generate the hydrodynamic modelling results but many hundreds of runs are required to obtain satisfactory ecosystem model output (Murray and Parslow, 1997). Without this separation into the long-but-few run and the short-but-many run components it would be virtually impossible to carry out meaningful analysis of the ecosystem model’s dynamics and spatial patterns. The integrated transport-ecosystem model is typically run for 12-year simulations using three cycles of the observed forcing (weather and inputs from rivers) for the four-year period of 1991–5. The first eight years of a run are to allow the model independence from initialisation conditions and are discarded. Most variables reach dynamic equilibrium quite quickly and respond in the same manner when the forcing cycle is repeated. However, an earlier eight-year run period (with 4 discarded years) could sometimes fail to bring slow growing seagrasses to dynamic equilibrium. Refractory detritus can also take some considerable time to come to equilibrium and so an initial distribution, based on model output from a standard run, is specified at the start of the run. Because the model is close to equilibrium, steady state analysis has proven an effective tool (Murray and Parslow, 1999a). 2.2. Hydrodynamic models Walker (1999) describes the hydrodynamic models used to predict the currents in Port Phillip Bay. A 3D finite difference method using Port Phillip Bay bathymetry is used to calculates these currents. The grid has a 1000 m horizontal resolution (scales of 500–3000 m have been tested) with a 2 m vertical resolution. The entrance to the bay is only 3 km wide, so there were initially problems with the 1 km resolution in this region and smoothing of the modelled bathymetry was required locally, but elsewhere this grid provides realistic results.

360

A.G. Murray et al. / Environmental Modelling & Software 15 (2000) 357–372

The model is forced using winds, external water elevation and water density, the last of which is derived from temperature and salinity distributions which in turn depend on inputs of seawater, freshwater (from streams and precipitation), evaporation, solar radiation and air temperature. The hydrodynamic model provides a good simulation of the currents and tidal elevation observed in Port Phillip Bay in May to June and September to October 1994 (Walker, 1999). It is, however, too slow to be used to drive the ecosystem model because the degree of resolution produces some 2000 points in the horizontal plane, each with up to 10 depths, at which the model equations are solved at simulation intervals of minutes. As a result, the model takes days to complete a standard six-month simulation. The ecosystem model requires the simulation of several years for each run, so several such six-month periods are required. Clearly, a shorter run-time is needed. The hydrodynamic model is used to drive a transport between 59 model boxes (Fig. 1) in which the ecosystem model’s equations are solved in over 3 orders of magnitude less time than the hydrodynamic model takes to run. The transformation from the slow hydrodynamic model to the fast transport model is achieved using a novel Lagrangian particle tracking method (Walker, 1999). The individual boxes are seeded with large numbers of imaginary particles that are then allowed to drift in the currents generated under the hydrodynamic model on the fine scale hydrodynamic model’s grid. After 24

Fig. 1. Spatial structure of the Port Phillip Bay model showing major point sources for inputs of nutrient and water and Bass Strait boundary. The model is divided into 59 boxes (dashed lines) grouped into 8 regions for analysis purposes (Section 5). The regions are: 1 — Sands I the(the Heads); 2 — the Sands II; 3 — East Coast; 4 — Yarra; 5 — North West; 6 — Werribee; 7 — Corio Bay; and 8 — Bay Centre.

h the particle’s positions are determined and assigned to appropriate transport model boxes. The transport between any two boxes can then be calculated on the basis of the relative number of particles that have moved between those two boxes. In multi-layer versions of the model the transport may be horizontal or vertical, but because Port Phillip Bay was vertically well mixed we used a single water-column layer model, with horizontal transport, as standard. The 24 h period for integration is used to minimise the gross tidal currents, leaving only net transport. As material becomes mixed throughout the fine box to which it is transported, transport must be restricted to net transport or numerically generated diffusion can become a severe problem. Transport model results were verified using fortnightly salinity observations from 8 sites for the period 1991–5. The modelling is detailed in Walker (1999). 2.3. Transport model The transport model (Walker, 1999) acts as a link between the hydrodynamic model and the ecosystem model. For the transport (and ecosystem) model the bay is divided into 59 polygonal boxes (Fig. 1). Box boundaries on the onshore-offshore axis were based on the 5 m interval depth contours; on some steep slopes 10 m intervals were used. The alongshore axis boundaries were determined less formally using coastal features such as nutrient sources. These boxes are subdivided into sediment and water-column components (with epibenthic variables such as macroalgae situated between them) that may themselves be subdivided into layers, although this is not standard in the modelling of wellmixed Port Phillip Bay. The transport model takes the net transport between the model’s boxes, which is derived from the hydrodynamic model, to calculate the motion of dissolved or suspended material between those boxes. The transport data is stored in files covering sixmonth periods; pre-generated currents can be assembled from a library to cover a longer period. The resultant spatial distribution of variables generated by the transport model is used to provide initial values to the ecosystem model. The transport model defines the advection of dissolved and particulate matter and the settling out of particles from the water-column; advection includes exchanges with the oceanic water across the Bass Strait boundary. In the sediment, the model describes the processes of resuspension and internal mixing of sediments and solutes (including animal driven bioturbation and bioirrigation). This model calculates stress on the bay’s bottom and this is used by the ecosystem model to drive some of the macroalgae’s mortality (storm damage). The transport model calculates gas flux across the water’s surface; this is of most importance for oxygen. The transport model handles inputs of tracers from point

A.G. Murray et al. / Environmental Modelling & Software 15 (2000) 357–372

sources and from the atmosphere; nutrient inputs from point sources (Fig. 1) handled here are of fundamental importance to the ecosystem model. The program that runs the transport model also handles the output of model results and input of parameters other than those that directly controlling ecological processes. Due to the decoupling of the transport and ecosystem models, numerical problems can be produced if the model’s time-steps are too long. If the time-steps are too short, then unnecessary computational overheads result in reduced performance. A 24 h time step produces good results, although numerical problems may occur in shallow water for fast fast-sinking variables, particularly if there are multiple water-column layers in the model. More details of the transport model can be found in Walker (1999). 2.4. Ecosystem model This part of the integrated model solves the equations describing ecological processes occurring in the integrated model’s 59 boxes. Distributions of variables between these boxes, which are generated by the transport model, act as the initial conditions that drive these equations. The ecosystem model imports parameters for the ecological equations only. The nature of this model is the subject of the next section and is described in more detail by Murray and Parslow (1997, 1999b).

3. The ecosystem model The Port Phillip Bay model exists as an ideal theoretical model per se and as a practical computer program that approximates, allowing solution of, this theoretical model. The model is summarised first and then the methods that have been used to implement this model are described. 3.1. Model overview The ecosystem model describes the transformation of material between a variety of inorganic and organic forms. The model includes 17 variables representing several nutrients, primary producers, grazers, detritus and oxygen as detailed below. Cycles of N, P, Si and, implicitly, of C are included in the model. Organic variables contain a fixed Redfield Ratio of C:N:P, and Si is contained in diatoms (large phytoplankton and microphytobenthos) at a fixed N:Si ratio. Sets of differential equations describe fluxes occurring within the water-column, sediment and epibenthos (the surface of the sediment). These equations are implemented in 59 local boxes that describe the different areas of the bay. The ecosystem model’s local structure is summarised in

361

Fig. 2 for the case of the nitrogen fluxes, which control bay-wide production. Nutrients contained in the model are ammonia, nitrate, phosphate and silica. When calculating nutrient uptake by primary producers the ammonia and nitrate are grouped together as dissolved inorganic nitrogen (DIN). It is DIN supply that limits production over most of the bay and so it is critical that this pool of nitrogen is modelled. This DIN is separated into ammonia and nitrate because there are good data on both of these in the water (Longmore et al., 1996), sediments (Nicholson et al., 1996) and in modelled inputs (Sokolov and Black, 1999). Phosphate never limits production in Port Phillip Bay, but good data are available (references as for DIN) and so P is modelled as a tracer of mixing processes. Si is occasionally limiting to production, particularly near the discharges of the Western Treatment Plant where large inputs of N and P leave a relative shortfall of Si. Due to this Si shortfall, Si must be modelled if local production is to be correctly modelled. Although there are good data on Si within the bay (Longmore et al., 1996), the data on inputs are unfortunately very limited. Primary producers take inorganic materials and, in the presence of light, convert them to organic matter via photosynthesis. The model includes 5 functionally different primary producers, which reflects the high biodiversity of Port Phillip Bay. Small phytoplankton need only low concentrations of nutrient, but grazers restrain their population. They thus have a relatively constant biomass in time and space. Large phytoplankton can take advantage of high nutrient inputs to form blooms, but do require moderate nutrient concentrations (including Si) to grow. They are most abundant around nutrient input sources and at times of highest input, particularly spring. Microphytobenthos (MPB) are single celled algae that grow in the sediments of the bay. Here nutrients are more abundant and so in most parts of the bay they are relatively insensitive to nutrient availability and much more sensitive to light limitation. Their production is equivalent to about 50–60% of that of the bay’s phytoplankton, so they represent a considerable, often ignored, component of this and many other coastal ecosystems. Macroalgae (seaweeds) in Port Phillip Bay contain a biomass much larger than (about 2–3 times) than that of phytoplankton. Since their growth is relatively slow, their contribution to production is approximately 20% of phytoplankton’s, (Murray and Parslow, 1999b); locally, in the north west of the bay, their production may be comparable to that of phytoplankton. Finally, seagrasses are found in oligotrophic waters in the southwestern parts of the bay. They account for a relatively minor component of the bay’s primary production, but they represent a valuable habitat and are a sensitive indicator of water quality and are modelled for these reasons. There are three types of grazers included in the model. Small protozoan zooplankton graze the small phyto-

362

A.G. Murray et al. / Environmental Modelling & Software 15 (2000) 357–372

Fig. 2.

Summary of nitrogen pools and fluxes occurring in a box of the Port Phillip Bay Ecosystem model.

plankton. The rapid growth of these zooplankton enables them to respond to any growth in their prey’s population, thus preventing blooms. Large crustacean zooplankton, which graze large phytoplankton, have a slower response to changes in their prey’s population. This decoupling can allow large phytoplankton to blooms if the system is perturbed. A final category of grazer are consists of benthic filter-feeders, large animals such as scallops or sabellid worms that graze both types of phytoplankton and small zooplankton. They are locally important grazers in enriched shallow waters. Their inclusion in the model has allowed the investigation of potential effects from the recent invasion of an exotic sabellid worm (Murray and Parslow, 1999b). Modelled detritus consists of labile and refractory forms, which tie up considerable amounts of N and P in the sediments. However, these pools only constitute a small fraction of the total observed detrital pool. Most nutrient mass, although little flux, consists of very refractory material which does not turn over on time scales of relevance to the model; this very refractory material is not modelled. Detrital silica is modelled separately from N and P. Dissolved organic matter is the most abundant form of N in the water-column of Port Phillip Bay and thus contributes most of the N exported to Bass Strait. However, the main loss of N from Port Phillip Bay occurs via denitrification, not export (Harris et al., 1996). When organic matter breaks down in the sediments the N is released as ammonia, nitrate or N2 gas; the last is lost to the system. Loss due to denitrification is of critical importance in maintaining water-quality in the bay. Denitrification depends upon locally very rapid processes of biogeochemistry and diffusion/bioirrigation. Modelling such processes requires short time steps and a high degree of spatial resolution (Blackburn and Blackburn, 1993) which could impose very large overheads on execution of the model. To avoid such problems we

developed an empirical model, using observations of fluxes from the bay’s sediment (Nicholson et al., 1996), which relates denitrification to sediment respiration (Murray and Parslow, 1999a,b) and allows us to use a longer time step. The model also contains an oxygen variable; this is currently used only to trace net metabolic processes. The values of the model variables change with time according to a series of ordinary differential equation whose details are discussed by Murray and Parslow (1997). For most variables two sets of these equations exist, one describing the processes in the water column and one for the sediment. Epibenthic variables interact with both the water column and the sediment. The model equations are solved in each of 59 polygonal boxes defined by the transport model (Fig. 1). For purposes of analysis these boxes are often grouped into 8 (or in some cases 9) regions covering the major local environments within the bay. 3.2. The program of the ecosystem model An ecosystem model, unless very simple, cannot be used until it has been implemented as a computer program. Programming is not an automatic translation process; it requires decisions to aid readability and debugging, to maximise coding efficiency and to minimise numerical errors. The ecosystem model summarised above has been coded in C++ and is contained in two files — biology.c and ecomodel.c (Fig. 3). The different files control the operation of logically different processes; i.e. an objectorientated approach has been adopted. The file biology.c is the control file, handling initialisation, input of data from the transport model, preprocessing of these data and integration methods. The integration method calls procedures in ecomodel.c to solve the ecosystem mod-

A.G. Murray et al. / Environmental Modelling & Software 15 (2000) 357–372

Fig. 3. JSP (King and Pardoe, 1985) diagram of the dependence of functions involved in ongoing model operation. The star shows repeated calls,; the square indicates a choice, determined in this case by flag F (arrow from circle).

el’s own equations. The biology file also handles the initialisation and storage of the ecosystem model’s parameters. To aid alteration of the model and to help debugging the program is designed to make updating of the ecosystem model simple. As at the level of the ecosystem/transport model separation, usually only the immediately relevant code, that containing ecosystem model routines and parameter input in the initialisation code, needs to be altered to update the ecosystem model. 3.2.1. Initialisation Model initialisation is handled in a relatively straightforward manner in the biology file. This routine checks the model’s time units, sets up arrays of model variables and inputs the ecosystem model’s parameters. Parameter names in the input file are checked against the program’s internal list, so the order of parameters in the input file is not important. This independence from the inputs file’s structure removes an important potential source of errors. The code also checks that all parameters have values assigned and that there are no excess parameters, thus preventing the use of an inappropriate input file. The input procedure converts time units input from units of day⫺1, which is appropriate for most biological processes, to units of s⫺1, to be consistent with SI time convention that is used in the program code. 3.2.2. Biology The function “biology” controls the post-initialisation behaviour of the ecosystem model. It is called repeatedly by the function main() in the transport model: once for each box in the model on each model time step. The transport model provides the initial concentrations of variables in the model box. The ecosystem equations in each model box are then solved independently of processes occurring in neighbouring boxes. This indepen-

363

dence allows the selection of a local time step (see integration) appropriate to the local ecosystem dynamics. Local areas of rapid variable turnover, and hence stiff differential equations (Burden and Faires, 1988), do not lead to a slowing of the solution of equations throughout the entire model. The biology routine first calls subroutines to calculate the average light intensity for each layer in the watercolumn and at the surface of the sediment. Temperature values are then calculated as a function of time of year; parameters that describe rates of processes are modified to reflect the effect of this temperature. After this preprocessing, which avoids the need to recalculate the parameters each time they are used, the model equations themselves are called for each model layer. The model equations are called via two intermediate routines. The first routine determines the nature of the model layer (water, sediment, epibenthic boundary), this is straightforward and we will not discuss it further. The second routine handles the integration of the model’s solution; the integration routine in turn calls the code holding the model equations for the selected environment. 3.2.3. Light and temperature effects Primary production in the model is forced by the availability of light, while metabolically driven rates are sensitive to temperature. Light and temperature vary from day to day and therefore the rates of the processes these drive vary. However, we can use a single calculation to determine the light field and the rates of processes, given the temperature for a particular day in a model box. We precede the execution of the model proper with these calculations, thus saving computation required to implement these environmental effects as they are used. Primary production requires light and so the model calculates the average light intensity within each layer of the water-column, and that at the surface of the sediments. Surface light is input using a file of observed irradiance intensity. Light then attenuates with depth on the basis of an attenuation coefficient, itself a function of water-column content (chlorophyll, detritus, DOM and the water itself). Light reaching the bottom of one layer of the water-column is used as the surface intensity for the subsequent layers. The average light intensity throughout each layer is found by integrating and then averaging the light intensities throughout that layer. Light at the surface of the sediment is found from the light intensity at the bottom of the deepest layer of the water-column. Implementation of the attenuation model presents a problem in that water-column contents change throughout the model time step and hence, strictly, attenuation varies within the time-step. In the one water-column one sediment layer model all the ecosystem model processes

364

A.G. Murray et al. / Environmental Modelling & Software 15 (2000) 357–372

are coordinated, hence it would be possible to recalculate attenuation for each sub-time step. However in multilayer versions of the model, equations in each layer are independently solved with their own sub-time steps (see integration), hence it is not possible to coordinate attenuation coefficients in each layer. For this reason, and because it saves on computation, average light intensities are calculated on the basis of the initial contents of the water-column as they are passed from the transport model. This light intensity is then passed as a parameter when the model implementation code is called. Rates of biological processes often depend on temperature. At the start of the full model time step the model calculates modified values for all rate parameters based on the current temperature. The names of temperature sensitive parameters contain the suffix FT15, to indicate that these basic values apply at 15°C. This is the average temperature of Port Phillip Bay; it varies from 10°C in winter to 20°C in summer. Temperature is calculated using a simple sinusoidal seasonal signal and is the same for all parts of the water-column; hence the calculations need be made just once per time step to cover all layers in the model. The parameters are modified by the function: t−15

Pt⫽P⫻Q1010

Where where P is the parameter, Pt the temperature modified parameter, t is temperature and Q10 is the factor by which the parameter changes for a 10°C temperature change. Currently all variables use a Q10 of 2 (Murray and Parslow, 1997), but potentially individual parameters could have different Q10, although validation of the use of multiple Q10 values would be difficult. The calculations of light and temperature are carried out in the different files of the code (Fig. 3). Light is simply treated as a single extra variable, and can therefore easily be passed to the ecosystem model code along with all the other variables. Temperature, however, affects a large number of parameters. It is easier and more secure to produce modified versions of these parameters exclusively within the “ecomodel” file where they are required. 3.2.4. Integration Programs that describe continuous processes can usually only ever be numerical approximations of the models they implement. One of the most significant numerical approximations in determining the model’s structure is the selection of the integration method. In the program the model equations are solved using a simple linear Eulearian approximation. A basic model time step length over which the equations are solved is set as a model parameter. However, net changes may be very rapid (equations stiff) and so, for a fixed time step, overshoot of zero may create negative values in the variables. If a

very short time step is used as the default then the model will carry out many unnecessary calculations when solving the equations under more stable conditions, thus reducing the efficiency of the implementation. To avoid both these problems the full time step may be divided into sub-time steps. The sub-time steps are invoked if net change of the most rapidly altering variable over a time step is of more than a fixed percent; this maximum acceptable level of change is set as a model parameter. The sub-time step is then adjusted such that this level of variation is exactly reached by the most volatile variable. This ease of adjustment is the reason for using a simple Eulerian method: the linear relationship of the solution with time allows for linear adjustment of the time step. The model solutions are applied over the subtime step and variables updated. Sub-time steps are called repeatedly as required, until the full time step is used up. If the length of the last sub-time step exceeds the remaining time in the full time step, the sub time step is rounded down to equal this this available time. Exceptions to the rules for adjustment of time steps exist at very low values of some variables whose change is independent of their value. This prevents small absolute changes from delaying model execution. For example, if initial detrital values are very low then the production of detritus due to zooplankton grazing may be large relative to the existing detritus pool. Since this production of the detritus is independent of the existing pool of detritus the rate of production will not change as the pool size changes and so it makes no sense to use a small time step in this case. A similar method, a Eulerian integration routine with a variable time step, has been used in ERSEM, the European Regional Seas Ecosystem Model (Ruardij et al., 1995). In that model the accuracy of the Eulearian method has been compared with a 5th order Runga-Kutta method, with very little effect upon the solution. The separation of physical and ecological processes in the Port Phillip Bay model, which greatly eased model development and testing, means that numerical benefits of adopting a more complex integration scheme are likely to be even less than those under the ERSEM. Recently the model has been revised to incorporate the more sophisticated integration routines dopri5 and dopri8 (Hairer et al., 1990). These methods produce nearly identical results to those obtained under the simple integration routines for standard runs. Specialised stiff equation solving methods were found not to be required as the ecosystem equations exhibit only moderate stiffness. 3.2.5. The ecology/biogeochemistry equations The model’s actual equations will not be described here; they are better described elsewhere (see Murray and Parslow, 1997). As described above, model equations are contained in three routines: water-column, sedi-

A.G. Murray et al. / Environmental Modelling & Software 15 (2000) 357–372

ment and epibenthos. The appropriate routine (as determined by flag F) is called by the integration routine and calculates the rates of change of variables. At the interface between the bottom of the water-column and the surface of the sediment the equations of these two layers and of the epibenthos variables are solved together. This is required because epibenthic organisms continuously transfer material between the water-column and sediment. The standard version of the model has only two layers (1 sediment and 1 watercolumn) and so the epibenthos routine is the only form that is directly called by the integration routine. When the epibenthos routine is called, it in turn calls the routines to solve the model in the neighbouring water-column and the sediment layers. The model then adds to the solutions from these routines the effects of changes due to macroalgae, seagrasses and filter-feeders. These total solutions are then passed back to the integration method. The integration method, as just described, selects a time step in which the most rapidly changing variable alters within acceptable limits. This means that the equations are solved at a frequency dependent upon this variable. It is for this reason that a simplified sediment biogeochemistry model was adopted, since the rapid fluxes and reaction rates found in surface sediments could seriously slow execution of the entire model. To implement most modifications to the ecosystem model it is only necessary to change the code in these routines, and to update the parameter initialisation routine if new parameters are added. Even the addition of new variables, apart from specification of the changes to the variables in the file headers (the .h files of the C++ program), is essentially restricted to these pieces of code. This modularity makes for relatively easy model modification.

365

3.3. Model parameter naming conventions The ecosystem model contains a very large number of parameters and it is easy to lose track of their specific function. In order to increase program readability a series of formal naming conventions has been adopted (Blackford and Radford, 1995). These conventions require a degree of flexibility if names were not to become excessively long (thus undermining readability again) and so a few exceptions to the rules exist, as discussed below. For most parameters the name used is a two or three field structure of the format: Parameter typeFVariable nameFTemperature Sensitivity The parameter type describes the function of the parameter; the variable name is a (usually) 2 letter description of the variables the parameter is associated with; when the temperature sensitivity field is present this indicates that the parameter is temperature sensitive. For example large zooplankton’s clearance rate when feeding at 15°C is: CFZLFT15 while nitrogen uptake half saturation (which is not temperature sensitive) for large phytoplankton is: KNFPL. The parameter name rules are modified in the case of attenuation coefficients to allow the variable name field to include references to factors which absorb light: water, total phytoplankton and suspended inorganic matter. These substances are not among the ecosystem model’s variables per se. Thus, attenuation due to water is: kFw.

3.2.6. Conservation of matter Since input, output and transport are handled by the transport model, the total amount of nutrient nitrogen present in the cell during the ecosystem model’s execution is decreased only by the biogeochemical process of denitrification. Other ecosystem processes just alter the form of the N. This conservation principle is a powerful tool for ensuring the correct operation of the model. Nitrogen level at the start and finish of the execution of the ecosystem model are compared, together with denitrification losses in sediment cells, to ensure that there is no unaccounted for loss. During development of the model this test was applied more widely, but to avoid unnecessary overheads, only a comparison of the N at the start and end of the model’s integration time step is carried out under normal running. The routine that calculates this sum of N is not shown in Fig. 3.

Modification of the use of variable name field is even more extreme for parameters defining the ratios of components in material. These components are elements (C, N, P, Si or O) or chlorophyll and the parameter describes the fixed ratio between two of them. The variable name field includes these two components, for example the Redfield C:N (the carbon:nitrogen ratio of planktonic organisms) ratio is: XFCN. Some parameter names load extra meanings into the first (parameter function) field. For example the fraction of large zooplankton’s mortality that ends up as detritus is FDMFZL.

366

A.G. Murray et al. / Environmental Modelling & Software 15 (2000) 357–372

The three letters in the first field hold three meanings: F for fraction, D for (labile) detritus and M for mortality. A more extreme example is provided by: FDGDLFBF. This parameter name describes a fraction (F) of detritus (D) that is produced by grazing (G) on labile detritus (DL) by benthic filter-feeders (BF). This fraction is different from the fraction of detritus produced when benthic filter-feeders feed on other food sources (FDGFBF), hence the need for a specific name. These extra meanings do not change the nature of the first field:, they simply increase the amount of information that it carries and hence its usefulness. An exception to the usual name structure is provided for the parameters giving maximum biomasses of macroalgae or seagrasses and the maximum fraction of denitrification. These parameters use an unhyphenated name that begins with the variable (or process D for denitrification) name and is followed by max (e.g. MAmax for maximum macroalgal biomass), in the reverse order to that used for most parameter names. This usage was adopted to clearly distinguish these parameters from maximum rate parameters, (e.g. mumFMA, the maximum growth rate of macroalgae). Another exception to the rules is that the middle field of the names of the sediment geochemistry parameters RF0FT15 and RFDFT15 do not refer to model variables. These fields refer instead to the status of geochemical processes. The field 0 refers to the level of sediment respiration (R) at which nitrification reaches zero through lack of oxygen. D refers to the level of sediment respiration at which denitrification consumes all nitrate produced by nitrification (Murray and Parslow, 1999b). This use of formal rules, with some flexibility, allows the meaning of parameters to be quickly established, a great asset for debugging or code modification. It is not necessarily intuitively obvious what the meaning of an individual field is because names are short and sometime arbitrary (e.g. X for material composition ratios), however longer names can lead to loss in readability of the program code. Experience soon gives a familiarity with these short names that was not easily obtained (but was easily lost) using an earlier, less systematic, notation.

4. Input and output of the integrated Port Phillip Bay model 4.1. Introduction A complex model such as the Port Phillip Bay model requires a large amount of data of a range of types to operate. It also produces a great deal of output, which

requires post-processing using a number of purpose build and commercially available packages. Separation of the data into different files makes these processes manageable. Most often, the ecosystem modeller wishes to alter the ecosystem model’s own parameters, contained in one of the input files. However, other files contain changeable data of direct importance to the ecosystem modeller. Most input and output are actually handled in the physical transport model’s code. The ecosystem modeller may need to change the data in the files but should not need to alter the code in the input routines. The ecosystem model does handle the input of its own parameters, and certain specialised outputs such as screen error messages and an operations log. For these, small coding changes may sometimes be required. A Unix shell program contains the names and addresses of input files accessed and the program version executed. The preservation of this shell allows output files to be directly connected to their specific run environments and, if necessary, recreated. Output is into a file named on running the shell, by convention this output file has the same name as the shell program, but version numbers may be added if re-runs are required. The principal model inputs are: Ecosystem parameters Physics parameters Run control parameter Model and variable properties External forcing values Bass Strait Boundary conditions And outputs are: run time log main output file The input files, together with the version of the model to be used, are specified in a Unix shell program. The output file’s name is specified as a parameter of the shell: by convention the same name as that of the shell program is used. When any input file is changed a new version, under a new name, is created; the old version is preserved under its existing name and a description is saved in a text file. Thus the shell program can be rerun at any time, allowing output from old runs to be restored if they should become lost or corrupted. The program uses SI units to describe variables and fluxes. However the ecological model’s parameter file uses units of days, and mg (usually of N), since these are more familiar from the biological literature. Time is converted to seconds on input to allow efficient interaction with times obtained from the transport model. The issue of units of time represents a major potential problem, since it is difficult to debug input files in unfamiliar

A.G. Murray et al. / Environmental Modelling & Software 15 (2000) 357–372

units, while it is also difficult to debug the program code if different unit systems are used. Variables in the ecosystem model are, however, defined in the non-SI terms of mg N m⫺3 (or mg P m⫺3 or mg Si m⫺3) because the transport model does not need to know the units used, only the relative concentrations in the different boxes. The transport model also possesses the ability to convert units. This is particularly useful for specifying the lengths of model runs, which are usually 12 years (consisting of 3 cycles of 4 years, 8 years of initialisation run being discarded), in intuitive days rather than seconds. 4.2. The ecosystem model’s parameters This file consists simply of a list of all the ecosystem model’s parameters, comprising of the parameter name, its value, a short description, units and a standard value or range of values. The parameters present are those listed in Murray and Parslow (1997). As well as model parameters, this file includes the operational parameter. Parameter RelFTol determines how much the most sensitive variable may change before the time-step is changed. FlagSeason specifies whether seasonal cycles of light and temperature are included in this model run, or constant values are to be used. The parameter ecotest determines the level of detail that is recorded in the run’s log. The ecosystem model’s parameters file are read in and stored locally by the ecosystem model when this is initialised. The order of data in this file is not important as the file reading routine in the program checks the names in the file against an internal name-list. This check also allows the model to reject inappropriate versions of this file since missing or unexpected parameters are detected. In the file, units of time are defied in terms of days since this is most familiar for most literature values on biological process rates. On input, time units are converted to SI units of seconds, in line with usage in physical modelling. It can be very difficult to detect errors in this file if process rates are expressed in unfamiliar units. 4.3. Transport model’s parameters This parameter file contains a series of parameters concerned with mixing of water-column and sediment and erosion of sediment. Alteration of this file is rarely undertaken during ecosystem modelling because it only affects ecosystem processes indirectly. 4.4. Run control parameters This file consists simply of parameters that control the model time step size, the number of time steps in the run, the time step at which output is started and the interval at

367

which this output occurs. The values in this file are generally specified in days for convenience of interpretation; they are converted to seconds on input. 4.5. Model and variable properties The structure of the model, including the physical properties of the model’s variables, is described in this file which is read in by the program when the transport model is initialised. This file contains several component data types. First the file lists a variety of data on aspects of the physical model and its structure and variables used in the physical model. There then follows a list of the ecosystem model’s variables and their properties, including diagnostic variables. The diagnostic variables contain the value of major fluxes, which are stored in the output file as useful tracers of the model’s behaviour. The file finishes with the initial values, by fine box, of selected variables including refractory detritus. These distributions of initial values reduce the time required for the model to come to equilibrium. If a new variable is added to this file, the transport model is automatically altered to initialise, store, transport, and output their values. Of course, the ecosystem model still has to be altered to include the equations of these new variables, but coding alterations are restricted to the immediate area of concern. The two properties of these variables of most immediate relevance to the ecosystem model are contained within the fields called “svel” and “FFillFval”. These control the variables settling velocity and initial value (and this includes default boundary conditions not specifically set in the boundary condition file, since initial values in the Bass Strait box do not change). Note that this field is expressed in SI units, which tend to be very small for sinking rates given in m s⫺1. 4.6. Input forcing parameters The external environment, such as weather and input from rivers, drives the internal environment of Port Phillip Bay. To replicate this the integrated model is driven by predefined forcing inputs, such as nutrients and light, from both point and diffuse sources. Other forcing factors (such as winds) are used by the hydrodynamic model to generate the current fields that are used to drive the transport model. A coordinating file handles all of these various input forcing functions. Some of these input files are generated, and may be modified, using further programs. The overall forcing file accesses a large number of different components contained in other files. Firstly there is a list of water-current files, calculated from the particle tracer modelling for six-month periods each, which allow the pre-calculated water-current regimes to

368

A.G. Murray et al. / Environmental Modelling & Software 15 (2000) 357–372

be included in the model. This transport forcing data controls the years that the ecosystem model can replicate, longer runs must repeat years. This is followed by a file of bottom stress values and then two sets of files of inputs from freshwater sources: inputs of freshwater for the transport model and inputs of nutrients for ecosystem model. The number of input sources used may vary between runs. There are then files of precipitation, evaporation and solar radiation. The ecosystem model makes direct use of the inputs of solar radiation for photosynthesis and of bottom stress, which removes macroalgae. Inputs of nutrients have most effect on the ecosystem modelling, as these nutrients drive the productivity and structure of the ecosystem and are also most easily influenced by management changes in the Port Phillip Bay catchment. The nutrient point sources included in the model are the Western Treatment Plant, the Yarra River, the Patterson River, and Mordialloc Creek, Kororoit Creek and the Werribee River (Fig. 1). The Western Treatment Plant is the largest single source of nutrients into Port Phillip Bay. Data on its inputs have been obtained from weekly measurements of concentrations multiplied by flow. There is only a limited amount of data on inputs of nutrients from the Yarra River and from the linked Patterson River and Mordialloc Creek system. Inputs were derived using a model developed by Sokolov and Black (1999). These streams account for the great majority of non-Western Treatment Plant nutrient inputs. Silica inputs from the Patterson-Mordialloc system, in the absence of good data, were added later using a simple fixed concentration and observed flow. Inputs from the Werribee River and Kororoit Creek, for which there were little data, were derived using constant concentration and observed flows. These minor creeks are only significant nutrient sources during brief flood events. Sokolov and Black’s (1999) model of nutrient inputs from the major rivers calculated this input for a given river using a function of the calculated concentration in the catchment and the measured flow of water in the river. Nutrients accumulate in the catchment with time and are removed by the river’s flow. Thus a flood occurring after a long period of low flow, during which nutrients could accumulate in the catchment, would input more nutrient than a similar flood occurring after a period of high flow. Parameters were obtained from the observed concentration and flow time series. Inputs from the Patterson-Mordialloc system are complicated because the Patterson River receives overflow from the Mordialloc Creek and so the model had to include this exchange of waters with different properties. The nutrient input files have been modified in order to allow the execution of model nutrient load scenarios (Murray and Parslow, 1998). When other files are modified only a few variables are changed between versions and so the changes can be made by hand. When the input

forcing files are modified thousands of individual values of the variables have to be changed. A series of simple shell tools have been developed to modify all of the time series or just some of them, such as the N inputs. For most loading scenarios the basic time series are simply multiplied by some constant. 4.7. Bass Strait boundary Port Phillip Bay exchanges water, containing suspended/dissolved substances, with the adjacent ocean (Fig. 1). A file of these boundary conditions in the neighbouring Bass Strait has been derived (Murray and Parslow, 1997) based on observed concentrations at the bay’s entrance (Longmore et al., 1996). This file gives constant values of concentration of major variables that apply in Bass Strait. Exchange across the boundary is determined by the concentration difference between these values and those in the adjacent part of the bay, as well as the rate of exchange of water between Port Phillip Bay and Bass Strait. The transport model carries out this exchange. 4.8. Output log The log file stores the run time messages derived from a model run and so it shows the model’s history of execution. This file is mainly useful for debugging. The level of detail of the recorded messages is controlled by the user and is defined by the parameter ecotest found in the ecosystem model parameter file. 4.9. Main output file The model’s main output file is, by convention, saved under the name of the Unix shell file that spawned it. The output file is stored in NetCDF (Net Common Data Form) format. This is a machine-independent selfdescribing data format that allows ancillary information, such as names, units or descriptive text, to be associated with the data (see http://www.unidata.ucar.edu/ for details). The format is a robust means of exchanging data between different users. The data can be accessed by the purpose built GUI Dview to enable visualisation of maps and time series on screen and as hardcopies. The data stored consists of a list of model variable values for each model box and depth that are output at a time interval and from an initial time specified in the model run input file. Outputs also include the model diagnostic variables, which contain the values of major fluxes such as primary production. Units of both model variables and diagnostic variables are specified in the input file that contains the properties of variables. An example of the files involved is shown in Fig. 4, which shows the set of files used for a run called sirun142 (see Murray and Parslow, 1997). The shell code

A.G. Murray et al. / Environmental Modelling & Software 15 (2000) 357–372

369

Fig. 4. The files involved in an example model run (sirun142) of the Port Phillip Bay model. Sirun142 contains the Unix shell program, bm45 contains the C++ program of the full integrated model, while the other boxes are data files.

in file sirun142 causes test19.cdl (model and variable properties) and BassSt (boundary) data to combine forming test19.nc. The file then calls the model execution code, in this case bm45. The model, on execution, inputs from the files: test19.nc, physical.prm (physics), run12y.prm (run control) forci91956F1.prm (forcing) and newpara62.prm (ecosystem model). Output is stored in log2 and in sirun142.nc. The output file’s name is not specified in the shell, but must be input when this is run. Thus to run this program all that is required, if all the files are available, is to type: “sirun142 sirun142”.

5. Analysis of the model’s output 5.1. Introduction The Port Phillip Bay ecosystem model’s principal output consists of long time series for each of the individual model boxes of each model variable. It is possible to use these raw time series, particularly for comparison of model outputs with observations at fixed stations. However, for many purposes the raw data is an information overload and it must be refined into some more useful format. A variety of tools have been developed, from a sophisticated GUI to simple EXCEL spreadsheets, to allow efficient, and easily reproducible, analysis. The use of simple tools with only a few functions allows the appropriate analysis to be selected without the overhead of unnecessary computation or data storage. Analysis also leads to the automatic generation of standard records that document the major feature of the run’s output. The model output is generated by the 59 local boxes of the model (Fig. 1). For many purposes of analysis

purposes the totality of this bay-wide variation is excessive and so the boxes are grouped into 8 (or 9) regions, whose boundaries are defined on the basis of their environmental characteristics (see Murray and Parslow, 1999b). In some cases the results from the fine boxes are grouped by these regions (see Fig. 5a and b) in other cases the properties of the boxes are combined to determine regional properties (see Fig. 5c and d). Some of the more frequently used types of data analysis are described in this following section (Fig. 5). These aim to present the model’s results as clearly and succinctly as possible to allow the detection of patterns in the model’s behaviour using rapidly comparison between different runs and to evaluate this behaviour with respect to the observations. The analysis approach is flexible, some runs being rapidly rejected as unrealistic, and the full range of model — data comparisons are carried out only for selected runs. As a range of software tools is used for analysis, a range of data formats is required. The raw output is in NetCDF format which can be directly accessed by some visualisation routines, but use of MATLAB and EXCEL requires that the selected data be converted to appropriate formats. This conversion does impose some computation time and storage overheads, but the time overhead is small relative to the program run time and space used is small relative to the model’s full output file. As conversion can be done concurrently with further model runs, it does not significantly impact model experimentation time: frequently-used data are stored in the appropriate format and so the overhead is minimised. The range of tools used is appropriate for an experimental model for which the types of model outputs and analyses required, are frequently updated. However, in

370

A.G. Murray et al. / Environmental Modelling & Software 15 (2000) 357–372

Fig. 5. Examples of model output for chlorophyll from run sirun142. (a) Model box time series output. O, only 3 out of 59 of the model boxes’ output is shown in this figure and already difficulties are apparent. (b) Average concentrations by model box. Spatial patterns are apparent. (c) Cruise track model output and observations for 1 (region 4= Yarra) of 8 regions of Port Phillip Bay. These integrated time series do allow model temporal behaviour to be compared between runs particularly in key regions. (d) Percentile variation by regions. This figure allows the range, but not timing, of variation to be rapidly evaluated for all the regions.

the longer-term a fully standardised set of analyses in a single package is a priority development goal. Such a standardised package is a particularly useful tool for future users who were not involved in the model’s development phase. The package must be able to plot the spatial and temporal variation in variables and fluxes, which an existing GUI Dview already does. It must also be able to integrate these variables and fluxes over space and/or time and to allow the results to be recorded in accessible formats. The analyses presented allow a wide range of both mean values and spatial and temporal variation of nutrient pools and fluxes to be visualised and compared with observations. This visualisation allowed us to search out an optimal fit between the model’s output and the observations. Two approaches to model-data fitting were used. For bay-wide long-term mean values, a formal analysis of simplified models at steady state was used to obtain a sophisticated understanding of the model’s dependence on parameter values (Murray and Parslow, 1999a). This was supplemented by a sensitivity analysis of the full model, including an analysis by region of the parameter sensitivity of chlorophyll (Murray and Parslow, 1997). To fit the model’s output to observed spatial and temporal variation about this mean (Murray and Parslow, 1999b), a guided heuristic search was used. Given the large number of model parameters and the significant model run time, exhaustive systematic formal analysis was impossible.

5.2. Local time series

The raw time series of model results stored for each model box can be accessed and plotted using a variety of packages. This is very simple in principle since these time series are the raw output of the model — but appropriate presentation packages are required or the user becomes lost in the sea of data. Spread sheets have been set up to plot the time series of chlorophyll and nutrients for each of the model’s fine boxes (Fig. 5a). Since there are too many time series to plot on one graph, they are plotted as 9 graphs and arranged in groups that correspond to the environmental regions of the bay (the bay centre region is split in two for this analysis). These spreadsheets can also plot observed data on chlorophyll or nutrients on the same graph as the model results are plotted. The observations that are plotted are the mean value observed in each box on a specific date. These data comes from a series of 25 monthly cruises conducted in 1993–5 (Longmore et al., 1996). Similar comparisons of fine box results and observations can be made for a smaller number of boxes in which there are sampling stations. These stations were sampled fortnightly (instead of monthly), were sampled for more variables, and (in some cases) pre-date the study. This allowing allows time series to be accessed back to before mid 1991, the time at which our model simulations begin.

A.G. Murray et al. / Environmental Modelling & Software 15 (2000) 357–372

5.3. Simple means The output that allows the most rapid comparison of the model’s results with observations, and between results obtained under different model runs, is the baywide annual mean value for a major flux or variable. The bay-wide mean can be found by multiplying local concentrations by the local box’s volume, summing the product over all of the bay’s boxes, and dividing by the bay’s total volume. This procedure can also be carried out as a standard analysis at a regional level too; as discussed, eight regions cover the principal environments of Port Phillip Bay (Fig. 1). A MATLAB program has been written to calculate these bay-wide and regional mean values of major pools and fluxes. This analysis is carried out on the results of all model runs immediately following their completion and the selected mean values and fluxes are stored automatically to give an easily accessible record of the model’s development. Spatial variation can be obtained from means of individual model boxes and can be visualised by plotting these values either as a map or simply against box number (Fig. 5b). The value of mean values versus box number is that it is possible to plot more than one data type in the graph, a difficult task with a map. Plotting both model output and observations of chlorophyll and nutrients in those boxes that were sampled on at least 21 of the 25 cruises allows the model’s spatial performance to be evaluated (Murray and Parslow, 1999b). 5.4. Cruise track weighted means The problem with using bay-wide mean values for model-data comparison comes not from the model’s output but from errors that arise in the estimation of the real bay-wide value by interpolating patchy or biased observations to cover the whole bay. The 25 underway surveys conducted in Port Phillip Bay (Longmore et al., 1996) used continuous sampling from moving vessels to obtain very large data sets from many regions of the bay. However, even such large data sets are biased as a result of the sampling being concentrated in regions with most environmental variability. One simple method of dealing with this bias in the observations is to apply the same bias to the model results. Model results can only be biased to follow the cruise track for those days on which a cruise took place. Model results for a box are multiplied by the number of samples that were taken in that box on the cruise, this product is summed over all of the model’s boxes, and then divided by the total number of samples taken (in all boxes) on that cruise. This operation obtains the average model output along the cruise track, which can be directly compared with the average observed value along the cruise track (Fig. 5c). Biases caused by gradients at scales smaller than the model’s boxes can still occur, particularly in the vicinity of nutri-

371

ent input sources. Plots of observations and cruise-track weighted model results are created using a MATLAB program.

5.5. Variability

Water quality managers (and the public) are particularly concerned about large phytoplankton blooms. Such blooms can be environmentally damaging even if these they are quite brief and so are not apparent from averaged chlorophyll values. Mean values are a convenient means of comparing observations with model results, but the model should also capture the observed variation in variable values. It is possible to use time series for this analysis; the regional cruise-track-biased results are compared with observations in this way, as are time series for individual boxes. However, a more condensed form of output is more practical for analysis of bay-wide patterns and in particular for viewing the changes in these patterns that occur between different runs. Such analysis also allows more objective comparisons of modelled and observed variability than can be obtained from raw time-series. One simple method of comparison of the model’s calculated variation and the observed variation for a given variable is to find the mean variance of that variable that was observed and modelled in each of the model’s boxes. When these two sets of data are plotted against box number, this approach gives a spatially-detailed picture of the model’s replication of variability. This approach is similar to the plotting of spatial mean values. A more detailed picture of the variation in variables can be obtained using box and whisker plots of percentile values (showing 10, 25, median, 75 and 90%, see Fig. 5d). This approach is particularly useful since many water quality regulations and guidelines are expressed in term of percentiles (EPA, 1995). It has been used to show the sensitivity by region of variability in the major model variables to changing input scenarios (Murray and Parslow, 1998). A MATLAB program is used to combine as a single list the model results for all of the boxes in a region, sort this list and then find the values at the locations in the array that correspond to the given percentiles. These percentiles are then plotted in box-andwhisker format using excel EXCEL spreadsheets. Similar plots can be obtained using data from observations. The percentiles obtained by the above methods can be used to compare details of variation in model results and between the results and the observations by calculating the relative values of percentiles obtained in different runs, or between observed and modelled data. If model results are to be compared with observations then those periods for which observations are lacking must be removed from the comparison.

372

A.G. Murray et al. / Environmental Modelling & Software 15 (2000) 357–372

6. Conclusions Ecosystems are complex structures whose modelling presents a formidable series of problems. Ecosystems exist in a larger physical environment and the behaviour of this must be included in the model. This is particularly true of marine ecosystems in which physical processes play a direct role in most aspects of the ecosystem’s behaviour and structure. The program itself exists in a wider programming environment of data files and software tools. The result is a large and potentially unwieldy set of software, whose development may require the collaboration of many different researchers. In this paper we have looked at practical approaches that have been used to aid the development of an applied ecosystem model of Port Phillip Bay. These methods approaches include 앫 Use of a transport model to allow application of detailed hydrodynamic model results in a minimum run time. 앫 Minimising the interaction of transport and ecosystem modelling, allowing efficient independent development of both models. 앫 Using separate files for the book-keeping routines and the model’s own routines in the ecosystem modelling code. 앫 A Eulerian integration method which tailors time step size to minimise run time. 앫 A formal naming convention for variables. 앫 SI units are default, but the model is flexible, accepting, and converting on input files in other units. 앫 A Unix shell structure that coordinates input files and serves as a full record of specific runs. 앫 A selection of software tools that allows data to be quickly and consistently analysed. The approaches adopted here have proven to be flexible, efficient, comprehensible and reliable. They have allowed the development and testing of an applied model of a complex ecosystem by a small group of people in a fairly short time. The modelling environment proved itself to be adaptable during the last stages of the Port Phillip Bay project, allowing the model to take advantage of observations immediately after their their final reporting was completed.

Acknowledgements This model was developed as a part of the Port Phillip Bay Project, funded by Melbourne Water and managed by CSIRO. We thank Drs Ken Ouyang and Paval Sakov

for their contributions to the coding of the program and Drs David MacDonald and Tony Smith and two anonymous referees for reviewing the manuscript. References Blackburn, N.D., Blackburn, H.T., 1993. A reaction diffusion model of C-N-S-O species in a stratified sediment. FEMS Microbiology Ecology 102, 207–215. Blackford, J.C., Radford, P.J., 1995. A structure and methodology for marine ecosystem modelling. Netherlands Journal of Sea Research 33, 247–260. Brookes, F.P. Jr, 1982. The Mythical Man-Month. Addison Wesley, Reading, Mass. Burden, R.L., Faires, J.D., 1988. Numerical analysis, 4th ed. PWSKent, Boston. EPA, 1995. Protecting water quality in Port Phillip Bay. Publication 434, Environment Protection Authority (Victoria), Melbourne, Australia. Firestone, D., Henderson-Sellers, B., Graham, I., 1998. OPEN Modeling Language (OML) Reference Manual. Cambridge University Press, Cambridge. Hairer, E., Norsett, S.P., Wanner, G., 1990. Solving Ordinary Differential Equations I. Nonstiff Problems. Springer, Berlin. Harris, G., Batley, G., Fox, D., Hall, D., Jernakoff, P., Molloy, R., Murray, A., Newell, B., Parslow, J., Skyring, G., Walker, S., 1996. Port Phillip Bay Environmental Study Final Report, CSIRO Press, Canberra, Australia. [Port Phillip Bay reports are collectively available on CD-ROM: Molloy, R. (Coordinator) 1999. The Port Phillip Bay Environmental Study. Technical Reports. CSIRO Publishing, Collingwood, Australia (www.publish.csiro.au)] Longmore, A.R., Cowdell, R.A., Flint, R., 1996. Nutrient status of the water in Port Phillip Bay. Port Phillip Bay Environment Study Technical Report no. 24, Canberra. King, M.J., Pardoe, J.P., 1985. Program Design Using JSP. MacMillan, London. Murray, A.G., 1990. Ada Tasking as a Tool for Ecological Modelling. Ada Letters X, 85–90. Murray, A.G., Parslow, J.S., 1997. Port Phillip Bay Integrated Model: Final Report. Port Phillip Bay Environment Study Technical Report no. 44, Canberra. Murray, A.G., Parslow, J.S., 1998. Port Phillip Bay Integrated Model: Input Scenarios. Port Phillip Bay Environment Study Technical Report no. 46, Canberra. Murray, A.G., Parslow, J.S., 1999a. The analysis of alternative formulations in a simple model of a coastal ecosystem. Ecological Modelling 119, 149–166. Murray, A.G., Parslow, J.S., 1999b. Modelling nutrient cycles and impacts in Port Phillip Bay — a semi-enclosed marine Australian ecosystem. Marine and Freshwater Research 50, 597–611. Nicholson, G.J., Longmore, A.R., Cowdell, A.R. 1996. Nutrient status of the sediments of Port Phillip Bay. Port Phillip Bay Environment Study Technical Report no. 26, Canberra. Ruardij, P., Baretta, J.W., Baretta-Bekker, J.G., 1995. SESAME, a software environment for simulation and analysis of marine ecosystems. Netherlands Journal of Sea Research 33, 261–270. Sokolov, S., Black, K.P., 1999. Long-term prediction of water quality for three types of catchment. Marine and Freshwater Research 50, 493–501. Walker, S.J., 1999. Coupled hydrodynamic and transport models of Port Phillip Bay, a semi-enclosed bay in south-eastern Australia. Marine and Freshwater Research 50, 469–481.