environmental science & policy 11 (2008) 115–124
available at www.sciencedirect.com
journal homepage: www.elsevier.com/locate/envsci
Uncertainty in water quality data and its implications for trend detection: lessons from Swedish environmental data Karl Wahlin, Anders Grimvall * Department of Computer and Information Science, Linko¨ping University, SE-58183 Linko¨ping, Sweden
article info Published on line 7 February 2008 Keywords: Environmental monitoring Water quality Trend detection Data quality Statistical analysis
abstract The demands on monitoring systems have gradually increased, and interpretation of the data is often a matter of controversy. As an example of this, we investigated water quality monitoring and the eutrophication issue in Sweden. Our results demonstrate that powerful statistical tools for trend analysis can reveal flaws in the data and lead to new and revised interpretations of environmental data. In particular, we found strong evidence that longterm trends in measured nutrient concentrations can be more extensively influenced by changes in sampling and laboratory practices than by actual changes in the state of the environment. On a more general level, our findings raise important questions regarding the need for new paradigms for environmental monitoring and assessment. Introduction of a system in which conventional quality assurance is complemented with thorough statistical follow-up of reported values would represent a first step towards recognizing that environmental monitoring and assessment should be transformed from being a system for sampling and laboratory analyses into a system for interpreting information to support policy development. # 2008 Elsevier Ltd. All rights reserved.
1.
Introduction
Regular measurements of the state of the environment have long provided important decision support. However, the demands on monitoring systems have gradually increased. When existing monitoring programmes were devised, it was not unusual that important questions could be answered by simple surveys and very basic data analyses. The environmental issues now receiving the most attention, such as diffuse water pollution, are more complex. Understanding long-term changes in the state of the environment represents a priority, but the amount of data required for representative and meaningful assessments can be substantial, and the interpretation of the data is often a matter of controversy. In this article, we address data quality issues and the need for more efficient statistical analyses of water quality data as a means of assessing the achievement, or
otherwise, of key environmental objectives. Taking the Swedish water quality monitoring and the eutrophication issue as an example, we examine to what extent powerful statistical tools for trend analysis can reveal flawed data and lead to new revised interpretations of environmental data used to provide diffuse pollution policy support. In particular, we discuss the interpretation of data that can be used to assess progress towards water-related Swedish environmental objectives, such as Zero Eutrophication, and two specific interim targets regarding waterborne anthropogenic emissions of nitrogen and phosphorus compounds into sea areas (SEPA, 2007). The implementation of the EU Water Framework Directive (WFD) has created a river basin or water district structure within which environmental objectives are adopted. This calls for joint evaluation of data collected at a number of sites, during different seasons, and over a period of several years
* Corresponding author. Tel.: +46 13 281482; fax: +46 13 142231. E-mail address:
[email protected] (A. Grimvall). 1462-9011/$ – see front matter # 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.envsci.2007.12.001
116
environmental science & policy 11 (2008) 115–124
(Clement et al., 2006). Furthermore, we must be able to separate anthropogenic trends from weather-driven fluctuations and systematic or random errors in the sampling, sample handling and laboratory analysis of the collected samples. The latter task requires statistical methods that suppress spurious trends caused by large-scale or cyclic components in the weather conditions. Also, it must be taken into consideration that changes in sampling, sample handling, or analytical methods in the laboratory can give rise to misleading synchronous increases and decreases in the observed concentration levels. The classical statistical techniques for extracting anthropogenic trends from observational data do not meet these demands. In particular, there is a considerable risk that moderately large errors impairing data from a large number of sites over long periods of time can be overlooked if data exhibit a significant natural variability and they are analysed station by station. Therefore, we propose that a class of semi-parametric regression techniques (Grimvall et al., 2007) be used to elucidate complex space–time phenomena characterising longer-term environmental monitoring data associated with water quality assessment. In the following sections, we examine a set of water quality data from a river mouth monitoring programme (http:info1.ma.slu.se/ma/www_ma.acgi$Project?ID=Intro) in Sweden. This dataset was selected because it was believed to be of especially good quality. In particular, it can be noted that the same laboratory has been responsible for the sampling and chemical analyses since the onset of the programme, and that this laboratory has long practiced standard quality assurance. Two questions were posed at the outset of our investigation:
are there significant spatio-temporal patterns that would have been overlooked if data from each site had been analysed separately? Are the statistical trends that can be derived from the monitoring data contaminated by errors in the sampling, sample handling, and laboratory analysis? A brief discussion of the need for a paradigm shift in environmental monitoring for water quality concludes the paper.
2.
Database
Our study comprised data from 34 rivers in Sweden (Fig. 1), with drainage areas ranging from 100 to 47,000 km2. To be more precise, we procured almost monthly concentration records (Table 1) and almost daily water discharge records from these rivers. In addition, we analysed time series of water quality data from one site (Jungfrun) in Lake Va¨ttern and two sites (Megrundet and Dagska¨rsgrund) in Lake Va¨nern.
3.
Statistical methods
The assembled data were analysed in three steps. Firstly, we used concentration and discharge data to compute monthly riverine loads. Secondly, we adjusted (or normalized) these data to remove as much as possible any weather-driven fluctuations, and computed adjusted annual flow-weighted concentrations for each site. Finally, we examined spatiotemporal trends in the annual summaries for all rivers with the same recipient.
Fig. 1 – Map of Sweden with sampling sites for the investigated rivers and lakes and their recipients (KS: Kattegat and ¨ T: Lake Va¨ttern, VA ¨ N: Lake Va¨nern). Skagerrak, SB: Southern Baltic, GB: Gulf of Bothnia, VA
117
environmental science & policy 11 (2008) 115–124
Table 1 – Water quality parameters and time span of the measurements Water quality parameter
Time span
Total nitrogen using persulphate digestion Kjeldahl nitrogen Sum of nitrite and nitrate nitrogen Ammonium nitrogen Total phosphorus Phosphate phosphorus Total organic carbon Chemical oxygen demand (permanganate consumption) pH Absorbance (420 nm, 25 8C, filtered and unfiltered samples) a
1988–2005 1980–2005 1980–2005 1980–2005 1980–2005 1980–2005 1987–2005 1980–2005 1980–2005 1980–2005
Number of observationsa 200–222 282–312 283–312 283–312 283–312 283–312 205–228 283–323 283–323 283–323
Sampling was undertaken more frequently in the Skivarpsa˚n River and less frequently in the Gide and Alstera˚n Rivers than the other rivers.
3.1. A general semi-parametric regression model for water quality trend analysis
collected over several seasons or representing several sampling sites along a gradient or ordered on a linear scale.
All statistical analyses in the present study were carried out using a general semi-parametric regression model for multiple time series of data. The input to that model comprised an mdimensional vector time series:
3.2.
ð1Þ
ðmÞ
yt ¼ ðyt ; . . . ; yt Þ0 ;
t ¼ 1; . . . ; n
representing the observed state of the environment at n equidistant time points, and a matrix 0
ð1Þ
x1t B B xt ¼ B B @ ðmÞ x1t
ð1Þ 1 x pt C C C C; A ðmÞ x pt
t ¼ 1; . . . ; n
that included contemporaneous values of p explanatory vectors representing natural fluctuations in yt. The general model structure was defined by the equation system: ð jÞ
ð jÞ
yt ¼ at þ
p X ð jÞ ð jÞ ð jÞ bi xit þ et ;
j ¼ 1; . . . ; m;
t ¼ 1; . . . ; n
i¼1
ð1Þ
ðmÞ
where the sequence of vectors at ¼ ðat ; . . . ; at Þ0 t ¼ 1; . . . ; n, represents a deterministic temporal trend, 0 ð1Þ ð1Þ 1 bp b1 B C B C b¼B C B C @ A ðmÞ ðmÞ b1 bp is a matrix of time-independent regression coefficients, and ð jÞ the error terms et , j = 1,. . ., m, t = 1,. . ., n are independent of each other and of the explanatory variables. The model parameters were estimated by imposing smoothness conditions on the intercepts, and the degree of smoothing was determined by cross-validation. Uncertainty (bootstrap) intervals were computed by resampling the model residuals. Further details have been reported by Grimvall et al. (2007). Compared to other methods for semi- or non-parametric regression, such as Gaussian smoothers and thin plate splines (Ha¨rdle, 1997; Hastie et al., 2001), an advantage of our approach is that the smoothing pattern can easily be tailored to different types of relationships between the vector components. In particular, we give explicit smoothing expressions for data
Meteorological adjustment
Weather-driven fluctuations in the measured concentrations were suppressed by adjusting (normalizing) monthly riverine loads with respect to contemporaneous runoff values and then computing annual flow-adjusted concentrations for each river. Phosphorus levels were adjusted with respect to runoff and a proxy (the difference in absorbance between filtered and unfiltered samples) of the amount of particulate matter in the analysed sample. Concentrations measured in lakes were adjusted for temperature. Several investigators have used regression methods, or related statistical learning techniques such as neural networks, to normalize air quality data (Thompson et al., 2001; Libiseller and Grimvall, 2003). A few authors have considered flownormalization of riverine loads of substances (Sta˚lnacke and Grimvall, 2001; Uhlig and Kuhbier, 2001; Hussian et al., 2004), and concentrations of pollutants in sediments are often normalized with respect to grain size or amount of organic matter in the analysed samples (e.g., Clark et al., 2000). Additional examples of statistical methods that are used to reduce the variability of observed data include normalization regarding the fat content in biota or the salinity of water samples collected in estuaries. An embryo to a theoretical framework for normalization of environmental quality data was published by Grimvall et al. (2001).
3.3.
Trend assessment
Trend surfaces for multiple time series of adjusted annual data were computed by running the semi-parametric regression model and using annual adjusted values as inputs. The rivers were ordered with respect to the average concentration of the substance under consideration. A review paper by Esterby (1996) provides an overview of techniques for the detection and estimation of water quality trends with emphasis on univariate methods for unadjusted data. Some more recent contributions focus on non-parametric tests that can accommodate covariates (Libiseller and Grimvall, 2002), the interpretation of observations below detection limits (Helsel, 2005), and the use of additive models for trend detection (Giannitrapani et al., 2005).
118
environmental science & policy 11 (2008) 115–124
Fig. 2 – Scatter chart of Tot-N(Kj) plotted against Tot-N(ps) for a total of 7540 water samples collected in 34 rivers in Sweden. The slope of the dashed line is exactly half of that of the solid line.
4. Patterns and trends in the reported longerterm water quality data 4.1.
Total nitrogen
Total nitrogen levels can be determined by a direct method involving persulphate digestion of organic nitrogen, and by computing the sum of Kjeldahl nitrogen and nitrite and nitrate nitrogen (Patton and Kryskalla, 2003). In principle, the results obtained with these two methods (Tot-N(ps) and Tot-N(Kj), respectively) should be strongly correlated to each other. However, the scatter chart in Fig. 2 shows that there were substantial measurement errors or other causes of deviating concentrations. For example, there seemed to be a set of water samples for which the computed Tot-N(Kj) value was only about half the Tot-N(ps) value. After examining time series plots from the sites responsible for these anomalies, we concluded that some of the Tot-N(ps) values were twice as
Fig. 3 – Trend surface fitted to flow-normalized concentrations of total nitrogen (Tot-N(ps)) in riverine input to the Kattegat and Skagerrak from seven rivers (see Fig. 1).
Fig. 4 – Trend surface fitted to flow-normalized concentrations of total nitrogen (Tot-N(Kj)) in riverine input to the Kattegat and Skagerrak from seven rivers.
large as they should be due to calculation or dilution errors in the chemical analysis. Closer examination of the majority of the data points in Fig. 2 revealed another interesting pattern. The Tot-N(Kj)-to-TotN(ps) ratios started to increase in the mid-1990s. When trend surfaces were fitted to the annual Tot-N(Kj) and Tot-N(ps) summaries this pattern emerged more clearly. There was a pronounced downward trend in the Tot-N(ps) while the average Tot-N(Kj) levels remained practically constant. Figs. 3 and 4 show the results obtained for rivers in south-western Sweden, and similar results were obtained for rivers discharging into southern Baltic. In northern Sweden, where the average nitrogen concentration is lower, the downward trend in TotN(ps) was weaker. Fig. 5 provides yet another illustration of the deviating temporal trends in the two measures of total nitrogen, TotN(ps) and Tot-N(Kj), and the uncertainty in these trend estimates. Apparently the choice of measure of the nitrogen content has a dramatic effect on the conclusions that can be drawn about recent temporal trends in water quality datasets.
Fig. 5 – Trend lines and associated 95% confidence bands for the arithmetic mean of flow-normalized concentrations of Tot-N(ps) and Tot-N(Kj) in seven rivers discharging into the Kattegat and Skagerrak.
environmental science & policy 11 (2008) 115–124
Fig. 6 – Observed and flow-normalized total phosphorus concentrations in the Lagan River flowing into the Kattegat.
4.2.
Total phosphorus
When phosphorus concentrations measured at Swedish river mouths were plotted as time series, we noted that in most of the rivers the observed levels decreased from 1980 to 1983 and then increased. Furthermore, there seemed to be a drop around 1996 (Figs. 6 and 7), although the conclusions for individual rivers are uncertain. This temporal pattern emerged more clearly when semi-parametric regression models were used to normalize the observed concentrations and trend surfaces were subsequently fitted to matrices of annual normalized concentrations for selected groups of rivers. Fig. 8 shows the results obtained for 15 rivers flowing into the Gulf of Bothnia when the observed levels of total phosphorus were normalized with respect to water discharge and to the amount of particulate matter measured as the difference in absorbance between unfiltered and filtered water samples. It can be seen that the temporal changes in the early 1980s and around 1996 were remarkably synchronous in all the investigated rivers. Fig. 9 shows the arithmetic mean concentrations obtained for rivers discharging into the Gulf of Bothnia and rivers discharging into Kattegat and Skagerrak. It appears that in
Fig. 7 – Observed and flow-normalized total phosphorus ˚ ngermana¨lven River flowing into concentrations in the A the Gulf of Bothnia.
119
Fig. 8 – Trend surface fitted to normalized annual summaries of total phosphorus concentrations in 15 Swedish rivers flowing into the Gulf of Bothnia. The normalization was based on water discharge and the amount of particulate matter, measured as the difference in absorbance between unfiltered and filtered samples.
both regions there was a trough in the normalized concentrations around 1983. In search of further evidence of synchronous changes in phosphorus concentrations, we also analysed data from one site in Lake Va¨ttern and two sites in Lake Va¨nern. Fig. 10 shows the results for Lake Va¨ttern, which has a hydraulic residence time of about 80 years. First, it can be noted that the temporal changes in phosphorus concentrations were almost synchronous at all depths in Lake Va¨ttern. Second, both the trough in the early 1980s and the drop around 1996 could be discerned in data from that lake. Similar results were obtained for the two sites in Lake Va¨nern, which has a water residence time of about 9 years. The techniques used to estimate the trend lines in Fig. 9 and the trend surfaces in Figs. 8 and 10 may, under certain conditions, smooth out step changes to produce gradual changes. To ascertain whether such smoothing effects did occur in our study, we plotted the estimated trends along with
Fig. 9 – Trend curves for the arithmetic mean of the normalized concentrations of total phosphorus in riverine input to the Kattegat/Skagerrak and the Gulf of Bothnia. Normalization based on water discharge and the amount of particulate matter, measured as the difference in absorbance between unfiltered and filtered samples.
120
environmental science & policy 11 (2008) 115–124
Fig. 10 – Temporal trends in temperature-normalized concentrations of total phosphorus at Jungfrun in Lake Va¨ttern. Trend surface for samples collected at different depths (0.5–75 m).
the originally measured concentration values for sampling sites at which the natural variability was particularly small. The results revealed that the decrease around 1996 was actually a step change, whereas the decrease in the early 1980s seemed to include minor changes spread out over several years (see Fig. 11). Together, our findings regarding phosphorus concentrations showed that major changes in the reported levels occurred simultaneously in a large number of rivers and lakes representing a very wide range of hydraulic residence times and anthropogenic pressures. Possible causes of these surprisingly synchronous temporal trends are scrutinized in the ensuing discussion.
4.3.
Fig. 12 – Trend surface fitted to flow-normalized concentrations of phosphate in riverine input to the Gulf of Bothnia.
trend surface fitted to flow-normalized annual concentrations for 15 rivers discharging into the Gulf of Bothnia. Further studies of phosphate concentrations revealed similar temporal patterns in rivers located in other parts of Sweden. More precisely, almost all rivers with low or moderately high average phosphate levels exhibited troughs in 1983–1984 and 1999– 2000, as well as peaks in 1980, 1994 and 2005. Fig. 13 shows the trend surface for such rivers in south-eastern Sweden. The temporal patterns were more individual for the rivers with fairly high phosphate concentrations and the results obtained for the sampling sites in Lakes Va¨nern and Va¨ttern were inconclusive.
5. Possible causes of the detected patterns and trends
Phosphate
Synchronous temporal changes were also apparent in the phosphate data. This is illustrated in Fig. 12, which shows a
Our study of water quality data provided numerous examples of temporal changes that were remarkably synchronous in rivers representing a wide range of hydrogeological conditions
Fig. 11 – Trend curve for temperature-normalized concentrations of phosphorus shown along with measured concentrations of that element in samples collected at different depths (0.5–70 m) at Dagska¨rsgrundet in Lake Va¨nern.
Fig. 13 – Trend surface fitted to flow-normalized concentrations of phosphate in five rivers (the Alstera˚n, Mo¨rrumsa˚n, Botorpstro¨mmen, Ema˚n, and Lyckebya˚n) discharging into the southern part of the Baltic Sea.
environmental science & policy 11 (2008) 115–124
and anthropogenic pressures. There are several possible explanations for such coinciding fluctuations: (i) large-scale human interventions; (ii) large-scale variation in weather conditions; (iii) advertent or inadvertent alterations in sampling and laboratory practices; (iv) artefacts in the statistical procedures used to analyze the collected data. The following discussion first deals with the possibility of artefacts associated with the statistical techniques and then considers the other three explanations separately for each water quality parameter. It has been advocated that adjustments of raw monitoring data can create spurious trends, and that it is even possible to construct data sets for which such effects can appear. However, in general, adjustments reduce the noise in the measured data and, consequently, such methods usually decrease the risk for spurious trends. In addition, the major trends we found in the adjusted data were also present in unadjusted data from water bodies for which the noise was comparatively small and adjustment thus had a negligible effect. Hence, we rejected the hypothesis that the synchronous trends were merely an artefact of the adjustment. Like any other trend analysis, our method of fitting trend surfaces can distort true trends. Above all, we noticed that a step change could be smoothed out when a trend surface was fitted to annual data from a set of stations. However, apart from that effect, the risk of undesired smoothing was generally small because the smoothing factors were selected to optimize the predictive power of the underlying regression model. In particular, we found that the smoothing across rivers was negligible when the trends in the investigated rivers were clearly dissimilar. Furthermore, it should be noted that we analysed data from different regions separately. Hence, we had very good arguments to rule out the possibility that the remarkably synchronous changes in water quality were merely an artefact of the statistical procedures used.
5.1.
Total nitrogen
We have already concluded that some of the total nitrogen levels shown in Fig. 2 are erroneous and this has been confirmed by the laboratory responsible for the chemical analyses. This is annoying, albeit not detrimental, for the detection of water quality trends. Many statistical software packages contain procedures for filtering outliers, and some trend tests are relatively robust to the presence of outliers. The deviating trends in Tot-N(ps) and Tot-N(Kj) is a more serious problem because it involves thousands of observations. Theoretically, such deviating trends could arise if the organic matter in the analysed water samples became more resistant to persulphate digestion. However, simple mass balance calculations showed that, if both the Tot-N(ps) levels and the determinations of inorganic nitrogen were correct, there must have been an unprecedented change in the composition of organic matter in Swedish watercourses. More precisely, an almost 50% decrease in the amount of organic nitrogen that can be digested by persulphate would have coincided in time with a general increase in the amount of organic matter that can be oxidized with permanganate. We regarded this as very improbable.
121
In our search for possible explanations for the reported decreases in the Tot-N(ps) levels we were informed that the laboratory responsible for the chemical analyses had used persulphate of poor quality in 2005. After reanalysing the trends in different time periods we concluded that this problem started much earlier and that it became apparent in 2001. Consequently, it is reasonable to suggest that the downward trends shown in Figs. 3–5 are mainly due to measurement errors.
5.2.
Total phosphorus
Our analysis of total phosphorus levels showed that there was a significant drop in 1996. Closer inspection of that decline revealed the following: (i) the levels decreased simultaneously in water bodies representing a wide range of anthropogenic pressures and hydraulic residence times ranging from less than a year to about 80 years; (ii) the decreases in Lakes Va¨nern and Va¨ttern were step changes that were too large to be caused by sudden fluctuations in either point emissions or riverine inputs to the lakes. Accordingly, it is reasonable to assume that the direct effects of anthropogenic interventions cannot be a major explanation of the observed shift in the level of total phosphorus. Internal loading triggered by specific weather conditions can occasionally cause relatively rapid changes in concentrations of phosphorus in a body of water (Pettersson, 1998). However, in our analysis, the weather effects were suppressed by normalizing the river water data with respect to flow and particulate matter, and the lake water data with regard to temperature. Moreover, inspection of water discharge and temperature data did not reveal any events that could explain why the decrease in total phosphorus was so pronounced both in northern and southern Sweden in 1996, whereas the interannual variation was subsequently much smaller. Because the drop in total phosphorus in 1996 coincided in time with the introduction of a new instrument for measuring phosphorus in the laboratory conducting the chemical analyses, and this change revealed that baseline drift had long influenced determination of low concentrations (Sonesten and Engberg, 2001), we concluded that the decrease in 1996 was due to altered laboratory procedures. Furthermore, we noted that these measurement errors had influenced the concentration levels reported for a majority of the investigated rivers. The trough in the early 1980s was observed from southwestern to north-western Sweden in rivers containing low or moderately high levels of total phosphorus. Based on the same argument as given above, we rejected the idea that these temporal changes had been induced by human interventions in the drainage area. We also dismissed the idea that they were due to internal loading triggered by weather events for the following reasons: (i) the major weather-driven effects should have been removed by our statistical adjustments (normalizations); (ii) visual inspection of water discharge and water temperature data did not indicate that there were any pronounced nation-wide temporal patterns in these parameters in 1980–1985; (iii) the trough was apparent in water bodies representing a very wide range of hydraulic residence times. Consequently, it was deemed reasonable to assume
122
environmental science & policy 11 (2008) 115–124
that the observed trends in total phosphorus that occurred in the early 1980s were strongly influenced by changes in sampling or laboratory practices.
5.3.
Phosphate
The extraction of trend surfaces showed that all the study rivers with a low or moderately high average phosphate concentration had practically identical peaks and troughs. It is difficult to imagine any large-scale interventions that would cause such a sequence of synchronous increases and decreases in the phosphate levels. Therefore, we rejected the hypothesis that the detected trends were due to human activities. Because phosphate is readily taken up by aquatic organisms, and weather effects can have a marked impact on such processes, it is not impossible that large-scale components of the weather fluctuations can cause substantial variation in phosphate levels. However, we found that the temporal changes in the concentrations were much more synchronous than the fluctuations in precipitation and temperature. In addition, there was no obvious correlation between the levels of phosphate and water quality parameters (e.g., sum of cations, sum of anions, or conductivity) that might indicate major changes in water pathways. Therefore, we deem it less likely that the remarkably concurrent changes were due to specific weather effects. Metadata about sampling, sample handling and analytical methods did not provide any clues about changes in laboratory practices that might have influenced the levels of phosphate that were reported during the study period. However, it is generally recognized that the low concentrations that prevail in most Swedish rivers are difficult to quantify. Thus, in the absence of other plausible explanations, we concluded that altered laboratory or sampling practices constituted the major cause of the synchronous temporal changes in phosphate shown in Figs. 12 and 13.
6.
Discussion and conclusions
Our study showed that powerful statistical methods can reveal remarkable spatio-temporal patterns in measured water quality data and that this may lead to new interpretations regarding the human impact on aquatic environments. In particular, we found strong evidence that long-term trends in measured nutrient concentrations can be more extensively influenced by changes in sampling and laboratory practices than by actual changes in the state of the environment. Similar studies of the measured concentrations of organic matter have demonstrated that the levels of such substances can be strongly influenced by measurement errors or inadvertent changes in sampling or sample handling (Wahlin and Grimvall, 2007). Extensive investigations of water quality data reported by other laboratories were outside the scope of the present study. However, simple time series plots of water quality data from other Swedish laboratories and international organizations, such as the OSPAR Commission for the protection of the marine environment of the North-East Atlantic, provided numerous examples of sudden shifts in
flow-weighted concentrations, thus indicating that the quality of reported data deserves increased attention and scrutiny. Some of the data quality problems exemplified here could and should have been detected if the measured concentration levels had been properly visualized before they were entered into the corresponding database. Furthermore, it would be desirable to implement efficient statistical techniques for the identification of outliers, i.e. occasional strongly deviating concentration levels (Clement et al., 2006). However, from a point of view of trend detection, it is more severe that moderately large level shifts can affect a large number of observations before the data quality is questioned. In addition, experience has taught us that observations that are considered correct at the time of the sampling can be deemed erratic when more data have been collected and new analytical procedures have been implemented. To tackle the latter type of data quality problems we propose regular retrospective analyses involving: (i) Meteorological/hydrological adjustment of measured concentrations; (ii) Joint analysis of several time series of data. The adjustment step can reduce the noise in the time series of monitoring data and thereby clarify other sources of variation. Joint analysis of data from multiple sites can reveal true trends as well as data quality problems that would have been overlooked if each time series of data had been scrutinized separately. Calculating ratios or differences between interrelated water quality parameters can further facilitate detection of data quality problems. Our semiparametric regression model (Grimvall et al., 2007) proved to be a flexible and efficient tool for trend analysis of water quality data. If interaction effects are small, generalized additive models may provide viable alternatives (Giannitrapani et al., 2005). We concluded that changes in the sampling, sample handling and laboratory analyses were responsible for major trends in the reported data. However, regardless of why the measured data exhibit synchronous increases and decreases, such patterns create problems in the analysis of temporal trends. If, for example, large-scale weather phenomena play a more important role than previously assumed, the monitoring strategies that are currently in use need to be drastically revised. The impacts of human interventions are relatively easy to separate from statistically independent, purely random errors in the data, whereas it is an extremely difficult task to distinguish between seemingly persistent weather effects and the influence of human interventions. The same holds true for assessing progress towards environmental objectives like Zero eutrophication and interim targets for reducing the input of nitrogen and phosphorus to the Baltic Sea (SEPA, 2007), and this calls for a more widespread use of statistical methods appropriate for interpreting a broad range of spatio-temporal phenomena comprising apparent longerterm trends. On a more general level, our findings raise important questions regarding the need for new paradigms for environmental monitoring and assessment. Monitoring programmes have long been regarded as systems for collecting samples,
environmental science & policy 11 (2008) 115–124
undertaking laboratory analyses, and storing data in a database. This has created a situation where it is unclear who is responsible for developing the database into a system for information management, where raw data are integrated with assessment tools, and the information content is constantly updated with new findings about data quality and various sources of spatial and temporal variability in the assembled data. Neither has it been sufficiently debated how the priorities in environmental monitoring can and should be altered when science is increasingly about information, its collection, organization and transformation. In a recently published issue of the journal Nature it was emphasized that important discoveries are made by scientists and teams who combine different skill sets—not just biologists, physicists and chemists, but also computer scientists, statisticians and datavisualization experts (Szalay and Gray, 2006). Introduction of a system in which conventional quality assurance is complemented with thorough statistical follow-up of reported values would represent a first step towards recognizing that environmental monitoring and assessment should be transformed from being a system for sampling and laboratory analyses into a system for interpreting information to support policy development. Since monitoring data is used to inform water quality policy, it is essential that adequate statistical analyses is performed during assessments of longer-term catchment response to environmental change, e.g., diffuse pollution mitigation programmes. Perhaps, a system adopted by the National Aeronautics and Space Administration (NASA) in the United States NASA (2007) could serve as a model, since it accepts and encourages peer evaluation of data quality by qualified external reviewers or committees composed of such experts (sti.nasa.gov/STI-public-homepage.html). Assembling reliable information for testing positive catchment response to increased diffuse pollution mitigation efforts represents a priority for many stakeholders seeking justification for the increasing expenditure on environmental management targeting water quality protection. Accordingly, analyses of longer-term water quality datasets used to assess the environmental benefits of policy programmes and improved land management must be robust. The procedure outlined in this contribution affords a useful means of tackling this important issue.
Acknowledgements The authors are grateful for financial support from the Swedish Environmental Protection Agency and to the constructive comments provided by two anonymous reviewers.
references
Clark, M.W., Davies, F., McConchie, M.D., Birch, G.F., 2000. Selective chemical extraction and grainsize normalisation for environmental assessment of anoxic sediments. Sci. Total Environ. 258, 149–170. Clement, L., Thas, O., Vanrolleghem, P.A., Ottoy, J.P., 2006. Spatio-temporal statistical models for river monitoring networks. Water Sci. Technol. 53, 9–15.
123
Esterby, S.R., 1996. Review of methods for the detection and estimation of trends with emphasis on water quality applications. Hydrol. Process. 10, 127–149. Giannitrapani, M., Bowman, A.W., Scott, E.M., 2005. Additive models for correlated data with applications to air pollution monitoring, submitted for publication. Grimvall, A., Wackernagel, H., Lajaunie, C., 2001. Normalisation of environmental quality data. In: Hilty, L.M., Gilgen, P.W. (Eds.), Sustainability in the Information Society. Metropolis– Verlag, Marburg, pp. 581–590. Grimvall, A., Hussian, M., Libiseller, C., 2007. Semiparametric smoothers for trend assessment of multiple time series of environmental quality data, submitted for publication. Ha¨rdle, W., 1997. Applied Non-parametric Regression. Cambridge University, Cambridge. Hastie, T., Tibshirani, R., Friedman, J., 2001. The Elements of Statistical Learning. Springer, New York. Helsel, D.R., 2005. More than obvious: better methods for interpreting nondetect data. Environ. Sci. Technol. 39, 419A–423A. Hussian, M., Grimvall, A., Petersen, W., 2004. Estimation of the human impact on nutrient loads carried by the Elbe River. Environ. Monit. Assess. 96, 15–33. Libiseller, C., Grimvall, A., 2002. Performance of partial Mann–Kendall tests for trend detection in the presence of covariates. Environmetrics 13, 71–84. Libiseller, C., Grimvall, A., 2003. Model selection for local and regional meteorological normalisation of background concentrations of tropospheric ozone. Atmos. Environ. 37, 3923–3931. NASA (National Aeronautics and Space Agency). http:www.sti.nasa.gov/STI-public-homepage.html. Patton, C.J., Kryskalla, J.R., 2003. Methods of Analysis by the U.S. Geological Survey National Water Quality LaboratoryEvaluation of Alkaline Persulfate Digestion as an Alternative to Kjeldahl Digestion for Determination of Total and Dissolved Nitrogen and Phosphorus in Water. U.S. Geological Survey. Water-Resources Investigations Report 03-4174. Pettersson, K., 1998. Mechanisms for internal loading of phosphorus in lakes. Hydrobiologia 373/374, 21–25. SEPA, 2007. Sweden’s Environmental Objectives in an Interdependent World—de Facto 2007. Swedish Environmental Protection Agency, Stockholm. Sonesten, L., Engberg, S., 2001. Totalfosforanalyser vid Institutionen fo¨r miljo¨analys 1965–2000. Department of Environmental Assessment, Swedish University of Agricultural Sciences, Uppsala (in Swedish). Sta˚lnacke, P., Grimvall, A., 2001. Semiparametric approaches to flow-normalisation and source apportionment of substance transport in rivers. Environmetrics 12, 233–250. Szalay, A., Gray, J., 2006. Science in an exponential world. Nature 440, 413–414. Thompson, M.L., Reynolds, J., Cok, L.H., Guttorp, P., Sampson, P.D., 2001. A review of statistical methods for the meteorological adjustment of ozone. Atmos. Environ. 35, 617–630. Uhlig, S., Kuhbier, P., 2001. Trend methods for the assessment of the effectiveness of reduction measures in the water system. Umweltforchungsplan 298 22 244, Bundesministerium fu¨r Umwelt, Naturschutz und Reaktorsicherheit, Berlin. Wahlin, K., Grimvall, A., 2007. Uncertainty in water quality data and its implications for trend detection. Technical ReportLiU-IDA-Stat-02/07. Linko¨ping University.
124
environmental science & policy 11 (2008) 115–124
further reading
Linko¨ping University concern trend detection and data quality issues in environmental monitoring.
Swedish University of Agricultural Sciences, Department of Environmental Assessment. http://www.info1.ma.slu.se/ ma/www_ma.acgi$Project?ID=Intro.
Anders Grimvall is Professor of Statistics at Linko¨ping University with a long career as environmental scientist. He has undertaken research on the large-scale turnover of nutrients and toxic organic contaminants in the environment, and he has a special interest in data mining and statistical analysis of environmental monitoring data.
Karl Wahlin is a statistician focusing on applications of statistics in environmental science and management. His PhD studies at