Uncertainties in annual riverine phosphorus load estimation: Impact of load estimation methodology, sampling frequency, baseflow index and catchment population density

Uncertainties in annual riverine phosphorus load estimation: Impact of load estimation methodology, sampling frequency, baseflow index and catchment population density

Journal of Hydrology (2007) 332, 241– 258 available at www.sciencedirect.com journal homepage: www.elsevier.com/locate/jhydrol Uncertainties in ann...

735KB Sizes 12 Downloads 78 Views

Journal of Hydrology (2007) 332, 241– 258

available at www.sciencedirect.com

journal homepage: www.elsevier.com/locate/jhydrol

Uncertainties in annual riverine phosphorus load estimation: Impact of load estimation methodology, sampling frequency, baseflow index and catchment population density P.J. Johnes

*

Aquatic Environments Research Centre, School of Human and Environmental Sciences, University of Reading, Whiteknights, Reading BERKS RG6 6AB, UK Received 15 September 2005; received in revised form 21 March 2006; accepted 3 July 2006

KEYWORDS Uncertainty; Load estimation; Phosphorus; Model calibration; Nutrient budgets

Summary Models developed to identify the rates and origins of nutrient export from land to stream require an accurate assessment of the nutrient load present in the water body in order to calibrate model parameters and structure. These data are rarely available at a representative scale and in an appropriate chemical form except in research catchments. Observational errors associated with nutrient load estimates based on these data lead to a high degree of uncertainty in modelling and nutrient budgeting studies. Here, daily paired instantaneous P and flow data for 17 UK research catchments covering a total of 39 water years (WY) have been used to explore the nature and extent of the observational error associated with nutrient flux estimates based on partial fractions and infrequent sampling. The daily records were artificially decimated to create 7 stratified sampling records, 7 weekly records, and 30 monthly records from each WY and catchment. These were used to evaluate the impact of sampling frequency on load estimate uncertainty. The analysis underlines the high uncertainty of load estimates based on monthly data and individual P fractions rather than total P. Catchments with a high baseflow index and/or low population density were found to return a lower RMSE on load estimates when sampled infrequently than those with a low baseflow index and high population density. Catchment size was not shown to be important, though a limitation of this study is that daily records may fail to capture the full range of P export behaviour in smaller catchments with flashy hydrographs, leading to an underestimate of uncertainty in load estimates for such catchments. Further analysis of sub-daily records is needed to investigate this fully. Here,

* Tel.: +44 118 378 6197; fax: +44 118 975 5865. E-mail address: [email protected]. 0022-1694/$ - see front matter ª 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.jhydrol.2006.07.006

242

P.J. Johnes recommendations are given on load estimation methodologies for different catchment types sampled at different frequencies, and the ways in which this analysis can be used to identify observational error and uncertainty for model calibration and nutrient budgeting studies. ª 2006 Elsevier B.V. All rights reserved.

Introduction In the development of any modelling approach it is important that model output is calibrated against observed data in order to test the validity of the parameter values selected for the study site. McIntye and Wheater (2004) argue that this need is paramount in modelling at all scales, both to test the conceptual model in terms of model structural errors, and to minimise the uncertainty associated with the spatial and temporal lumping of model parameters, including the lumping of discrete chemical species. In diffuse pollution modelling this means that model estimates of nutrient or sediment loading delivered to a water course must be compared with observed rates of nutrient or sediment loading in the water body at a representative scale and in an appropriate chemical form. However, in many modelling studies the observed load estimates are based on infrequent monitoring, with samples analysed to determine only one fraction of the total nutrient load. Uncertainty is rarely expressed or acknowledged for these estimates, and the concept of fitting a model or parameter set to fuzzy observational data with the potential for multiple acceptable models that might fit these observations, as argued by Beven in a number of papers, is largely ignored (see Beven and Freer, 2001; Beven, 2006). It is not cogent to argue that these observed load estimates are certain, that there is little or no uncertainty associated with them, and that a model or set of model parameters which fits well with these estimates is therefore in some way ‘optimal’ or a unique solution. Indeed, as Beven argues (2005), optimal parameter sets are dependent on the period of calibration data used. It follows logically that multiple optimal parameter sets may therefore be identified in any model calibration procedure depending on the observational data points selected from or available in the record. Observational uncertainty is critically important in defining and contributing to parametric uncertainty and errors in the conceptual models underpinning diffuse pollution research and management. The need to both acknowledge this observational uncertainty and, where possible, amend existing monitoring programmes to reduce this uncertainty is clear. There are a number of examples of more frequent routine water quality modelling programmes, including the Danish National Monitoring and Assessment Programme (NOVA), launched in 1988 (Kronvang et al., 2005), and targeted monitoring on key rivers basins in the USA (see for example Passell et al., 2005). In these cases existing routine water quality monitoring programmes have been modified to take account of evidence of temporal fluctuations in the rate and speciation of contaminant transport published in the scientific literature. In the UK, routine water quality monitoring has been conducted on all major rivers under the Harmonised Monitoring Scheme (HMS) since 1974. This was established to provide an archive for UK water quality

assessment to meet international obligations and estimate contaminant transport to coastal waters. It is focused on 230 sites located at the freshwater:tidal interface and at the confluence of major river tributaries. In addition, as part of the General Quality Assessment of river water quality in England and Wales, 7000 river stations are sampled monthly, representing over 40,000 km of waterway, with separate, comparable sampling schemes in Northern Ireland and Scotland. However, both the HMS and GQA programmes focus only on the inorganic and instantaneously bioavailable nutrient fractions (nitrate, NO3-N; total ammonium, NH4-N; orthophosphate, PO4-P). In recognition of emerging evidence of the importance of other nutrient fractions in terms of the total nutrient load delivered to rivers from their catchments (see for example Johnes and Burt, 1991), and the need for long-term integrated environmental monitoring for multi-stressors on UK terrestrial and freshwater ecosystems, the UK Environmental Change Network was launched in 1992. This involves, for freshwater systems, monthly sampling of 28 river stations together with quarterly sampling of 16 lakes from 1994 to date. Amongst other variables, total P (TP), particulate P (PP) and soluble reactive P (SRP), NO3-N, NH4-N and total N (TN) are determined for these sites. The combined UK routine water quality monitoring programmes provide very good spatial representation, and there are a few sites where full nutrient speciation information is available on a monthly basis. However there has been little change since the inception of the HMS programme in terms of the frequency of sampling. For the vast majority of the British freshwater systems there is no information on total nutrient flux from land to water. This pattern of sampling is common to many other national monitoring programmes in Europe and elsewhere. Resource constraints are the main reason why there has been little change in sampling frequency or inclusion of representative variables in such monitoring programmes to date. It is acknowledged that these constraints are likely to remain and that these partial representations of nutrient flux will therefore be the only data available for many sites where nutrient enrichment is problematic and in need of control. This paper therefore explores the nature and extent of the observational error associated with nutrient flux estimates based on partial fractions and infrequent sampling, and examines the approaches which may be adopted to both quantify and reduce the uncertainty associated with these estimates. The analysis focuses on phosphorus, for which 39 water years of daily paired flow and P concentration data are available within the Aquatic Environments Research Centre (AERC) data archives for 17 UK research catchments. These data were used to: • Determine the nature and extent of uncertainty in P load estimates using a variety of methodologies.

Uncertainties in annual riverine phosphorus load estimation River Lambourn at Boxford Suspended sediment (mg/1)

40 30 20 10

P fractionation trends in rivers

SRP (mg/l)

SUP (mg/l)

TDP (mg/l)

PP (mg/l)

TP (mg/l)

01/12/1999

0 0.50

Sources of uncertainty in load estimates

Mean discharge x 10 (m3 s-1)

01/08/1999

• Place the imprecision and bias of load estimates associated with sampling frequency in context with the imprecision of load estimates resulting from the selection of different load estimation methodologies. • Assess whether there are underlying catchment characteristics which might make load estimation inherently more or less uncertain in different systems. • Advise those relying on infrequent P concentration and flow observations on the least biased and methodologies for estimating load in different catchment types.

243

0.40 0.30 0.20 0.10 01/02/2000

01/10/1999

01/06/1999

01/04/1999

01/02/1999

01/12/1998

01/10/1998

01/08/1998

01/06/1998

01/04/1998

0.00

River Enborne at Brimpton Suspended Sediment (mg/1)

700 600 500 400

Mean Discharge x 20 (m3 s-1)

SRP (mg/l)

SUP (mg/l)

TDP (mg/l)

PP (mg/l)

TP (mg/l)

01/12/1999

1.60

01/08/1999

300 200 100 0

1.20 0.80 0.40

01/02/2000

01/10/1999

01/06/1999

01/04/1999

01/02/1999

01/12/1998

01/10/1998

01/08/1998

01/06/1998

0.00 01/04/1998

Trend analysis of the water quality records available for rivers provides a valuable insight into the impact of climatic variability, past and ongoing intensification of agricultural practice, and nutrient control programmes on nutrient concentrations in the receiving waters (see for example Heathwaite et al., 1996; Passell et al., 2005; Bennion et al., 2005; Kronvang et al., 2005; Littlewood and Marsh, 2005). For those basins with a relatively complete representation of nutrient flux behaviour, a comparison of trends for basins with dissimilar hydrogeology can give an indication of the relative importance of the various controls on nutrient hydrochemical function under different environmental conditions. This in turn can provide valuable information on the nature and origins of uncertainty associated with nutrient load estimates based on infrequent sampling or fractional analysis of the total nutrient load. Examples of the type of environmental behaviour evident for phosphorus in UK catchments when sampled at daily frequency are shown in Fig. 1 (after Evans and Johnes, 2004) for two tributaries of the River Kennet, itself a tributary of the Thames, draining eastwards to join the Thames at Reading. The rivers were sampled daily over a two year period, 1998–2000, with the sampling stations only 10 miles apart. Both catchments have a similar population density (0.53 and 0.62 people per hectare, respectively), similar agriculture (mixed arable and sheep production), catchment size (198 km2 and 145 km2 respectively) and similar rainfall rate and intensity (average annual rainfall, 1961– 1990, of 743 mm and 717 mm respectively). However, the Lambourn catchment is entirely underlain by Chalk, and it has a high baseflow index (0.98), with a high proportion of the flow in the river delivered from groundwater sources (National River Flow Archive (NRFA)). The flow regime is therefore moderate and peak flow events are uncommon. The P hydrochemistry reflects this, with a high proportion of the total P load present in the form of SRP (65–75%). By contrast the Enborne catchment is predominantly underlain by tertiary beds with a high proportion of clay soils in the catchment. As a result the catchment has a lower baseflow index (0.53, NRFA) and a more extreme flow regime. A high proportion of the flow in the river is delivered along overland and near-surface lateral quickflow pathways, particularly during storm events, and the total P load is dominated by particulate P (50–60%). It is apparent from these records that monthly determination of SRP concentrations

Figure 1 Difference in hydrochemical response characteristics of rivers with high (River Lambourn) and low (River Enborne) baseflow index (after Evans and Johnes, 2004).

in the Lambourn catchment is likely to return a more accurate representation of total P loading in the River Lambourn that is the case in the River Enborne where SRP is a minority fraction of the total P load and the majority of P is transported in relatively few storm events. Reliance on routine determination of only one fraction of the load also generates uncertainty. This is illustrated in Fig. 2, where the Total Dissolved P (TDP) to TP ratio for both concentration and load is presented for 5 UK basins for which paired daily observations of TDP and TP concentrations and instantaneous flow were available for multiple water years (Johnes and Burt, 1991; Heathwaite and Johnes, 1996; Prior and Johnes, 2002; Evans and Johnes,

244

P.J. Johnes

2004). Sites are ordered along the X axis according to baseflow index and human population density. TDP, the sum of SRP and SUP, constitutes anywhere between 32% and 85% of the TP load (35–87% of the TP concentration) in these rivers. The greatest inter-annual differences in TDP:TP ratio are observed in the Lambourn, and the smallest in the Winterbourne, despite both having a similar baseflow index (0.98, 0.96, NRFA) and predominantly agricultural character. The proportion of the TP load transported as TDP in the Lambourn is higher in wetter years (1998–2000) than in drier years. This trend is reversed in basins with a lower baseflow index. In the Enborne, where the flow regime is flashier, the proportion of TDP contributing to the TP load falls in wetter years as a result of greater PP delivery from the catchment along near-surface and surface quickflow pathways. Similar findings are reported by Vanni et al. (2001) in a study of dissolved and particulate nutrient flux in three adjacent agricultural catchments in Ohio, USA, where baseflow index varied from 0.29 to 0.44. Over a 5 year period of study particulate P dominated the TP load in all three systems and baseflow index could not be used to explain differences in the fractionation of P flux in the three basins studied. This was consistent with findings published for 27 Chesapeake Bay watersheds by Jordan et al. (1997). Reliance on partial fractions in any monitoring programme is therefore likely to lead to highly uncertain estimates of P transport within catchments. Baseflow index may be important in describing the likely frequency distribution of P concentrations and load within a catchment, but cannot be used to indicate the likely fractionation of the P load, at least in the few basins for which detailed daily observations are available for multiple water years. Load estimates based on regression of P fraction concentrations against observed flow for intensively monitored periods and then extrapolated for years with less frequent observed concentrations are likely to yield highly uncertain load esti-

Frequency distribution analysis A second major source of uncertainty in P load estimates derives from the frequency of sampling undertaken in the catchment. The role of flow and sediment transport in driving P transport in rivers is well known and clearly illustrated in the data presented in Fig. 1. Thus an infrequent monitoring programme which collects samples under a representative range of sediment transport and flow conditions may yield less uncertain estimates of P fractionation and loading than a routine water quality monitoring programme which samples at a fixed time interval, regardless of flow conditions. This is illustrated in Fig. 3, where observed instantaneous flow, suspended sediment, and total P data are presented taken from records for the River Windrush at Worsham, 1987–1988 (Johnes and Burt, 1991, 1993; Heathwaite and Johnes, 1996). The Windrush is a Cotswold river, draining south-east to join the Thames above Oxford. Its catchment is largely underlain by Jurassic Limestone, with a moderately high baseflow index (0.80, NRFA). The graphs in Fig. 3 represent the frequency distribution of data collected in the Windrush programme relative to sampling frequency. The upper part of each graph shows the distribution of the raw data for the River Windrush at Worsham for water year 1987/1988 and the impact on this distribution of sub-sampling the daily data set at a weekly and then a monthly sampling frequency for (a) discharge, (b) suspended sediment concentrations and (c) total phos-

TDP:TP (instantaneous load)

1.000 0.900 0.800 0.700 0.600 0.500 0.400 0.300 0.200 0.100 0.000

W

in t W erb in ou te rn rb e ou S La rne trea m S m La bou trea 95 m rn m -96 La bou Do 96 m rn wn -97 La bou Do 95 m rn wn -96 bo D u o 96 La rn wn -97 m D o 98 La bou wn -99 m rn 99 La bou Up -00 W m rn 95 in dr La bou Up -96 W us m rn 9 6 in h dr a bou Up -97 W ush t Ri rn 98 s U -9 in p d a si 9 W rus t Ri ngt 99 in h ss on -0 dr at in 0 us W gt 87 o -8 h at ors n 8 8 En Wo ham 8-8 bo rs 8 9 En rne ham 7-8 bo D 8 8 rn ow 8-8 e En D n 9 9 bo ow 8-9 R En rne n 9 9 iv e bo U 9-0 R r C rne p 9 0 iv ol e e U 8-0 R r Co Sit p 9 0 iv er le e A 9-0 C Sit 95 0 ol e -9 e Si B 9 6 te 5D 96 95 -9 6

% of TP load as TDP

TDP:TP (concentration)

mates. Thus a major source of uncertainty in P load estimates derives from the fact that P transport and fractionation varies both between catchments and between water years. Simple rules, based on relationships between P fractionation and flow records and derived from partial observations of system behaviour, cannot be used to generate robust indices of P loading and fractionation outside the period of observation.

High Low

Figure 2

Baseflow index Population density

Moderate High

TDP:TP ratios in rivers with daily observations of concentrations and flow.

Uncertainties in annual riverine phosphorus load estimation

245

(a) Discharge

(c) Suspended sediment Frequency histogram Daily sampling Weekly sampling Monthly sampling

40 30 20 10 0

0.5

% of total count

% of total count

Frequency histogram 50

100

Daily sampling Weekly sampling Monthly sampling

80 60 40 20 0

0.5

Exponential distribution

Gamma distribution

Monthly sampling ( =4.11, =3.46) Weekly sampling ( =1.88, =1.22) Daily sampling ( =2.22, =1.56)

Monthly sampling (=e-0.118x) Weekly sampling (= e Daily sampling (=e

-0.079x

-0.072x

)

)

(c) Total phosphorus % of total count

Frequency histogram 70 60 50 40 30 20 10 0

Daily sampling Weekly sampling Monthly sampling

0.5

Gamma distribution

Monthly sampling ( =30.2, =113.7) Weekly sampling ( =11.4, =44.1) Daily sampling ( =6.05, =17.6)

Figure 3 Impacts of the frequency distribution of flow and suspended sediment on the hydrochemical response characteristics of UK rivers.

phorus concentrations. These distributions are described in the lower part of each graph comprising a Gamma Distribution Model fitted to these frequency distributions. In these, a is a curvilinear function describing the rising limb and b is an exponential function representing the falling limb of the distribution. The trends for discharge and suspended sediment concentrations are telling. Discharge shows a skewed distribution where baseflow is reflected in the near-normally distributed section of the distribution, and stormflow contributes the highly skewed tail of the distribution. Suspended sediment concentrations could not be fitted to the Gamma Distribution Model, since the data distribution is characteristically exponential. This is because suspended sediment transport will be equal to zero when flow fails to exceed the critical mobility threshold for the finest sediment fraction. For both discharge and suspended sediment the tail of the distribution contracts as sampling frequency is reduced. The impact of these trends in discharge and suspended sediment behaviour on phosphorus concentrations is clear. For Total P, where 40% of the load is transported as particulate P, the hydrochemical response characteristic shows a

skewed distribution comparable to that for discharge. As sampling frequency is reduced from daily to monthly the tail of the distribution is lost and the data distribution seems to trend towards normality. This reflects the fact that less frequent sampling has a significantly reduced chance of incorporating short-term, extreme flow conditions within the data set. Thus nutrient species concentrations determined from a low-frequency sampling programme will tend to reflect only baseflow conditions which will tend to show a normal distribution across the range of baseflow conditions in the absence of the exponentially distributed suspended sediment fraction from the water column. This presents a major problem where an accurate estimate of total nutrient loading is required in a sampling programme, since low-frequency sampling has a high probability of misrepresenting the environmental behaviour of all non-normally distributed variables. At low sampling frequencies the co-incidence of high flow volume with high nutrient or sediment concentrations will have a high probability of being missed. Since these high flow events will account for a substantial proportion of the total nutrient load transported in the river in any water year loading estimates based on low sampling frequencies will be likely to significantly underestimate the

246

P.J. Johnes

total annual nutrient load. It is also possible that inclusion of one extreme high flow event in 12 monthly records could lead to a significant overestimate of nutrient loading in some systems. When this analysis was repeated for those UK catchments with paired daily observations of discharge, suspended sediment and P fraction concentrations, it was apparent that each variable could be assigned to one of 3 basis hydrochemical response types, as shown in Fig. 4. In the systems studied suspended sediment showed a uniformly exponential distribution. All individual P fractions showed a skewed distribution, including total P in predominantly agricultural catchments and those with a baseflow dominated hydrograph. Total P only ever showed a quasi-normal distribution in catchments with a high proportion of the total P load delivered from daily discharges of poorly treated sewage effluent, maintaining a relatively constant rate of injection into the river and overwhelming the diffuse source signal. However, this varied between years, from wet to dry, resulting from the variable significance of the diffuse P loading to the total P load transported in these rivers.

Determining load estimate uncertainty There are thus a range of sources of the uncertainties associated with P load estimation. These are due to the inherent variability in the drivers controlling P transport and transformations in catchments, the proportion of the total P load derived from diffuse (flow dependent) and point (flow independent) sources, and the proportion of the total P load transported in each P fractional form (SRP, SUP, PP) as these vary both between basins and between water years within a single basin. This problem has received attention in a number of earlier studies, with new methodologies derived to compensate in some way for the failure to acquire a representative sample of the true environmental behaviour of contaminants in routine water quality monitoring programmes. (see for example Dolan et al., 1981; Kronvang and Bruhn, 1996; Webb et al., 2000). Many of these rely on the availability of continuous flow records, and relationships derived between flow and contaminant transport to ‘repair’ the observed P record. However, the extent to which a particular methodology can compensate for the inadequacies of the sampling programme will vary between

Num ber of s a mpl es

Suspended sediment Q, SRP, SUP, PP, (TP) (TP)

Concentration

Figure 4 Typical hydrochemical response characteristics of flow, suspended sediment and P fractions in UK rivers.

basins according to the degree of skewness in the frequency distribution of the ‘true’ population of data, and the complexity of catchment sources contributing to the contaminant signal. A wide range studies report the impact of these methodologies on sediment, and occasionally P, load estimation in different catchments (see for example Walling and Webb, 1981, 1982, 1988; Rekolainen et al., 1991; Cohn et al., 1992; Walling et al., 1992; Cohn, 1995; Kronvang and Bruhn, 1996; Webb et al., 1997; Phillips et al., 1999; Horowitz, 2003; Littlewood and Marsh, 2005). In the most recent papers, new statistical techniques are presented to reduce the uncertainty associated with contaminant transport based on discharge indices (Vogel et al., 2003) and in the case of records censored because of analytical detection limits (Cohn, 2005). These provide a useful step forward, but there remains a need for new and improved procedures for P load estimation, not least because in many catchments P transport may not be strongly controlled by stream discharge, so that continuous flow records cannot be used to fill the gaps in the observed P database. Here the impact of load estimation methodology on P load estimation at different sampling frequencies is investigated in relation to the sources of uncertainty identified in section 2. A suite of analyses have been undertaken in 17 UK catchments with differing hydrology, population density and catchment size for which paired daily TP and flow data are available. The environmental characteristics of these catchments are summarised in Table 1. For 11 catchments multiple years of daily observations are available, allowing investigation of inter-annual variability in P load estimate uncertainties under a variety of environmental conditions. The results of this analysis are presented in three parts. First an analysis of the inherent uncertainties associated with different load estimation techniques is presented, based only on the daily data sets, and the least uncertain methodologies for different catchment types are discussed. Second the impact of sampling frequency on the degree of uncertainty associated with each load estimation technique is presented, based on four different sampling frequencies, and the degree of uncertainty associated with load estimation at each frequency is discussed. Finally, recommendations are given for the least uncertain load estimation procedure for use in infrequently sampled rivers in lowland landscapes, and indications are given for areas requiring further research. To assist in the interpretation of the results, some basic information on key catchment characteristics previously reported as important in controlling P fractionation and transport is provided for each of the catchments (Table 2). This includes baseflow index, population density and catchment size and illustrates the wide range of conditions represented in the daily paired Q and TP records for these essentially lowland rural catchments. Baseflow index ranges from 0.41 to 0.98, population density ranges from 0.2 to nearly 3 people per hectare, and catchment size varies from 25 km2 to 1282 km2. Mean annual runoff varies from 114 mm to 913 mm, and annual average rainfall varies from 631 mm to 1387 mm. The relative impact of these controls on the hydrochemical response characteristics for P and flow in the different catchment types is drawn out in the analysis.

Catchments with daily observations of Q, TP and suspended sediment

Catchment

Major geology

Land use (dominant)

Years of record

Daily variables

Windrush at Rissington Windrush at Worsham

Jurassic Limestone, Lias Clay Jurassic Limestone, Lias Clay

Mixed arable, sheep Arable, sheep, cattle

87/88, 88/89 87/88, 88/89

TP, TDP, Q, SS, N species (and TDP, PP weekly)

800 763

237 233

River Cole Site A River Cole Site B River Cole Site D

Oxford Clay, some chalk Oxford Clay, some chalk Oxford Clay, some chalk

Mixed arable Mixed arable Mixed arable

95/96 95/96 95/96

TP, TDP, Q, SS (and SRP, DHP, PP weekly)

682 682 682

275 275 275

Lambourn Up

Upper Chalk

Grass, arable, sheep, pigs

TP, TDP, Q, SS, N species (and SRP, DHP, PP weekly)

745

183

Lambourn Down

Upper Chalk

Grass, arable, sheep, pigs

95/96, 98/99, 95/96, 98/99,

745

183

Enborne Up Enborne Down

Tertiary Clay, some chalk Tertiary Clay, some chalk

Arable, cattle, sheep Arable, cattle, sheep

98/99, 99/00 98/99, 99/00

TP, TDP, Q, SS (and SRP, DHP, PP weekly)

790 790

288 288

Winterbourne Stream

Upper Chalk

Grass & arable, sheep

95/96, 96/97

TP, TDP, Q, SS, N species (and SRP, DHP, PP weekly)

716

114

Chitterne Ebble Nadder Sem East Avon West Avon

Chalk Chalk Chalk with Clay in headwaters Clay Chalk, Greensand, some Gault Clay Greensand, Chalk, Gault Clay

Grass & arable, sheep Grass & arable, sheep Arable, grass, cattle Arable, cattle Arable, sheep, cattle Arable, cattle, sheep

02/03 03/04 02/03, 03/04 03/04 02/03 02/03, 03/04

TP, Q, SS (and TDP, PP weekly)

921 866 875 880 760 744

220 220 421 475 304 260

Wye at Erwood Frome Garron Brook Stretford Brook Worm Brook

Igneous rock, Palaeozoic sediments Old Red Sandstone Old Red Sandstone Old Red Sandstone Old Red Sandstone

Moorland, sheep grazing Arable, livestock on hills Arable, livestock on hills Arable, livestock on hills Arable, livestock on hills

02/03 02/03, 03/04 02/03, 03/04 03/04 03/04

TP, Q, SS (and TDP, PP weekly)

1387 721 887 955 955

913 287 346 351 351

Ant at Swafield Bridge Ant at Honing Lock Ant at Hunsett Mill

50% sand and gravel, 50% loam 50% sand and gravel, 50% loam 50% sand and gravel, 50% loam

Arable, sheep, poultry, Arable, poultry, pigs Arable, poultry, pigs

99/00 99/00 99/00

TP, Q, SS (and SRP, DHP, PP weekly)

635 631 631

218 206 206

96/97, 99/00 96/97, 99/00

Rainfall (mm)

Q (mm)

Uncertainties in annual riverine phosphorus load estimation

Table 1

Notes: Rainfall and Q are average annual rates, 1961–1990, extracted from the National River Flow Archive, CEH Wallingford. For ungauged stations, numbers in italics have been estimated from neighbouring gauging station and rainfall isohyet maps, extracted from the NRFA records.

247

248 Table 2

P.J. Johnes Baseflow index, catchment area and population density in the study catchments

Name

Baseflow index (units)

Catchment area (ha)

Population size (n)

Population density (people/ha)

Worm Brook Winterbourne at Honeybottom Garron Brook Wye at Erwood Lambourn at Boxford Windrush at Rissington Nadder Chitterne Windrush at Worsham Enborne at Brimpton Ebble Ant at Swafield Stretford Brook Ant at Honing West Avon Ant at Hunsett Cole at Coleshill Sem Frome East Avon

0.48 0.96 0.48 0.41 0.98 0.88 0.82 0.90 0.80 0.53 0.90 0.87 0.48 0.87 0.72 0.87 0.54 0.45 0.48 0.89

6900 4730 9100 128110 19800 17500 22060 6970 29600 14500 10900 2559 5500 4449 8460 8679 9000 2100 7770 8578

1358 1100 2838 42165 10443 9500 13061 4155 17650 9000 7980 1913 4216 3435 7994 10859 16000 4837 19301 23855

0.20 0.23 0.31 0.33 0.53 0.54 0.59 0.60 0.60 0.62 0.73 0.75 0.77 0.77 0.94 1.25 1.78 2.30 2.48 2.78

Uncertainty associated with selection of load estimation methodology The load estimation methodologies used are shown in Table 3. Five interpolation methods were tested, including method 5, the standard methodology used for load estimation where daily data sets are available (see for example Kronvang and Bruhn, 1996; Webb et al., 2000). The uncertainty associated with the selection of load estimation methodology was estimated by application of each of the published methodologies to the daily data sets for the 39 water years of paired TP and flow observations. For each data set, where there were gaps in the record resulting from equipment failure, the gaps were filled by linear interpolation between observed data (after Kronvang and Bruhn, 1996) so that each data set represented a full water year with 365 pairs of paired flow and P concentration data. Method 5 was used to estimate the ‘true’ load for each water year for each catchment. For each daily record and load estimation methodology the load estimate generated was then compared against the ‘true’ load estimate for each site. The bias (B) (difference between the ‘true’ load and the mean of the distribution of the estimates), standard deviation (S) (reflecting the precision of the estimate) and p Root Mean Square Error (RMSE = (B2 + S2)) were then estimated and compared for each water year and catchment. The loads estimated for each of the 39 water years are presented in alphabetical order for each of the 8 load estimation equations in Fig. 5. In each instance the load is plotted as a % of the ‘true’ load as estimated for that river and water year using method 5. For this reason the results for all catchments using method 5 plot at 100%, and no conclusions can be drawn from this regarding the reliability of Eq. (5) at daily sampling frequency. Importantly, the data presented in Fig. 5 illustrate the fact that some of the uncertainty

associated with P load estimation is independent of sampling frequency, and derives from the selection of load estimation methodology. Even where daily observations are available, very biased estimates are returned by some of the methodologies when compared to the ‘true’ load. This is particularly marked for interpolation method 1 which estimates the mean concentration for the site and then multiplies this by the mean of the instantaneous flow observations. For some sites, particularly those with a high baseflow index, this method returns a reasonably close estimate of the ‘true’ load, but for others, particularly those with a more ‘flashy’ flow regime and a low baseflow index, the errors are large, with a maximum of 225% (more than double the ‘true’ load estimate) in catchments with a relatively high human population density, and a minimum of 57% (just over half of the ‘true’ load) in catchments with a similarly low baseflow index but a lower population density. Thus it seems that method 1 is particularly unsuitable for sites with a flashy regime and a high proportion of flow generated along quickflow pathways. This analysis also indicates a high degree of bias in load estimates generated using Eq. (7), which estimates TP using a log–log rating relationship between instantaneous concentration and flow for each water year and each site. This extrapolation method appears to produce a substantial underestimate of load, even based on daily observations for each site, with the greatest bias observed for sites where there is a poor correlation between flow and TP concentrations. The most biased estimate for this method was recorded for the Wye at Erwood, with an estimate of only 45% of the ‘true’ load returned for this catchment. Indeed method 7 seems to systematically underestimate load in all catchments and water years, except for one year in the West Avon where an estimate of 115% of the ‘true’ load was generated. Method 8 seeks to address the deficiencies

Uncertainties in annual riverine phosphorus load estimation Table 3

249

Load estimation (after Dolan et al., 1981; Kronvang and Bruhn, 1996; Webb et al., 2000)

Eq. no. Interpolation methods (1)

(2)

(3)

Equation

Description

n X Ci Load ¼ K n i¼1

!

n X Qi n i¼1

!  n X Ci Q i Load ¼ K n i¼1 Load ¼ K

n X

!

K = conversion factor to take account of period of record Ci = instantaneous concentration associated with individual samples (mg l1) Qi = instantaneous discharge at time of sampling (m3 s1) Qr ¼ mean discharge for period of record (m3 s1) Qpi ¼ mean discharge for interval between samples (m3 s1) C m ¼ mean monthly concentration (mg l1) Qm ¼ mean monthly discharge (m3 s1) n = number of samples

 pi Þ ðCi Q

i¼1

(4)

(5)

Beale’s ratio estimator (6)

! n X Ci Load ¼ K Qr n i¼1

Load ¼

K

Pn ðCi Q i Þ Pi¼1 Qr n i¼1 Q i



 Slq  1 þ n1  lq CF3 ¼  1S2q  1 þ n q2



1 Slq ¼ n1

 X n i¼1

!   l ðQ 2i Ci Þ  n  q

; S2q ¼



   1 2 sumni¼1 ðQ 2i Þ  n  q n1

Gives a correction factor CF3 representing the ratio between mean measured loads and mean actual flow. The influence of CF3 decreases as n increases Extrapolation methods (7)

Log–log rating Ci ¼ aQ bi

(8)

Smearing estimate  X 1 n 10ei CF2 ¼ n i¼1

ei = log(Ci)  log(Cei) Log–log estimate of concentration is multiplied by CF2 to give ‘smeared’ estimate of concentration

in method 7 by using a correction factor which takes account of the difference between the observed and estimated concentrations generated by method 7. Method 8 is less biased as a load estimation method than method 7 for these catchments, but still tends to return a marked underestimate of the ‘true’ load for more than half of the catchments studied here. Excepting method 5, the methods which seem to return the least biased estimates of the ‘true’ load, based on a daily sampling frequency, are methods 2, 3 and 6, each of which relates instantaneous concentration to either instantaneous flow (method 2) or mean flow between sampling periods (method 3), or applies a correction factor to the estimates, based on instantaneous flow and concentration to represent the ratio between the mean measured loads and the mean actual flow. This is Beale’s Ratio Estimator, and has performed well when tested in a number of catch-

ments for the estimation of nutrient and sediment loads (see for example Dolan et al., 1981; Kronvang and Bruhn, 1996; Webb et al., 2000). However, as the authors of this method note, the influence of the correction factor decreases as the number of samples increases. Thus by testing this against the full daily record, its performance may not be clear. The value of this method is clarified in Section 3.2 where it is tested against estimates of the ‘true’ load as sampling frequency decreases. This initial analysis shows that different equations return load estimates with different degrees of bias for the various catchments and these may be explained, in part, by the proportion of the signal delivered from point sources (as indicated by the population density in the catchment) and the flashiness of the flow regime (as indicated by the baseflow index in a catchment). This is evident in the data presented in Fig. 6. Here the range of load estimates generated from

250

A nt at Honing 99- 00 A nt at Hunset t 99- 00 A nt at S w af ield 99- 00 C hitterne 02- 03 E as t A v on 03- 04 E bble 03- 04 Enbor ne at Brimpton Down 98- 99 Enborne at Brimpton Down 99- 00

200.0

Enborne at Brimpton Up 98- 99 Enborne at Brimpton U p 99- 00 Fr ome 02- 03

% deviation of load from 'true' mean

Fr ome 03- 04 Gar ron Br ook 02- 03 Garr on Br ook 03- 04 L ambour n at Box ford Down 95- 96 L ambour n at Box ford Down 96- 97 L ambour n at Box ford Down 98- 99 L ambour n at Box ford Down 99- 00 L ambour n at Box ford Up 95- 96 L ambour n at Box ford Up 96- 97 L ambour n at Box ford Up 98- 99 L ambour n at Box ford Up 99- 00

100.0

Nadder at Wilton 02- 03 Nadder at Wilton 03- 04 Riv er Cole Site A 95-96 Riv er Cole Site B 95- 96 Riv er Cole Site D 95- 96 Sem 03- 04 Str etford Br ook 03- 04 Wes t Avon 02- 03 Wes t Avon 03- 04 Windr us h at Ris sington 87- 88 Windr us h at Ris sington 88- 89 Windr us h at Wor s ham 87- 88 Windr us h at Wor s ham 88- 89 Winterbour ne at Honey bottom 95- 96

0.0

Winterbour ne at Honey bottom 96- 97

Equation 1

Equation 2

Equation 3

Equation 4

Equation 5

Equation 6

Equation 7

Equation 8

Wor m Br ook 03- 04 Wye at Er wood 02- 03

Figure 5

Load estimates by methodology (% of ‘true’ load) based on daily sampling.

P.J. Johnes

at Er w oo W St o re rm Se d 0 tfo B m 2G rd ro 0 03 ar B o 3 En G ron ro k 0 -04 ar B ok 3b ro ro 0 04 En or n o 3bo ne Br k 0 r n at o 0 4 Fr ok 2-0 E n e Br o 0 3 bo at B im En r r pt Fr me 3-0 n o i o bo e m n m 02 4 rn at pto Do e -0 e Br n w 03 3 a i D n R t B mp ow 98 04 iv ri to n R er C mp n U 99 99 iv o to p R er le n U 98 00 iv Co S p er l it 9 99 Co e S e A 9W le ite 9 00 in W S B 5-9 W dru e it 9 in s W st e D 5- 6 dr h e Av 9 96 us at st o 5h W A n 9 N at or vo 02 6 ad W s n -0 N de or ham 03 3 ad r s d a ha 8 04 An er t W m 7-8 t a at Wilto 88 8 n W t An S ilt 0 -89 in o 2 W dru A t a waf n 0 -0 in s nt t H ie 3 3 dr h a o ld -0 us at t H ni 99 4 h Ri un ng -0 at ss s 9 0 W Ri in ett 9-0 in s gt 9 0 W ter in bo Ea sing on 9-0 te u st to 87 0 La rbo rne A n -8 C m u a hi von 88- 8 tte La bo rne t H 8 r 03 9 m u a o La bo rn t H ney Eb ne -0 m u a o b b 02 4 La bo rn t Bo ney ott le 0 -0 m ur at xf bo om 3- 3 bo n Bo or tt 9 0 La urn at B xfo d D om 5-94 m o r o 9 6 La bo at B xfo d D wn 6-9 m u r o 9 7 La bo rn oxfo d D wn 5m ur at rd ow 96 96 La bo n Bo D n -9 a m ur t x f ow 98 7 bo n Bo or n ur at xf d U 9 99 n Bo or p 9at x d 9 00 Bo for Up 5xf d U 96 96 or p d 9 97 Up 899 99 -0 0

3 TP load (TRUE)

2

1

TP load estimate (kg/ha)

W in W ter in bo te u rb rn ou e W rn at o e H rm at o n B H ey ro o b o La G ney ott k 0 m a b o 3 La bo G r ro ott m 9 -04 a n o m u La bo rn Wy r ro Br m 5-9 m ur at e a n B oo 96 6 La bo n B t r k -9 m ur at ox Er oo 02 7 b n B fo w k -0 La our at oxf rd D ood 03- 3 0 m n B o La bo at oxf rd Dow 02- 4 n 0 m u B o La bo rn oxf rd ow 95 3 m ur at or Do n 9 -96 d La bo n B w 6 m ur at ox Do n 9 -97 W bou n a Bo ford wn 8-9 in r t x 9 9 W dru n a Bo ford Up 9-0 in s t B xfo U 95 0 dr h o rd p us at xf U 9 96 h Ri ord p 6N at ss U 98 97 a R in p N dde iss gto 99 99 ad r in n W de at gto 87 00 in r a W n -8 En W dr t ilt 8 8 b i u En or ndr sh C Wi on 8-8 bo ne ush at hi t lton 02- 9 r n at a Wo ter 0 03 En e B t W rs ne 30 h b a ri En or t B mp ors am 02- 4 bo ne rim ton ha 8 03 r n at pto D m 7-8 e Br n o w 8 8 8 at im D n -8 Br p t o w 9 8 9 im on n -9 pt U 99 9 An on p -0 ta U 98 0 An t S Ebb p 9 -99 St t wa le 9re at fi 0 00 tfo H eld 30 r on W d B ing 99- 4 e r 0 W st oo 99 0 R An es Avo k 03 -00 iv t a t n e A 0 R r C t H vo 0 4 2 iv R er ole uns n 0 - 03 iv C S e 3er ol it tt 0 e e 9 Co S A 9 4 le ite 9 -00 Si B 5-9 te 9 6 5 S D -9 Fr em 95- 6 om 0 96 Ea Fro e 3- 0 st m 02 4 Av e 0 - 03 on 303 04 -0 4

TP load estimate (kg/ha) 3

W ye

TP load estimate (kg/ha)

An An t at An t a Ho t a t H nin t S un g 9 En w se 9bo En r C afie tt 9 00 n h bo e Ea itte ld 9-0 r n at st rn 99- 0 En e a Br Av e 0 0 0 b t im En orn Br pto E on 2-0 bo e imp n bbl 03 3 rn at to Do e 0 -04 e Br n w 3 at im Do n -0 Br p t w 9 8 4 im on n 9 -99 pt U 9 on p - 0 La 9 0 m Fr Up 8-9 L a bo G o 9 ar F me 99m ur r La bo n G on ro 0 00 m 2 m u at a La bo rn Bo rro Bro e 0 -03 m ur at xf n B ok 3bo n Bo or ro 0 04 La urn at B xfo d D ok 2-0 m o rd o 0 3 La bo at B xfo D wn 3-0 m u o r o 9 4 La bo rn xfo d D wn 5-9 m ur at rd ow 96 6 La bo n Bo D n -9 m ur at xfo ow 98 7 bo n Bo r n -9 ur at xf d U 99 9 n Bo or p -0 d a N t B xfo U 95- 0 ad o rd p 9 N de xfo Up 96- 6 ad r rd 9 R de at U 98 7 iv r W p -9 R er C at ilto 99 9 iv o W n -0 R er C le ilto 02 0 iv o S n -0 er le ite 0 3 Co S A 3le ite 95 04 Si B -9 St te 95 6 re tfo D W rd Se 95 96 in W dru We Bro m 0 -96 in s W st ok 3d h W ru at es Avo 03 04 W in sh R t A n -0 in d i W ter Win rus at R ssi von 02- 4 in bo d h is ng 0 03 te u ru a s to 3 rb rn sh t W ing n -0 ou e a o to 87 4 rn at t W rs n -8 e Ho o ha 88 8 at n rs m -8 H ey ha 8 9 on bo m 78 W ey tto 8 8 W or bot m 8-8 ye m to 95 9 at Br m 9 -96 Er oo 6w k 0 97 oo 3 d -0 02 4 -0 3

TP load (TRUE)

2

1

TP load estimate (kg/ha)

(a) ranked alphabetically 4

(b) ranked by population density

Max TP (+1s d) Max TP (+1s d)

Min TP (-1sd) Min TP (-1 sd)

0

4

0

3

(c) ranked by catchment area Min TP (-1sd)

Max TP (+1sd)

3

TP lo ad (TRUE)

2

1

0

Uncertainties in annual riverine phosphorus load estimation

W in W ter An in bo te u ta rb rn ou e A t S Se rn at nt wa m e Ho at fie 0 at n H ld 3H ey on 9 04 St one bot ing 9-0 re yb to 9 0 tfo o m 9W rd ttom 95-00 or Br 9 m oo 96 6 B k -9 C ro 03 7 o h W itt k -0 e e 0 4 W st A rne 3-0 es v 0 4 t A on 2E R An as vo 02 03 iv t a t A n -0 R er C t H vo 03 3 iv o u n -0 R er C le nse 03 4 iv o S t -0 er le ite t 9 4 C S A 9 E G ole ite 95 -00 nb ar S B E orn G ron ite 95 96 nb e ar B D ro ro 9 96 or a n o 5n t Br k 9 E e a Br nb t im o 0 6 E or Br pto E ok 2-0 nb ne im n bb 0 3 or a pt D le 3-0 W ne t B o o 0 4 n w 3 in r W dru at B imp Do n 9 -04 in sh ri to wn 89 dr m us at pt n U 99 9 o h Ri n p -00 La at ss U 98 m i p R n 9 La bo i s gt 99 9 si on m u ng 0 La bo rn 8 0 m u at Fr ton 7-8 La bo rn a Bo o 8 8 m ur t B xfo F me 8-8 bo n o r ro 0 9 La urn at B xfo d D me 2-0 m o rd o 0 3 La bo at B xfo D wn 3-0 m u o r d ow 9 4 La bo rn a xfo D n 5-9 m u r t B r d ow 96 6 La bo n D n -9 m ur at oxfo ow 98 7 bo n Bo r n -9 ur at xf d U 99 9 n Bo or p -0 d a N t B xfo Up 95- 0 a r W N dd oxf d U 96 96 in a er or p -9 W dru dd at d U 98 7 in sh er W p -9 dr 9 9 us at at Wilton 9h W il 0 00 a o t W t W rsh on 2-0 ye o a 03 3 at rsh m 8 -0 Er am 7- 4 w 8 88 oo 8 d -8 02 9 -0 3

4

4

( d) r an k e d b y b a s e f l o w i n d e x Max TP (+1sd) Min TP (-1sd) TP load (TRUE)

2

1

0

Figure 6 Imprecision as an indicator of uncertainty in TP load estimates based on daily data: range of TP load estimates generated using Eqs. (1)–(8) for each site ranked by alphabetically, and by population density, catchment size and base flow index.

251

252

P.J. Johnes

the daily data sets, using the eight load estimation methods is plotted for each catchment, and ranked in ascending order according to four criteria. In Fig. 6(a) the data are plotted in alphabetical order, indicating the degree of imprecision in load estimates by catchment and water year. From this it can be seen that some catchments such as the Chitterne, Ant, East Avon, Ebble, Lambourn, Windrush and Winterbourne are relatively insensitive to the load estimation procedure used when the estimates are generated from daily data sets. All of these systems have a moderate to high baseflow index. Conversely, there are other catchments such as the Enborne, Cole, Sem, Stretford Brook and the Wye at Erwood which show a high degree of sensitivity to the load estimation method selected, with some methods producing very imprecise load estimates even though daily observations are used. This plot also shows that the trend is not sensitive to P loading rate, since the most imprecise load estimates span the P loading range from 0.7 kg P ha1 to 3.2 kg P ha1. To isolate the impact of controlling variables on load estimate uncertainty associated with selection of methodology these data are also presented ranked by population density (Fig. 6(b)), catchment size (Fig. 6(c)), and baseflow index (Fig. 6(d)). The rank order of the catchments according to these criteria is also presented in Table 4. In Fig. 6(b) it is apparent that, with the exception of the River Wye at Erwood, all other sites returning imprecise load estimates have high population densities. Thus the greater the proportion of the total P load contributed by point sources, and the greater the complexity of the signal ‘observed’ in the channel, the more imprecise are the estimates generated by some of the load estimation techniques. This is not surprising since the majority of the methods were originally developed to cope with infrequently sampled rural water bodies with a strong relation-

Table 4

High

Low

ship between contaminant transport and flow. None of these criteria would apply to the discharge of variously treated sewage and septic tank effluents to river channels. There is little evidence that catchment size is a strong determinant of the robustness and precision of load estimates in these catchments (Fig. 6(c)). The 39 water years of data span 17 catchments ranging in size from 25 km2 to 1280 km2, but imprecision shows little relationship with catchment size. However, there are sites with a high human population density which show a low degree of imprecision in the load estimates generated from the daily data sets, notably the East Avon, sampled in 2003–2004, the West Avon in both 2003–2004, the River Ant sampled at three sites in 1999–200, and the Ebble sampled in 2003–2004 (Fig. 6(b)). All of these systems have a moderate to high baseflow index, and this also appears to be a key factor affecting the degree of imprecision in load estimates generated from daily data sets, as shown in Fig. 6(d). Here the rule seems to universally apply that the higher the baseflow index the lower the degree of imprecision in the loading estimates. General conclusions which can be drawn from this analysis are that catchments with a high baseflow index have strongly buffered flow and concentrations, such that the catchments are relatively insensitive to the load estimation method used. Load estimates for river systems in lowland permeable landscapes may therefore be regarded as relatively precise and unbiased with a low degree of uncertainty, at least where daily observations are available. Model estimates should fit closely to these load estimates, and it may be possible to thus constrain the range of models and model parameter ranges which can reasonably fit these narrow bands of observed load. Conversely, catchments with a lower baseflow index return a wide range of load estimates even where daily data sets are

Key criteria which affect the hydrochemical response characteristics of individual catchments Rank by population density

Rank by baseflow index

Rank by catchment size

East Avon Frome Sem Cole Ant at Hunsett Mill West Avon Ant at Honing Lock Stretford Brook Ant at Swafield Ebble Enborne Windrush at Worsham Chitterne Nadder Windrush at Rissington Lambourn Wye at Erwood Garron Brook Winterbourne Stream Worm Brook

Lambourn Winterbourne Ebble Chitterne East Avon Windrush at Rissington Ant at Hunsett Mill Ant at Honing Lock Ant at Swafield Bridge Nadder Windrush at Worsham West Avon Cole Enborne Frome Garron Brook Stretford Brook Worm Brook Sem Wye at Erwood

Wye at Erwood Windrush at Worsham Nadder Lambourn Windrush at Rissington Enborne Ebble Garron Brook Cole Ant at Hunsett Mill East Avon West Avon Frome Chitterne Worm Brook Stretford Brook Winterbourne Ant at Honing Lock Ant at Swafield Sem

Large

Small

Uncertainties in annual riverine phosphorus load estimation

253

available, and the analysis presented here can be used to fit ranges of uncertainty around the estimated ‘true’ mean, so that models can be calibrated against a fuzzy measure of ‘true’ load (after Beven, 2006) which takes into account the inherent variability in the hydrochemical response characteristics of these systems. Catchments with a high human population density also return a relatively high degree of imprecision in load estimates when based on daily observations. As with the catchments with a low baseflow index, ranges of uncertainty can be fitted around the estimated ‘true’ mean to give an indication of the uncertainty in the estimation of loads for these catchments, and a fuzzy measure of the ‘true’ load. In model calibration procedures any combination of model parameter values which returns a load estimate within this range of load estimates can be considered a possible correct description of the ‘true’ origins of the P load in the system. A wider range of optimal models and/or model parameter sets may therefore emerge for these systems, compared to those for lowland permeable catchments.

ate a series of replicate ‘monthly’ or ‘weekly’ data sets. The load estimates comprise:

The analysis presented so far has been based on the availability of reliable daily observations of paired instantaneous flow and concentration in the 17 study catchments. However, in the majority of cases where routine water quality monitoring data are all that are available greater caution needs to be exercised. In the most detailed of earlier studies on contaminant load estimation the various methodologies have been tested for the study catchments by artificially decimating the daily data set to generate replicate artificial weekly or monthly data sets for a range of catchments. This is often undertaken using Monte Carlo analysis tools to generate all possible combinations of samples from the observed record. From these the total contaminant load can be estimated and the difference between these estimates and the ‘true’ load, as estimated from the original daily data set, is used to indicate the bias and imprecision of the loading estimates generated from infrequently sampled data streams. The limitation of these earlier studies is that too few catchment types have been studied for paired TP and flow data, and Monte Carlo analysis fails to preserve the inherent structure of the sampling programme (it is possible, for example for all 12 ‘monthly’ samples to be selected from one month of observations). In addition, in some instances the analysis has been performed based on comparison of load estimates generated by different load estimation procedures using infrequently sampled data, and examination of the imprecision and bias of these estimates without reference to load estimates based on more frequently sampled data sets. Conclusions in some of these cases regarding the reliability of load estimates and the extent to which individual nutrient fractions vary over the annual cycle, or between water years must be treated with great caution, as the data in Figs. 1 and 2 illustrate. Here, load estimate uncertainty associated with sampling frequency has been estimated by taking each of the 39 water years of daily paired data and artificially decimating the records (after Kronvang and Bruhn, 1996) to gener-

The structure of the sampling programme was maintained for each artificial record, thus reducing the likelihood of a high RMSE being generated by concentration of sampling in a limited time window within the year. To generate the stratified sampling datasets each river record was ‘sampled’ daily for the top 10% of flows and the remainder of the record was sampled weekly, so that for each artificial stratified sampling record the top 10% of flows were always the same records, but the remainder of the observations were varied by the day of the week. The reason for evaluating this sampling programme was that this is now the basis of the stratified sampling protocol adopted for all Danish rivers under the Danish NOVA programme, and was considered by Kronvang and colleagues to provide the least uncertain and most robust estimate of P loading in two large Danish catchments in a similar mathematical evaluation of load estimation procedures (Kronvang and Bruhn, 1996). The impact of these low sampling frequencies on the bias and imprecision of load estimates has been calculated for each replicate data set for each catchment using each of the 8 methodologies and compared against the ‘true’ load calculated using daily data and method 5. The uncertainty associated with these estimates is presented in Fig. 7 as the Root Mean Square Error (RMSE) of these load estimates, with the data ordered by method number along the X axis. The least uncertain estimates are those returning the lowest RMSE. For each method the estimates are presented in alphabetical order for each catchment with Stratified, Weekly and Monthly estimates presented in sequence. Thus for method 1 the first observation is the RMSE associated with Stratified Sampling for the Ant at Honing Lock. 200

RMSE (as % of True load)

Uncertainty associated with sampling frequency

• Daily sampling: 39 records · 8 methods. • Stratified sampling (structured by day of week and Q; top 10% flows sampled daily, remainder sampled weekly): 39 records · 8 methods · 7 replicates. • Weekly sampling (structured by day of week): 39 records · 8 methods · 7 replicates. • Monthly sampling (structured by day of month): 39 records · 8 methods · 30 replicates.

100

0 0

1

2

3

4

5

6

7

8

Equation

Figure 7 Root Mean Square Error of load estimates, calculated using an artificial stratified, weekly and monthly sampling regime, based on 7 structured replicates for stratified and weekly sampling, and 30 structured replicates for monthly sampling for each site.

254

P.J. Johnes

This plot indicates that method 1, as before, returns a high RMSE for the majority of catchments at all sampling frequencies, and can be viewed as unreliable for the purposes of TP load estimation in the catchments studied here. Method 7 is also unreliable as highlighted in section 3. However, it is methods 2 and 4, which previously returned a low degree of bias and imprecision when used for the daily data sets, which are clearly unreliable when used to estimate TP load estimates at lower sampling frequencies. Method 2 contains no mechanism for accounting for concentrations where samples have not been collected, relying wholly on the pairing of instantaneous concentration and flow at the

time of sampling. Method 4 attempts to account for the missing concentration data by reference to the mean annual flow for the entire period of the record. Both methods are clearly highly sensitive to sampling frequency and cannot be recommended for the majority of lowland UK river catchments as represented in this study, unless a representative range of paired flow and TP concentrations is available in the sampling programme as in the case of Stratified Sampling. Methods 3 and 5 continue to return a lower RMSE for the estimates that other methods, but these nevertheless return moderately high RMSE for a number of catchments.

Equation 1 400

Equation 5 500

Max Mean Min

% of true mean

% of true mean

500

300 200 100 0 Daily

Stratified

Weekly

400

Max Mean Min

300 200 100 0 Daily

Monthly

Stratified

Sampling frequency

Equation 2 400

Mean Min

300 200 100 0 Daily

Stratified

Weekly

400

Max Mean Min

300 200 100 0 Daily

Monthly

Stratified

Equation 3

Min

300 200 100 0 Daily

Stratified

Weekly

400

Max Mean Min

300 200 100 0 Daily

Monthly

Stratified

Sampling frequency

Mean Min

300 200 100 0 Daily

Stratified

Weekly

Sampling frequency

Monthly

Equation 8 500

Max

% of true mean

% of true mean

400

Weekly

Sampling frequency

Equation 4 500

Monthly

Equation 7 500

Max Mean

% of true mean

% of true mean

400

Weekly

Sampling frequency

Sampling frequency

500

Monthly

Equation 6 500

Max

% of true mean

% of true mean

500

Weekly

Sampling frequency

Monthly

400

Max Mean Min

300 200 100 0 Daily

Stratified

Weekly

Monthly

Sampling frequency

Figure 8 Deviation of the TP load estimate from the true load, based on 1 daily sampling record, 7 stratified sampling replicates, 7 weekly sampling replications and 30 monthly sampling replicates for each catchment. Data are presented in each class with the catchments ranked in ascending order by population density.

Uncertainties in annual riverine phosphorus load estimation

255

Method 6 performs well returning lower RMSE for estimates for all catchments other than the Wye at Erwood where there is little relationship evident in the record between flow and TP concentrations. This is the only upland site for which paired daily TP and flow data were available, and though it is possible that its behaviour may be typical of such environments, it is prudent that no further conclusions are drawn from the output for this site for upland catchments in general until further records for similar sites have been analysed. As with the analysis of load estimation methodologies based on the daily data, here the deviation of the TP load estimates from the ‘true’ load estimate has been further

broken down according to load estimation method, and ranked in ascending order according to population density (Fig. 8), catchment area, and baseflow index (Fig. 9) for each sampling frequency. Catchment area showed no apparent trend when plotted. A number of trends are, however, apparent in Figs. 8 and 9. Stratified sampling Where stratified sampling is employed interpolation method 2 and extrapolation methods 7 and 8 return the least biased and imprecise estimates of the ‘true’ mean. Compared to weekly sampling, stratified sampling generated a notably closer estimate of the ‘true’ mean in each case. However,

Equation 1 400

Equation 5 500

Max Mean Min

% of true mean

% of true mean

500

300 200 100 0 Daily

Stratified

Weekly

400

Max Mean Min

300 200 100 0 Daily

Monthly

Stratified

Sampling frequency

Equation 2 400

Mean Min

300 200 100 0 Daily

Stratified

Weekly

400

Max Mean Min

300 200 100 0 Daily

Monthly

Stratified

Equation 3 500

Max Mean

% of true mean

% of true mean

400

Min

300 200 100 0 Daily

Stratified

Weekly

400

Equation 7

Min

300 200 100 0 Daily

Monthly

Stratified

% of true mean

% of true mean

Mean Min

300 200 100

Stratified

Weekly

Sampling frequency

Monthly

Equation 8 500

Max

0 Daily

Weekly

Sampling frequency

Equation 4 400

Monthly

Max Mean

Sampling frequency

500

Weekly

Sampling frequency

Sampling frequency

500

Monthly

Equation 6 500

Max

% of true mean

% of true mean

500

Weekly

Sampling frequency

Monthly

400

Max Mean Min

300 200 100 0 Daily

Stratified

Weekly

Monthly

Sampling frequency

Figure 9 Deviation of the TP load estimate from the true load, based on 1 daily sampling record, 7 stratified sampling replicates, 7 weekly sampling replications and 30 monthly sampling replicates for each catchment. Data are presented in each class with the catchments ranked in ascending order by baseflow index.

256

P.J. Johnes

it is not suited to the estimation of loads using methods 3 or 4, where there is little discernible benefit over weekly sampling, and it returns a clear overestimate of load compared to weekly sampling if methods 5 or 6 are used. It is a matter of judgement for water quality managers whether the resources can be found to support the development of a stratified sampling programme for UK rivers but, were this to be so, there are clear benefits which would ensure in the form of less uncertain P load estimates. Resource requirements could be greatly constrained if time for space substitution were considered. However, this might not be necessary for all catchment types, particularly for systems with a high baseflow index and/or low population density. Weekly sampling At weekly sampling frequency method 3 returns the least biased and imprecise estimates of the ‘true’ load, and this is also true for all sites at monthly sampling frequency. Selection of load estimation method is tied to the selection of sampling frequency, and is also influenced by catchment character, with a high degree of uncertainty associated with catchments with a high population density (Fig. 8) and a lower baseflow index (Fig. 9) although the effect is less marked using method 3 than other methods. The mean of the estimates produced using method 3 indicates a tendency to produce a slight underestimate of the ‘true’ load, compared to method 5 which consistently returns a mean of the distribution of load estimated close to the 100% of the ‘true’ load, but with greater imprecision for catchments with a high population density and low baseflow index. At weekly sampling frequency there is a choice to be made between the use of method 3 with a lower degree of imprecision but a greater bias, and method 5 which tends to be less biased but has a higher degree of imprecision. Both could be used together to indicate uncertainties associated with load estimates generated at weekly sampling frequency, to provide a fuzzy estimate of TP load for the purposes of model calibration and the nutrient management.

% of total annual P load

45 40

Monthly sampling Monthly sampling frequency clearly returns the most imprecise, biased and uncertain of the loading estimates for all catchments, regardless of the method used. Even using method 3, the least imprecise of the methods tested, the load estimates for a number of the catchments studied range from more than double to less than half of the ‘true’ load estimates for these rivers. The range of imprecision and bias is, again, least for catchments with a high baseflow index and low human population density. It might be that a hierarchical programme could be devised with less frequent sampling in these systems, allowing for more intensive sampling in systems with low baseflow index and/ or higher human population densities. Continuing with the current monitoring programme in the UK cannot be recommended. A final outcome from this analysis is that sampling at low sampling frequencies can generate a substantial overestimate of load for sites with a relatively low baseflow index and/or high population density. Estimates were generated in the range 196–452% of ‘true’ load for the rivers Frome, Cole, Wye, Sem, Stretford Brook and Worm Brook. This can be attributed to the inclusion of data for one or more of the highest flow events of the year in the 12 monthly samples in these records. The pattern is illustrated in Fig. 10, with sites ranked by baseflow index. At a relatively low baseflow index a single daily record can account for up to 20% of the total annual P load transported in the river, while the top 5 flow events in the year account for up to 42% of the total annual P load. By contrast, at a higher baseflow index the highest flow event of the year typically accounts for <1% of the total annual P load.

Recommendations for least uncertain P load estimation methodology A number of recommendations can be made both for sampling programme design and selection of load estimation

% TP load transported in highest f low event % TP load transported in top 5 events

35 30 25 20 15 10 5

W ye

at E S rw tre oo tfo d W rd Se 02 or Br m -0 m oo 0 3 B k 3-0 E ro 0 4 nb Fr ok 3-0 E or om 03 4 nb ne Ga or a G rro Fr e -0 o 4 t E ne Br ar n B me 02nb at im ro r 03 0 o n E orn Br pt B ok 3nb e im on r 0 04 or a pt Doo 2ne t B on o k 0 03 w a rim D n 3R t Br pt ow 9804 iv im on n -9 R er C pt U 99 9 iv o p -0 R er ole n U 98 0 iv C S p -9 o er l it 9 9 C e S e A 90 ol it W e e 95 0 in W Si B -9 W dru es te 95 6 in s W t dr h e A D 9 -96 us at st vo 5 h W A n -9 N at or vo 02 6 ad W sh n -0 N de or am03 3 ad r sh de at am 8704 W A r at ilt 88 88 nt W on W 8 a A i 0 in n t lt 2 9 W dru An t a Ho on -0 in s t t H nin 03 3 dr h at u g -0 us at S ns 9 4 h Ri wa et 9at ss fi t 9 00 W R in eld 9in is gt 9 0 W ter 0 s o in bo E ing n 89-0 te u a st ton 7- 0 rb rn La o e 8 A v 88 8 m u a La bo rne t H E on -89 C b 0 o m u a b La bo rn t H nehyitt le 3-0 m u at o b er 0 4 La bo rn Bo ney ottne 3-0 m ur at xf bo om02 4 bo n Bo or tt 9 -0 La urn at B xfo d D om 5-93 m o r o 9 6 La bo at B xfo d D wn 6-9 m ur o rd ow 95 7 La bo n xfo D n -9 m u a r o 9 6 La bo rn t Bo d D wn 6-9 m ur at xf ow 98 7 bo n Bo or n ur at xf d U 9 99 n Bo or p 9at x d 9 00 B for Up 5ox d 9 96 fo U 6 rd p -9 U 98- 7 p 9 99 9 -0 0

0

Figure 10

% of TP annual load transported in high flow events, ranked by baseflow index.

Uncertainties in annual riverine phosphorus load estimation Table 5

257

Recommended procedures for total P load estimation in lowland clay and permeable catchments

Baseflow index

Population density

Stratified

Weekly

Monthly

High Moderate Low

Low Moderate High

Method 2, 7, 8 Method 2, 7, 8 Method 2, 7, 8

Method 2, 3, 5 Method 3 Method 3

Method 3 Method 3 Method 3

methodology, to identify and constrain uncertainty in observed P load estimates for UK rivers. • All samples must be analysed to determine both total P and SRP concentrations. The environmental behaviour of TP is more robust and can be estimated with far greater certainty than for SRP or TDP alone, and this applies at all sampling frequencies. • Sampling at monthly frequency returns highly uncertain load estimates. Method 3 returns the least imprecise and uncertain estimates for the catchment types included in this analysis and should be the preferred method of choice for lowland clay and permeable catchments sampled at monthly frequency. However, method 5 should be the preferred method if a less biased estimate is required. • For catchments with a high baseflow index and/or a low population density, methods 2 and 5 both return load estimates with a relatively low degree of bias and imprecision at weekly sampling frequency, but method 2 is unreliable for all catchments at monthly sampling frequency. • For catchments with multiple water years of daily paired TP and flow records the analysis has shown that the relationships between flow and TP and TDP are not stable from year to year, and thus the use of extrapolation methods to reconstruct TP loads based on flow for years with infrequent TP observations cannot be recommended. • Daily sampling in a routine water quality monitoring programme is unlikely to be considered affordable, but a stratified sampling programme with daily sampling on the 35 highest flow days in the year, combined with weekly sampling during the remainder of the year returns load estimates with a low bias and imprecision. This should be considered a viable option for one sampling station at the basin outlet in each designated catchment in the UK in order to generate a reliable and robust estimate of nutrient loading to support the development of catchment management plans for the Water Framework Directive. Recommended methodology(s) at different sampling frequencies are summarised in Table 5 according to baseflow index and population density. It must be noted, however, that in the absence of sufficient evidence of hydrochemical response characteristics for impermeable and upland catchments, it is not possible to extrapolate these recommendations to catchments other than those in lowland permeable or clay regions.

Conclusions This analysis has indicated that for daily data sets methods which estimate load based on paired instantaneous Q and

concentration (C) data are the most precise and least biased of those evaluated. As sampling frequency decreases methods which take account of the ratio of observed flows to mean annual flow return the lowest RMSE. Methods estimate the total annual load from the mean of the observed concentrations and flows return high RMSE at all sampling frequencies and are not reliable methodologies for TP load estimation in lowland UK rivers. Uncertainty in loading estimates increases at all sampling frequencies as population density increases, as baseflow index decreases, and as river regime becomes more extreme. For baseflow dominated systems, particularly those with a low population density, sampling at less than daily frequency returns a relatively low RMSE. TP loads calculated from infrequent sampling programmes for such systems may be viewed as reasonably reliable indicators of riverine TP loading. For systems with significant quickflow hydrological response, and those with a substantial point source P loading, sampling at less than daily frequency will return highly uncertain estimates of observed load. The impact of catchment size on load estimate uncertainty is not clear from the catchments analysed in this study, and further study is required in smaller catchments with sub-daily peaks in flow and P transport.

Acknowledgements The data used in this analysis have been collected in research programmes funded by a wide range of UK funding agencies. The author would like to acknowledge their support for her work, and the work of her many co-workers and students on these projects. Data for the Windrush, Lambourn, Winterbourne and Enborne were generated under Natural Environment Research Council studentships (GT4/AAPS/87, Johnes; GT4/94/402, Prior; GT4/97/241, Evans). Data for the Cole were collected as part of the EU LIFE funded River Restoration Project, in collaboration with Dr Jeremy Biggs, Oxford Brookes University. The Ant data were collected under an Environment Agency funded programme on the Effectiveness of Eutrophication Control by P Reduction (EA R&D project P2-137). Data for the Hampshire Avon and its tributaries the Ebble, Chitterne, Nadder, Sem, East and West Avon, and for the Herefordshire Wye and its tributaries the Frome, Garron Brook, Stretford Brook and Worm Brook were collected under the PSYCHIC (Phosphorus and Sediment Yield Characterisation in Catchments) project funded by the Department for Environment Food and Rural Affairs (Defra project PE0202), Environment Agency and English Nature. The views expressed are those of the author and do not necessarily represent the views of the funding agencies.

258

References Bennion, H., Johnes, P.J., Ferrier, R., Haworth, E., Phillips, G.L., 2005. A comparison of diatom phosphorus transfer functions and export coefficient models as tools for reconstructing lake nutrient histories. Freshwater Biology 50 (October), 1651–1670. Beven, K., 2006. A manifesto for the equifinality thesis. Journal of Hydrology 320 (1–2), 18–36. Beven, K., Freer, J., 2001. Equifinality, data assimilation, and uncertainty estimation in mechanistic modelling of complex environmental systems using the GLUE methodology. Journal of Hydrology 249 (1–4), 11–29. Cohn, T.A., 1995. Recent advances in statistical methods for the estimation of sediment and nutrient transport in rivers. Reviews of Geophysics 33, 1117–1124. Cohn, T.A., 2005. Estimating contaminant loads in rivers: an application of adjusted maximum likelihood to type 1 censored data. Water Resources Research 41, W07003. doi:10.1029/ 2004WR00383. Cohn, T.A., Caulder, D.L., Gilroy, E.J., Zynjuk, L.D., Summers, R.M., 1992. The validity of a simple statistical model for estimating fluvial constituent loads: an empirical study involving nutrient loads entering Chesapeake Bay. Water Resources Research 28 (9), 2353–2363. Dolan, D.M., Yui, A., Geist, R.D., 1981. Evaluation of river load estimation methods for total phosphorus. Journal of Great Lakes Research 7 (3), 207–214. Evans, D.J., Johnes, P.J., 2004. Physico-chemical controls on phosphorus cycling in two lowland streams: I – Water column. Science of the Total Environment 329, 145–163. Heathwaite, A.L., Johnes, P.J., 1996. Contribution of nitrogen species and phosphorus fractions to stream water quality in agricultural catchments. Hydrological Processes 10, 971–983. Heathwaite, A.L., Johnes, P.J., Peters, N.E., 1996. Trends in water quality 6: trends in nutrients. Hydrological Processes 10 (2), 263–293. Horowitz, A.J., 2003. An evaluation of sediment rating curves for estimating suspended sediment concentrations for subsequent flux calculations. Hydrological Processes 17 (17), 3387–3409. Johnes, P.J., Burt, T.P., 1991. Water quality trends and land use effects in the Windrush catchment: nitrogen speciation and sediment interactions. IAHS Publ. 203, 349–357. Johnes, P.J., Burt, T.P., 1993. Nitrate in surface waters. In: Burt, T.P., Heathwaite, A.L., Trudgill, S.T. (Eds.), Nitrate: Processes, Patterns and Controls. Wiley, pp. 269–317. Jordan, T.E., Correll, D.L., Weller, D.E., 1997. Relating nutrient discharges from watersheds to land use and stream flow variability. Water Resources Research 33 (11), 2579–2590. Kronvang, B., Bruhn, A.J., 1996. Choice of sampling strategy and estimation method for calculating nitrogen and phosphorus transport in small lowland streams. Hydrological Processes 10, 1483–1501. Kronvang, B., Jeppesen, E., Conley, D.J., Sondergaard, M., Larsen, S.E., Ovesen, N.B., Cartensen, J., 2005. Nutrient pressures and

P.J. Johnes ecological responses to stream nutrient loading reductions in Danish streams, lakes and coastal waters. Journal of Hydrology 304 (1–4), 274–288. Littlewood, I.G., Marsh, T.J., 2005. Annual freshwater river mass loads from Great Britain, 1975–1994: estimation algorithm, database and monitoring network issues. Journal of Hydrology 304 (1–4), 221–237. McIntye, N., Wheater, H., 2004. McIntyre NR, Wheater HS, Calibration of an in-river phosphorus model: prior evaluation of data needs and model uncertainty. Journal of Hydrology 290, 100–116. Passell, H.D., Dahm, C.N., Bedrick, E.J., 2005. Nutrient and organic carbon trends and patterns in the upper Rio Grande, 1975–1999. Science of the Total Environment 345 (1–3), 239–260. Phillips, J.M., Webb, B.W., Walling, D.E., Leeks, G.J.L., 1999. Estimating suspended sediment loads in the LOIS study area using infrequent samples. Hydrological Processes 13, 1035– 1050. Prior, H., Johnes, P.J., 2002. Regulation of surface water quality in a Chalk catchment, UK: an assessment of the relative importance of instream and wetland processes. Science of the Total Environment 282/283, 159–174. Rekolainen, S., Posch, M., Kamari, J., Ekholm, P., 1991. Evaluation of the accuracy and precision of annual phosphorus load estimates from two agricultural basins in Finland. Journal of Hydrology 128, 237–255. Vanni, M.J., Renwick, W.H., Headworth, J.L., Auch, J.D., Schaus, M.H., 2001. Dissolved and particulate nutrient flux from three adjacent agricultural watershed: a five year study. Biogeochemistry 54, 85–114. Vogel, R.M., Stedinger, J., Hooper, R.P., 2003. Discharge indices for water quality loads. Water Resources Research 39 (10), 1273. doi:10.1029/2002WR00187. Walling, D.E., Webb, B.W., 1981. The reliability of suspended sediment load data. In: Erosion and Sediment Transport Measurement. IAHS Publ. 133, 177–194. Walling, D.E., Webb, B.W., 1982. Estimating the discharge of contaminants to coastal waters by rivers: some cautionary comments. Marine Pollution Bulletin 16 (12), 488–492. Walling, D.E., Webb, B.W., 1988. The reliability of rating curve estimates of suspended sediment yield: some further comments. In: Sediment Budgets. IAHS Publ. 174, 337–350. Walling, D.E., Webb, B.W., Woodward, J.C., 1992. Some sampling considerations in the design of effective strategies for monitoring sediment-associated transport. In: Erosion & Sediment Transport Monitoring Programmes in Catchments. IAHS Publ. 210, 279–288. Webb, B.W., Phillips, J.M., Walling, D.E., Littlewood, I.G., Watts, C.D., Leeks, G.J.L., 1997. Load estimation methodologies for British rivers and their relevance to the LOIS RACS(R) programme. Science of the Total Environment 194/195, 379–389. Webb, B.W., Phillips, J.M., Walling, D.E., 2000. A new approach to deriving ‘best-estimate’ chemical fluxes for rivers draining the LOIS study area. Science of the Total Environment 251/252, 45–54.