Effects of sampling strategies and estimation algorithms on total nitrogen load determination in a small agricultural headwater watershed

Effects of sampling strategies and estimation algorithms on total nitrogen load determination in a small agricultural headwater watershed

Journal of Hydrology 579 (2019) 124114 Contents lists available at ScienceDirect Journal of Hydrology journal homepage: www.elsevier.com/locate/jhyd...

4MB Sizes 0 Downloads 13 Views

Journal of Hydrology 579 (2019) 124114

Contents lists available at ScienceDirect

Journal of Hydrology journal homepage: www.elsevier.com/locate/jhydrol

Research papers

Effects of sampling strategies and estimation algorithms on total nitrogen load determination in a small agricultural headwater watershed

T



Ying Lia, Haw Yenb, R. Daren Harmelc, Qiuliang Leia, ,1, Jiaogen Zhoud, Wanli Hue, Wenchao Lia, Huishu Liana, A-Xing Zhuf,g, Limei Zhaia, Hongyuan Wanga, Weiwen Qiuh, Jiafa Luoi, ⁎ Shuxia Wua, Hongbin Liua, ,1, Xiaohong Lia a

Key Laboratory of Nonpoint Source Pollution Control, Ministry of Agriculture and Rural Affairs of China, Institute of Agricultural Resources and Regional Planning, Chinese Academy of Agricultural Sciences (CAAS), Beijing 10081, China b Blackland Research and Extension Center, Texas A&M AgriLife Research, Texas A&M University, TX 76502, USA c Center for Agricultural Resources Research, USDA Agricultural Research Service, United States Department of Agriculture, Fort Collins, CO 80526, USA d School of Urban and Environmental Sciences, Huaiyin Normal University, Jiangsu Province 223300, China e Institute of Agricultural Resources & Environment, Yunnan Academy of Agricultural Sciences, Kunming 650205, China f Key Laboratory of Virtual Geographic Environment (Nanjing Normal University), Ministry of Education, Nanjing 210023, China g Department of Geography, University of Wisconsin-Madison, Madison, WI 53706, USA h New Zealand Institute for Plant & Food Research Limited, Private Bag 4704, Christchurch 8140, New Zealand i AgResearch, Ruakura Research Centre, 10 Bisley Road, Hamilton 3214, New Zealand

A R T I C LE I N FO

A B S T R A C T

This manuscript was handled by G. Syme, Editor-in-Chief, with the assistance of Li He, Associate Editor

Monitoring data collected from rivers are used to support assessments of the quality of the river environment and help decision makes formulate appropriate management plans. Therefore, accurate and precise estimates of constituent fluxes in streams and rivers are important. However, seasonal errors have not received enough attention and the published results have rarely been verified by other methods. In this study, the error associated with sampling frequencies (3 daily, weekly, 2 weekly, 4 weekly, 6 weekly, 8 weekly) and seven load estimation algorithms were examined. Seasonal and annual total N loads were estimated and wavelet coherence was used as a reference. In order to reduce the contingency of the results, three years data were used and total N (with 17.53% particulate N, 58.70% NO3−-N and 23.77% other dissolved N fractions) was chosen to reflect water quality parameters. Results indicated that 2 weekly sampling frequency and algorithm F (linear interpolation) are sufficient for evaluating the characters of annual total N load in Fengyu River Watershed with 5% RMSE (root-mean-square error). There was a large RMSE (3.97–44.66%) in summer with a high CV (coefficient variation) (35.2%) for total N concentration. These results can serve as a useful reference for decision makes who need to establish monitoring programmes in watersheds with similar characteristics.

Keywords: Total N monitoring Measurement uncertainty Load estimation Sampling frequency

1. Introduction In recent decades, numerous monitoring projects have been conducted to assess water quality degradation in riverine systems (Hudnell, 2010; Richards et al., 2008; Vörösmarty and Meybeck, 2004; Walling and Webb, 1996). Such monitoring programs have been conducted to: (i) assess current water quality conditions (Roygard et al., 2012); (ii) distinguish between point and non-point pollutant sources (Roygard et al., 2012); (iii) provide data for watershed modeling (Gassman et al., 2007; Ullrich and Volk, 2010); (iv) report field boundary conditions and monitoring time to evaluate long-term trends in river loads (Brauer

et al., 2009; King and Harmel, 2003); and (v) evaluate the effectiveness of best management practices to guide policy and management decisions (Brauer et al., 2009; King and Harmel, 2003; Lennartz et al., 2010; Snelder et al., 2017). Ideally, the uncertainty of monitoring data (in terms of accuracy and precision) would be minimized by using appropriate collection and analysis methods, and estimates of the uncertainty would be reported alongside the data to guide data users (Birgand et al., 2010; Harmel et al., 2009; Moatar et al., 2013). However, the error in annual constituent load estimates relative to the “true” reference load may reach ± 100% in some cases, which limits the usefulness of the data (Walling and Webb, 1981; Yen et al., 2016).



Corresponding authors. E-mail addresses: [email protected] (Q. Lei), [email protected] (H. Liu). 1 USDA is an equal opportunity employer and provider. https://doi.org/10.1016/j.jhydrol.2019.124114 Received 12 April 2019; Received in revised form 25 August 2019; Accepted 4 September 2019 Available online 09 September 2019 0022-1694/ © 2019 Elsevier B.V. All rights reserved.

Journal of Hydrology 579 (2019) 124114

Y. Li, et al.

watersheds is more challenging than forested watersheds because of higher levels of temporal and spatial variability (Kelly et al., 2018). For instance, the timing of fertilizer applications relative to high streamflow periods in late spring or early summer contributes to large seasonal differences (Carpenter et al., 2015; Renwick et al., 2008; Richards and Baker, 2002; Royer et al., 2006). The primary goal of this work was to quantify the error in seasonal and annual total N loads as affected by sample collection frequencies and load estimation algorithms in a small agricultural headwater watershed. Since constituent flux data are critical in environmental assessments such as watershed modeling (Johnson et al., 2015; Yen et al., 2014; Yuan et al., 2018; Chen et al., 2019), it is also important to examine the effects of the hydrologic monitoring periods (e.g., annual or seasonal) in the evaluation of errors in load determination. Total N was selected as the contaminant of interest for this study because NO3−-N and NH4+-N are easily affected by pH and temperature. Specifically, four objectives were defined: (i) to quantify error in annual total N loads estimation as affected by sampling frequency; (ii) to evaluate the performance of seven load estimation algorithms; and (iii) to evaluate the impact on error by capturing high-flow events; (iv) to assess the impact of different sampling frequencies on seasonal total N loads estimation, and wavelet coherence was used as a reference.

There are several potential sources of measurement uncertainty in river monitoring: streamflow measurements, sample collection methods, how samples are preserved and stored, sample analysis procedures, and the algorithms used to estimate loads (Harmel et al., 2006; Rode and Suhr, 2007; Yanai et al., 2015). Among these sources, sample collection under typical scenarios has potential for the greatest contribution of uncertainty, accounting for ± 4–50% of cumulative probable uncertainty (Harmel et al., 2006). The flow measurement technology has reached a good level, the streamflow measurements σFlow error Flow between 5 and 10% at seven USGS gages can be classified as “good” (Jiang et al., 2014). In addition, CV (coefficient variation) of sample collection being 6% for the pollutant concentration distributed cross section was uniformly (Martin et al., 1992); there may be a further loss of 0–5% if oxidized nitrogen concentration larger than 0.1 mg/L during sample preservation and storage (Kotlash and Chessman, 1998); as well there was a CV of 3.7–4% when an IC (ion chromatography) method was used in lab analysis (Rode and Suhr, 2007; Jiang et al., 2014). In general, river discharge (Q) data is measured continuously or near-continuously by in situ devices (e.g., stage-discharge sensors that convert measured data to discharge data according to the site-specific relationship between stage and discharge) (Horowitz, 2003; Horowitz, 2008). On the other hand, water constituents concentration data are usually collected discretely (e.g., weekly to monthly) due to the limited manpower and financial resources (Cassidy and Jordan, 2011; Jones et al., 2012). Therefore, several computational methods (e.g., averaging, interpolation, regression/rating curve) have been developed and used to estimate loads from flow data and discrete water quality measurements (Birgand et al., 2010; Johnes, 2007; Kronvang and Bruhn, 1996; Preston et al., 1989; Quilbé et al., 2006; Gitau et al., 2016). Selection of a load estimation algorithm depends on various characteristics such as: (i) constituent measured, (ii) stream characteristics (Richards and Holloway, 1987); and (iii) sampling regime (Aulenbach and Hooper, 2006; Snelder et al., 2017). Many studies have evaluated the error in river load estimation considering sampling frequencies and load calculation algorithms (Table 1). Some methods, such as interpolation (Moatar and Meybeck, 2005; Williams et al., 2015), regression (Guo et al., 2002) and HSPF (Zamyadi et al., 2007), were used to calculate a high temporal resolution database (i.e. sub-daily or daily) as a reference of “true load” for error calculation. Even a weekly database was sometimes used as a reference (Guo et al., 2002; Stenback et al., 2011). However, errors for constituent load estimation are susceptible to hydrological characteristics, including concentration-discharge relationships (Snelder et al., 2017), BFI (Base Flow Index) and FI (flashiness index) (Jones et al., 2012; Kerr et al., 2016; Moatar et al., 2013; Stelzer and Likens, 2006) and watershed characteristics such as population density and size (Horowitz et al., 2015; Johnes, 2007). Available data are difficult to be generalized because of variability factors, such as unique characteristics and special stream hydrological regimes of study areas, different algorithms, different pollutants, and study durations, affect results (Birgand et al., 2010; Cassidy and Jordan, 2011; Defew et al., 2013; Johnes, 2007; Moatar and Meybeck, 2005). The degree of bias is greater at smaller streams or upland watersheds due to the more responsive and varying hydrologic conditions (Coynel et al., 2004; Duvert et al., 2011; Johnes, 2007; Jones et al., 2012). In these sites, for examples, low frequency sampling cannot capture rapid hydrologic responses (Coynel et al., 2004; Moatar et al., 2006; Stelzer and Likens, 2006), so brief monitoring durations (e.g., single season or year) also increase error in annual load estimation (Aulenbach and Hooper, 2006; Guo et al., 2002; Robertson and Roerish, 1999). In addition, estimates of seasonal loads are often less accurate than those of annual loads in forested streams (upland-draining or wetland-draining) (Kerr et al., 2016). When estimating dissolved reactive P (DRP) and NO3−-N loads in a tile-drained watershed, the uncertainty associated with the summer estimates was generally greater than that for the other three seasons (Williams et al., 2015). Furthermore, load estimation for agricultural

( )

2. Methodology 2.1. Site description In this study, effects of sampling strategies and algorithms on stream nutrient loads was investigated using stream total N and discharge data. As shown in Fig. 1, all data were collected from a small agricultural headwater watershed (Table 2), the Fengyu River Watershed (99.85861–100.0294°E, 25.88000–26.09778°N, elevation range 2082–3615 m asl). This region has distinct wet and dry seasons due to its subtropical plateau monsoon climate. The average annual air temperature is 13.9 °C, and average rainfall over the last five years was 1328 mm (data from Xialongmen Monitoring Station, 99.949894°E, 26.069082°N, elevation 2080 m asl), with the standard deviation in average rainfall data over the catchment was 231 mm. The wet season includes July, August, and September, in which more than 70% of the annual precipitation occurs. This watershed located in the mountainous areas of southwest China with average slope of 19°. Main soil types are brown soil (26.1%) and fuscous soil (22.3%) at the higher elevations, while paddy soil (14.3%) dominates valley bottoms and plains. This basin is the headwater of the Erhai Lake and dominated by natural forest and grassland in mountainous region and agriculture in valley plain. There are no industrial zones, and scattered farming is prevalent. The basin has a dense artificial channel network for irrigation. The Base Flow Index (BFI), which represents the proportion of base flow to total runoff, is 0.85 as determined by a digital filtering method (Baseflow Model based on Lyne-Hollick algorithm). In addition, stream flashiness index (FI), which is an indicator of responsiveness determined as the ratio of absolute day-to-day fluctuations of streamflow relative to total annaul flow (Baker et al., 2004), is 0.18 for the Fengyu River Watershed. This is a moderate flashiness index with high interannual variation (unevenness coefficient for the FI is 39%). Stream flashiness declines with increase in catchment size and has been linked to higher solute load error (Baker et al., 2004; Stelzer and Likens, 2006). Strong relationships can be found between FI and watershed characteristics (e.g., catchment area, mean catchment elevation, landuse, slope, BFI) (Holko et al., 2011). 2.2. Data collection At the watershed outlet (Xialongmen Monitoring Station), a Stalker II SVR (American Applied Concept Inc., measurement accuracy ± 0.03 m/s) was used to measure stream flow from June 2010 to June 2

3

1 8 8 1 7 1 1 7 2 6 1

NO3−-N NO3−-N, DRP, total P

total P NO3−-N, total N, DRP, total P, TSS

DRP, total P NO3−-N NO3−-N NO3−-N, total N, DRP, total P, TSS total P

NO3−-N, total P total P, TSS DRP, total P

NO3−-N NO3−-N, DRP DO, pH, SC, water temperature

SS

1406 36,970

25–1283 1773–30710

414 5–252 0.05–16 15–40 3–5

17.4–34600 740 11

90–19218 8–389 31–455

919–1847188

NO3−-N, total P, SIN, total N, DRP, TSS NH4+-N, SS NO3−-N, SRP Daily Sub-daily

3

15/30 min

1 5

15 min ~3samples/week

Daily

Sub-daily/daily Sub-daily/daily Hourly

2.2-Monthly half-hourly 2-hourly intervals

Sub-daily Hourly/daily Sub-daily Hourly/daily 20 min intervals

Daily Daily

Daily/weekly 182/year for nitrate

15-minute interval

Sub-daily Daily/monthly Sub-daily/daily

Sub-daily/daily

Monitoring frequency

1 1

1

8 1

5 1 13

* The number of estimation methods evaluated in the listed study.

12.06–128.75

3911

4.97, 50.22

NO3−-N Ca, DIC, Cl, K, DOC, Mg, Si, NH4+-N, NO3−-N, Na, SO4, H+ NO3−-N, total P, SS

5 7

TSS

499–3315

58,831 15.3, 19.6

22

total P NO3−-N, DRP, total N total N, DRP, total P

5–15 36000–197000 9–103

2

NO3−-N, DRP, total P, TS, SS

368–16699

Estimation methods evaluated (#)*

Water quality parameters

Watershed size (km2)

Table 1 Studies quantifying errors in load estimates, 1987–2018.

3 Ag. (Southwestern Ohio and Southeastern Indiana)

Two days to monthly –

Mississippi at Thebes; Mississippi at Grafton; Brazos at Richmond; Sacramento at Sacramento; Yadkin at Yadkin College; Schuylkill at Berne Predominantly by agricultural (Lowa) 2 small forested tributaries (Ontario, Canada)

4 Ag., 2 mixed land use, 1 urban watershed (Gulf of Mexico) 2 Ag. watersheds and 4 Ag. fields (Ohio, USA and Ontario, Canada) 3 mixed land use with wetland, glacier and forest (Alaska, USA)

49 Ag. (Iowa, USA) 1 Mixed land use watershed (Utah, USA) 1 Ag. Watershed (Loch Leven, Scotland)

17 Ag. Watershed (UK) 4 Mixed land use watershed (France) and 2 Mixed land use watersheds (Ohio, USA) 1 Ag. Watershed (Dorset, UK) 7 Ag. Watershed (Brittany, France); 2 with artificial drainage 2 Ag. Fields and 2 ag. Watersheds (Germany) 1 mixed land use and 1 forested watershed (North Carolina, USA) 3 Ag. Watershed (Ireland)

1 Ag. Watershed (Illinois, USA) 1 Mixed land use watershed (France)

2 Mixed land use watersheds (UK)

1 Ag. And 1 forested watershed (Finland) 4 Mixed land use watersheds (Poland) 2 Ag. Watersheds (Denmark)

3 Ag. Watersheds (Ohio, USA)

Watershed description

1 mixed land use groundwater dominate; 1 livestock surface water dominated (South of England) Main land use type is sheep and/or beef farming (New Zealand)



Daily to monthly Weekly to monthly

4–12 year-1

12 h to monthly 12 h to 60 days 2–30 d 2–60 days Weekly, daily and random sampling approaches Weekly to monthly Hourly to monthly Weekly, daily and composite sampling strategies 2–30 days 1–30 d Weekly to monthly

Stratified, weekly, monthly From 2 to 30 days

Weekly, fortnightly and monthly Weekly and multiweekly Weekly to bimonthly

12 to roughly 600 samples per year – Twice a week to monthly 16–104 year-1

Sampling frequency investigated

Kelly et al. (2018)

Elwan et al. (2018)

Lloyd et al. (2016)

Reynolds et al. (2016) Kerr et al. (2016)

Jiang et al. (2014) Williams et al. (2015) Sergeant and Nagorski (2015) Horowitz et al. (2015)

Guo et al. (2002) Moatar and Meybeck (2005) Johnes (2007) Moatar and Meybeck (2007) Bowes et al. (2009) (Birgand et al. (2010) Tiemeyer et al. (2010) Birgand et al. (2011) Cassidy and Jordan (2011) Stenback et al. (2011) Jones et al. (2012) Defew et al. (2013)

Richards and Holloway (1987) Rekolainen et al. (1991) Tonderski et al. (1995) Kronvang and Bruhn (1996) Phillips et al. (1999)

Reference

Y. Li, et al.

Journal of Hydrology 579 (2019) 124114

Journal of Hydrology 579 (2019) 124114

Y. Li, et al.

Fig. 1. Location of the Fengyu River Watershed in Southwest China, a small headwater watershed of Erhai Lake Basin, where the five-pointed star shows the monitoring station on mouth.

effect of sampling frequencies and estimation algorithms on load estimations (Harmel and King, 2005); however, it is extremely difficult to collect detailed nutrient loads data because of limited personnel hours, financial resources, and other factors. Thus, in most studies, reference or “true” loads datasets are based on sub-daily to weekly data. In this study, daily total N concentrations (mg/L) and mean daily stream flow (m3/s) were considered to be sufficient to represent the reference “true” river loads (t) by using Eq. (1):

2012 and a Waterlog H-3553 Bubbler/Pressure Sensor (American WaterLOG, Inc., Yellow Springs, Ohio, USA, measurement accuracy ± 0.02%) was installed in June 2012, which provided 30 min resolution stage height data. A WE300 (American Global Water, Inc., California, USA, measurement accuracy ± 1%) meteorological sensor was installed to collect weather data (e.g., solar radiation, air temperature, rainfall) paired with the bubbler gage every 30 min. Water quality samples were collected manually on a daily frequency, and total N concentrations were determined by an acid persulfate digestion colorimetric method GB11894-89 (National Environmental Protection Agency, 1996). Water samples were collected manually on a 5 daily frequency from November through April since 2014. Stream discharge was calculated based on the stage-discharge relationship equation. To ensure the frequency of water quality and stream flow data were consistent, three years daily total N concentrations data (2011–2013) were chosen, and average daily discharge data were calculated (Fig. 2a). The range of discharge was 0.62–17.66 m3/s with the 90th percentile of discharge was 4.82 m3/s and the 95th percentile was 6.27 m3/s. The number of days during 2011–2013 with flow reaching the 90th percentile was 112, while 56 days reached 95th percentile. Total N concentration ranged from 0.30 to 3.73 mg/L with the highest concentration occurring in summer. The high values of discharge and total N concentrations, both occurred in June to September (Fig. 2a). However, there was a weak relationship between discharge and total N concentration (R2 < 0.1) (Fig. 2c). In addition, the proportion of particulate N and dissolved N in total N were 17.53% and 82.47% (58.70% NO3−-N and 23.77% other dissolved N fractions) (Fig. 2b). And the particulate N content in four seasons was summer (26.09%) > fall (20.74%) > spring (12.55%) > winter (10.64%), while that for NO3−-N was fall (62.57%) > winter (61.78%) > spring (58.37%) > summer (52.17%) (Fig. 2b).

n

Lt = k ∑ Ci Qi

(1)

i=1

where: Lt is reference “true” annual river load(t); k is conversion factor to account for the period of load estimation and measurement units; Ci and Qi are the constituent concentration (mg/L) and mean daily river discharge (m3/s) measured on the ith day. 2.4. Load estimation algorithms In this study, three types of several common loads estimation algorithms were evaluated (averaging, interpolation, regression/rating curve). Averaging algorithms use average data to represent the time period. Interpolation algorithms assume that data (concentration and discharge) are represented by instantaneous sampling. Regression (extrapolation) procedures depend on an empirical relationship and changed rules about concentration and discharge (Moatar and Meybeck, 2005). Seven algorithms (Table 3) were used to estimate total N loads as described previously (Moatar and Meybeck, 2005; Walling and Webb, 1981, 1985). Algorithms A and D used time-weighted mean concentration values with dilution effects at high flows. Algorithms B and C used discharge-weighted concentration data. Algorithms E, F, and G applied the basic definition of flux. The reliability of algorithms A and D expressed the performance difference on the time-weighted values of the mean concentrations and the discharge-weighted values, while the

2.3. Estimation of reference loads Theoretically, high resolution data are required to compare the Table 2 Summary of catchment characteristics. Area (km2)

Average Rainfall (mm)

BFI

FI

Average Slope

Monitoring Elevation

Cropland (%)

Orchard (%)

Residential (%)

Forest (%)

Grass (%)

219

1328

0.85

0.18

19

2100 m

20.8%

2.3

1.3

29.6

45.9

4

Journal of Hydrology 579 (2019) 124114

Y. Li, et al.

Fig. 2. Raw monitoring data (discharge and total N concentration) on 2011–2013 in the watershed mouth, proportion of particulate nitrogen (PN), dissolved nitrogen (DN) and NO3−-N and the linear and quadratic relationships between total N (TN) concentration and discharge (cyan and red lines respectively). The black line is the 90th percentile of stream discharge, and the red line is the 95th percentile. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Table 3 Load estimation algorithms. Method A

Algorithm

(

B

L= L=

E F G

L= L=

)(

Ci n Q ∑i = 1 i n n n CQ K ∑i = 1 i i n n K ∑i = 1 (Ci Qi,¯i 1 ) n

L = K ∑i = 1

C D

Description (reference)

n C KQ¯ ( ∑i = 1 i ) n n ∑ Ci Qi K i =n 1 Q¯ ∑i = 1 Qi

Means of sampled concentration multiplied by discharge (Preston et al., 1989)

)

Average of load in sampled (Preston et al., 1989) Assumed constant concentration around sample (Birgand et al., 2010)



Annual flow multiplied by mean concentration of samples (Shih et al., 1994) Annual discharge multiplied by flow-weighted mean concentration (Littlewood, 1992) Linear interpolation of concentrations multiplied by flow (Moatar and Meybeck, 2005)

365

L = K ∑i = 1 (Ciint Qi )

L=

365 K ∑i = 1

Rating curve (Littlewood et al., 1998; Phillips et al., 1999)

(Ciext Qi )

where: K conversion factor to take account period of load estimation and convert the calculated values into a specific unit; Ci concentration of constituent associated with individual samples (mg/L). Qi daily discharge (m3/s). Q¯ discharge for period of record (m3/s). Qi,¯i − 1 discharge for Qi and Qi-1 (m3/s). Cint daily concentration by linearly interpolated between two measured samples (mg/L). Cext daily concentration, extrapolated by a rating curve, if a significant relationship exists between concentration and discharge n number of chemical analyses. 5

Journal of Hydrology 579 (2019) 124114

Y. Li, et al.

Fig. 3. Concatenated box plots displaying the range of the total N (TN) flux estimates generated by seven (A-G) algorithms and six sampling scenarios (3 daily, weekly, 2 weekly, 4 weekly, 6 weekly, 8 weekly). The horizontal lines inside each box represent mean-value, bottom and top edges of the boxes represent 10th and 90th percentile, whiskers represent the range of the data. The black long line represents the “true” load from daily data, dark green area encompasses the ± 5% range of “true” data and light green area encompasses the ± 11% range of “true” data. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

performance of algorithms B, C and E relied on the representativeness of the discharge-weighted mean concentration derived from river sampling. Algorithms C, D and E employed the mean flow values during the period, and daily discharge values were used to approximate the period mean in algorithms A and B (Webb et al., 1997). These methods may cause loads to be over- or under-estimated. And these algorithms estimate loads in a given time period with differing procedures to relate concentration and discharge data.

¯ 2 ∑N j = 1 (Lj − L) N

SD (%) =

Lt

× 100

(2)

where Lj is estimated load (kg); L¯ is average estimated load (kg); Lt is measured load (kg) calculated from daily database; and N is the number of datasets. The Bias of load estimation was calculated as follows (Stelzer and Likens, 2006; Toor et al., 2008):

(Le − Lt ) × 100 Lt

2.5. Sampling frequency scenarios

Bias (%) =

Sampling frequency scenarios were developed to determine how frequency affects the bias and precision of load estimations for each of the seven algorithms. In this study, six sampling frequencies (3 daily, weekly, 2 weekly, 4 weekly, 6 weekly and 8 weekly) were adopted, which produced 3, 7, 14, 30, 42, and 56 scenarios. In short, given a sampling interval of N days, there were N scenarios due to the varying sampling initial date. Constituent concentrations may be greatest on the rising arm of the hydrograph during high-flow events with the potential for dilution after the first flush (Jiang et al., 2010; Vanni et al., 2001) often due to the strong associations between flow rate and concentration (Carpenter et al., 2015; Littlewood, 1995). On the other hand, substantial suspended sediment transport may only occur during high-flow periods (Gao and Josefson, 2012). The complex relationships between concentration and discharge rate certainly affect load estimation error (Aulenbach et al., 2016; Kelly et al., 2018). There is reliable evidence that the annual fluxes of constituents of river are dominated by highflow events (Horowitz, 2008; Horowitz, 2013). Therefore, we added sampling at high flow > 90th to represent the influence of high discharge events on load estimation.

where Le is estimated load (kg); Lt is measured load (kg) calculated from daily database. When evaluating the load estimation, accuracy and precision were found to have inverse relationships (Walling and Webb, 1981). For tradeoffs between accuracy (i.e., Bias) and precision (i.e., SD), the RMSE was proposed as follows (Dolan et al., 1981; Haggard et al., 2003; Preston et al., 1989):

RMSE =

Bias 2 + SD 2

(3)

(4)

3. Results 3.1. Loads estimation with different sampling frequencies and algorithms In the Fengyu River Watershed, reference “true” total N load was 272.7 t for the 3 yr period (2011–2013) with 45.2 t, 87.7 t, and 139.8 t for year 1, 2, and 3 respectively. As can be seen from Fig. 3, variations of the estimation substantially increased with the decreases of sampling frequency. The estimated results from algorithm G were far from the line (“true value”) indicating relatively poor performance. To obtain an error within ± 11%, based on (Harmel et al., 2006), the 2 weekly sampling frequency was acceptable for all methods expect G. To obtain an error within the uncertainty range of the ± 5% “true” load based on (Shih et al., 2016), the weekly sampling frequency with algorithms E and F or the 3 daily sampling frequency for algorithms B and C are necessary. The error of different algorithms for total N load estimation was influenced by the sampling frequency and the algorithms had an obviously influence on the error distribution (Fig. 4). The relationships between the bias indices (B90, B50, and B10) for total N load estimation and six sampling frequencies (3 daily, weekly, 2 weekly, 4 weekly, 6 weekly, 8 weekly) was analyzed and presented in a series of informative graphics (Fig. 4). Algorithm A, with the 80% probability, had a Bias

2.6. Statistical measures of sampling error In nutrient load estimates, error is defined as the difference between the reference load and the estimated load (Birgand et al., 2010). To compared the errors associated with sampling frequency and load estimation algorithms, the reference load was calculated based on actual daily measurements, and the accuracy (percent Bias), precision (standard deviation, SD), and performance (root-mean-square error, RMSE) were determined, they are normalized by the measured “true” load (Preston et al., 1989). The SD of load estimation was calculated as follows: 6

Journal of Hydrology 579 (2019) 124114

Y. Li, et al.

Fig. 4. Uncertainty descriptors (B90, B50, B10 represent the 90th percentile, median, and 10th percentile of the error distribution) for seven methods used to estimate total N loads for six sampling frequencies.

within −10-0% when the sampling frequency was more than 2 weekly and a Bias within ± 15% when the frequency was more than 6 weekly. The B50 was about −5% for all studied sampling frequencies; algorithms B and C the range of Bias had 80% probability to be greater than 20% when the sampling frequency less than 2 weekly. Algorithm D was the best of all these algorithms, with the range of B90–B10 was −10-5% for the most frequencies, and its underestimation mostly occured at the sampling frequency was less 4 weekly. Algorithm E, did not show evident over- or under-estimation but the B50 had a slightly decrease with the sampling interval increasing; Algorithm F, with the range of B90–B10 between −15 and 10%, had a secondly best estimation. Agorithm G mostly underestimated the total N load in this study area for all studied sampling frequencies.

Algorithms E and F had the lower |Bias| (< 4%) when the sampling frequency was more than 2 weekly. The RMSE was used as an evaluation indicator of the combined performance. There was a small gap in RMSE within other 6 algorithms if sampling intervals was less than two weeks. Algorithm G was not suitable to be applied for this small agriculture watershed with a RMSE range of 13.1–17.1%; For the 3 daily frequency, there was no significant difference in the performances of algorithms A, B, C, D, E, and F (Fig. 5c). When the sampling frequency was more than 2 weekly, algorithm F was the best (RMSE ≤ 5.2%). When sampling frequency was less than 2 weekly, the performance gap of the seven load estimation algorithms became wider, and algorithm D had the best performance (RMSE ≤ 10%). Meanwhile, algorithm B and C had poor combined performances when the frequency less than weekly, and RMSE sharply increased (RMSE > 10%) when frequency was less than 2 weekly; Algorithms F and D provided the two best estimations when the frequency was less than 2 weekly and algorithms A and E presented mediocre performance.

3.2. Accuracy, precision and performance for sampling frequencies and estimation algorithms Seven load estimates algorithms (A, B, C, D, E, F, and G) were used to evaluate the effect of several conventional sampling frequencies (3 daily, weekly, 2 weekly, 4 weekly, 6 weekly, 8 weekly) on total N flux over three years at Xialongmen Station. The accuracy (i.e., Bias), precision (i.e., SD) and combined performance (RMSE) were quantified (Fig. 5). In general, the precision of total N load estimation decreased (the increase of SD value) with sampling interval increasing and the same trends were found for |Bias| and RMSE (Fig. 5a–c). The difference increased among these seven algorithms with sampling interval increasing. Algorithms D and G provided most precise (SD < 10%) estimation among all sampling frequencies. Algorithms A and E had the moderate precision and a SD < 10% when the sampling frequency was more than 4 weekly. Algorithms B and C had the high SD value (SD > 10%) when the sampling frequency was less than 2 weekly. As to the accuracy, the |Bias| from algorithm B and C had a sharply increase as the sampling interval was more than two weeks. Therefore, it can be concluded that algorithms B and C may not be suitable for estimating constituent loads in such a small headwater watershed when sampling frequency was less than 2 weekly; Algorithm G had a consistent but large |Bias| (about 14%) for all sampling frequencies, whereas algorithm D had a small and stable |Bias| (4.6–6.7%).

3.3. Evaluation of high flow sampling To study the influence of targeted high flow events sampling on load estimation, sampling at high flow > 90th percentile of daily flows was added to the conventional sampling frequencies (Fig. 5d–f). The precision was improved for all algorithms and sampling frequencies by adding the targeted high flow sampling. In addition, the amplitude of SD values decreased with the frequency change (Fig. 5a, d). Algorithms B and C had the large improvement in their precision with the average SD values decreased 9.4% and 8.5%, respectively. Algorithm E became the most precise (average SD was 0.7%) followed by algorithm D (average SD was 1.0%) and F (average SD was 1.5%). However, the |Bias| values of all algorithms except F (average |Bias| value decreased by 3.5%) increased for the high flow sampling and algorithms A, B and C had the worse accuracy with |Bias| > 20% (Fig. 5b, e). In addition, when the high-flow sampling was conducted, the statistical performance (RMSE) of algorithms A, B and C was poor (Fig. 5c, f). On the other hand, algorithm D showed improvements at the 3 daily and weekly frequency, algorithm F showed improvements at all frequencies with better performance among all the frequencies. As shown above, 7

Journal of Hydrology 579 (2019) 124114

Y. Li, et al.

Fig. 5. The precision (SD), accuracy (|Bias|), and performance (RMSE) for seven algorithms (A-G) used to estimate total N load as a function of the conventional sampling frequency and adding high discharge sampling frequency.

The average RMSE in summer was 20.4%, while the average RMSE was 14.2% in fall, 12.0% in spring and 8.5% in winter. Therefore, RMSE value in summer was 8.8% higher than the average of the other three seasons. If the sampling frequency was less than 4 weekly, the RMSE values for four seasons were all more than 11%. The added high flow sampling obviously improved the performance for load estimation in summer and fall, with 6.2% for average RMSE value in summer and 7.1% in fall. Compared to the conventional sampling frequencies, added high flow sampling improved the RMSE value by 14.2% in summer and 7.1% in fall. Wavelet coherence was used as a reference for setting the sampling frequency (Fig. 7). The wavelet analysis can further disclose detailed signals both from frequency and time domains by the Fourier transform and can appropriately analyze detailed temporal patterns of nonstationary hydrological and water quality signals over different temporal scales (Kang and Lin, 2007). From the wavelet coherence analysis of annual flow and total N concentration in the Fengyu River Watershed (Fig. 7), it was found that the strong coherence of flow and total N was at 48 days (Fig. 7c). If monthly loads were of interest, the strong coherence occurred at 4 days (Fig. 7a), and at about 14 days for three months (Fig. 7b).

algorithm F was the best algorithm with the RMSE range being 1.4–5.1%, followed by algorithm D (the range of RMSE was 1.5–11.1%) when the high-flow sampling was conducted (Fig. 5c, f).

3.4. Seasonal error The largest measured total N loads occurred in the summer with 42.16 t total N accounting for 46% of the annual total load, followed by the 26% of the total load in fall (Fig. 6b). The average discharge in fall was the highest at 3.58 m3/s followed by 3.33 m3/s in summer (Fig. 6a). The flow in these two seasons accounted for 35.7% and 33.2% of the annual flow, respectively. In the Fengyu River Watershed, fertilizers were usually applied in summer. The discharge was at a relatively high level in this season, and it was expected to has the highest total N load. Thus, based on results described in Section 3.2, algorithm F was used to quantify seasonal errors associated with infrequent sampling (3 daily, weekly, 2 weekly, 4 weekly, 6 weekly and 8 weekly). When the conventional sampling frequencies were used, load estimations for summer showed the poorest performance for all sampling frequencies (Fig. 6c), with the RMSE values for four seasons were 1.61–20.51% (winter), 0.87–25.29% (spring), 3.97–44.66% (summer) and 1.04–32.89% (fall). 8

Journal of Hydrology 579 (2019) 124114

Y. Li, et al.

Fig. 6. The “true” average discharge, total N (TN) load, and RMSE of six sampling frequencies by algorithm F in four seasons in two scenarios (conventional sampling frequency and adding high discharge sampling). The horizontal lines inside each box represent mean-value, bottom and top edges of the boxes represent 10th and 90th percentile, the whiskers represent 1.5 SD.

4. Discussion

nutrient loading occurred during the summer (Williams et al., 2015). In the Fengyu River Watershed the largest measured total N loads occurred in the summer, accounting for 46% of the annual total load and 33.2% of the annual flow (Fig. 6a, b). In this study, the sorting order for average RMSE in four seasons was as follows: summer (20.4%) > fall (14.2%) > spring (12.0%) > winter (8.5%). In the Fengyu River Watershed, there was a weak C-Q relationship (R2 < 0.1) (Fig. 2c). The C-Q correlation was strongest in the fall, followed by the summer, and winter (Fig. 8). The C-Q relationship can not explain the high error in such a small agriculture and groundwater dominated watershed. However, the three-year average CV values of the total N concentrations for the four seasons were 35.2% in summer, 30.8% in fall, 27.4% in spring and 23.9% in winter, the same seasonal order as for the RMSE values (Fig. 9). Therefore, the variation of the total N concentration may be the main reason to cause different seasonal errors.

4.1. Uncertainty of total N loads estimation from this small agricultural headwater watershed It has been found that the accuracy decreases with the decreasing of sampling frequency for nitrate load estimation (Birgand et al., 2010). The load estimation of total N had the same trend in this small agricultural watershed with the average RMSE in 3 daily, weekly, 2 weekly and monthly being 5%, 6.7%, 7.9% and 13.7%. In addition, with the monthly sampling, the errors (RMSE) for total N (6.6–20.4%) had a smaller range than for nitrate (4–26%), orthophosphate (5–33%), and particulate P (6–18%), but a larger range than for total P (13–42%) (Moatar and Meybeck, 2005). Furthermore, the precision for monthly sampling frequency was 8.8% for total N while it was 13% for NO3−-N, 20% for total P, 26% for orthophosphate, and 34% for particulate P (Moatar and Meybeck, 2005). Runoff of the Fengyu River Watershed was found to be dominated by groundwater with a high BFI (0.85) and a low population density (162 persons per square kilometer). Such characteristics would have caused catchments to have a relatively low RMSE on load estimation (Johnes, 2007). However, it would be inadequate to use annual pollutant load estimation for achieving some required monitoring purposes, such as for assessing current water quality conditions (Roygard et al., 2012), providing data for watershed modeling (Gassman et al., 2007; Ullrich and Volk, 2010; Guo et al., 2018; Wang et al., 2019) and evaluating effectiveness of best management practices to guide policy and management decisions. What’s more, estimating seasonal solute loads is often less accurate than estimating annual load in forested streams (uplanddraining or wetland-draining) (Kerr et al., 2016). A previous study found that intense storm events in the summer produced higher variability in nutrient concentrations in a tile-draining watershed and led to greater uncertainty for DRP and NO3−-N load estimation, but as compared to other seasons, only a small fraction (< 10%) of annual

4.2. Selecting an algorithm for total N loads estimation To inform watershed, intensive monitoring programs of contaminant fluxes are commonly conducted. However, the influence of sampling strategy and load estimation algorithms should be considered in designing and implementing such projects. Previous studies have recommended algorithm E to estimate nitrate flux, which requires relationships between constituent concentration and flow and a great deal of data for algorithm E (Birgand et al., 2010; Williams et al., 2015). Also, algorithm F performed better performance than algorithm E for total N and total P in small agriculture catchments, but performance was not as good in larger basins (Kronvang and Bruhn, 1996). Algorithm D was the best for total P (especially for particulate P) (Moatar and Meybeck, 2005), and overestimation due to dilution at high flows was likely (Elwan et al., 2018; Kronvang and Bruhn, 1996). Jiang et al. (2014) studied three load calculation methods, i.e. LOADEST (LOAD Estimator), AD-CI (all days-concentration interpolated) (algorithm F in 9

Journal of Hydrology 579 (2019) 124114

Y. Li, et al.

estimate concentrations based on flow (algorithm E) and interpolation (algorithm F) were preferred for estimating N loads in rivers (Elwan et al., 2018; Jiang et al., 2014; Moatar and Meybeck, 2005; Williams et al., 2015). The present study of a small agricultural headwater watershed produced similar results. At the conventional 2 weekly sampling frequency, algorithm E and F had the lower RMSE values with 6.0% for algorithm E and 5.2% for algorithm F. After the high discharge flow samplings (> 90th) was added, precision of all algorithms was improved along with the number of samples. However, the |Bias| for algorithms A, B, C, E became larger for the increased weight of high flow sampling with a high total N concentration. Overall, algorithm F had the best performance. 4.3. Selecting a sampling strategy for total N loads estimation Relative to load estimation algorithms (A-F), sampling frequencies have a larger impact on producing accurate and precise annual load estimation. A case study in the Netherlands by De Vries and Klavers (1994) indicated that riverine fluxes of pollutants: monitoring strategy first, calculation methods second. Moreover, a study found that decreasing the sampling frequency for total P and total N from daily to 3 daily would save an approximately 50% cost (Kovács et al., 2012). Thus, the sampling frequency should be thoroughly considered before determining the reliability of estimates. It was suggested that the sampling interval of 15 days for nitrate, 10 days for orthophosphate P and total P, and about 5 days for particulate P can obtain a precision of 10% in a large mixed land use watershed (36970 km2) (Moatar and Meybeck, 2005). To have ± 10% accuracy in tile-drained fields or watersheds, the frequency can be adapted with 13–26 h for DRP and 2.7–17.5 days for NO3−-N (Williams et al., 2015). In the present study, we concluded that a conventional 2 weekly sampling frequency could be sufficient for estimating annual constituent loads with about 5% RMSE value for algorithm F. But this frequency would not be adequate for analyzing hydrologic periods or seasonal characteristics which had about 17% of average RMSE value for summer. From the wavelet coherence analysis of annual flow and total N concentrations in the Fengyu River Watershed (Fig. 7), the strong coherence of flow and total N was at 48 days. This indicated that 48 daily sampling frequency could be used, if only the annual load was of interest; however, variations within the year could be lost. When the sampling frequency was set at 42 daily, the best performance (RMSE) of annual load estimation was 10.2%. Obviously, the performance of 48 daily would be worse. If monthly loads were of interest, the strong coherence occurred at 4 days (Fig. 7a), but such frequency sampling substantially increases costs. Based on these results, the sampling frequency should be set between 4 and 48 daily to balance cost and performance considerations. The strong coherence of flow and total N was at 48 days. If three months loads are of interest, the strong coherence occurred at about 14 days. Therefore, the conventional 2 weekly sampling frequency could be a good choice. Previous researches suggested that the required sampling frequency depends on the characters of watersheds (Johnes, 2007), especially the hydrological conditions (Jones et al., 2012) and concentration behavior (Littlewood, 1995). Hydrology-based sampling schemes were most accurate for loads estimation (Horowitz et al., 2015). The river monitoring sampling need to capture as many high-flow events as possible and cover at least 80–85% of the range of local annual water discharge (Hooper et al., 2001; Horowitz, 2008). As is known to all, the hydrological variation has a strong relationship with weather condition and high discharge events often occur after rainfall. Hence, the sampling strategy can be decided by referring to the weather forecast.

Fig. 7. The wavelet coherence analysis of flow and total N concentration in the Fengyu River Watershed.

this paper) and MD-FC (measured days-flow-weighted concentration) found the CV values were 39.7%, 15.7% and 24.5%, respectively. Similarly, a linear interpolation of concentrations (algorithm F in this paper) was the best among seven studied algorithms (Moatar and Meybeck, 2005; Williams et al., 2015). Previous research typically concluded that algorithms which

5. Conclusion In the present study, the analysis based on daily database for three years (2011–2013) has shown substantial error by using infrequent 10

Journal of Hydrology 579 (2019) 124114

Y. Li, et al.

Fig. 8. The linear and quadratic relationships between total N (TN) concentration and discharge (cyan and red lines respectively) for four seasons. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Declaration of Competing Interest

sampling to estimate the load of total N. Sampling with lower frequency may lead to reduction of the result reliability. In addition, performance from different load estimation algorithms also varied accordingly. The best algorithm and sampling frequency in this small headwater agriculture watershed are algorithm F with 2 weekly sampling frequency. The summer season has the highest error by using conventional sampling frequency. After addition of high discharge sampling to the sampling events, the performance of load estimates in summer and fall obviously improved. In addition, the precision (SD) can be improved for all the adopted algorithms, and the algorithm F was the best option for load estimates. Engineers and scientists can take advantage of the given work to enhance the quality of future projects, especially on infrastructure development and modeling design. Valuable financial resources can also be saved and concentrated on more relevant sampling approaches to avoid unnecessary wastes.

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements This research was supported by the National Natural Science Foundation of China (Grant No.: 31572208), the Special Fund for Agroscientific Research in the Public Interest from the Ministry of Agriculture and Rural Affairs of China (Grant No.: 201303089 and No.: 201003014), and Newton Fund (Grant Ref: BB/N013484/1), the National Key Research and Development Program of China (2016YFD0800500). USDA is an equal opportunity employer and provider.

11

Journal of Hydrology 579 (2019) 124114

Y. Li, et al.

Fig. 9. The series of total N (TN) concentration for four seasons in 2011–2013.

References

Guo, T., Gitau, M., Merwade, V., Arnold, J., Srinivasan, R., Hirschi, M., Engel, B., 2018. Comparison of performance of tile drainage routines in SWAT 2009 and 2012 in an extensively tile-drained watershed in the Midwest. Hydrol. Earth Syst. Sci. 22 (1), 89–110. Guo, Y., Markus, M., Demissie, M., 2002. Uncertainty of nitrate-N load computations for agricultural watersheds. Water Resour. Res. 38, 1185. Haggard, B.E., Soerens, T.S., Green, W.R., Richards, R.P., 2003. Using regression methods to estimate stream phosphorus loads at the illinois river, arkansas. Appl. Eng. Agric. 19, 187–194. Harmel, R.D., Cooper, R.J., Slade, R.M., Haney, R.L., Arnold, J.G., 2006. Cumulative uncertainty in measured streamflow and water quality data for small watersheds. Trans. ASABE 49, 689–701. Harmel, R.D., King, K.W., 2005. Uncertainty in measured sediment and nutrient flux from small agricultural watersheds. Trans. ASAE 48, 1713–1721. Harmel, R.D., Smith, D.R., King, K.W., Slade, R.M., 2009. Estimating storm discharge and water quality data uncertainty: a software tool for monitoring and modeling applications. Environ. Modell. Softw. 24, 832–842. Holko, L., Parajka, J., Kostka, Z., Škoda, P., Blöschl, G., 2011. Flashiness of mountain streams in Slovakia and Austria. J. Hydrol. 405, 392–401. Hooper, R.P., Aulenbach, B.T., Kelly, V.J., 2001. The national stream quality accounting network: a flux-based approach to monitoring the water quality of large rivers. Hydrol. Process. 15, 1089–1106. Horowitz, A.J., 2003. An evaluation of sediment rating curves for estimating suspended sediment concentrations for subsequent flux calculations. Hydrol. Process. 17, 3387–3409. Horowitz, A.J., 2008. Determining annual suspended sediment and sediment-associated trace element and nutrient fluxes. Sci. Total Environ. 400, 315–343. Horowitz, A.J., 2013. A review of selected inorganic surface water quality-monitoring practices: are we really measuring what we think, and if so, are we doing it right? Environ. Sci. Technol. 47, 2471–2486. Horowitz, A.J., Clarke, R.T., Merten, G.H., 2015. The effects of sample scheduling and sample numbers on estimates of the annual fluxes of suspended sediment in fluvial systems. Hydrol. Process. 29, 531–543. Hudnell, H.K., 2010. The state of U.S. freshwater harmful algal blooms assessments, policy and legislation. Toxicon 55, 1024–1034. Jiang, R., Woli, K.P., Kuramochi, K., Hayakawa, A., Shimizu, M., Hatano, R., 2010. Hydrological process controls on nitrogen export during storm events in an agricultural watershed. Soil Sci. Plant Nutr. 56, 72–85. Jiang, Y., Frankenberger, J.R., Bowling, L.C., Sun, Z., 2014. Quantification of uncertainty in estimated nitrate-N loads in agricultural watersheds. J. Hydrol. 519, 106–116. Johnes, P.J., 2007. Uncertainties in annual riverine phosphorus load estimation: impact of load estimation methodology, sampling frequency, baseflow index and catchment population density. J. Hydrol. 332, 241–258. Johnson, M.V., Norfleet, M.L., Atwood, J.D., Behrman, K.D., Kiniry, J.R., Arnold, J.G., White, M.J., Williams, J., 2015. The Conservation Effects Assessment Project (CEAP): a national scale natural resources and conservation needs assessment and decision support tool, IOP Conference Series: Earth & Environmental Science, p. 012012. Jones, A.S., Horsburgh, J.S., Mesner, N.O., Ryel, R.J., Stevens, D.K., 2012. Influence of sampling frequency on estimation of annual total phosphorus and total suspended solids loads. J. Am. Water Resour. Assoc. 48, 1258–1275. Kang, S., Lin, H., 2007. Wavelet analysis of hydrological and water quality signals in an agricultural watershed. J. Hydrol. 338, 1–14. Kelly, P.T., Vanni, M.J., Renwick, W.H., 2018. Assessing uncertainty in annual nitrogen, phosphorus, and suspended sediment load estimates in three agricultural streams using a 21-year dataset. Environ. Monit. Assess. 190, 91. Kerr, J.G., Eimers, M.C., Yao, H., 2016. Estimating stream solute loads from fixed

Aulenbach, Brent T., Burns, Douglas A., Shanley, James B., Yanai, Ruth D., Bae, Kikang, Wild, Adam D., Yang, Yang, Yi, Dong, 2016. Approaches to stream solute load estimation for solutes with varying dynamics from five diverse small watersheds. Ecosphere 7 (6), e01298. https://doi.org/10.1002/ecs2.1298. Aulenbach, B.T., Hooper, R.P., 2006. The composite method: an improved method for stream-water solute load estimation. Hydrol. Process. 20, 3029–3047. Baker, D.B., Richards, R.P., Loftus, T.T., Kramer, J.W., 2004. A new flashess index: characteristics and applications to midwestern rivers and streams. J. Am. Water Resour. Assoc. 40 (2), 503–522. Birgand, F., Appelboom, T.W., Chescheir, G.M., Skaggs, R.W., 2011. Estimating nitrogen, phosphorus, and carbon fluxes in forested and mixed-use watersheds of the lower coastal plain of north carolina: uncertainties associated with infrequent sampling. Trans. ASABE 54, 2099–2110. Birgand, F., Faucheux, C., Gruau, G., Augeard, B., Moatar, F., Bordenave, P., 2010. Uncertainties in assessing annual nitrate loads and concentration indicators: Part 1. Impact of sampling frequency and load estimation algorithms. Trans. ASABE 53, 437–446. Bowes, M.J., Smith, J.T., Neal, C., 2009. The value of high-resolution nutrient monitoring: a case study of the River Frome, Dorset, UK. J. Hydrol. 378, 82–96. Brauer, N., O’Geen, A.T., Dahlgren, R.A., 2009. Temporal variability in water quality of agricultural tailwaters: implications for water quality monitoring. Agric. Water Manage. 96, 1001–1009. Carpenter, S.R., Booth, E.G., Kucharik, C.J., Lathrop, R.C., 2015. Extreme daily loads: role in annual phosphorus input to a north temperate lake. Aquat. Sci. 77, 71–79. Cassidy, R., Jordan, P., 2011. Limitations of instantaneous water quality sampling in surface-water catchments: comparison with near-continuous phosphorus time-series data. J. Hydrol. 405, 182–193. Chen, J., Liu, Y., Gitau, M.W., Engel, B.A., Flanagan, D.C., Harbor, J.M., 2019. Evaluation of the effectiveness of green infrastructure on hydrology and water quality in a combined sewer overflow community. Sci. Total Environ. 665, 69–79. Coynel, A., Schafer, J., Hurtrez, J.E., Dumas, J., Etcheber, H., Blanc, G., 2004. Sampling frequency and accuracy of SPM flux estimates in two contrasted drainage basins. Sci. Total Environ. 330, 233–247. De Vries, A., Klavers, H.C., 1994. Riverine flues of pollutants: monitoring strategy first, calculation methods second. Eur Water Pollut Control 4, 12–17. Defew, L.H., May, L., Heal, K.V., 2013. Uncertainties in estimated phosphorus loads as a function of different sampling frequencies and common calculation methods. Mar. Freshw. Res. 64, 373–386. Dolan, D.M., Yui, A.K., Geist, R.D., 1981. Evaluation of river load estimation methods for total phosphorus. J. Great Lakes Res. 7, 207–214. Duvert, C., Gratiot, N., Némery, J., Burgos, A., Navratil, O., 2011. Sub-daily variability of suspended sediment fluxes in small mountainous catchments - implications for community-based river monitoring. Hydrol. Earth Syst. Sci. 15, 703–713. Elwan, A., Singh, R., Patterson, M., Roygard, J., Horne, D., Clothier, B., Jones, G., 2018. Influence of sampling frequency and load calculation methods on quantification of annual river nutrient and suspended solids loads. Environ. Monit. Assess. 190, 78. Gao, P., Josefson, M., 2012. Temporal variations of suspended sediment transport in Oneida Creek watershed, central New York. J. Hydrol. 426–427, 17–27. Gassman, P.W., Reyes, M.R., Green, C.H., Arnold, J.G., 2007. The soil and water assessment tool: historical development, applications, and future research directions. Trans. ASABE 50, 1211–1250. Gitau, M.W., Chen, J., Ma, Z., 2016. Water quality indices as tools for decision making and management. Water Resour. Manag. 30 (8), 2591–2610.

12

Journal of Hydrology 579 (2019) 124114

Y. Li, et al.

Royer, T.V., David, M.B., Gentry, L.E., 2006. Timing of riverine export of nitrate and phosphorus from agricultural watersheds in Illinois: implications for reducing nutrient loading to the Mississippi River. Environ. Sci. Technol. 40, 4126–4131. Roygard, J.K.F., McArthur, K.J., Clark, M.E., 2012. Diffuse contributions dominate over point sources of soluble nutrients in two sub-catchments of the Manawatu River, New Zealand. N. Z. J. Mar. Freshwater Res. 46, 219–241. Sergeant, C.J., Nagorski, S., 2015. The implications of monitoring frequency for describing riverine water quality regimes. River Res. Appl. 31, 602–610. Shih, G., Abtew, W., Obeysekera, J., 1994. Accuracy of nutrient runoff load calculations using time-composite sampling. Trans. Asae Am. Soc. Agric. Eng. 37, 419–429. Shih, Y.T., Lee, T.Y., Huang, J.C., Kao, S.J., Chang, F.K., 2016. Apportioning riverine DIN load to export coefficients of land uses in an urbanized watershed. Sci. Total Environ. 560–561, 1–11. Snelder, T.H., Mcdowell, R.W., Fraser, C.E., 2017. Estimation of catchment nutrient loads in new zealand using monthly water quality monitoring data. J. Am. Water Resour. Assoc. 53, 158–178. Stelzer, R.S., Likens, G.E., 2006. Effects of sampling frequency on estimates of dissolved silica export by streams: the role of hydrological variability and concentration-discharge relationships. Water Resour. Res. 42. Stenback, G.A., Crumpton, W.G., Schilling, K.E., Helmers, M.J., 2011. Rating curve estimation of nutrient loads in Iowa rivers. J. Hydrol. 396, 158–169. Tiemeyer, B., Kahle, P., Lennartz, B., 2010. Designing monitoring programs for artificially drained catchments. Vadose Zone J. 9, 14–24. Tonderski, A., Grimvall, A., Dojlido, J.R., Dijk, G.M.V., 1995. Monitoring nutrient transport in large rivers. Environ. Monit. Assess. 34, 245–269. Toor, G.S., Harmel, R.D., Haggard, B.E., Schmidt, G., 2008. Evaluation of regression methodology with low-frequency water quality sampling to estimate constituent loads for ephemeral watersheds in Texas. J. Environ. Qual. 37, 1847–1854. Ullrich, A., Volk, M., 2010. Influence of different nitrate-N monitoring strategies on load estimation as a base for model calibration and evaluation. Environ. Monit. Assess. 171, 513–527. Vörösmarty, C.J., Meybeck, M., 2004. Responses of Continental Aquatic Systems at the Global Scale: New Paradigms, New Methods. Vegetation, Water, Humans and the Climate, 375-413. Vanni, M.J., Renwick, W.H., Headworth, J.L., Auch, J.D., Schaus, M.H., 2001. Dissolved and particulate nutrient flux from three adjacent agricultural watersheds: a five-year study. Biogeochemistry 54, 85–114. Walling, D.E., Webb, B.W., 1981. The reliability of suspended sediment load data. In: Erosion and Sediment Transport Measurement. IAHS Press, Wallingford, pp. 177–194 No 133. Walling, D.E., Webb, B.W., 1985. Estimating the discharge of contaminants to coastal waters by rivers: some cautionary comments. Mar. Pollut. Bull. 16, 488–492. Walling, D.E., Webb, B.W., 1996. Erosion and sediment yield: a global overview. IAHS Publications-Series of Proceedings and Reports-Intern Assoc Hydrological Sciences 236, 3–20. Wang, R., Yuan, Y., Yen, H., Grieneisen, M., Arnold, J., Wang, D., Wang, C., Zhang, M., 2019. A review of pesticide fate and transport simulation at watershed level using SWAT: Current status and research concerns. Sci. Total Environ. 669, 512–526. Webb, B.W., Phillips, J.M., Walling, D.E., Littlewood, I.G., Watts, C.D., Leeks, G.J.L., 1997. Load estimation methodologies for British rivers and their relevance to the LOIS RACS(R) programme. Sci. Total Environ. 194–195, 379–389. Williams, M.R., King, K.W., Macrae, M.L., Ford, W., Esbroeck, C.V., Brunke, R.I., English, M.C., Schiff, S.L., 2015. Uncertainty in nutrient loads from tile-drained landscapes: effect of sampling frequency, calculation algorithm, and compositing strategy. J. Hydrol. 530, 306–316. Yanai, R.D., Tokuchi, N., Campbell, J.L., Green, M.B., Matsuzaki, E., Laseter, S.N., Brown, C.L., Bailey, A.S., Lyons, P., Levine, C.R., Buso, D.C., Likens, G.E., Knoepp, J.D., Keitaro, F., 2015. Sources of uncertainty in estimating stream solute export from headwater catchments at three sites. Hydrol. Process. 29, 1793–1805. Yen, H., Hoque, Y.M., Wang, X., Harmel, R.D., 2016. Applications of explicitly incorporated/post-processing measurement uncertainty in watershed modeling. JAWRA J. Am. Water Resour. Assoc. 52, 523–540. Yen, H., Wang, X., Fontane, D.G., Harmel, R.D., Arabi, M., 2014. A framework for propagation of uncertainty contributed by parameterization, input data, model structure, and calibration/validation data in watershed modeling. Environ. Modell. Softw. 54, 211–221. Yuan, Y., Wang, R., Cooter, E., Ran, L., Daggupati, P., Yang, D., Srinivasan, R., Jalowska, A., 2018. Integrating multimedia models to assess nitrogen losses from the Mississippi River basin to the Gulf of Mexico. Biogeosciences 15 (23), 7059–7076. Zamyadi, A., Gallichand, J., Duchemin, M., 2007. Comparison of methods for estimating sediment and nitrogen loads from a small agricultural watershed. Can. Biosyst. Eng. 49, 1.27-21.36.

frequency sampling regimes: the importance of considering multiple solutes and seasonal fluxes in the design of long-term stream monitoring networks. Hydrol. Process. 30, 1521–1535. King, K.W., Harmel, R.D., 2003. Considerations in selecting a water quality sampling strategy. Trans. ASAE 46, 63–73. Kotlash, A.R., Chessman, B.C., 1998. Effects of water sample preservation and storage on nitrogen and phosphorus determinations: implications for the use of automated sampling equipment. Water Res. 32, 3731–3737. Kovács, J., Korponai, J., Kovács, I.S., Hatvani, I.G., 2012. Introducing sampling frequency estimation using variograms in water research with the example of nutrient loads in the Kis-Balaton Water Protection System (W Hungary). Ecol. Eng. 42, 237–243. Kronvang, B., Bruhn, A.J., 1996. Choice of sampling strategy and estimation method for calculating nitrogen and phosphorus transport in small lowland streams. Hydrol. Process. 10, 1483–1501. Lennartz, B., Tiemeyer, B., de Rooij, G., DoležAl, F., 2010. Artificially drained catchments – from monitoring studies towards management approaches. Vadose Zone J. 9, 1–3. Littlewood, I.G., 1992. Estimating Contaminant Loads in Rivers: A Review. Institute of Hydrology, Wallingford, U.K. Littlewood, I.G., 1995. Hydrological regimes, sampling strategies, and assessment of errors in mass load estimates for United Kingdom rivers. Environ. Int. 21, 211–220. Littlewood, I.G., Watts, C.D., Custance, J.M., 1998. Systematic application of United Kingdom river flow and quality databases for estimating annual river mass loads (1975–1994). Sci. Total Environ. 210, 21–40. Lloyd, C.E., Freer, J.E., Johnes, P.J., Collins, A.L., 2016. Using hysteresis analysis of highresolution water quality monitoring data, including uncertainty, to infer controls on nutrient and sediment transfer in catchments. Sci. Total Environ. 543, 388–404. Martin, G.R., Smoot, J.L., White, K.D., 1992. A Comparison of surface-grab and cross sectionally integrated stream-water-quality sampling methods. Water Environ. Res. 64, 866–876. Moatar, F., Meybeck, M., 2005. Compared performances of different algorithms for estimating annual nutrient loads discharged by the eutrophic River Loire. Hydrol. Process. 19, 429–444. Moatar, F., Meybeck, M., 2007. Riverine fluxes of pollutants: towards predictions of uncertainties by flux duration indicators. C.R. Geosci. 339, 367–382. Moatar, F., Meybeck, M., Raymond, S., Birgand, F., Curie, F., 2013. River flux uncertainties predicted by hydrological variability and riverine material behaviour. Hydrol. Process. 27, 3535–3546. Moatar, F., Person, G., Meybeck, M., Coynel, A., Etcheber, H., Crouzet, P., 2006. The influence of contrasting suspended particulate matter transport regimes on the bias and precision of flux estimates. Sci. Total Environ. 370, 515–531. National Environmental Protection Agency, 1996. Compilation of National Standards for Water Quality Analysis Methods. Standards Press of China, Beijing. Phillips, J.M., Webb, B.W., Walling, D.E., Leeks, G.J.L., 1999. Estimating the suspended sediment loads of rivers in the LOIS study area using infrequent samples. Hydrol. Process. 13, 1035–1050. Preston, S.D., Bierman Jr, V.J., Silliman, S.E., 1989. An evaluation of methods for the estimation of tributary mass loads. Water Resour. Res. 25, 1379–1389. Quilbé, R., Rousseau, A.N., Duchemin, M., Poulin, A., Gangbazo, G., Villeneuve, J.P., 2006. Selecting a calculation method to estimate sediment and nutrient loads in streams: application to the Beaurivage River (Québec, Canada). J. Hydrol. 326, 295–310. Rekolainen, S., Posch, M., Kämäri, J., Ekholm, P., 1991. Evaluation of the accuracy and precision of annual phosphorus load estimates from two agricultural basins in Finland. J. Hydrol. 128, 237–255. Renwick, W.H., Vanni, M.J., Zhang, Q., Patton, J., 2008. Water quality trends and changing agricultural practices in a midwest U.S. watershed, 1994–2006. J. Environ. Qual. 37, 1862–1874. Reynolds, K.N., Loecke, T.D., Burgin, A.J., Davis, C.A., Riverosiregui, D., Thomas, S.A., Clair, M.A.S., Ward, A.S., 2016. Optimizing sampling strategies for riverine nitrate using high-frequency data in agricultural watersheds. Environ. Sci. Technol. 50, 6406–6414. Richards, R.P., Baker, D.B., 2002. Trends in water quality in LEASEQ rivers and streams (northwestern Ohio), 1975–1995. Lake erie agricultural systems for environmental quality. J. Environ. Qual. 31, 90–96. Richards, R.P., Baker, D.B., Crumrine, J.P., Kramer, J.W., Ewing, D.E., Merryfield, B.J., 2008. Thirty-year trends in suspended sediment in seven Lake Erie tributaries. J. Environ. Qual. 37, 1894–1908. Richards, R.P., Holloway, J., 1987. Monte Carlo studies of sampling strategies for estimating tributary loads. Water Resour. Res. 23, 1939–1948. Robertson, D.M., Roerish, E.D., 1999. Influence of various water quality sampling strategies on load estimates for small streams. Water Resour. Res. 35, 3747–3759. Rode, M., Suhr, U., 2007. Uncertainties in selected river water quality data. Hydrol. Earth Syst. Sci. Discuss. 11, 863–874.

13