Assessment of SWE data assimilation for ensemble streamflow predictions

Assessment of SWE data assimilation for ensemble streamflow predictions

Journal of Hydrology xxx (2014) xxx–xxx Contents lists available at ScienceDirect Journal of Hydrology journal homepage: www.elsevier.com/locate/jhy...

2MB Sizes 0 Downloads 73 Views

Journal of Hydrology xxx (2014) xxx–xxx

Contents lists available at ScienceDirect

Journal of Hydrology journal homepage: www.elsevier.com/locate/jhydrol

Assessment of SWE data assimilation for ensemble streamflow predictions Kristie J. Franz a,⇑, Terri S. Hogue b,c, Muhammad Barik c, Minxue He d a

Department of Geological and Atmospheric Sciences, Iowa State University, Ames, IA 50011, USA Civil and Environmental Engineering, Colorado School of Mines, Golden, CO, USA c Civil and Environmental Engineering University of California, Los Angeles, CA, USA d Hydrology Branch, California Department of Water Resources, Sacramento, CA, USA b

a r t i c l e

i n f o

Article history: Available online xxxx Keywords: Ensemble streamflow prediction Data assimilation Snow water equivalent Verification

s u m m a r y An assessment of data assimilation (DA) for Ensemble Streamflow Prediction (ESP) using seasonal water supply hindcasting in the North Fork of the American River Basin (NFARB) and the National Weather Service (NWS) hydrologic forecast models is undertaken. Two parameter sets, one from the California Nevada River Forecast Center (RFC) and one from the Differential Evolution Adaptive Metropolis (DREAM) algorithm, are tested. For each parameter set, hindcasts are generated using initial conditions derived with and without the inclusion of a DA scheme that integrates snow water equivalent (SWE) observations. The DREAM-DA scenario uses an Integrated Uncertainty and Ensemble-based data Assimilation (ICEA) framework that also considers model and parameter uncertainty. Hindcasts are evaluated using deterministic and probabilistic forecast verification metrics. In general, the impact of DA on the skill of the seasonal water supply predictions is mixed. For deterministic (ensemble mean) predictions, the Percent Bias (PBias) is improved with integration of the DA. DREAM-DA and the RFC-DA have the lowest biases and the RFC-DA has the lowest Root Mean Squared Error (RMSE). However, the RFC and DREAM-DA have similar RMSE scores. For the probabilistic predictions, the RFC and DREAM have the highest Continuous Ranked Probability Skill Scores (CRPSS) and the RFC has the best discrimination for low flows. Reliability results are similar between the non-DA and DA tests and the DREAM and DREAM-DA have better reliability than the RFC and RFC-DA for forecast dates February 1 and later. Despite producing improved streamflow simulations in previous studies, the hindcast analysis suggests that the DA method tested may not result in obvious improvements in streamflow forecasts. We advocate that integration of hindcasting and probabilistic metrics provides more rigorous insight on model performance for forecasting applications, such as in this study. Ó 2014 Elsevier B.V. All rights reserved.

1. Introduction Properly representing the various sources of uncertainty in the forecast process is a key concern of modern hydrologic forecasting (Welles et al., 2007; Demeritt et al., 2013), this includes uncertainty inherent in model parameters and model states (Beven and Binley, 1992; Beven, 1993; Gupta et al., 1998; Kuczera and Parent, 1998; Moradkhani et al., 2005; Liu and Gupta, 2007; Brigode et al., 2013; Demirel et al., 2013), observational data to calibrate and drive models (Kavetski et al., 2002; Vrugt et al., 2008b; Renard et al., 2010; McMillan et al., 2011; Montanari and Di Baldassarre, 2012; Li et al., 2012), and model structure (Gupta et al., 1998; Wagener and Gupta, 2005; Ajami et al., 2006; Duan et al., 2007; Franz et al., 2010; Renard et al., 2010; Najafi et al., ⇑ Corresponding author. Tel.: +1 515 294 7454. E-mail address: [email protected] (K.J. Franz).

2011). The National Weather Service (NWS) Ensemble Streamflow Prediction (ESP) method considers uncertainty of future climate by using multiple historical precipitation and temperature time series as input scenarios to hydrologic forecast models (Day, 1985; Franz et al., 2003; Bradley et al., 2004; Jeong and Kim, 2005; McEnery et al., 2005). Each scenario is conditioned on the state of the watershed at the start of the forecast by initializing each model run (or ensemble member) from the same initial model states. Wood and Schaake (2008) note that not considering uncertainty in initial states results in overconfident forecasts. Ensemble data assimilation (DA) has the potential to refine ESP by incorporating the uncertainty associated with initial basin conditions (Liu and Gupta, 2007; DeChant and Moradkhani, 2011). In snow dominated areas, streamflow predictions are significantly impacted by the initial snow water equivalent (SWE) states (Franz et al., 2003, 2008b; Clark and Hay, 2004). Several studies have tested updating hydrologic forecast models with SWE observations

http://dx.doi.org/10.1016/j.jhydrol.2014.07.008 0022-1694/Ó 2014 Elsevier B.V. All rights reserved.

Please cite this article in press as: Franz, K.J., et al. Assessment of SWE data assimilation for ensemble streamflow predictions. J. Hydrol. (2014), http:// dx.doi.org/10.1016/j.jhydrol.2014.07.008

2

K.J. Franz et al. / Journal of Hydrology xxx (2014) xxx–xxx

(Slater and Clark, 2006; Leisenring and Moradkhani, 2011; He et al., 2012), however, most have investigated the impact of the DA framework on historical simulations only. The few studies that have evaluated hydrologic forecasts focus on assimilation of streamflow data (Seo et al., 2009; Li et al., 2011; McMillan et al., 2013). While forecast improvement was observed as a result of the integration of streamflow data in these studies, the impact of SWE assimilation on hydrologic forecasts remains unclear (He et al., 2012). The objective of the current study is to assess the impact of assimilating SWE information into the NWS hydrologic forecast models for water supply ESP using hindcasting and forecast verification techniques (Franz et al., 2003; Bradley et al., 2004; Renner et al., 2009; Franz and Hogue, 2011). Hindcasting is the process of retroactively applying forecast models and methods to produce large samples of retrospective forecasts that can be used to verify the forecasting approach (Demargne et al., 2009). The quality of the hindcasts gives insight into how the method may perform for real-time forecasting. A framework for evaluating hydrologic modeling methods for operational applications using ESP hindcasting was set forth by Franz et al. (2008b). Hindcasting can be accomplished using the data and tools that are commonly utilized for hydrologic model evaluation and has been undertaken for various studies (Hamlet and Lettenmaier, 1999; Wood et al., 2002; Westrick et al., 2002; Bradley et al., 2004; Franz et al., 2003, 2008b). Our study applies the NWS operational streamflow forecast system on the North Fork of the American River Basin (NFARB) in California and explores established parameter estimation and data assimilation techniques through hindcasting. 2. Methods 2.1. Study basin and data The study site is the North Fork of the American River Basin (NFARB) (Fig. 1) located on the western side of the Sierra Nevada Mountains in northern California. The NFARB has an area of 868 km2 with elevation ranging from 215 m to 2735 m. The basin has 1514 mm of annual long-term average precipitation, with much of this falling as snow throughout the winter season. There is no significant regulation or land use/land cover change in the basin (Shamir and Georgakakos, 2007). Following the NWS forecasting framework, the basin is delineated into upper and lower sub-basins at 1515 m which divides the basin into snow dominated and non-snow dominated regions (He et al., 2012; Franz and Karsten, 2013). Spatially averaged precipitation and temperature time series at 6-h intervals were obtained from the California Nevada River Forecast Center (CNRFC) for both the upper and lower sub-basins. Climatological values of mid-month, basin average potential evapotranspiration (PET) for both the upper and lower sub-basins were also obtained from CNRFC. Daily values were linearly interpolated from the midmonth values and evenly divided across four timesteps to get 6-h values. Daily discharge data at the basin outlet were obtained from the archived USGS gage #11427000 record. SWE observations were obtained from three snow stations in or close to the study basin (Fig. 1). CSS Lab (CL) is a SNOw TELemetry (SNOTEL) station operated by the Natural Resources Conservation Service (NRCS). Blue canyon (BL) and Huysink (HY) are snow stations maintained by the California Data Exchange Center (CDEC).

1973) and the Sacramento Soil Moisture Accounting (SAC-SMA) rainfall-runoff model (Burnash et al., 1973). The SNOW17 uses a temperature index approach where air temperature controls snow accumulation, snow-atmosphere energy exchange, and melt processes in the model. The model has 11 parameters and an areal depletion curve, and requires air temperature and precipitation as inputs. Output from the SNOW17 is a rain-plus-snowmelt time series which is input into the SAC-SMA (NWS, 2004). The SAC-SMA (Burnash et al., 1973) uses a conceptual approach to distribute the rain-plus-melt input through a series of soilmoisture stores and to overland flow. The model is divided into a faster, upper soil zone and a slower, lower soil zone. Each zone consists of free water and tension water storages. Free water storages represent the water that can be drained by gravitational forces. Tension water storages represent water that is be held by suction inside the soil pores and can only be removed by evaporation and transpiration. The SAC-SMA has 16 parameters and requires potential evapotranspiration (PET) as input in addition to the rain-plus-melt input. Output from the model is channel inflow. Channel inflow from the upper and lower sub-basins of the NFARB is computed separately using the forecast models. Total channel inflow is a weighted average of the channel inflow from the upper sub-basin (37%) and the lower sub-basin (63%). The CNRFC unit hydrograph for the NFARB is applied to route the total channel inflow to the outlet to compute total watershed discharge. 2.3. Model calibration The SAC-SMA and SNOW17 parameters for the upper and lower sub-basins were obtained from the CNRFC. NWS hydrologists commonly use manual parameter estimation based on regional experience and model expertise. A second set of parameters are developed using the Differential Evolution Adaptive Metropolis (DREAM) automatic calibration algorithm (Vrugt et al., 2008a) which estimates the optimal parameter values and the associated uncertainty. DREAM (Vrugt et al., 2008a) is designed to handle complex and multidimensional sampling problems using a Markov Chain Monte Carlo (MCMC) sampler, which efficiently samples from the high probability region of the parameter space with the aid of a likelihood function. In DREAM, a global search is undertaken running multiple Markov Chains generated from a predefined parameter distribution and running them in parallel for the convergence. For more details, the reader is referred to Vrugt et al. (2008a, b) and He et al. (2011b, 2012). The DREAM calibration is conducted using the 1980–1986 period and a least squares likelihood function against streamflow observations. A total of 10,000 iterations of DREAM are defined. The period 1980–86 covers a range of years with contrasting wetness, and is representative of the whole study period. Our previous study shows a seven-year calibration period generally yields reasonable estimates of model parameters (He et al., 2012). Four SNOW17 parameters and 12 SAC-SMA parameters are calibrated based on previous studies (Hogue et al., 2000, 2006; Franz et al., 2008a; He et al., 2011, 2012; Franz and Karsten, 2013). The optimization is performed in two steps. The 16 SNOW17/SAC-SMA parameters are first optimized for the upper sub-basin, holding the lower sub-basin parameters at the pre-established CNRFC values. Next, the optimized parameters from the first step are fixed for the upper sub-basin and the lower sub-basin parameters are optimized.

2.2. Models

2.4. Data assimilation

The NWS forecast models applied in the NFARB are the SNOW17 snow accumulation and ablation model (Anderson,

The Ensemble Kalman Filter (EnKF) (Evensen, 1994) uses a Monte Carlo approach to determine the nonlinear propagation of

Please cite this article in press as: Franz, K.J., et al. Assessment of SWE data assimilation for ensemble streamflow predictions. J. Hydrol. (2014), http:// dx.doi.org/10.1016/j.jhydrol.2014.07.008

K.J. Franz et al. / Journal of Hydrology xxx (2014) xxx–xxx

3

Fig. 1. North Fork of the American River Basin (NFARB) located on the western side of the Sierra Nevada Mountains in northern California. The basin is delineated into upper and lower sub-basins as indicated by the ‘‘Snowline’’.

ensemble model states between observations and to update these states at observation times. Updating is done at given intervals using the priori distribution of ensemble states from the previous updated step. The state updating process can be explained with the following equation:

Zut ¼ Zpt þ KðAt  gZpt Þ

ð1Þ

where Zu and Zp represent the vectors for state estimates after and before the update, respectively, t represents the time step when the updating is done, At is the set of perturbed observation with the measurement error, g is the operator which interpolates from model state to the observation, and K is the Kalman gain estimated from the error covariance. Defining error characteristics for forcing data and observations is a critical component of any hydrologic data assimilation study. It is a common practice to treat these variables (i.e., precipitation and air temperature observations) as random variables with predefined error distributions and characteristics (e.g., Margulis et al., 2002; Moradkhani et al., 2005; Clark et al., 2008; Leisenring and Moradkhani, 2011; He et al., 2012). The associated uncertainty is then considered by perturbing these predefined distributions. However, defining error distributions and characteristics is often subjective and mostly based on sensitivity analysis (Liu et al., 2012). Most recently, Leisenring and Moradkhani (2012) proposed a procedure to dynamically adjust error variance based on model prediction error from previous time steps. This procedure was further expanded and implemented in improving the reliability and applicability of particle filters for use in complex hydrologic models (Moradkhani et al., 2012). The current study adopts an

alternative, but simpler, method to consider uncertainty in precipitation. This method draws upon how precipitation is currently utilized within the SNOW17 model to determine snowfall input. Specifically, the SNOW17 uses the air temperature and the PXTEMP parameter to determine whether precipitation is snowfall or rainfall. The snowfall input to the model is then adjusted by a snow correction factor (SCF). Therefore, the actual snowfall forcing for the SNOW17 is determined by precipitation and air temperature forcing, as well as parameters SCF and PXTMEP. Hence, instead of perturbing precipitation time series, we perturb parameters SCF and PXTEMP and assume that the uncertainty identified for these two parameters implicitly represents the uncertainty in precipitation. Following our previous studies (He, 2010; He et al., 2012), SWE observations and temperature data are assumed to be corrupted with additive normally-distributed white noise, with a standard deviation of 25 mm and 0.5 °C, respectively. The uncertainty structure of the SWE observation is estimated using a normal distribution (Liu and Gupta, 2007) with zero mean for systematic error and a standard deviation of 25 mm representing the uncertainty in the observation. Sensitivity tests in multiple years with contrasting wetness indicate that EnKF configured with these error definitions produces satisfactory ensemble SWE and streamflow estimates (He, 2010). Probability distributions for SCF and PXTEMP used in this study were determined through our previous work (He et al., 2012). He et al. (2012) combine the parameter uncertainty estimates from DREAM with the EnKF assimilation framework in a method coined the Integrated unCertainty and Ensemble-based data Assimilation (ICEA) to update SWE states in the SNOW17. ICEA is a stepwise process which starts with a Generalized Sensitivity Analysis

Please cite this article in press as: Franz, K.J., et al. Assessment of SWE data assimilation for ensemble streamflow predictions. J. Hydrol. (2014), http:// dx.doi.org/10.1016/j.jhydrol.2014.07.008

4

K.J. Franz et al. / Journal of Hydrology xxx (2014) xxx–xxx

(GSA) (Hornberger and Spear, 1981) to identify behavioral model parameters. DREAM is then applied to determine the uncertainty structure of the sensitive parameters identified in the GSA. Next, data assimilation with the EnKF is conducted using the uncertainty characteristics of the parameters and observed SWE data. SWE states in the SNOW17 are updated using daily SWE observations available within the basin (Fig. 1). Updating is only performed for the upper sub-basin where three snow sites are present. Following He et al. (2012), SWE values for use in the EnKF are derived by applying a regression based method to daily SWE observations. The regression method computes a weighted average SWE from the SWE observations that is assumed to be representative of the modeled SWE. The weight for each snow site is calculated via a least-square method:

 !2  X 3   D X d d min  ðql  W l Þ  W f ;   d¼1 l¼1

for all; W dl P 0

ð2Þ

where W dl is the SWE observation from l-th snow station (here, l = 1, 2, 3) at the d-th timestep (d = 1, 2,. . ., D), where D represents the total number of days in consideration; W df is the SNOW17 modeled SWE at d-th timestep; and ql is the weight being solved for the l-th snow station. The calculated weights for Blue canyon (BL), Huysink (HY) and CSS Lab (CL) are 0.31, 0.54 and 0.44, respectively (He et al., 2012). 2.5. Hindcasting Using the methods described above, four hindcasting tests are formulated: (1) ‘‘RFC’’ – CNRFC parameters. (2) ‘‘DREAM’’ – DREAM parameters using the maximum likelihood parameter set. (3) ‘‘RFC-DA’’ – CNRFC parameters with EnKF. (4) ‘‘DREAM-DA’’ – the full ICEA framework. In test 4, the assimilation takes the uncertainty of sensitive parameters into account via perturbing the parameter space using DREAM (He et al., 2012). Water supply (total discharge) ESP hindcasts with a forecast window of April 1 to July 31 were developed for each test following Franz et al. (2003, 2008b). The first step in the hindcast process is to generate model states (i.e. initial conditions) for forecast dates of interest by running the models for the entire study period. For tests 1 and 2, a simple historical simulation is conducted and states are saved during the model run. For tests 3 and 4, the EnKF with SWE DA is included in the historical simulation and, likewise, states are saved during the model run. In this study, initial conditions are saved for seven typical water supply forecast dates (January 1, January 15, February 1, February 15, March 1, March 15 and April 1) for both the upper and lower sub-basins. Using the saved states, hindcasts are then generated for each of the forecast dates. To generate a hindcast for one forecast date, the models are initialized with the associated saved states then forced with historical temperature and precipitation data for each year in the historical record. Only data that span the forecast period are used from the historical forcing. The forecast period is the forecast date to July 31. Each year of historical data creates an individual streamflow scenario for the ensemble forecast. SWE data from 1982 to 2007 are available to run the DA, allowing for 26 years of hindcasts to be generated. Fifty-eight years of temperature and precipitation data are available from the CNRFC for the period of 1950 to 2007, resulting in 58 member ensembles for the non-DA hindcasts. For the tests that include DA, 100

replicate model states were generated by the EnKF during the updating and saved for each forecast date. Hindcasts are then generated for each set of saved states in the same manner described above, resulting in 5800 member ensembles. 2.6. Hindcast verification For a given forecast date, i, the ensemble forecast consists of a set of seasonal water supply values defined as {x(1), x(2), x(3),. . ., x(z)} sorted in ascending order from an ensemble size z. Each ensemble member is treated as a discrete variable and a uniform distribution is assumed. Two types of forecasts are evaluated, the ensemble mean forecast and the full ensemble forecast. Standard deterministic metrics (Franz et al., 2003; Bradley et al., 2004; Wilks, 2006; Franz and Hogue, 2011) used to evaluate the ensemble mean forecasts are Root Mean Squared Error (RMSE), Percent Bias (PBias), and Correlation (R):

rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 XT ðmi  oi Þ2 i¼1 T

RMSE ¼

PBias ¼

ð3Þ

" , # T T X X ðmi  oi Þ oi  100% i¼1

ð4Þ

i¼1

PT  i¼1 ðmi  mÞðoi  oÞ ffi R ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi PT 2 PT 2   mÞ o Þ ðm  ðo  i i i¼1 t¼1

ð5Þ

where mi is the ensemble mean, oi is the observation that occurred, and T is the total number of forecasts. Probabilistic forecast verification metrics used to evaluate the ensemble forecasts are the containing ratio, Continuous Ranked Probability Skill Score (CRPSS), calibration–refinement factorization (reliability), and likelihood-base rate factorization (discrimination). The Containing Ratio (CR) (Xiong and O’Connor, 2008) is a measure of overall accuracy and describes how often the observation falls within the ensemble bounds:

CR ¼

T 1X Ii T i¼1

ð6Þ

where Ii is

 Ii ¼

1; xð1Þ;i 6 oi 6 xðzÞ;i 0;

ð7Þ

otherwise

The continuous ranked probability score (CRPS; Hersbach, 2000; Wilks, 2006) is the squared difference between the cumulative distribution function for the ith ensemble forecast, Pi(x), and that of the observation, P oi ðxÞ averaged over a set of forecasts:

CRPS ¼

T 1X T i¼1

Z

1 1

2

½Pi ðxÞ  Poi ðxÞ dx

ð8Þ

where

Poi ðxÞ ¼



0;

x < observed

1; x P observed

ð9Þ

is a cumulative probability step function that jumps from 0 to 1 at the point where the forecast variable x equals the observation (Hersbach, 2000; Wilks, 2006). Similar to the ranked probability score, CRPS rewards forecasts that concentrate probability near the observed value and smaller values are better, but unlike the ranked probability score it does not require specification of forecast categories (Wilks, 2006). The value of CRPS is difficult to assess based on its value alone, therefore the CRPS Skill Score (CRPSS) is normally applied:

Please cite this article in press as: Franz, K.J., et al. Assessment of SWE data assimilation for ensemble streamflow predictions. J. Hydrol. (2014), http:// dx.doi.org/10.1016/j.jhydrol.2014.07.008

K.J. Franz et al. / Journal of Hydrology xxx (2014) xxx–xxx

CRPSS ¼ 1 

CRPSSf CRPSScl

ð10Þ

where CRPSSf is the average CRPSS of the ensemble forecasts and CRPSScl is the average CRPSS of climatological forecasts generated from the historical observations. To evaluate reliability and discrimination, five forecast categories are defined: the 620th percentile, 640th percentile, 660th percentile, 680th percentile, and >80th percentile of the historical observations (climatology). The frequency of the ensemble members in these categories determines the forecast probability, f. The forecast probability is rounded up to the nearest tenth probability interval, therefore ten probability bins are used (0–10%, >10– 20%, etc.). The observed probability, y, is equal to 1 for the forecast category in which it occurred, and equal to 0 for the flow categories in which it did not occur. Murphy and Winkler (1987) propose a verification framework based on factorization of the joint distribution of observations and forecasts into the calibration–refinement factorization, p(f, y)

5

= p(y|f) p(f), and the likelihood base-rate factorization, p(f, y) = p(f|y) p(y). The conditional distribution p(y|f) describes how often an observation occurs given a particular forecast and is often plotted on a reliability diagram as a function of forecast probability (Murphy and Winkler, 1987, 1992; Franz et al., 2003; Bradley et al., 2004; Wilks, 2006; Franz and Hogue, 2011). A set of forecasts are perfectly reliable when the relative frequency of the observations equal the probability of prediction (i.e. p(y|f) = f), in which case it will plot along the 1:1 line on the reliability diagram. Bias in the forecasts can be assessed if they diverge from the 1:1 line. Forecasts that are over-predicting will fall to the right of the line, and forecasts that are under-predicting will fall to the left of the line. Bar plots are often plotted with the reliability diagrams to display the marginal distribution p(f), or the frequency of the forecasts. The conditional distribution (p(f|o)) is plotted versus forecast probability using discrimination diagrams (Murphy and Winkler, 1987; Wilks, 2006). The discrimination diagrams can be used to determine how confidently the forecasts predicted the event that was observed versus an event that did not occur. A discrimination diagram is developed for a specific event or observation and the probability assigned by the forecasts for all possible events are plotted. We evaluate the forecast discrimination for two possible events: seasonal flows occurring in the lowest 40% of the historical record and seasonal flows occurring in the upper 40% of the historical record. Discrimination is evaluated by observing the separation between the lines that plot the forecast probability distribution for the lowest 40% flows and the line that plots the forecast probability distribution for highest 40% flows. Ideally, forecasts frequently give the highest probabilities to the event that occurred and minimal probability to the opposite event(s). If, for example, the forecasts have good discrimination for high flows, there will be significant separation between the two lines and high flows will have often been predicted with the highest probability values. When the forecast probability distribution for the event that occurred is equal to or less than the forecast probability distribution for event(s) that did not occur, the forecasts are not discriminatory for that observation and have low skill for predicting that event (Murphy and Winkler, 1987). 3. Results The ensemble means from RFC-DA forecasts have the lowest RMSE values for all forecast dates (Fig. 2a). The DREAM-DA ensemble means have lower RMSE values compared to DREAM, but the DREAM-DA and RFC ensemble mean forecasts have similar RMSE

Fig. 2. The (a) Root Mean Squared Error (RMSE), (b) Percent Bias (PBias), and (c) correlation of the ensemble mean forecasts for forecast dates from January 1 to April 1.

Fig. 3. Continuous Ranked Probability Skill Score (CRPSS) for ensemble forecasts for forecast dates from January 1 to April 1.

Please cite this article in press as: Franz, K.J., et al. Assessment of SWE data assimilation for ensemble streamflow predictions. J. Hydrol. (2014), http:// dx.doi.org/10.1016/j.jhydrol.2014.07.008

6

K.J. Franz et al. / Journal of Hydrology xxx (2014) xxx–xxx

values for most forecast dates. Both the DREAM-DA and RFC-DA ensemble means have lower PBias than the non-DA forecasts, indicating that the DA reduced the average bias in ensemble mean forecasts (Fig. 2b), which is consistent with our observations in a previous study (He et al., 2012). The ensemble means underpredict the seasonal discharge for all tests and forecast dates. Statistics from deterministic runs in a previous study showed negative bias in model simulations for the same basin (He et al., 2012), indicating that to some degree the standard model analysis was a good indicator of forecast performance for the ensemble mean. Correlation is highest for the RFC ensemble means, with the RFC-DA having the second highest correlations (Fig. 2c). The DREAM ensemble means have higher correlation than the

DREAM-DA ensemble means for all forecast dates. RMSE and Correlation continually improve throughout the forecast season, with the exception of March 1 (Fig. 2). PBias values remain relatively constant throughout the season. We noted that the ensemble median forecasts (not shown) produce similar results to the mean forecasts, and could also be evaluated as a sample deterministic forecast. With the exception of the RFC-DA January 1 forecast, the CRPSS values are positive for all hindcasting tests indicating that the probabilistic ensemble forecasts perform better than climatology (Fig. 3). The ensembles without DA (RFC and DREAM) have the highest CRPSS scores and the DREAM-DA has higher CRPSSs than the RFC-DA.

Fig. 4. CDFs for forecasts issued January 1 to April 1 for the 1984 forecast season.

Please cite this article in press as: Franz, K.J., et al. Assessment of SWE data assimilation for ensemble streamflow predictions. J. Hydrol. (2014), http:// dx.doi.org/10.1016/j.jhydrol.2014.07.008

K.J. Franz et al. / Journal of Hydrology xxx (2014) xxx–xxx

Fig. 5. The containing ratio for ensemble forecasts for forecast dates from January 1 to April 1.

The poorer performance of the DREAM-DA and RFC-DA is due to the tendency of the DA to result in a shift of the forecast CDF away from the observed discharge. This results in the probability of the DA forecasts ‘‘missing’’ the observation, leading to poorer CRPSSs. In the example illustrated, the forecast probability is increasingly applied to higher discharge values by the RFC-DA (Fig. 4 left column) and DREAM-DA (Fig. 4 right column) as the forecast season progresses, moving the forecast probability away from the observation relative to the non-DA tests. The CR results further show that inclusion of DA tends to cause the forecast probability to move away from discharge observations

7

as the forecast season progresses. The DREAM-DA and RFC-DA CR values start out nearly equal to the RFC and DREAM values early in the forecast season, but drop below them by March 15 (Fig. 5). Based on visual inspection of all the CDF plots, the non-DA CDFs are a closer match to the observations than the DA CDFs in 15 of the 26 years, particularly later in the season. The DA results in a more accurate probability distribution in only six years compared to the non-DA cases, and five years showed no noticeable difference in the CDFs. The range of the ensembles did not differ much between the DA and the non-DA cases (Fig. 4). Note that the narrow nature of the ensembles relative to the climatology leads to positive skill scores because more of the probability is centered near the observation relative to the climatology forecast, which always has a larger spread. As with the RMSE, PBias, and Correlation, the CRPSS increases throughout the season and drops for the March 1 forecast date (Fig. 3). This change in skill was due to the difficulty of predicting the discharge for year 1991. In 1991, the melt occurred very near March 1, resulting in large errors and uncertainty in the initial model states and a decline in the average forecast skill for that forecast date. For early season forecasts, the DREAM and DREAM-DA have good reliability for forecast probabilities less than 0.50 (Fig. 6b) and the RFC and RFC-DA have good reliability for forecast probabilities less than 0.60 (Fig. 6a). The DREAM and DREAM-DA tend to have better reliability than the RFC and RFC-DA for forecast dates February 1 and later. For all tests conducted, there are few forecasts with probability greater than 50% (Fig. 6 bar graphs), indicating that the forecasts

Fig. 6. Reliability diagrams for (first column) RFC and RFC-DA, (second column) DREAM, and DREAM-DA for (a and b) January 1, (d and e) February 1, (g and h) March 1, and (j and k) April 1 forecast dates. The frequency of the forecast probability averaged over all the methods is shown on the bar graphs in the third column.

Please cite this article in press as: Franz, K.J., et al. Assessment of SWE data assimilation for ensemble streamflow predictions. J. Hydrol. (2014), http:// dx.doi.org/10.1016/j.jhydrol.2014.07.008

8

K.J. Franz et al. / Journal of Hydrology xxx (2014) xxx–xxx

display low confidence. The low forecast frequency is partly why none of the tests have good reliability between the 0.50 and 1.00 probability levels and why there can be little confidence in results in that range. There is a slight increase in the frequency of probabilities above 0.50 in the late season (Fig. 6). Discrimination diagrams depicting the frequency of the forecast probability given to low flows (lowest 40% climatology flow category) versus high flows (upper 40% climatology flow category) when the observations are low flows are given in Fig. 7. Both the RFC (Fig. 7 first column) and RFC-DA (Fig. 7 second column) forecasts show some discrimination for low flows at each forecast date. Discrimination improves throughout the season and by April 1 there is good separation between the lines depicting high and low flows. There is also increase in the frequency of larger probability values for low flows (Fig. 7m and n). DREAM (Fig. 7c and g) and DREAM-DA (Fig. 7d and h) have poor discrimination for low flows early in the season, meaning they are unable to predict whether high flows or low flows are more likely. However, discrimination does improve later in the season (Fig. 7o and p). The RFC on average has the best discrimination, followed by the RFC-DA. The DREAM-DA has poor discrimination for low flows throughout the forecast season. All of the tests produce hindcasts with poor discrimination for high flows (not shown). There is some improvement towards the end of the season, but discrimination for high flows remains less than for low flows. The DREAM-DA is the only test that performed better for discrimination of high flows compared to low flows but only for the March 1 and April 1 forecast dates.

4. Concluding remarks The current study is one of the few to evaluate snow data assimilation within a hindcasting framework and use a number of

probabilistic verification metrics. While the DA improves some aspects of the seasonal water supply predictions, overall skill of the forecasts is not significantly improved. For deterministic (ensemble mean) predictions, the PBias is improved by the DA application (DREAM-DA and the RFC-DA have the lowest biases), and the RFC-DA has the lowest RMSE. However, the RFC ensemble mean forecasts have lower RMSE scores than the DREAM-DA. Also, the RFC and DREAM probabilistic predictions have the highest CRPSS. Although the ICEA was shown to improve streamflow generation when applied to the coupled NWS forecasting models (He et al., 2011a, b, 2012), we find that ESP forecasts from that method (DREAM-DA) are less skillful than the non-DA forecasts for some measures. There is some indication that treatment of parametric uncertainty does improve both the deterministic and probabilistic forecasts as the bias, CRPSS, and reliability (for low forecast probabilities) are better for the DREAM-DA than for the RFC-DA. Similar to the findings here, Franz et al. (2003) found better discrimination for low flows compared to high flows for hindcasts from sites in the Colorado River basin. In their study, there is no discrimination for low flows early in the season, but by the April 1 forecast date, the discrimination for low flows is almost perfect for some locations. Franz et al. (2003) also found that sites with good discrimination for high flows were the wetter, cooler sites located in Colorado and the sites with poor discrimination for high flows were the drier, warmer sites in Arizona (see Fig. 11 in Franz et al., 2003). Based on their findings, one would have expected better discrimination for high flows in the NFARB, which has significant snowpack that persists late into the forecast season. A late melt tends to lead to improved forecasts later in the season as snowmelt dominates the hydrograph (Franz et al., 2003, 2008b). Poor forecast performance for high flows could be related to the data used to drive the model; we did not analyze potential biases in the data obtained from the CNRFC. Further, the DA approaches rely on a small number of ground-based, spatially-limited snow

Fig. 7. Discrimination of forecasts when low flows are observed for (column 1) RFC, (column 2) RFC with DA, (column 3) DREAM, and (column 4) DREAM with DA for (a–d) Januray 1, (e–h) February 1, (i–l) March 1, and (m–p) April 1 forecast dates.

Please cite this article in press as: Franz, K.J., et al. Assessment of SWE data assimilation for ensemble streamflow predictions. J. Hydrol. (2014), http:// dx.doi.org/10.1016/j.jhydrol.2014.07.008

K.J. Franz et al. / Journal of Hydrology xxx (2014) xxx–xxx

sites to estimate snow distribution in the basin. The regression method assumes that the weighted average SWE computed from the snow sites is representative of the modeled basin average SWE, however, these point-scale SWE observations may not be adequate to resolve the variability of SWE at the basin scale (Slater and Clark, 2006; Bales et al., 2006). Additional uncertainty regarding basin average SWE occurs because updating in the lower basin is not considered due to a lack of observations at lower elevations. Updating, and the subsequent streamflow forecasts, may be improved with access to better estimates of basin-wide SWE. Dual assimilation of streamflow and SWE may also help overcome the limitations associated with the snow observations to give a better estimate of basin-wide conditions. Modifying the SWE internal states with real world data may also be ignoring model biases or model errors and leading to a degradation of the forecasts in the DA examples. A number of studies show that calibration to several variables can improve simulations of multiple watershed processes, but often leads to poorer simulations of some processes (e.g. discharge) (Udnæs et al., 2007; Parajka and Blöschl, 2008; Koren et al., 2008; Franz and Karsten, 2013). This suggests that model structural errors can lead to biases in internal watershed processes when the model is calibrated to a single variable. The SNOW17 and SAC-SMA model parameters are calibrated only to discharge in this study and there has been no evaluation to determine whether internal model states (e.g., basin average SWE) correlate well with field observations. We noted that the ensemble spread does not vary much between the four tests and there is no improvement in the CR for the large DA ensembles. Therefore, an approach to refine the number of ensemble members retained in the DA applications is needed. A more comprehensive evaluation of the uncertainty estimations resulting from ICEA and other ENKF applications that includes probabilistic metrics (e.g. Franz and Hogue, 2011) would aid in determining the utility of various ensemble sizes. Only a small sample of 26 hindcasts was analyzed due to the limited SWE data record. Therefore, uncertainty in the verification statistics may be quite large. Although we find no obvious overall improvement in the hindcasts when using the DA, future studies should include an assessment of the uncertainty to assure that apparent improvements are interpreted correctly. As noted, the present study is an advanced analysis of a DA approach by evaluation within a hindcasting framework, providing additional information on model performance in a forecast mode. The results shows that inclusion of the DA using the three SNOTEL sites does not explicitly or notably improve predictions of streamflow in the study basin, even though the method produced improved streamflow simulations in previous studies. Hence, we advocate for integration of an evaluation approach that includes assessing the ensemble of simulations and forecasts through hindcasting and probabilistic metrics, when evaluating hydrologic modeling and forecasting methods.

Acknowledgements This research was supported by grants from the National Oceanic and Atmospheric Administration (NOAA) National Weather Service (NA07NWS4620013) and the National Aeronautics and Space Administration (NNX10AQ77G).

References Ajami, N.K., Duan, Q., Gao, X., Sorooshian, S., 2006. Multimodel combination techniques for analysis of hydrological simulations: application to distributed model intercomparison project results. J. Hydrometeorol. 7 (4), 755–768. http:// dx.doi.org/10.1175/JHM519.1.

9

Anderson, E.A., 1973. National Weather Service River Forecast System – Snow Accumulation and Ablation Model. NOAA technical memorandum: NWS HYDRO-17. U.S. National Weather Service, Silver Spring (MD), pp. 217. Bales, R.C., Molitch, N.P., Painter, T.H., Dettinger, M.D., Rice, R., Dozier, J., 2006. Mountain hydrology of the western United States. Water Resour. Res. 42, W08432. http://dx.doi.org/10.1029/2005WR004387. Beven, K.J., 1993. Prophecy, reality and uncertainty in distributed hydrological modeling. Adv. Water Resour. 16, 41–51. http://dx.doi.org/10.1016/03091708(93)90028-E. Beven, K.J., Binley, A., 1992. Future of distributed models: model calibration and uncertainty prediction. Hydrol. Process. 6, 279–298. http://dx.doi.org/10.1002/ hyp.3360060305. Bradley, A., Schwartz, S.S., Hashino, T., 2004. Distributions-oriented verification of ensemble streamflow predictions. J. Hydrometeorol. 5, 532–545. http:// dx.doi.org/10.1175/1525-7541(2004)005%3C0532:DVOESP%3E2.0.CO;2. Brigode, P., Oudin, L., Perrin, C., 2013. Hydrological model parameter instability: a source of additional uncertainty in estimating the hydrological impacts of climate change? J. Hydrol. 476, 410–425. http://dx.doi.org/10.1016/ j.jhydrol.2012.11.012. Burnash, R.J.C., Ferral, R.L., McGuire, R.A., 1973. A generalized streamflow simulation system: conceptual models for digital computers. US National Weather Service, NOAA, and the State of California Department of Water Resources Technical Report: Joint Federal-State River Forecast Center. Clark, M.P., Hay, L.E., 2004. Use of medium-range numerical weather prediction model output to produce forecasts of streamflow. J. Hydrometeorol. 5, 15–32. http:// dx.doi.org/10.1175/1525-7541(2004)005%3C0015:UOMNWP%3E2.0.CO;2. Clark, M.P., Rupp, D.E., Woods, R.A., Zheng, X., Ibbitt, R.P., Slater, A.G., Schmidt, J., Uddstrom, M.J., 2008. Hydrological data assimilation with the ensemble Kalman filter: use of streamflow observations to update states in a distributed hydrological model. Adv. Water Resour. 31, 1309–1324. http://dx.doi.org/ 10.1016/j.advwatres.2008.06.005. Day, G.N., 1985. Extended streamflow forecasting using NWSRFS. J. Water Res. Pl.ASCE 111 (2), 157–170. DeChant, C., Moradkhani, H., 2011. Improving the characterization of initial condition for ensemble streamflow prediction using data assimilation. Hydrol. Earth Syst. Sci. 15, 3399–3410. http://dx.doi.org/10.5194/hess-15-3399-2011. Demargne, J., Mullusky, M., Werner, K., Adams, T., Lindsey, S., Schwein, N., Marosi, W., Welles, E., 2009. Application of forecast verification science to operational river forecasting in the U.S. National Weather Service. Bull. Am. Meteorol. Soc. 90 (6), 779–784. http://dx.doi.org/10.1175/2008BAMS2619.1. Demeritt, D., Nobert, S., Cloke, H.L., Pappenberger, F., 2013. The European flood alert system and the communication, perception, and use of ensemble predictions for operational flood risk management. Hydrol. Process. 27, 147–157. http:// dx.doi.org/10.1002/hyp.9419. Demirel, M.C., Booij, M.J., Hoekstra, A.Y., 2013. Effect of different uncertainty sources on the skill of 10 day ensemble low flow forecasts for two hydrological models. Water Resour. Res. 49 (7), 4035–4053. http://dx.doi.org/10.1002/ wrcr.20294. Duan, Q., Ajami, N.K., Gao, X., Sorooshian, S., 2007. Multi-model ensemble hydrologic prediction using Bayesian model averaging. Adv. Water Resour. 30, 1371–1386. http://dx.doi.org/10.1016/j.advwatres.2006.11.014. Evensen, G., 1994. Sequential data assimilation with a nonlinear quasigeostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res. 99, 143–110. Franz, K.J., Hogue, T.S., 2011. Evaluating uncertainty estimates in hydrologic models: borrowing measures from the forecast verification community. Hydrol. Earth Syst. Sci. 15 (11), 3367. http://dx.doi.org/10.5194/hess-15-33672011. Franz, K.J., Karsten, L.R., 2013. Calibration of a distributed snow model using MODIS snow covered area data. J. Hydrol. 494, 160–175. http://dx.doi.org/10.1016/ j.jhydrol.2013.04.026. Franz, K.J., Hartmann, H.C., Sorooshian, S., Bales, R., 2003. Verification of National Weather Service ensemble streamflow predictions for water supply forecasting in the Colorado River basin. J. Hydrometeorol. 4, 1105–1118. http://dx.doi.org/ 10.1175/1525-7541(2003)004%3C1105:VONWSE%3E2.0.CO;2. Franz, K.J., Hogue, T.S., Sorooshian, S., 2008a. Operational snow modeling: addressing the challenges of an energy balance model for National Weather Service forecasts. J. Hydrol. 360 (1–4), 48–66. http://dx.doi.org/10.1016/ j.jhydrol.2008.07.013. Franz, K.J., Hogue, T.S., Sorooshian, S., 2008b. Snow model verification using ensemble prediction and operational benchmarks. J. Hydrometeorol. 9, 1402– 1415. http://dx.doi.org/10.1175/2008JHM995.1. Franz, K.J, Butcher, P., Ajami, N.K., 2010. Addressing snow model uncertainty for hydrologic prediction. Adv. Water Resour. 33, 820–832. http://dx.doi.org/ 10.1016/j.advwatres.2010.05.004. Gupta, H.V., Sorooshian, S., Yapo, P.O., 1998. Toward improved calibration of hydrologic models: Multiple and noncommensurable measures of information. Water Resour. Res. 34 (4), 751–763. http://dx.doi.org/10.1029/97WR03495. Hamlet, A., Lettenmaier, D., 1999. Columbia River streamflow forecasting based on ENSO and PDO climate signals. J. Water Res. Pl.-ASCE 125 (6), 333–341. http:// dx.doi.org/10.1061/(ASCE)0733-9496. He, M., 2010. Data assimilation in watershed models for improved hydrologic forecasting. Ph.D. Dissertation, Civil and Environmental Engineering, University of California, Los Angeles, pp. 173. He, M., Hogue, T.S., Franz, K.J., Margulis, S.A., Vrugt, J.A., 2011a. Characterizing parameter sensitivity and uncertainty for an operational snow model across

Please cite this article in press as: Franz, K.J., et al. Assessment of SWE data assimilation for ensemble streamflow predictions. J. Hydrol. (2014), http:// dx.doi.org/10.1016/j.jhydrol.2014.07.008

10

K.J. Franz et al. / Journal of Hydrology xxx (2014) xxx–xxx

hydroclimatic regimes. Adv. Water Resour. 34 (1), 114–127. http://dx.doi.org/ 10.1016/j.advwatres.2010.10.002. He, M., Hogue, T.S., Franz, K.J., Margulis, S.A., Vrugt, J.A., 2011b. Corruption of parameter behavior and regionalization by model and forcing data errors: a Bayesian example using the SNOW17 model. Water Resour. Res. 47 (W07546), 1–17. http://dx.doi.org/10.1029/2010WR009753. He, M., Hogue, T.S., Franz, K.J., Margulis, S.A., 2012. An integrated uncertainty and ensemble-based data assimilation framework for improved operational streamflow predictions. Hydrol. Earth Syst. Sci. 16, 815–831. http://dx.doi.org/ 10.5194/hess-16-815-2012. Hersbach, H., 2000. Decomposition of the continuous ranked probability score for ensemble prediction systems. Weather Forecast. 15, 559–570. http://dx.doi.org/ 10.1175/1520-0434(2000)015%3C0559:DOTCRP%3E2.0.CO;2. Hogue, T.S., Sorooshian, S., Gupta, H., Holz, A., Braatz, D., 2000. A multistep automatic calibration scheme for river forecasting models. J. Hydrometeorol. 1 (6), 524–542. http://dx.doi.org/10.1175/1525-7541(2000)001%3C0524:AMACSF %3E2.0.CO;2. Hogue, T.S., Gupta, H., Sorooshian, S., 2006. A ‘user-friendly’ approach to parameter estimation in hydrologic models. J. Hydrol. 320 (1), 202–217. http://dx.doi.org/ 10.1016/j.jhydrol.2005.07.009. Hornberger, G.M., Spear, R.C., 1981. An approach to the preliminary analysis of environmental systems. J. Environ. Manage. 12, 7–18. Jeong, D.I., Kim, Y.O., 2005. Rainfall-runoff models using artificial neural networks for ensemble streamflow prediction. Hydrol. Process. 19 (19), 3819–3835. http://dx.doi.org/10.1002/hyp.5983. Kavetski, D., Franks, S.W., Kuczera, G., 2002. Confronting input uncertainty in environmental modeling. In: Duan, Q., Gupta, H.V., Sorooshian, S., Rousseau, A.N., Turcotte, R. (Eds.), Calibration of Watershed Models. American Geophysical Union, Washington, D.C., pp. 49–68. Koren, V., Moreda, F., Smith, M., 2008. Use of soil moisture observations to improve parameter consistency in watershed calibration. Phys. Chem. Earth 22, 1068– 1080. http://dx.doi.org/10.1016/j.pce.2008.01.003. Kuczera, G., Parent, E., 1998. Monte Carlo assessment of parameter uncertainty in conceptual catchment models: the Metropolis algorithm. J. Hydrol. 211, 69–85. http://dx.doi.org/10.1016/S0022-1694(98)00198-X. Leisenring, M., Moradkhani, H., 2011. Snow water equivalent prediction using Bayesian data assimilation methods. Stoch. Environ. Res. Risk Assess. 25 (2), 253–270. http://dx.doi.org/10.1007/s00477-010-0445-5. Leisenring, M., Moradkhani, H., 2012. Analyzing the uncertainty of suspended sediment load prediction using sequential Monte Carlo methods. J. Hydrol., 268–282. http://dx.doi.org/10.1016/j.jhydrol.2012.08.049. Li, Y., Ryu, D., Wang, Q.J., Pagano, T., Western, A., Hapuarachchi, P., Toscas, P., 2011. Assimilation of streamflow discharge into a continuous flood forecasting model. In: Bloschl, G., Takeuchi, K., Jain, S., Farn-leitner, A., Schumann, A. (Eds.), Risk in Water Resources Management. IAHS Publ., pp. 07–113. Li, M., Yang, D., Chen, J., Hubbard, S.S., 2012. Calibration of a distributed flood forecasting model with input uncertainty using a Bayesian framework. Water Resour. Res. 48 (8). http://dx.doi.org/10.1029/2010WR010062. Liu, Y., Gupta, H., 2007. Uncertainty in hydrologic modeling: toward an integrated data assimilation framework. Water Resour. Res. 43. http://dx.doi.org/10.1029/ 2006WR005756, 1–18, W07401. Liu, Y., Weerts, A.H., Clark, M., Hendricks Franssen, H.-J., Kumar, S., Moradkhani, H., Seo, D.-J., Schwanenberg, D., Smith, P., van Dijk, A.I.J.M., van Velzen, N., He, M., Lee, H., Noh, S.J., Rakovec, O., Restrepo, P., 2012. Advancing data assimilation in operational hydrologic forecasting: progresses, challenges, and emerging opportunities. Hydrol. Earth Syst. Sci. 16, 3863–3887. http://dx.doi.org/ 10.5194/hess-16-3863-2012. Margulis, S., McLaughlin, D., Entekhabi, D., Dunne, S., 2002. Land data assimilation and soil moisture estimation using measurements from the Southern Great Plains 1997 field experiment. Water Resour. Res. 38 (12), 1299. http:// dx.doi.org/10.1029/2001WR001114. McEnery, J.O., Ingram, J., Duan, Q., Adams, T., Anderson, L., 2005. NOAA’s advanced hydrologic prediction service. Bull. Am. Meteorol. Soc. 86, 375–385. http:// dx.doi.org/10.1175/BAMS-86-3-375. McMillan, H., Jackson, B., Clark, M., Kavetski, D., Woods, R., 2011. Rainfall uncertainty in hydrological modeling: an evaluation of multiplicative error models. J. Hydrol. 400 (1), 83–94. http://dx.doi.org/10.1016/j.jhydrol.2011. 01.026. McMillan, H.K., Hreinsson, E.Ö., Clark, M.P., Singh, S.K., Zammit, C., Uddstrom, M.J., 2013. Operational hydrological data assimilation with the recursive ensemble Kalman filter. Hydrol. Earth Syst. Sci. 17 (1), 21–38. http://dx.doi.org/10.5194/ hess-17-21-2013.

Montanari, A., Di Baldassarre, G., 2012. Data errors and hydrological modeling: the role of model structure to propagate observation uncertainty. Adv. Water Resour. 51, 498–504. http://dx.doi.org/10.1016/j.advwatres.2012.09.007. Moradkhani, H., Hsu, K.L., Gupta, H., Sorooshian, S., 2005. Uncertainty assessment of hydrologic model states and parameters: sequential data assimilation using the particle filter. Water Resour. Res. 41, W05012. Moradkhani, H., DeChant, C.M., Sorooshian, S., 2012. Evolution of ensemble data assimilation for uncertainty quantification using the particle filter-Markov chain Monte Carlo method. Water Resour. Res. 48, W12520. http://dx.doi.org/ 10.1029/2012WR012144. Murphy, A.H., Winkler, R.L., 1987. A general framework for forecast verification. Mon. Weather Rev. 115, 1330–1338. Murphy, A.H., Winkler, R.L., 1992. Diagnostic verification of probability forecasts. Hydrol. Process. 7, 435–455. http://dx.doi.org/10.1016/0169-2070(92)90028-8. Najafi, M.R., Moradkhani, H., Jung, I.W., 2011. Assessing the uncertainties of hydrologic model selection in climate change impact studies. Hydrol. Process. 25 (18), 2814–2826. http://dx.doi.org/10.1002/hyp.8043. NWS, 2004. National Weather Service River Forecast System (NWSRFS) User’s Manual. Available from NOAA/National Weather Service, Office of Hydrology, 1325 East–West Hwy, Silver Spring, MD 20910. Parajka, J., Blöschl, G., 2008. The value of MODIS snow cover data in validating and calibrating conceptual hydrologic models. J. Hydrol. 358 (3–4), 240–258. http:// dx.doi.org/10.1016/j.jhydrol.2008.06.006. Renard, B., Kavetski, D., Kuczera, G., Thyer, M., Franks, S.W., 2010. Understanding predictive uncertainty in hydrologic modeling: the challenge of identifying input and structural errors. Water Resour. Res. 46, W05521. http://dx.doi.org/ 10.1029/2009WR008328. Renner, M., Werner, M.G.F., Rademacher, S., Sprokkereef, E., 2009. Verification of ensemble flow forecasts for the River Rhine. J. Hydrol. 376, 463–475. http:// dx.doi.org/10.1016/j.jhydrol.2009.07.059. Seo, D.J., Cajina, L., Corby, R., Howieson, T., 2009. Automatic state updating for operational streamflow forecasting via variational data assimilation. J. Hydrol. 367 (3–4), 255–275. http://dx.doi.org/10.1016/j.jhydrol.2009.01.019. Shamir, E., Georgakakos, K.P., 2007. Estimating snow depletion curves for American River basins using distributed snow modeling. J. Hydrol. 334 (1–2), 162–173. http://dx.doi.org/10.1016/j.jhydrol.2006.10.007. Slater, A.G., Clark, M.P., 2006. Snow data assimilation via an ensemble Kalman filter. J. Hydrometeorol. 7, 478–493. http://dx.doi.org/10.1175/JHM505.1. Udnæs, H.-C., Alfnes, E., Andreassen, L.M., 2007. Improving runoff modeling using satellite-derived snow covered area? Nord. Hydrol. 38 (1), 21–32. http:// dx.doi.org/10.2166/nh.2007.032. Vrugt, J.A., ter Braak, C., Gupta, H., Robertson, D., 2008a. Equifinality of formal (DREAM) and informal (GLUE) Bayesian approaches in hydrologic modeling? Stoch. Environ. Res. Risk Assess. 23 (7), 1011–1026. http://dx.doi.org/10.1007/ s00477-008-0274-y. Vrugt, J.A., ter Braak, C., Clark, M.P., Hyman, J.M., Robinson, B.A., 2008b. Treatment of input uncertainty in hydrologic modeling: doing hydrology backwards with Markov chain Monte Carlo simulation. Water Resour. Res. 44 (2), W00B09. http://dx.doi.org/10.1029/2007WR006720. Wagener, T., Gupta, H.V., 2005. Model identification for hydrological forecasting under uncertainty. Stoch. Environ. Res. Risk Assess. 19 (6), 378–387, 10-1007/ s00477-005-0006-5. Welles, E., Sorooshian, S., Carter, G., Olsen, B., 2007. Hydrologic verification: a call for action and collaboration. Bull. Am. Meteorol. Soc. 88 (4), 503–511. http:// dx.doi.org/10.1175/BAMS-88-4-503. Westrick, K.J., Storck, P., Mass, C.F., 2002. Description and evaluation of a hydrometeorological forecast system for mountainous watersheds. Weather Forecast. 17, 250–262. Wilks, D.S., 2006. Statistical Methods in the Atmospheric Sciences, Second ed. Academic Press, pp. 627. Wood, A.W., Schaake, J.C., 2008. Correcting errors in streamflow forecast ensemble mean and spread. J. Hydrometeorol. 9, 132–148. http://dx.doi.org/10.1175/ 2007JHM862.1. Wood, A.W., Maurer, E.P., Kumar, A., Lettenmaier, D., 2002. Long-range experimental hydrologic forecasting for the eastern United States. J. Geophys. Res. 107 (D20), 4429. http://dx.doi.org/10.1029/2001JD000659. Xiong, L., O’Connor, K.M., 2008. An empirical method to improve the prediction limits of the GLUE methodology in rainfall-runoff modeling. J. Hydrol. 349, 115– 124. http://dx.doi.org/10.1016/j.jhydrol.2007.10.029.

Please cite this article in press as: Franz, K.J., et al. Assessment of SWE data assimilation for ensemble streamflow predictions. J. Hydrol. (2014), http:// dx.doi.org/10.1016/j.jhydrol.2014.07.008