1379
Incorporation of soft data to describe uncertainty of data in model calibration I.M. Khadam a and J.J. Kaluarachchi a aDepartment of Civil and Environmental Engineering and Utah Water Research Laboratory Utah State University, Logan, Utah 84322-8200 In this work we present an intuitive and simple framework for incorporating "soft data" about the accuracy of different parts of calibration data into the calibration process. The framework makes use of a ranking method developed in operations research that requires only the specification of the importance order of the criteria used for ranking. The framework was demonstrated in the case of Fishtrap Creek catchment, where a short streamflow record with significant gaps was reconstructed using Support Vector Machines (SVM's). The reconstructed streamflow is inherently less accurate than the observed streamflow. Incorporating this educated judgment about the relative accuracy of the calibration data in the proposed framework resulted in identification of faulty model calibration as well as errors in SVM's prediction of extreme streamflow events, which would have gone to go undetected otherwise. 1. I N T R O D U C T I O N
Availability of continuous streamflow records is important for understanding and modeling of hydrological behavior of any catchment. Poor management of the streamflow gauging networks in many catchments is not an uncommon problem. Poor management is caused by lack of understanding or appreciation to the need for continuous flow recording. However, a poor gauging network is often a result of financial restrictions. Some of the common problems associated with streamflow data collection include: (1) gaps in streamflow record, (2) gauges do not remain at same location over time preventing useful analysis of historical trends in streamflow records, and (3) some gauges are at locations with no hydrological significance, e.g., away from a catchment outlet. As a result, understanding and modeling of hydrological behavior of a catchment becomes more challenging. Understanding of hydrological behavior is key to understanding hydrological controls on fate and transport of pollutants in the catchment, including nutrients and pesticides. Therefore, poor hydrologic data collection affects assessment and management of watershed health, which require modeling of watershed hydrological behavior. The implication of these problems on hydrological modeling is that model development and calibration becomes increasingly challenged as the availability and quality of calibration data deteriorates. Classically, calibration of a conceptual model is an exercise of fitting predicted streamflow to observed streamflow by adjusting model parameters. More generally, model cali-
1380 bration is the reduction of discrepancy between observed and predicted system response. In the case of watershed models, system response refers to streamflow as well as to soil moisture, groundwater level, base flow, and other hydrological measurements. Model calibration requires that observed system response be accurate, so that any discrepancy between predicted and observed responses is only due to flaws in calibration or in model structure. In this work we are developing a framework to handle the problem of uncertainty in the calibration data, specifically, the problem of calibration data that comes from sources with varying degrees of reliability. This problem is further aggravated because the reliability of the data could only be qualified subjectively, i.e. quantification of uncertainty is not possible. The developed calibration framework will be applied to calibrate a rainfall-runoff model for Fishtrap Creek catchment in Washington State. Uncertainties associated with observed system response could force the system to erroneous calibration. Of particular concern is the problem where observation contains data from different sources with different levels of uncertainties. Occasionally, certain portions of/~ suffer from higher uncertainties compared to the rest of the observed record, e.g., missing streamflow measurements that has been filled by interpolation or extrapolation. Moreover, different system responses (e.g., streamflow and soil moisture) are typically collected using different measurement techniques, and are analyzed and averaged at different temporal and spatial scales. Hence, different system responses are expected to contain different levels and types of noise. If the fact that observed responses contain different levels of uncertainties is not explicitly handled in the calibration process, it could potentially result in questionable calibration outcome. However, incorporating this information about relative accuracy of data in calibration process is difficult because this information is typically qualitative, i.e., "soft data" in the form of professional judgment. Therefore, there is a need to for a framework that can include "soft data" about the relative accuracy of observed data in the calibration process. In the next section a calibration scheme for handling calibration data with varying levels of uncertainty will be developed. 2. M E T H O D O L O G Y
An intuitive answer to the problem of calibration data with different levels of uncertainty is to give different weight in the calibration process to the different parts of the data based on their relative uncertainties. By definition, model calibration is the search for optimal solution set in the space of all feasible solutions. The search implies a ranking of all feasible solution sets based on an objective function (A). The optimal set is the one ranked best. Grouping calibration data by their relative uncertainty to m groups, an index (A) could be defined for each group (A1, A2... A~). Using accuracy order of calibration data, a set of m weights (Wl, W2...,w~) could be defined such that wl > w2 >_ ... > w~. The objective function could then be defined as a weighted sum of m indices corresponding to each group of calibration data: m
A-~wiAi
(1)
i=i
Solution sets are then ranked based on A. The best solution set is the one that maximizes or minimizes A, depending on the definition of the objective function.
1381 To define the weights that describe importance order of the calibration data, some assumptions about the relative size of uncertainties need to be made. However, since relative uncertainties in the data are only defined subjectively, direct numerical quantification of the weights is not possible. It is only possible to use professional judgment to define reliability or importance order of different parts of calibration data. Subjective assignment of weights could results in biased weights, which could erroneously influence model calibration. In order to overcome the bias associated with subjective weights assignment, we will use a method proposed by Yakowitz et al. [6] for ranking based on order of importance defined by certain criteria. The method calculates best and worst scores for each option using the predefined importance order without requiring the specification of weights beforehand. The ranking is then carried out based on the average of best and worst scores. The best score (Area• for any alternative is obtained solving the following problem: 7Yt
Maximize
~-
wi Ai i=i
subject to:
Wl >_ w2 _> ... _> w,~,
~w~-
1
and
w~ _> 0
(2)
i=i
Similarly, the worst score (~min) is obtained by minimizing (A) subject to the same constraints. The solution to the above maximization and minimization problems can be expressed in a closed form as" 1 k ~max -- max{~-~k} and /~min -- min{t2k}, where "t2~ - ~ ~ Ai and k i=1
1 . . . m.
(3)
Yakowitz et al. [6] proved that, for each alternative, the solutions to these two linear programs determines the maximum and minimum scores possible for any combination of weights that do not violate the given importance order of the criteria. The difference between ~max and ~min describes the sensitivity of each alternative to the choice of weights, the greater the difference (Area• ,~min)~ the greater the sensitivity. The score used to rank alternatives is typically the average of ~max and ,~min. The proposed methodology for model calibration with data of varying levels of uncertainty uses the order of importance method for calculating a single score for each solution set. The score is then used to search the space of feasible solutions to find the optimal solution to the calibration problem. The advantage of this methodology is that it is able to make use of the "soft data" provided by professional judgment about the relative uncertainty in calibration data. The "soft data" is incorporated as an essential part of the methodology and is then transferred into numerical scores. Another advantage of this framework is that it can handle systems with both single and multiple responses without any modifications to the basic methodology. For example, consider a model with two system responses (R a and R b) such that uncertainty of observed response /~a is greater than that of/~b, the score (A) for the solution of the model calibration problem can be defined as: ,'~ -
W a X /~ a _nt.- W b X
/~ b
(4)
1382 where /~a and s are scaled error indices, e.g., coefficient of efficiency, that are functions of calibration errors (c a -- i/~a - i~ a) g i l d (C b -- i/~b - i~b). Given that Wa > Wb, the solution score (A) can then be estimated using the order of importance method similar to the manner described in Equations (2) and (3). If each of the observed system responses (/~a and IQb) is broken into groups with varying levels of uncertainties, then a global ranking of all the groups is required. The rest of calculation procedure is the same. It should be noted that if all the observed responses turned to be accurate, then the final solution is not sensitive to the particular choice of the weights, given that model structure is correct.
3. C A S E S T U D Y 3.1. P r o b l e m description Fishtrap Creek catchment is a small (96 km 2) agricultural-dominated watershed in northwestern part of Washington State. Figure 1 shows the location of four gauging stations that have been operating at different times along Fishtrap Creek. It is clear that no one gauge has been operating for the whole period of streamflow record, which started in 1948. In addition, there is a gap of 16 years between 1971 and 1987, during which no record about the flow of river exists at any location. The gauge downstream of Fishtrap Creek (G4) is the closest gauge to the catchment outlet, which is the point of hydrological significance for rainfall-runoff modeling of the catchment. Streamflow gauge G4 has a short time series of less than 3 years, which also contain data gaps as shown in Figure 2.
Figure 1. Streamflow gauges at Fishtrap Creek catchment. Dates show the period of streamflow recorded at each gauge.
1383
Figure 2. Daily streamflow of Fishtrap Creek at downstream gauge (G4) with the missing flow record highlighted in bold line (streamflow is normalized by the catchment area).
Modeling rainfall-runoff requires the presence of reasonably long record of streamflow and precipitation data. For example, Yapo et al. [7] found, for their data set, that eight years worth of data was needed to calibrate a rainfall-runoff model with 13 parameters. They also concluded that data containing greater hydrologic variability are likely to result in more reliable estimates of model parameters, which further support the need for longer data records. In the case of Fishtrap Creek catchment, the available streamflow record at G4 is not suitable for rainfall-runoff modeling of Fishtrap Creek catchment because of its short length and many gaps. Since streamflow at gauge G1 has a complete record of the recent 15 years (June-1987 to December-2001) and there is a common period in streamflow records at G1 and G4 (June-1996 to March-1999) (See Figure 1), streamflow record at G1 could be used to reconstruct the record at G4. The reconstructed record could then be used to calibrate rainfall-runoff model for Fishtrap Creek. 3.2. S t r e a m f l o w r e c o n s t r u c t i o n The G1 streamflow record was used to reconstruct the record at G4 using Support Vector Machines (SVM). SVM are mathematical algorithms for classification and regression derived from statistical learning theory [4]. SVM are similar to Artificial Neural Networks (ANN) in their ability to learn patterns and relations. They are more robust as learning algorithms than ANN due to their quadratic objective function that avoids the problem of local minima. SVM, in the process of learning relationships from training data, keeps training data with the highest significance in its model. These are called support vectors.
1384 SVM has a high generalization power because it has the ability to detect key vectors of training data to be used as support vectors. Because of the use of kernel functions, only a fraction of the training vectors are chosen. Kernels are non-linear functions that are used to transform the input vectors in order to more efficiently exploit information in the data. For complete description and mathematical treatment of SVM refer to Vapnik [4] and Crstianini and Taylor [1]. Developing SVM model starts by choosing the kernel function and specifying the parameters cost, which controls the trade-off between the complexity of the model and the error size, and c, which is error tolerance, c could be replaced by an equivalent parameter ~, which is the upper bound on the fraction of error. Calibration and verification carried on the SVM model used to reconstruct streamflow record at G4 indicated good performance (Figure 3). The average bias during calibration and verification was about 6 percent; bias is defined as the ratio of observation error to observed flow. Mean absolute error (MAE) is 0.003 and R 2 is 0.995. The reconstructed streamflow record at G4 is shown in Figure 4. Although calibration and verification results indicated excellent performance of the SVM model, reconstruction of streamflow this way has a major shortcoming in that it is not possible to independently verify the reconstructed streamflow record. There is always the possibility that the short streamflow record (3 years) used to develop the SVM model does not capture the whole pattern of variability in streamflow behavior. As a result, SVM model used to predict streamflow beyond the 3 years of training data generates streamflow with increased levels of uncertainty. In addition, it is difficult to quantify the magnitude of uncertainty in the reconstructed streamflow record as no independent verification of the reconstructed streamflow is available. Therefore, rainfall-runoff modeling at Fishtrap Creek catchment should take into consideration the higher uncertainty of reconstructed records compared to that of the observed part of the record. The developed methodology for handling calibration data with varying levels of uncertainty will be applied to model rainfall-runoff in Fishtrap Creek catchment. 3.3. M o d e l
structure
The linear transfer function that was used to model rainfall-runoff generation at monthly time steps is a simple tank model of a single storage and single hole (SS-SH model). The model has one state variable, $1, which is the level of water in the storage, and three parameters, Slmaz, which is the maximum level of water in the storage, Z1, which is the height of the runoff hole, and K1, which is the coefficient of runoff hole. Runoff is calculated from the following relations:
ETt
Slt-1
-
PETt x Slm~x
(5)
Qlt - ( S l t - 1 - Z1) x K1
(6)
Slt-
(7)
S l t _ ~ + (Pt - E T t ) - Q l t
where Pt is the total precipitation at time step (t). SS-SH monthly model has 3 parameters to be calibrated (K1, Z1, Slmax) in addition to the initial storage S l t = 0 . Close inspection of the model formulation shows that model parameters Z1 and Slmax are not independent,
1385
Figure 4. Observed and reconstructed streamflow records at gauges G1 and G4.
1386 but are related to value of state variable S1 and initial storage Slt=0. Hence, to simplify calibration, Z1 and Sl,~a~ are set to 0 and 25 cm, respectively. In the case of SS-DH daily model, Sl,~a~ is also set to 25 cm, while Z1 is not set a priori because the relative value of parameters Z1 and Z2 is important. Model calibration was carried out using genetic algorithms [5, 2]. The model is calibrated with data consisting of both observed and reconstructed records. The objective function (A) used for calibration is a two-criteria function. The two criteria are the coefficient of efficiency [3] of observed streamflow, Cobs, and the coefficient of efficiency of the reconstructed streamflow record, Crec. The coefficient of efficiency is scaled between 1 and minus infinity, with optimal value at 1. A coefficient of efficiency of zero indicates that the mean value is as good estimator as the developed model. The objective function is defined as the maximization of weighted average: maximize" A - Wob s x Cob s + wrec x Crec
(8)
where Wob s and wrec are weights that describe the relative influence of observed and reconstructed flow, respectively, on the objective function. 3.4. M o d e l c a l i b r a t i o n To asses the effect of model calibration with data containing different levels of uncertainty, a data set consisting of observed and reconstructed data was used to calibrate the SS-SH monthly model. The accuracy of model prediction is measured by the objective function in Equation (8). Figure 5 presents plots of the surfaces of the objective function for the SS-SH monthly model parameters: time constant (K1) and initial storage ($10). Height of the storage-hole (Z1) was set to zero and Slmax to 25. The two surfaces were developed using calibration data that consists of observed and reconstructed data assureing: (1) both are of equal reliability (Figure 5-1eft) and (2) observed data is more reliable than reconstructed data (Figure 5-right). Figure 5 shows two different surface responses for each of the calibration cases. Although in both cases the model was calibrated with the same data set, the two surfaces were different when emphasis was put on certain part of the calibration data in the second case. If both observed and reconstructed flow records are accurate, then the two surfaces should not have changed and the assignment of the weights (Wob s and Wrec) in the objective function Equation (8) should not have any effect on the overall objective function. The objective function is insensitive to the weights when there is no trade-off that reflects conflict. Conflict between different parts of the system response appears if inappropriate model structure was used, or if varying levels of uncertainties were present in different system responses. Therefore, the conclusion drawn from the plots of the objective function's surfaces is that a conflict does exist between observed and reconstructed streamflow records. Assuming that the chosen model structure is the true structure and that no significant errors are present in the data set, then the only explanation to the conflict in the observed and reconstructed flows is that they are generated from two different models. Two different models for the same catchment could potentially exist if structural changes in the catchment took place, hence changing the rainfall-runoff dynamics. Structural changes in Fishtrap Creek catchment are unlikely to have occurred during the period of record because of the absence of any documentations about such changes and because of the short
1387
Figure 5. Contours of the objective function (A - Wobs x Cob s + wrec x Crec) for SSSH model calibrated with observed and reconstructed data assuming: (1) both of equal reliability (left), and (2) observed data is more reliable than reconstructed data (right). Model's parameter Z1 - 0, and S l m a x - - 25.
period that separated observed and reconstructed parts (less than a year). Therefore, the most plausible explanation to the conflict between the observed and reconstructed streamflow is the presence of significant errors in one of the time series. For obvious good reasons reconstructed streamflow is most likely to contain such inaccuracies. 4. C O N C L U S I O N
We presented a simple framework for incorporating "soft data" about the accuracy of different part of calibration data into the calibration process. The framework makes use of a ranking method developed in operations research that requires only the specification of the importance order of the criteria used for ranking. The framework was demonstrated in the case of Fishtrap Creek catchment, where the short streamflow record with significant gaps was reconstructed using SVM. The reconstructed streamflow is inherently less accurate than the observed streamflow. Incorporating this educated judgment about the relative accuracy of the calibration data in the proposed framework resulted in identification of faulty model calibration as well as errors in SVM prediction of extreme streamflow events, which would have gone undetected otherwise. The developed framework, although demonstrated for a case of a single system response (runoff) with varying levels of uncertainty, could be easily implemented for cases where multiple system responses are used for calibration. In addition, multi-objective optimization could also be implemented within the proposed framework, where different objective functions could be used. The advantage of this framework when used in these applications
1388 is that it avoids bias and subjectivity in weights assignment to calibration criteria. REFERENCES
1. Crstianini, N. and J. S. Taylor. 2000. An introduction to Support vector machines and other kernel-based learning methods. Cambridge university press. 189p. 2. Holland, J. H. 1975. Adaptation in Natural and Artificial Systems. The University of Michigan Press, Ann Arbor, MI. 3. Nash, J. E., and J. V. Sutcliffe. 1970. River flow forecasting through conceptual models (1)" A discussion of principles. Journal of Hydrology, 10" 282-290. 4. Vapnik, V. N. 1995. The nature of statistical leering theory. Springer- Verlag, NY. 5. Wang, Q. J. 1991. The genetic algorithm and its application to calibrating conceptual rainfall-runoff models. Water Resources Research, 27: 2467-2471. 6. Yakowitz, D.S., L. J. Lane, and F. Szidarovszky. 1993. Multiattribute decision-making: Dominance with respect to an importance order of the attributes. Applied Mathematics and Computation, 1993; 54:167-181. 7. Yapo, P. O., H. V. Gupta, and S. Sorooshian. 1996. Automatic calibration of rainfallrunoff models: sensitivity to calibration data. Journal of Hydrology, 181:23-48.