Cold Regions Science and Technology 109 (2015) 33–42
Contents lists available at ScienceDirect
Cold Regions Science and Technology journal homepage: www.elsevier.com/locate/coldregions
Calibration of nearest neighbors model for avalanche forecasting Amreek Singh a,⁎, Bhanu Damir b, Kusum Deep b, Ashwagosha Ganju a a b
Snow & Avalanche Study Establishment, Chandigarh, India Indian Institute of Technology, Roorkee, India
a r t i c l e
i n f o
Article history: Received 27 February 2014 Accepted 21 September 2014 Available online 28 September 2014 Keywords: Nearest neighbors method Model calibration Uniform random sampling Artificial bee colony algorithm Heidke skill score Avalanche forecasting
a b s t r a c t Nearest neighbors (NN) method is very popular among avalanche forecaster world-over for data exploration and forecast guidance. The NN method based models employ snow-meteorological variables as input to search for past days with similar conditions and then analyze the events associated with past similar days to generate forecast. In order to achieve high forecast accuracy, a NN model needs to be calibrated by way of assigning distinct weights to various input variables. Thus model calibration may be treated as an optimization problem with objective to maximize the forecast accuracy. To investigate the structural characteristics of the problem, uniform random sampling method was applied on eNN10 (a NN model developed in India for avalanche forecasting). The problem has an issue of multiple optima. Population based metaheristics are suggested to handle such optimization problem better in comparison to classical analytical methods. Thus artificial bee colony algorithm, a metaheuristic inspired by the foraging behavior of honey bees, was explored to calibrate eNN10. The study was conducted for two climatologically diverse avalanche prone regions of Indian Himalaya. The results have been verified for operational forecast with significant gains in forecast accuracy in terms of Heidke skill score. © 2014 Elsevier B.V. All rights reserved.
1. Introduction Avalanche forecasting is primarily a decision making process taking into account various contributory factors. McClung (2002a,b) describes the process of avalanche forecasting in detail. Avalanche forecasters often rely upon NN method based models for assistance in decision making since operation of these models is intuitive and similar to the processes of conventional forecasting (Purves et al., 2003). NN models search for past days with similar snow-meteorological conditions and the information about the events associated with these past days (called nearest neighbors) then helps in generating forecast for the current day. However, in order to achieve high forecast accuracy from NN models, certain model parameters are required to be pre-determined as emphasized by many avalanche workers (Buser et al., 1987; Gassner et al., 2000; Purves et al., 2003). Mathematically, it is an optimization issue. In the field of hydrology, Duan et al. (1992) carried out a somewhat similar exercise of pre-determining proper values of parameters for conceptual rainfall-runoff (CRR) model. The procedure to determine optimal values of model parameters is termed as model calibration (Duan et al., 1992). To our knowledge, study by Purves et al. (2003) is the only previous published work on automatic calibration of a NN method based model for avalanche forecasting. Purves et al. (2003) applied genetic algorithm (Beasley et al., 1993a,b) for calibration of the Cornice model and discussed in detail the effect of various statistical measures of forecast ⁎ Corresponding author. Tel.: +91 172 269 9804; fax: +91 172 269 9970. E-mail address:
[email protected] (A. Singh).
http://dx.doi.org/10.1016/j.coldregions.2014.09.009 0165-232X/© 2014 Elsevier B.V. All rights reserved.
accuracy on model calibration. However, the subject of NN model calibration needs to be investigated in detail in order to ensure globally optimal solution and the operational effectiveness of calibrated model. The present work investigates the structural characteristics of the optimization problem of NN model calibration, suggests an appropriate optimization method and verifies the operational effectiveness of the calibrated model. In this regard a popular metaheuristic (Blum and Roli, 2003) known as artificial bee colony (ABC) algorithm (Karaboga, 2005) was deployed to calibrate eNN10, a NN model for avalanche forecasting developed at Snow & Avalanche Study Establishment (SASE) of India. The study was conducted for two climatologically diverse avalanche prone regions of Indian Himalaya. The paper consists of 7 main sections. Section 2 describes the eNN10 model in detail. Section 3 explains the scope of study. Section 4 describes the study area and data characteristics. Section 5 discusses the model calibration as an optimization problem, highlights the issue of multiple optima and introduces the ABC algorithm for global optimization. Results and discussion are presented in Section 6 illustrating that despite some challenges NN models can be efficiently calibrated and implemented for operational forecast with higher accuracy. Conclusions are summarized in Section 7. 2. eNN10: a nearest-neighbors model for avalanche forecasting NN method for avalanche forecasting was introduced by Obled and Good (1980). Buser (1983) packaged it in the form of an interactive model named NXD. Later many other feature-rich variants of NN models were proposed for avalanche forecasting (Brabec and Meister,
34
A. Singh et al. / Cold Regions Science and Technology 109 (2015) 33–42
2001; Buser et al., 1987; Gassner and Brabec, 2002; Gassner et al., 2000; Heierli et al., 2004; Kristensen and Larsson, 1994; McClung and Tweedy, 1994; McCollister et al., 2002; Mérindol et al., 2002; Purves et al., 2003; Singh and Ganju, 2008; Singh et al., 2005). NN models may employ direct and derived snow-meteorological variables as input. Table 1 describes the 10 snow-meteorological variables of the eNN10 model. While other researchers, including eNN10 developers, chose Euclidean distance metric to determine nearest neighbors, McClung and Tweedy (1994) used Mahalanobis distance metric for avalanche prediction at Kootenay Pass, British Columbia, Canada. The use of Euclidean distance metric assumes no correlation between variables whereas Mahalanobis distance metric has implicit features to deal with correlated variables. Invariably, NN models with Euclidean distance metric apply distinct weights to input variables. In eNN10, for D-dimensional vector-space, the weighted Euclidean distance d(i, j) between two data vectors xi (xi1, xi2,…, xiD) and x j (x j1, x j2,…, x jD) is calculated as — vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u D 2 uX i j wk norm xk −norm xk dði; jÞ ¼ t
ð1Þ
k¼1
where, wk: weight corresponding to variable xk, k ∈ {1, …, D} norm(xk): value of variable xk, k ∈ {1, …, D} normalized by its range in the database. The following steps are implemented in eNN10 to generate forecast for any day i — 1. Calculate distance d(i, j) between current day i (under consideration for forecast) and past referenced record j ∈ {1, …, n} using Eq. (1), where n is the number of past referenced records in the database, 2. Sort all d (i, j) (calculated in step 1) in ascending order, 3. Corresponding to K (generally 10) shortest distances d (i, j), segregate records from database (call ‘K-nearest neighbors’), 4. Note events (yes/no) corresponding to K-nearest neighbors, 5. Calculate relative frequency P a ðiÞ ¼ KKa of avalanche occurrence with respect to current day i, where Ka is no. of yes (avalanche) days out of K-nearest neighbors. 6. If Pa (i) ≥ Pthreshold then forecast the current day as an avalanche day, else forecast the current day as no-avalanche day. In the above procedure, a commonly adopted value of Pthreshold is 0.3 (Buser et al., 1987). It is also obvious that with K = 10, Pa ∈ {0, 0.1, 0.2, …, 1.0}.
calculating Pa) as well as of Pthreshold are also decided by experts out of their experience of using the model. Hence, the optimal values of these parameters are not assured leading to uncertainty about the model forecast accuracy. These facts call for application of an automated objective method to calibrate the model (i.e. to determine the appropriate values of w (w1, w2,…, wD), K and Pthreshold) for maximized forecast accuracy. Since, there are three parameters i.e. w, K and Pthreshold which play their roles at different stages of the process, treating all of them simultaneously is a complex and computationally intensive task. In order to avoid these circumstances the scope of present study is limited to the determination of globally optimal values of w only, while K and Pthreshold continue to carry fixed values as determined by the experts (i.e. 10 and 0.3 respectively).
4. Study area and data characteristics In order to validate the versatility of our study and develop a wider perspective on the subject of NN model calibration, we conducted our study for two climatologically diverse areas of Indian western Himalaya — (1) Chowkibal–Tangdhar (CT) sector in Pir Panjal range and (2) Drass–Kargil (DK) sector in Great Himalayan range. The rough geographical extents of these regions are shown in Fig. 1. Snowmeteorological data from Stage-II (2650 m a.s.l.) and Drass (3230 m a.s.l.) observatories have been used to represent the CT sector and DK sector respectively. While the CT sector may be classified to have maritime climate, the DK sector has characteristics of continental climate (McClung and Schaerer, 2006). A total of 17 avalanche sites affect the part of CT sector (approx. 25 km2 area coverage) considered for the study, while the DK sector (approx. 400 km2 area coverage) is affected by 109 major avalanche sites along about 126 km of roads and treks. The past records corresponding to 10 snow-meteorological variables (Table 1) for a period from November-1999 to April-2013 (covering the months from November to April each winter) were archived in a database with respective dates of observation for each study area. Thus there are 1762 and 1714 past records corresponding to CT sector and DK sector respectively available in the database (only complete records with values for all the variables were considered). The reference time of these observations is 8:30 a.m. IST (GMT + 05:30) on the date of observation. Corresponding to each date/record, the associated avalanche information (Occurrence — Yes/No, size, distance traveled, aspect, natural/artificial etc.) are also stored in database. According to database, all the avalanche occurrences considered under this study (with respect to both study areas) are naturally triggered. Artificial avalanche release is not a common practice in India.
3. Scope of study In Eq. (1), the introduction of weight parameters wk, k ∈ {1, …, D} is supposed to improve the model accuracy if proper values are assigned to them. However, the values of wk, k ∈ {1, …, D} are generally assigned by the experts subjectively on the basis of their experiences of using the model. The values of K (the number of nearest-neighbors considered for
Table 1 Input snow-meteorological variables of eNN10 model. No. (k)
Variable (xk)
Unit
1 2 3 4 5 6 7 8 9 10
Snow surface temperature Air temperature Air temperature change from previous day Wind speed (averaged over last 24 h) New snow in 24 h New snow in 48 h New snow in 72 h Snowpack depth Snowpack water equivalent Free penetration from surface
°C °C °C m/s m m m m m m
Fig. 1. Study areas — CT sector and DK sector.
A. Singh et al. / Cold Regions Science and Technology 109 (2015) 33–42
5. Model calibration as an optimization problem The purpose of model calibration is to maximize the forecast accuracy. The accuracy of any binary forecast system (as in the case of avalanche forecast with two possible outcomes: ‘yes’ or ‘no’) can be assessed in terms of various statistical skill scores. Skill score primarily determines the skill in a systematic forecast against a reference forecast such as constant forecast or random forecast, in terms of an accuracy measure such as ‘Percent Correct’ or ‘Critical Success Index’. Especially in cases of skewed distribution of events, as in the case of avalanches where ‘yes’ events are lesser compared to ‘no’ events across the season, the skill scores such as Heidke skill score (Heierli et al., 2004; Wilks, 1995) are better representatives of forecast accuracy as compared to simple scores. Corresponding to distribution of forecasts and observations (Table 2), Heidke skill score (HSS) can be defined as — HSS ¼
ad−bc : ðða þ cÞ ðc þ dÞ þ ða þ bÞ ðb þ dÞÞ=2
ð2Þ
h i Þ as the basic HSS above is based upon the ‘Percent Correct’ ¼ ðaþd N aþc dþc dþb þ N accuracy measure with ‘random forecast’ ¼ aþb N N N as reference forecast. HSS has following main properties — HSS = 1: perfect forecast HSS N 0: skillful, better than reference HSS b 0: less skillful than reference. Thus, NN model calibration can be treated as an optimization problem with the objective to determine vector w such that HSS is maximized. In optimization problem parlance, HSS and w would be called objective function and decision variables vector respectively. The following constraints are applicable on decision variables w in context of eNN10 model calibration — (i) Inequality constraints: None (ii) Equality constraints: None (iii) Variable limits: 0 ≤ wk ≤ 1, k ∈ {1, …, D}, D = 10. 5.1. Experimental plan In order to assess the performance of the calibrated model in operational forecast, the study was conducted as follows — (a) Divide available dataset in two parts: (1) reference dataset — to calibrate model, and (2) verification dataset — to verify the effectiveness of the calibrated model. Thus, out of the total data archived for each study area described in Section 4, data of period from 1999 to 2012 was earmarked as reference dataset and remaining data of period from November 2012 to April 2013 was treated as verification dataset. (b) Calibration phase: Calibrate the model using an appropriate optimization method (to be discussed in subsequent sections) and the reference dataset. In this phase, for a given decision variable vector w, the objective function HSS was calculated by leave-one-out cross-validation (Kohavi, 1995) of model forecasts as described in Fig. 2. As the name suggests, leave-one-out crossvalidation uses one observation at a time as the test set and the
35
For i= 1 to n { For j = 1 to n; j ≠i { Calculate d(i,j) using Eq. (1) } Find K nearest neighbours Calculate Pa Generate forecast(i) using Pthreshold } Generate contingency table (Table 2) Calculate HSS Fig. 2. HSS evaluation by leave-one-out cross-validation of forecasts (n corresponds to number of records in reference dataset).
remaining observations as the training set. The procedure is then repeated on all observations. (c) Verification phase: It is batch forecasting of the events of the verification dataset in order to assess the performance of calibrated model for an independent dataset (not used for calibration). Therefore the verification phase is similar to implementing calibrated model for operational forecasting. For verification phase HSS is evaluated by slight modifications in the procedure described in Fig. 2. The outer for loop is modified as i = 1 to n′, where n′ is the number of records in verification dataset. Inner for loop corresponds to reference data (1999–2012) and remains unchanged. The exclusion condition (i ≠ j) does not apply here. 5.2. The issue of multiple optima The optimization problem of NN model calibration, as described above, draws a parallel with the calibration of conceptual rainfallrunoff (CRR) model by Duan et al. (1992). Similar to the work of Duan et al. (1992), uniform random sampling (URS) method was applied to establish the distinct features of this problem particularly to (1) detect the locations and number of optima, and (2) obtain information about parameter sensitivity. The method of URS is a primitive probabilistic approach to global optimization and can also be used to construct objective function response surfaces or contours. 20,000 points were sampled at random from the search space (0 ≤ wk ≤ 1, k ∈ {1, …, D}, D = 10) using a uniform probability distribution. The objective function (i.e. HSS) values were computed at each point as per procedure given in Fig. 2 using reference data. The point with the best objective function (i.e. the maximum HSS) value was taken as an estimate of the global optimum. Further, two types of graphical projections were constructed — (1) scatter plot of HSS versus distance of each of 20,000 points from the optimum normalized by its range; and (2) contour plots of HSS versus decision variables using a subset with 2,000 points for easy computation (see Figs. 3 and 4). HSS Contours have been plotted against two variables at a time due to dimensional limits of plotting. The general scattering pattern in Figs. 3a and 4a is similar. Negative regression of HSS on normalized distance in both the cases is easily understood. However, the important observation is the broad spread of
Table 2 Contingency table depicting the distribution of forecasted versus observed events. Forecast
Observation
Yes No
Yes
No
Hits (a) False alarm (b) Forecasted ‘yes’ (a + b)
Misses (c) Correct negatives (d) Forecasted ‘no’ (c + d)
Observed ‘yes’ (a + c) Observed ‘no’ (b + d) Total (N = a + b + c + d)
36
A. Singh et al. / Cold Regions Science and Technology 109 (2015) 33–42
Fig. 3. Summary of uniform random sampling analysis for CT sector. a) Scatter plot of HSS versus distance of random point from the optimum normalized by its range. b to f) Contour plots of HSS versus decision variables.
normalized distance corresponding to HSS at almost all levels of HSS. This suggests the existence of multiple maxima, which is also supported by the contour plots (Figs. 3bcdef and 4bcdef) of HSS against pairs of decision variables showing multiple distinct regions of maxima. The nature of parameter sensitivity is specific to the study area. For CT sector, HSS seems to be more sensitive to w1, w2, w6 and w10. On contrast, for DK sector w2, w4, w8 and w9 seem to govern HSS more than the others. Thus, the decision variables respond differently for the two study areas, i.e. the potential regions of optimal solution are different for the two study areas. Further, the correlation coefficients between HSS and the decision variables were determined using all 20,000 points (Table 3). The relative values of correlation coefficients support visual observations described above, though their absolute values suggest weak linear relationship between HSS and individual decision variables. The only exceptions may be w9 for CT sector and w4 for DK sector with moderate degree of linear relationship. Besides, these are only univariate observations and the explicit nature of sensitivity may be complex in multivariate space. Nevertheless, the above observations should serve in making good initial guess of the solution.
5.3. Selecting appropriate optimization method For complex high-dimensional problems such as the current problem where the explicit nature of the problem (derivatives, convexity, continuity etc.) is not well understood and multiple optima exist, classical analytical optimization methods cannot be applied. In such cases population based non-deterministic optimization methods, popularly known as metaheuristics (Blum and Roli, 2003), hold promise. Metaheuristics efficiently explore the search space in order to find a (near-)global optimum (Blum and Roli, 2003). Population based metaheuristics make iterative progress in a population through a guided random search to achieve the desired end and incorporate mechanisms to avoid getting trapped in confined areas of the search space. Considering the reported advantages of ABC algorithm over other metaheuristics for a wider set of benchmark problems (Abu-Mouti and El-Hawary, 2012), we decided to use it for handling the problem of NN model calibration. However, the ABC algorithm is not central to this work and there is no surety of it being the best for handling the present problem. In fact, other metaheuristics such as Genetic
A. Singh et al. / Cold Regions Science and Technology 109 (2015) 33–42
37
Fig. 4. Same as Fig. 3 but for DK sector.
Algorithm (as used by Purves et al., 2003) or Particle Swarm Optimization (Kennedy and Eberhart, 1995) could also have been deployed for this work. In this regard, ‘no free lunch’ theorem (Wolpert and Macready, 2005) stating that “any two algorithms are equivalent when their performance is averaged across all possible problems” is very relevant. The important factor while deploying an optimization algorithm is exploitation of problem-specific knowledge to achieve better than random performance (Wolpert and Macready, 2005). In context of the present study, the problem specific knowledge has been applied in selection of appropriate accuracy measure in terms of HSS as explained earlier.
5.4. ABC algorithm for global optimization ABC algorithm is inspired by the swarm intelligence particularly the foraging behavior of honey bees. The position of a candidate solution (or decision variable vector) in search space of given optimization problem is referred as food source and the fitness of that candidate solution corresponds to the quality of food. A population of food sources (candidate solutions) is iteratively processed through characteristic actions of three types of artificial bees: employed bees, onlookers and scouts. The numbers of employed bees, onlooker bees and food sources are the same. Initially food sources are generated randomly within the search
Table 3 Linear correlation coefficients between HSS and each decision variable for two study areas. Study area
Decision variable
Correlation coefficient between HSS and decision variable
↓
→
w1
w2
w3
w4
w5
w6
w7
w8
w9
w10
0.2559 −0.0189
0.2422 −0.2179
0.0348 −0.0302
−0.1100 0.3824
0.0271 0.0990
0.2378 0.0461
0.0495 0.1442
−0.1328 0.1463
0.0751 −0.2139
−0.5590 0.0074
CT sector DK sector
38
A. Singh et al. / Cold Regions Science and Technology 109 (2015) 33–42
space of given problem and allocated to the employed bees. Assuming ui = (ui1, ui2,…, uiD) as the ith food source, this operation can be defined as follows: þ randð0; 1Þ u j
i
uj ¼ uj
min
max −u j min
ð3Þ
where, i ∈ {1, 2,…, SN}; j ∈ {1, 2,…, D}; SN is the number of food sources (it equals half of population size representing the total numbers of employed bees and onlooker bees put together) and D is the number of decision variables (problem dimensionality); rand (0,1) is a random number in the range 0, 1; and uj _ max, uj _ min are the upper and lower bound of jth decision variable respectively. Each employed bee evaluates the quality of food at the food source allocated to it and then searches around for the better food source. A bee is allowed to search for better food source only once per iteration and the search direction is chosen randomly. To find a new candidate food position vi = (vi1, vi2,…, viD) from the old one ui = (ui1, ui2,…, uiD), the following expression is used: i i i k v j ¼ u j þ randð−1; 1Þ u j −u j ; k≠i
ð4Þ
where k ∈ {1, 2,…, SN} and j ∈ {1, 2,…, D} are randomly chosen indexes; k is different from i; and rand (−1, 1) is a random number in the range −1,1. If new food source is better, it replaces the old one in the memory of corresponding bee. Otherwise, the old one is retained. This marks the end of employed bees phase. Onlooker bees are then deployed to various food sources identified by employed bees, in numbers proportionate to probability Pi associated with each food source — fit P i ¼ XSNi i¼1
ð5Þ
fit i
where fiti is the fitness value of candidate solution i. Onlooker bees then further search for better food source around the food sources allocated to them (Eq. (4)). Again if a better food source is found, the old one is abandoned. With this ends the onlooker bees phase and also the first iteration of algorithm. The best solution (position of best food source) is memorized. Successive iterations start with employed bees taking over the food positions updated by onlooker bees of previous iteration. If any food source remains unchanged for a predetermined number of iterations (called limit, a control parameter of ABC algorithm), the scout bee activates. The particular food source is abandoned and a new food source is selected randomly within the search space as per Eq. (3). Next iteration continues with one new food source selected by scout bee. Only one scout bee is allowed to activate per iteration. It should be noted that the components of the new candidate food position vi may violate the predefined boundary constraints of search space. Among the various possible solutions to tackle this problem such as resetting schemes and penalty schemes. a simple and popular method is to set the violating components to the violated bounds, i.e., i
vj ¼ uj i vj
¼ uj
i
min ;
if v j bu j
;
i v j Nu j
max
if
min ; max
:
ð6Þ
matters of experience and are problem specific. Jain et al. (2001) offers some practical guidelines of the application of termination criteria. In the current study, the first of the three termination criteria mentioned above i.e. maximum number of generations has been adopted in the form of maximum cycle number (MCN). Section 6 discusses the appropriateness of adopting MCN in context of current problem. Two major characteristics of any metaheuristic algorithms are: intensification (or exploitation) and diversification (or exploration) (Blum and Roli, 2003). Diversification means to generate diverse solutions so as to explore the search space on a global scale, while intensification means intensive search around an already identified good solution. While diversification via randomization allows the search to escape from local optima, intensification ensures the best solution. A good balance of these two components will usually ensure the global optimality of solution. In ABC algorithm, while the employed bee phase explores the entire search space for optimal solution, the onlooker bee phase allows intense search around the good food sources identified by the employed bees and Scout bee prevents the solutions from getting trapped in local optima. ABC algorithm is, therefore, a balanced algorithm and has been applied in wide range of applications (AbuMouti and El-Hawary, 2012). For further details on ABC algorithm and related aspects readers are advised to refer to Karaboga (2005) and Karaboga and Basturk (2007, 2008, 2009). ABC algorithm was applied to calibrate the eNN10 model under calibration phase as per control parameters summarized in Table 4. These control parameters were decided following the general guidelines given in literature (Karaboga, 2005; Karaboga and Basturk, 2007, 2008, 2009). According to these control parameters, one calibration experiment involves 60,000 fitness function evaluations. Now since our problem is characterized by multiple optima and ABC algorithm is a random search method, it is very important to confirm the stability of algorithm for such a problem. We conducted 24 experiments of model calibration using reference dataset and different initial population each time to analyze the pattern of calibration results over repeated experiments. The reason behind the number of experiments being 24 is just that our machine has 2 hexa-core processors and model was calibrated twice through each core in parallel processing mode. The results of the above experiments and discussion thereon are present in the succeeding section. 6. Results and discussion The HSS and corresponding optimized w determined under multiple experiments conducted are summarized in Tables 5 and 6 for CT sector and DK sector respectively. Out of 24 experiments, the results of experiment with highest HSS are marked in bold in each table. Mean (μ), standard deviation (σ) and Coefficients of Variation (CoV) corresponding to HSS obtained over multiple experiments are also shown. Low CoV values at 0.64% and 1.98% for CT sector and DK sector respectively confirm the stability of ABC algorithm for model calibration. The box-plots of calibrated values of decision variables (as summarized in Tables 5 and 6) and progression of best HSS through successive iterations under above-mentioned 24 experiments are shown in Figs. 5 and 6 for the two study areas. The progression of best HSS over
The algorithm terminates according the pre-defined termination criteria. The following termination criteria are most commonly used among many reported in literature — • maximum number of generations or function evaluations, • reaching very close to the known global optima, and • maximum number of consecutive generations without improvement in fitness function value. Jain et al. (2001) studied 8 termination criteria including the three mentioned above. There is no general rule for adopting a termination criterion. To adopt and define the terms of a termination criterion are
Table 4 ABC algorithm control parameters applied for model calibration experiments. D (=10) is the number of decision variables (problem dimension) and SN (=30) is the number of food sources. Control parameter
Criterion
Value
Population size Limit Maximum cycle number (MCN)
D∗6 SN ∗ D Problem specific
60 300 1000
A. Singh et al. / Cold Regions Science and Technology 109 (2015) 33–42
39
Table 5 Results of calibration experiments corresponding to CT sector. The results with highest HSS are marked in bold. Exp no.
HSS
w1
w2
w3
w4
w5
w6
w7
w8
w9
w10
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Mean (μ) Std dev (σ) CoV
0.556244 0.556244 0.560080 0.557651 0.555196 0.554832 0.561897 0.556244 0.556244 0.558418 0.550364 0.559883 0.549418 0.556062 0.556798 0.557651 0.550203 0.554133 0.564644 0.558224 0.557175 0.560884 0.556427 0.553808 0.556614 0.003574 0.64%
0.28518 0.67297 0.30214 0.51953 0.76614 0.33779 0.24017 0.4775 0.36988 0.69706 0.43552 0.35117 1 0.20329 0.37329 0.7356 0.42848 0.69529 0.14704 0.38637 0.08431 0.13122 0.56572 0.31509
0.48532 1 0.54475 0.75438 0.76767 0.44295 0.34596 0.57102 1 0.66499 0.33359 0.45055 0.57366 0.90425 0.55932 0.68028 0.99294 0.61002 0.70509 0.61071 0.41799 0.66745 0.87831 0.77738
0.12701 0.34832 0.71109 0.25173 0.40729 0 0.38218 1 0.41766 0.3598 0.17465 0.70446 0.07389 0.15531 0.49921 0.32114 0.59332 0.43117 0.0887 0.49945 0.21943 0.2847 0.30092 0.11055
0.22378 0.72832 0.27851 0.42948 0.61551 0.10175 0.34972 0.95811 0.14393 0.65307 0.4934 0.37975 0 0.03368 0.73373 0.55656 0.98899 0.68131 0.06011 0.65713 0.4996 0.06888 0.52279 0
0.19543 0.37615 0.22522 0.36995 0.32145 0.04068 0.13899 0.03219 0.38833 0.25627 0.4026 0.22555 0.33145 0.57902 0.07706 0.2584 0.50248 0.07271 0.10235 0.18917 0 0.12325 0.42108 0.42265
0.53471 1 0.73334 0.70577 1 0.76276 0.53867 0.8419 1 0.59547 0.46506 0.84575 0.36687 0.70573 0.64298 0.96339 0.70846 0.68234 0.95173 0.54407 0.35884 0.79531 0.65625 0.66163
0.23769 0.82282 0.06339 0.61547 0.16359 0.72302 0.01481 0.16136 0 0.19257 0.09747 0.02606 0.65427 0.8646 0.01347 0.20221 0.43054 0.05098 0.75554 0 0 0.82198 0.73293 0.29527
0.52904 1 0.79027 0.7343 0.86486 0.46335 0.62084 0.26196 1 0.519 0.55081 0.93937 0.16419 0.42952 0.70276 0.89274 0.27987 0.66333 0.4351 0.51664 0.45004 0.43782 0.80217 0.60232
0.75214 0.99278 0.71866 0.72438 0.79743 1 0.54266 0.90654 0.61968 0.8888 0.52334 0.71666 0.32719 1 1 0.87271 0.35674 0.90423 0.97495 1 0.98275 1 0.88819 0.96028
0.07562 0.10837 0 0.06689 0.08236 0.06063 0 0 0 0.06392 0.01864 0 0 0.0614 0 0.08884 0.15353 0 0.05054 0 0.00489 0.05642 0.05399 0.02598
successive iterations (Figs. 5b and 6b) follow similar trajectories under all the experiments and converge closely substantiating the stability of ABC algorithm for NN model calibration. Further, the medians and narrow inter-quartile ranges of optimized values of w9 for CT sector (Fig. 5a) and that of w4 for DK sector (Fig. 6a) are in good agreement with corresponding correlation coefficient values shown in Table 3. This fact endorses the ability of ABC algorithm to efficiently explore the search space. At this stage, it would be interesting to know how much gain has been achieved in forecast accuracy after the automatic model
calibration. Hence, a comparison was made between HSS value obtained after model calibration and the HSS obtained by setting all decision variables equal to 1 i.e. wk = 1, k ∈ {1, …, 10}. In this regard, 17th and 12th experiments were chosen randomly for CT sector and DK sector respectively to represent HSS after model calibration of respective areas. Table 7 illustrates the comparisons. A substantial 20.1% and 25.8% gain has been achieved in HSS for CT sector and DK sector respectively under calibration phase. Further, the verification phase results corresponding to solutions obtained from the 17th and 12th experiments for CT sector and DK
Table 6 Results of calibration experiments corresponding to DK sector. The results with highest HSS are marked in bold. Exp no.
HSS
w1
w2
w3
w4
w5
w6
w7
w8
w9
w10
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Mean (μ) Std dev (σ) CoV
0.436921 0.433004 0.427337 0.419474 0.435266 0.449380 0.427662 0.422064 0.443946 0.440200 0.439430 0.427021 0.420278 0.418424 0.433407 0.423042 0.420381 0.429871 0.435266 0.420833 0.420610 0.428574 0.427650 0.423622 0.429319 0.008498 1.98%
0 0 0 0.76332 0.82603 0.00154 0.75625 0 0 0 0.13004 1 0 0.84205 0 0.08075 0 0.86201 0 0.12618 0 0.27129 0.91431 0.98784
0.13708 0.09569 0.53064 0 0 0.13195 0 0.59471 0.14174 0.07015 0 0 0.18104 0 0.11006 0.28123 0.06671 0 0.06118 0.17822 0.50895 0 0.21778 0
0 0.0434 0 0.30653 0.3244 0.24052 0.26363 0 0 0 0.25922 0.44891 0.01974 0.45695 0.5062 0 0.54238 0.27529 0.39916 0.51942 0 0.3607 0.19457 0.52463
0.74187 0.85412 0.89014 0.96344 0.97913 0.86492 0.78994 0.95371 0.67946 0.43135 0.75942 0.9801 0.86242 0.93963 0.78338 0.75661 1 1 0.76337 1 0.6833 1 0.84197 0.85113
0.17697 0.00275 0.14847 0.29877 0.09071 0.26294 0.00114 0.24121 0.27078 0.2398 0.34741 0.02435 0.73913 0 0.6095 0.98087 0.38309 0.45545 1 0.57884 0.73446 0.65165 0.35176 0
0.02795 0.04771 0.48632 0.28124 0.09064 0.29547 0.49726 0.71448 0.14954 0.0309 0.00688 0.32328 0.15829 0.13002 0.46506 0.43728 0.50461 0.09888 0.17586 0 0.53356 0 0.38864 0.63235
0.45089 0.40971 0.21173 0.34821 0.51342 0.32607 0.22627 0.12548 0.33645 0.21758 0.27776 0.54484 0.603 0.34524 0.29223 0.7528 0.27195 0.77097 0.32976 0.15888 0.57669 0.49058 0.38536 1
0.46181 0.83029 0.55076 0.81318 0.63787 0.70141 0.79663 0.60769 0.24941 0.32764 0.89725 0.84832 0.94808 0.68382 0.62506 0.64728 0.733 0.99871 0.47396 0.86726 0.52193 1 0.21576 0.74128
0.00676 0.13127 0.02329 0.06827 0.05172 0 0 0.05069 0.00427 0.00281 0 0 0.05002 0.17619 0 0.20405 0 0.06321 0.03605 0.04575 0.18157 0.05793 0.045 0.05488
1 0.93076 0.13554 0.28336 0.57226 0.21942 0.3669 0.49204 1 0.4876 0.71438 0.7312 0.33123 0.14016 0.1235 0.48608 0.57743 0.12862 0.04578 0.67648 0.3779 1 0.08353 0.53601
40
A. Singh et al. / Cold Regions Science and Technology 109 (2015) 33–42
Fig. 5. Summary of calibration experiments for CT sector. a) Progression of best HSS through successive iterations in 24 experiments. Boxplot of calibrated values of decision variables (wk) obtained through 24 experiments. Boxes represent the interquartile range. Whiskers extend up to 1.5 times the interquartile range. Open circles indicate outliers. b) Progression of best HSS through successive iterations in 24 experiments.
sector respectively and that corresponding to wk = 1, k ∈ {1, …, 10} are also included in Table 7. Relative gains of 112% and 46.3% for CT sector and DK sector respectively are very promising as verification phase is similar to operational forecast. Hence, substantial improvement is expected in forecast accuracy in other areas as well if the calibrated model is used for these areas. In order to study the response of the ABC algorithm on varying the control parameters (Table 4), in a special experiment, MCN was increased from 1000 to 5000. However, only marginal further gain was found in HSS for both the study areas. Thus, MCN = 1000 can be adopted as a suitable termination criteria for calibration of eNN10 by ABC algorithm with its current configuration. Therefore, with MCN = 1000 it can reliably be assumed that the HSS achieved at the termination of algorithm is sufficiently close to true global optima. Thus, solutions summarized in Tables 5 and 6 may all be considered as acceptable and used for operational forecast for respective study areas. However, a dynamic termination criterion such as ClusTerm (Jain et al., 2001) should further add to the reliability of the solution. Despite the noticeable improvement in forecast accuracy after calibration, the mean HSS values across 24 experiments corresponding to both study areas (0.556614 for CT sector and 0.429319 for DK sector) are still far from the ideal HSS score (i.e. 1.0). A certain factor responsible is the quality of input data (especially the avalanche occurrence observations) and representation of snow-meteorological conditions and their trends in the model in terms of input variables (Table 1). These issues are well recognized and are matter of concern for avalanche
practitioners and model developers (Buser, 1983; Buser et al., 1987; Gassner et al., 2000; Obled and Good, 1980). Another contributing factor restricting the HSS from reaching the ideal value could be the subjectively assigned values of K and Pthreshold (10 and 0.3 respectively). Determining the optimized values of K and Pthreshold alongwith w could be a subject of future study. Earlier Purves et al. (2003) attempted to objectively calibrate the Cornice, a nearest neighbors method based model. They deployed genetic algorithm, another population based metaheuristic, for maximizing accuracy of the Cornice for forecasting of avalanches in Lochaber area of Scotland. Experiments were carried out with three accuracy measures: the unweighted average accuracy, the bias, and the Hanssen and Kuipers discriminant. A total of 30 iterations were carried out with a population size of 20 (i.e. 600 functional evaluations in total). Results with unweighted average accuracy and the Hanssen and Kuipers discriminant were reported to be better in comparison to the bias, proving that selection of good fitness metric is important for better results from optimization algorithm. In view of observations by Purves et al. (2003), our choice of HSS as fitness metric (or objective function) is well justified for the reason explained in Section 5. 7. Summary and conclusions NN method based avalanche forecasting models are very popular for their exploratory features as well as probabilistic output. The accuracy of a model though depends on how well the model is calibrated. In
A. Singh et al. / Cold Regions Science and Technology 109 (2015) 33–42
41
Fig. 6. Same as Fig. 5 but for DK sector.
general the model is calibrated subjectively by experts based on their experience of using model and understanding of influence of various variables on avalanche occurrences. In such circumstances, the highest forecast accuracy is not assured. The present work deals with the subject of automatic model calibration through an objective method and treats it as an optimization problem. In this regard calibration experiments were conducted with eNN10, a NN model for avalanche prediction developed in India. For this purpose two diverse climatic regions of western Himalaya were picked as study areas and HSS was adopted as fitness metric (accuracy measure of forecast). URS analysis was carried out to investigate the structural characteristics of the problem
and sensitivity of various decision variables. It demonstrated that NN model calibration is characterized by multiple optima and also identified the potential regions of optimal solutions within search space. Population based metaheuristic are known to be better-suited for handling such problems compared to classical analytical methods. Thus the ABC algorithm, a population based metaheuristic, was deployed to calibrate the model. Results obtained indicate a substantial gain in forecast accuracy. Extensive numerical experimentation endorsed the stability and efficiency of population based metaheuristics in general and of ABC algorithm in particular for handing the problem of NN model calibration. However, a few issues still need to be addressed — data quality,
Table 7 A comparison of forecast accuracy in term of HSS before and after model calibration under calibration phase and verification phase. Before calibration all decision variables were assumed to be carrying a similar value equal to 1 i.e. wk = 1, k ∈ {1, …, 10}. Decision variable vector w from 17th calibration experiment corresponding to CT sector and that from 12th experiments corresponding to DK sector were selected randomly for comparative analysis. CT sector
DK sector
Before calibration
After calibration
Before calibration
After calibration
(wk = 1, k ∈ {1, …, 10})
(w from 17th experiment)
(wk = 1, k ∈ {1, …, 10})
(w from 12th experiment)
Calibration phase (leave-one-out mode) Heidke skill score 0.458221 Relative gain –
0.550203 20.1%
0.339483 –
0.427021 25.8%
Verification phase (batch forecast mode) Heidke skill score 0.147260 Relative gain –
0.312306 112%
0.186937 –
0.273465 46.3%
42
A. Singh et al. / Cold Regions Science and Technology 109 (2015) 33–42
selection of variables representative for the conditions leading to avalanche events, and objective determination of other model parameter values (K and Pthreshold). Adoption of a dynamic termination criterion such as ClusTerm (Jain et al., 2001) is also expected to enhance the reliability of the solution. The present work further extends and generalizes the work by Purves et al. (2003) on NN model calibration. It substantiates their observations through the use of a new optimization algorithm (ABC) and a different fitness metric (HSS). The work is supported by elaborate numerical experimentation (two study areas with 24 experiments each and 60,000 fitness function evaluations per experiment) in comparison to single experiment with 600 fitness functional evaluations by Purves et al. (2003) for any fitness metric. The study also demonstrated that the optimization results are unique to input dataset, and hence results obtained for one location/area cannot be applied for any other. Another distinct feature of the present work is the verification of results for operational forecasting. It adds to the confidence of applying the work for real-time forecasting with good success. Acknowledgments Authors sincerely acknowledge the contribution made by various SASE personnel in observing snow-meteorological and avalanche occurrence data used in this study. This work was supported by the Ministry of Defence, Govt of India vide Project No. ST-10/SAS-40. We would also like to thank the editors and three anonymous reviewers for the critical review of the manuscript and valuable suggestions which greatly helped us bring this manuscript to its present shape. References Abu-Mouti, F.S., El-Hawary, M.E., 2012. Overview of Artificial Bee Colony (ABC) algorithm and its applications. Systems Conference (SysCon). IEEE International, Vancouver, BC, pp. 1–6. Beasley, D., Bull, D.R., Martin, R.R., 1993a. An overview of genetic algorithms: part 1. Fundamentals. Univ. Comput. 15 (2), 58–69. Beasley, D., Bull, D.R., Martin, R.R., 1993b. An overview of genetic algorithms: part 2. Research topics. Univ. Comput. 15 (4), 170–181. Blum, C., Roli, A., 2003. Metaheuristics in combinatorial optimization: overview and conceptual comparison. ACM Comput. Surv. 35 (3), 268–308. Brabec, B., Meister, R., 2001. A nearest-neighbour model for regional avalanche forecasting. Ann. Glaciol. 32, 130–134. Buser, O., 1983. Avalanche forecasting with method of nearest neighbours: an interactive approach. Cold Reg. Sci. Technol. 8, 155–163. Buser, O., Butler, M., Good, W., 1987. Avalanche Forecast by Nearest-Neighbour Model vol. 162. IAHS Publ., pp. 557–569. Duan, Q., Sorooshian, S., Gupta, V., 1992. Effective and efficient global optimization for conceptual rainfall–runoff models. Water Resour. Res. 28 (4), 1015–1031. Gassner, M., Brabec, B., 2002. Nearest neighbour models for local and regional avalanche forecasting. Nat. Hazards Earth Syst. Sci. 2, 247–253.
Gassner, M., Etter, H.J., Birkeland, K., Leonard, T., 2000. NXD2000 — an improved avalanche forecasting program based on the nearest neighbor method. Proceedings of International Snow Science Workshop. Big Sky, Montana, USA, pp. 52–59. Heierli, J., Purves, R.S., Felber, A., Kowalski, J., 2004. Verification of nearest-neighbours interpretations in avalanche forecasting. Ann. Glaciol. 38, 84–88. Jain, B.J., Pohlheim, H., Wegener, J., 2001. On termination criteria of evolutionary algorithms. In: Spector, L. (Ed.), Proceedings of the Genetic and Evolutionary Computation Conference. Morgan Kaufmann Publishers, p. 768. Karaboga, D., 2005. An idea based on honeybee swarm for numerical optimization. Technical Report TR06. Erciyes University, Engineering Faculty, Computer Engineering Department. Karaboga, D., Basturk, B., 2007. A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm. J. Global Optim. 39, 459–471. Karaboga, D., Basturk, B., 2008. On the performance of artificial bee colony (ABC) algorithm. Appl. Soft Comput. 8, 687–697. Karaboga, D., Basturk, B., 2009. A comparative study of artificial bee colony algorithm. Appl. Math. Comput. 214, 108–132. Kennedy, J., Eberhart, R., 1995. Particle swarm optimization. Proceedings of IEEE International Conference on Neural Networks IV, pp. 1942–1948. Kohavi, R., 1995. A study of cross-validation and bootstrap for accuracy estimation and model selection. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, San Mateo, CA, USA 2 (12). Morgan Kaufmann, pp. 1137–1143. Kristensen, K., Larsson, C., 1994. An avalanche forecasting program based on a modified Nearest Neighbor Method. Proceedings of International Snow Science Workshop, Snowbird, Utah, USA, pp. 22–30. McClung, D.M., 2002a. The Elements of Applied Avalanche Forecasting Part I: The Human Issues. Nat. Hazards 25, 111–129. McClung, D.M., 2002b. The Elements of Applied Avalanche Forecasting Part II: The Physical Issues and the Rules of Applied Avalanche Forecasting. Nat. Hazards 26, 131–146. McClung, D., Schaerer, P.A., 2006. The Avalanche Handbook, 3rd ed. Mountaineers Books. McClung, D., Tweedy, M., 1994. Numerical avalanche prediction: Kootney Pass, British Columbia, Canada. J. Glaciol. 40 (135), 350–358. McCollister, C., Birkeland, K., Hansen, K., Aspinall, R., Comey, R., 2002, pp. 109-116. A probabilistic technique for exploring multi-scale spatial patterns in historical avalanche data by combining GIS and meteorological nearest neighbours with an example from the Jackson Hole Ski Area, Wyoming. In: Stevens, J.R. (Ed.), Proceedings of International Snow Science Workshop, Penticton, BC, Canada, pp. 109–116. Mérindol, L., Guyomarćh, G., Giraud, G., 2002, pp. 105-108. A French local tool for avalanche hazard forecasting: Astral, current state and new developments. In: Stevens, J.R. (Ed.), Proceedings of International Snow Science Workshop, Penticton, BC, Canada, pp. 105–108. Obled, C., Good, W., 1980. Recent developments of avalanche forecasting by discriminant analysis techniques: a methodological review and some applications to the Parsenn area (Davos, Switzerland). J. Glaciol. 25 (92), 315–346. Purves, R.S., Morrison, K.W., Moss, G., Wright, D.S.B., 2003. Nearest neighbours for avalanche forecasting in Scotland — development, verification and optimisation of a model. Cold Reg. Sci. Technol. 37, 343–355. Singh, A., Ganju, A., 2008. Artificial Neural Networks for Snow Avalanche Forecasting in Indian Himalaya. Proceedings of 12th International Conference of International Association for Computer Methods and Advances in Geomechanics (IACMAG), 1–6 October 2008, Goa, India. Singh, A., Srinivasan, K., Ganju, A., 2005. Avalanche forecast using numerical weather prediction in Indian Himalaya. Cold Reg. Sci. Technol. 43, 83–92. Wilks, D.S., 1995. Statistical Methods in the Atmospheric Sciences: An Introduction. Academic Press, San Diego. Wolpert, D.H., Macready, W.G., 2005. Coevolutionary free lunches. IEEE Trans. Evol. Comput. 9 (6), 721–735.