Appropriate model selection methods for nonstationary generalized extreme value models

Appropriate model selection methods for nonstationary generalized extreme value models

Accepted Manuscript Research papers Appropriate Model Selection Methods for Nonstationary Generalized Extreme Value Models Hanbeen Kim, Sooyoung Kim, ...

4MB Sizes 2 Downloads 174 Views

Accepted Manuscript Research papers Appropriate Model Selection Methods for Nonstationary Generalized Extreme Value Models Hanbeen Kim, Sooyoung Kim, Hongjoon Shin, Jun-Haeng Heo PII: DOI: Reference:

S0022-1694(17)30075-6 http://dx.doi.org/10.1016/j.jhydrol.2017.02.005 HYDROL 21804

To appear in:

Journal of Hydrology

Received Date: Revised Date: Accepted Date:

3 March 2016 11 November 2016 6 February 2017

Please cite this article as: Kim, H., Kim, S., Shin, H., Heo, J-H., Appropriate Model Selection Methods for Nonstationary Generalized Extreme Value Models, Journal of Hydrology (2017), doi: http://dx.doi.org/10.1016/ j.jhydrol.2017.02.005

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Appropriate Model Selection Methods for Nonstationary Generalized Extreme Value Models

Hanbeen Kim Ph.D. Candidate, School of Civil and Environmental Engineering, Yonsei University, Seoul, Korea

Sooyoung Kim Ph.D. and researcher, Hydrometeorological Cooperation Center, Korea

Hongjoon Shin Ph.D., School of Civil and Environmental Engineering, Yonsei University, Seoul, Korea

Jun-Haeng Heo1 Ph.D. and Professor, School of Civil and Environmental Engineering, Yonsei University, Seoul, Korea

1

Corresponding author (e-mail: [email protected]) 1

Abstract Several evidences of hydrologic data series being nonstationary in nature have been found to date. This has resulted in the conduct of many studies in the area of nonstationary frequency analysis. Nonstationary probability distribution models involve parameters that vary over time. Therefore, it is not a straightforward process to apply conventional goodnessof-fit tests to the selection of an appropriate nonstationary probability distribution model. Tests that are generally recommended for such a selection include the Akaike’s information criterion (AIC), corrected Akaike’s information criterion (AICc), Bayesian information criterion (BIC), and likelihood ratio test (LRT). In this study, the Monte Carlo simulation was performed to compare the performances of these four tests, with regard to nonstationary as well as stationary generalized extreme value (GEV) distributions. Proper model selection ratios and sample sizes were taken into account to evaluate the performances of all the four tests. The BIC demonstrated the best performance with regard to stationary GEV models. In case of nonstationary GEV models, the AIC proved to be better than the other three methods, when relatively small sample sizes were considered. With larger sample sizes, the AIC, BIC, and LRT presented the best performances for GEV models which have nonstationary location and/or scale parameters, respectively. Simulation results were then evaluated by applying all four tests to annual maximum rainfall data of selected sites, as observed by the Korea Meteorological Administration.

Keywords Model selection method; Nonstationary GEV model; Akaike’s information criterion; Bayesian information criterion; Likelihood ratio test; Proper model selection ratio

-2-

1. Introduction Frequency analysis plays an important role in the hydraulic structure design process as well as in the management of water resources. It utilizes appropriate probability distribution models to estimate hydrologic quantiles. Frequency analysis assumes that data is both independent and stationary, i.e., data and its statistical characteristics do not vary over time. Industrialization and urbanization influence change in climatic conditions, and this has caused hydrologic and meteorological data to become nonstationary (Jain and Lall, 2000, 2001; Katz et al., 2002; Milly et al., 2008; Olsen et al., 1999). For example, statistics such as quantiles of hydrologic data, and parameters of probability models may change over time. However, there has been much controversy over the concept of nonstationarity in water resources management and planning. Milly et al. (2008) asserted that nonstationary probabilistic models should be identified and applied because anthropogenic climate change is affecting the extremes of hydrological variables (e.g., precipitation, streamflow and evapotranspiration). In contrast, several scholars have emphasized that the careless application of nonstationarity could lead to the underestimation of variability, uncertainty and risk (Koutsoyiannis, 2011; Lins and Cohn, 2011; Montanari and Koutsoyiannis, 2014; Serinaldi and Kilsby, 2015). Nonetheless, various studies on nonstationarity for hydrological modeling are still being conducted to predict future events under changing environmental conditions. There have been many studies focusing on nonstationary frequency analysis that primarily takes into account covariates, such as time, temperature, and climate indices. Examples of climate indices are Pacific Decadal Oscillation (PDO), Southern Oscillation Index (SOI), Mediterranean Oscillation Index (MOI), North Atlantic Oscillation (NAO), Sea Level Pressure (SLP), and Sea Surface Temperature (SST). These covariates are used to determine parameters of probability distribution models (Brown et al., 2008; Coles, 2001; Griffis and Stedinger, 2007; Katz et al., 2002; Sugahara., 2009; Tramblay et al., 2013; Vasiliades et al., 2015; Wang et al., 2004; Wi et al., 2015). -3-

Extreme value theory is a branch of statistics that focuses on the extreme events and the tail behavior of a distribution. The theory uses the block maxima approach to derive Extreme Value (EV) distributions, including the Fréchet, Weibull, and Gumbel distributions. The GEV distribution unifies the three abovementioned EV distributions. In nonstationary frequency analysis, nonstationary GEV distributions have been proposed and widely used (Cannon, 2010; Coles, 2001; El Adlouni et al., 2007; Kharin and Zwiers, 2005; Leadbetter et al., 1983; Mailhot et al., 2010; Nadarajah, 2005; Vasiliades et al., 2015; Wang et al., 2004; Wi et al., 2015). The nonstationary GEV models proposed by Nadarajah (2005), Vasiliades et al. (2015), and Wi et al. (2015) have been used to conduct nonstationary frequency analysis of the annual maximum rainfall series, using time as a covariate. In conventional frequency analysis, the χ 2 test, Kolmogorov–Smirnov (KS) test, Cramér von Mises (CVM) test, probability plot correlation coefficient (PPCC), Anderson-Darling test, and modified Anderson-Darling test have been used to examine the goodness-of-fit (GOF) for probability models (Heo et al., 2013). In addition to these GOF tests, model prediction error measured by the bootstrap or cross-validation has been used to select an appropriate probability model (Laio et al., 2009; Smyth, 2000). Burnham and Anderson (2002) and Zucchini (2000) introduced and expounded these techniques for model selection. In nonstationary frequency analysis, however, it is not simple to apply goodness-of-fit tests to nonstationary probability distribution models involving parameters that vary with time since these tests should be performed at each time step. Therefore, many studies alternatively recommend the Akaike’s information criterion (AIC), corrected Akaike’s information criterion (AICc), and Bayesian information criterion (BIC) for the selection of appropriate nonstationary models (Cannon, 2010; Strupczewski et al., 2001a, 2001b; Sugahara et al., 2009; Villarini et al., 2009, 2010). These criteria are straightforward and allow for selecting an appropriate model if the maximized likelihood is calculated. Strupczewski et al. (2001a, 2001b) used the AIC to select the most efficient model out of several nonstationary flood -4-

frequency models. Sugahara et al. (2009) applied the AICc (Hurvich and Tsai, 1995) and the rAICc (Burnham and Anderson, 2004) to select the most efficient model out of four nonstationary generalized Pareto distributions. Villarini et al. (2009, 2010) employed the AIC and the BIC to find the degrees of freedom for the Generalized Additive Models of Location, Scale, and Shape parameters (GAMLSS) in a nonstationary framework. Cannon (2010) identified an appropriate nonstationary GEV model using the AICc and the BIC. Vasiliades et al., (2015) identified an appropriate nonstationary GEV model using the AICc and the BIC. Alternatively, the Likelihood Ratio Test (LRT) has been used in several studies (Clarke, 2002; El Adlouni et al., 2007; Garcia et al., 2007; Katz, 2013; Kharin and Zwiers, 2005; Mailhot et al., 2010; Nadarajah, 2005; Tramblay et al., 2013; Wang et al., 2013), and has been recommended for the selection of an appropriate nonstationary extreme value model (Coles, 2001). Clarke (2002) proposed the Gumbel distribution, involving time as a covariate, and used Generalized Linear Models (GLMs) to fit trend parameters. The LRT was applied to evaluate the goodness-of-fit for the GLMs. Kharin and Zwiers (2005) also evaluated nonstationary GEV models by performing the LRT. Nadarajah (2005), El Adlouni et al. (2007), and Wang et al. (2013) proposed several nonstationary GEV models, and determined the most efficient one by using the LRT. Garcia et al. (2007) conducted the LRT to draw a comparison between stationary and nonstationary GEV models. Mailhot et al. (2010) employed the LRT to compare the nonstationary Ensemble Members (EM) and Annual Maximum (AM) models. Katz (2013) used the AIC, the BIC, and the LRT to select appropriate nonstationary models. Tramblay et al. (2013) selected an appropriate nonstationary Peaks-Over-Threshold (POT) model with the help of the LRT. The abovementioned studies are only a few ones that compare various model selection criteria to determine an appropriate nonstationary GEV model. Although Stone (1979) described the fundamental characteristics and comparative performance of the AIC and BIC, no specific standards have been set to determine the best criterion for such a model. -5-

Therefore, it is likely that an inappropriate model may be selected under nonstationary conditions, and this makes it necessary to determine the most appropriate criterion. For this purpose, this study compares the performances of the AIC, the AICc, the BIC, and the LRT, using the Monte Carlo simulation for various sample sizes as well as location, scale, and shape parameters based on stationary and nonstationary GEV distributions. To evaluate the simulation results, the AIC, AICc, BIC, and LRT were applied to the stationary and nonstationary GEV models fitted to the observed annual maximum rainfall data.

2. Model Selection Criteria A number of methods can be applied to select appropriate nonstationary models. Of these, the AIC, the AICc, the BIC, and the LRT have been recommended the most. In this study, these tests were applied to various stationary and nonstationary GEV models. 2.1. Nonstationary GEV Models The GEV distribution is widely used for extreme values and includes location, scale, and shape parameters (Lettenmaier and Burges, 1982). The Probability Density Function (PDF) and the Cumulative Density Function (CDF) of the GEV distribution are represented by Eq. (1) and Eq. (2), respectively.

1 x−µ f ( x) = 1 + ξ  σ σ 

(−1 / ξ )−1

−1 / ξ    x−µ exp - 1 + ξ   σ    

− 1/ ξ    x − µ F ( x ) = exp - 1 + ξ   σ    

(1)

(2)

where, µ , σ ( σ > 0 ) , and ξ are the location, scale, and shape parameters, respectively. In a nonstationary GEV distribution, the GEV parameters can be expressed as various forms -6-

of time-dependent function. In this study, the location and scale parameters are expressed as a linear function of time ( t ), as represented by Eq. (3) and Eq. (4), respectively. It can simply present the increasing or decreasing trend of the location and scale parameters interrelated with the mean and variance of the observed data. µ (t ) = µ0 + µ1t

(3)

σ (t ) = exp( σ 0 + σ1t )

(4)

The location parameter varies linearly with time, whereas the scale parameter varies exponentially with time since it is greater than zero (Coles, 2001). For the GEV model, it is difficult to estimate the shape parameter precisely (Coles, 2001). Therefore, this study assumes the shape parameter to be constant over time. Table 1 lists the four GEV models that were used in this study: GEV(0,0,0) assumes all parameters to be stationary ( µ1 = 0 , σ1 = 0 ); GEV(0,1,0) assumes that the scale parameter varies exponentially with time ( µ1 = 0 ,

σ1 ≠ 0 ); GEV(1,0,0) assumes that the location parameter varies linearly with time ( µ1 ≠ 0 , σ1 = 0 ); and GEV(1,1,0) assumes both the location and scale parameters to be

nonstationary ( µ1 ≠ 0 , σ1 ≠ 0 ). Table 1. Applied stationary and nonstationary GEV models.

2.2. Akaike’s Information Criterion (AIC) and corrected AIC (AICc)

Akaike (1973) derived a simple equation by establishing a relationship between the Kullback-Leibler information (Kullback and Leibler, 1951) and Fisher’s log-likelihood function. The AIC is represented by Eq. (5).

AIC = − 2 log (ML ) + 2 k where, log (ML

(5)

) is the maximized log likelihood function under the proposed model and k -7-

is the number of parameters in a given model. The efficiency of various models can be determined by comparing their AIC values. The model with the lowest AIC value is considered to be the most efficient. However, when the dimension of the model—the number of parameters—is relatively larger than the sample size n , the bias of overfitting increases (Hurvich and Tsai, 1989). The corrected AIC (AICc) was developed for small sized samples to mitigate this bias. (Hurvich and Tsai, 1989, 1995). The AICc is represented by Eq. (6).

AICc = −2 log( ML ) + 2k +

2k ( k + 1) n − k −1

(6)

In this equation, the bias correction term, which includes the sample size n, is added to Eq. (5). If the value of n / k is less than or equal to 40, then AICc is recommended to select an appropriate model (Burnham and Anderson, 2004). 2.3. Bayesian Information Criterion (BIC)

The BIC, a widely used information criterion, was derived by Schwarz (1978). It is represented by Eq. (7). (7)

BIC = − 2 log( ML ) + k log( n )

The BIC equation differs from the AIC equation with respect to its second term that depends on the sample size n. The derivations of both equations appear to be similar. However, the BIC equation is derived within a Bayesian framework, which is different from the way the AIC equation is derived. The model with the lowest BIC value is considered to be the most efficient, as is the case with the AIC. 2.4. Likelihood Ratio Test (LRT)

The LRT enables to determine the preferred model of two competing nested models. A -8-

nested model includes one model, i.e., a reduced model, which is derived from another model, i.e., a full model. The full model includes all the parameters of the reduced model. The number of parameters included in a full model must be greater than that included in a reduced model (Cahill, 2003). Let M f and M r be a full model and a reduced model, respectively. The deviance statistic ( D ) is represented by Eq. (8).

D = 2{log( ML f ) − log( ML r )}

(8)

where, log( ML f ) and log( ML r ) are the maximized log likelihood functions of M f and M r , respectively (Coles, 2001).

Let k f and k r be the number of parameters of M f and M r , respectively. Then, c α is defined as the (1 − α ) quantile of the χ 2 distribution with (k f − k r ) degrees of freedom, where α is the level of significance. If D > cα , M r is rejected and M f is preferred. Else, M r is retained.

3. Simulation Experiments 3.1. General Description

Monte Carlo simulations were performed to evaluate the feasibility of using the AIC, AICc, BIC, and LRT in the selection of the GEV(0,0,0), GEV(0,1,0), GEV(1,0,0), and GEV(1,1,0) models. In order to determine the magnitude of trends in the location and scale parameters for the simulation conditions, the observed annual maximum rainfall data sets were used in this study. The observed data sets were standardized on the basis of the mean value of each data set, and then the slopes of the location and scale parameters were estimated. Based on these results, the simulation conditions considered for stationary and -9-

nonstationary models were listed in Table 2. When the location or scale parameters were assumed to be stationary, their values were determined to be 0 or 1, respectively. Sample data sets, with various sample sizes ( N = 30, 40, 50, 60, 70, 80, 90, 100, 120, and 160), were generated for each GEV model. For each sample size, the model parameters were estimated using the maximum likelihood method; and the AIC, AICc, BIC, and D (deviance statistic) values were calculated for each GEV model. The model with the least AIC, AICc, and BIC values was considered to be the most efficient amongst all the GEV models that were examined. In the case of the LRT, the four GEV models were compared to one another at the 5% significance level. However, when the results suggested the GEV(0,1,0) and GEV(1,0,0) models to be better than the others, the LRT failed to select the better model out of the two since neither were nested. To evaluate the performances of the AIC, AICc, BIC, and LRT, the proper model selection ratio ( R ) based on the Monte Carlo simulation was suggested and represented by Eq. (9).

R(%) =

NS=A × 100 NT

(9)

Where, NT is the total number of simulations, and N S = A is the number of times the selected model was the same as the assumed model that generated the samples. In this study, the simulation was repeated 5,000 times. The simulation procedure is illustrated in Fig. 1. Table 2. Simulation conditions for stationary and nonstationary GEV models.

Fig. 1. Simulation procedure performed in this study.

3.2. Simulation Results

The proper model selection ratios ( R ) of the AIC, AICc, BIC, and LRT are plotted as a function of the sample size, and are compared in Figs. 2, 4, 7, and 10, respectively, for the - 10 -

assumed GEV models. Figs. 3, 5, 8, and 11, respectively show the model selection ratios ( R ) for all candidate GEV models (GEV(0,0,0), GEV(0,1,0), GEV(1,0,0), and GEV(1,1,0)), which were selected on the basis of sample sizes. 1) GEV(0,0,0) conditions

Fig. 2 illustrates the results of simulations of the GEV(0,0,0) model. Figs. 2(a) and 2(b) represent the simulation results for the best and worst performances, respectively. Under the GEV(0,0,0) conditions, the performances were almost similar in both the best and worst cases. As shown in Fig. 2, for all sample sizes, the proper model selection ratios of the BIC and the LRT were observed to be higher than those of the AIC and the AICc. The BIC demonstrated the best performance when the sample size was more than 40. The proper model selection ratios of the AIC, BIC, and LRT increased with increase in sample size. In contrast, the proper model selection ratio of the AICc gradually decreased and converged with that of the AIC. Fig. 3 illustrates the model selection ratios of the four model selection tests for all candidate GEV models based on the sample size. The AIC and AICc demonstrated higher ratios of selecting the more complex GEV models (GEV(0,1,0), GEV(1,0,0), and GEV(1,1,0)) than the BIC and the LRT. Accordingly, the performances of the BIC and LRT were better than those of the AIC and AICc under the GEV(0,0,0) conditions. Fig. 2. Proper model selection ratios of GEV(0,0,0) model simulations.

Fig. 3. Model selection ratios of AIC, AICc, BIC, and LRT for all candidate GEV models according to sample size (in the case of the assumed GEV(0,0,0) model:  = ,  = ,  = . ).

2) GEV(0,1,0) conditions

Fig. 4 illustrates simulation results of the proper model selection ratios for the GEV(0,1,0) - 11 -

model. Under the GEV(0,1,0) conditions, the proper model selection ratios of the AIC and the AICc were higher than those of the BIC and LRT when a relatively small sample size was considered. However, as the sample size increased, the proper model selection ratios of the BIC and LRT increased to approximately 90%, whereas those of the AIC and AICc increased to approximately 80%, for most cases. Therefore, a reversed point was observed on a certain sample size for each simulation condition. The BIC demonstrated the best performance when the sample size was larger than the reversed point. For small sample sizes, the performance of the AICc was not better than that of the AIC, as opposed to simulations under stationary conditions. For all sample sizes, the difference between the proper model selection ratio of the AIC and that of the AICc was less than 5%. This indicated that the AICc had no advantage over the AIC in the case of small sample sizes. The performances of the model selection methods were influenced by two parameters: the slope of scale parameter ( σ1 ) and the shape parameter ( ξ ). The effect of

σ1 on the

performance can be seen in Figs. 4(a) and 4(c). The proper model selection ratios of all the tests decreased when

σ1 was close to zero. Moreover, the proper model selection ratios

increased more slowly with an increase in sample size. Therefore, the sample size of the reversed point increased. Fig. 4(a) represents the simulation result for the best performance under GEV(0,1,0) simulation conditions. The AIC and AICc performed better than the BIC and LRT while the proper model selection ratios of all the tests rapidly increased at N = 50 . However, the proper model selection ratios of the BIC and LRT increased to over 90%, whereas those of the AIC and the AICc increased to approximately 80%. In Fig. 4(c), the proper model selection ratios decreased significantly, and the reversed point was observed at N = 135. Figs. 4(c) and 4(d) illustrate the changes in the performance of the model selection methods on the basis of the shape parameter, assuming the simulation conditions to be µ = 0 , σ = exp( −0.005t ) , and ξ = ±0.2 . The results indicated that as the shape parameter increased, the performance of all tests generally reduced; however, not many - 12 -

differences were observed. When

σ1 was positive, all the results were nearly identical (not

shown in the figures). Fig. 5 illustrates the model selection ratios for all the candidate models considered under the four model selection tests. Figs. 5(a), 5(b), 5(c), and 5(d) show the best performances under the simulation conditions, whereas Figs. 5(e), 5(f), 5(g), and 5(h) show the worst performances. These results occurred because the ratios of selecting the GEV(0,0,0) model increased as

σ1 approached zero. When the sample size was 30, all four tests generally

selected the GEV(0,0,0) model as the appropriate model, even though the sample data were generated based on the GEV(0,1,0) model. The AIC performed better than the other tests for relatively lower sample sizes because it showed a tendency for selecting the more complex models when compared with the other tests. As the sample size increased, the ratios of selecting the GEV(0,1,0) and GEV(1,1,0) models increased, whereas those of selecting the other candidate models decreased to almost zero. Especially, the BIC showed lower ratios of selecting the GEV(1,1,0) model than the other tests. Therefore, the BIC showed the best performance for relatively larger sample sizes. Similar results were observed under other simulation conditions (not shown in the figures). Using the simulation results, regression analysis was performed for the reversed points to recommend the best model selection method for given

σ1 and sample sizes. To determine

the regression equation, two conditions were assumed on the basis of the simulation results: 1) the sample size at the reversed point under stationary condition ( σ 1 = 0 ) was infinity, and 2) the sample size at the reversed point only depended on the absolute value of

σ1 ,

regardless of its sign. Based on these assumptions, the power regression model represented in Eq. (10) was selected for the reversed points.

y = a⋅ x

b

(10) - 13 -

where, x is the slope of the scale parameter ( σ1) and y is the sample size ( N ). Table 3 presents the estimated regression equations for the reversed points and the coefficient of determination ( ). Fig. 6 illustrates the reversed points and the regression line. The AIC is recommended in the shadow region under the regression line, whereas the BIC is recommended in the region over the regression line. For example, the AIC is more suitable than the BIC when σ 1 = −0.01 , ξ = −0.2 , and N = 50 , as shown in Fig. 6(a).

Fig. 4. Proper model selection ratios of GEV(0,1,0) model simulations.

Fig. 5. Model selection ratios of AIC, AICc, BIC, and LRT for all candidate GEV models according to sample size, in the case of assumed GEV(0,1,0) model.

Table 3. Estimated regression equations of reversed points in GEV(0,1,0) model.

Fig. 6. Regression line of reversed points and recommended model selection methods in GEV(0,1,0) model.

(3) GEV(1,0,0) conditions Fig. 7 illustrates the simulation results of the proper model selection ratios for the GEV(1,0,0) model. The performance patterns of the selection methods under the GEV(1,0,0) conditions were similar to those under the GEV(0,1,0) conditions, with the only difference that the LRT demonstrated the best performance for sample sizes larger than the reversed point. The slope of location parameter ( µ1 ) and the shape parameter ( ξ ) of the GEV(1,0,0) model influence the performances of the model selection methods. Fig. 7 shows the detected change in performances with the change in µ 1 . As µ1 comes closer to zero, the proper - 14 -

model selection ratios of all the tests decreased. Further, the ratios increased more slowly with an increase in sample size. Therefore, the sample size of the reversed point increased (Figs. 7(a) through 7(d)). Figs. 7(a) and 7(b) show the performances when the simulation conditions are µ = 0.01t , σ = 1 , and ξ = ±0.2 . With a decrease in the shape parameter, all tests demonstrated slightly weaker performances. Fig. 8 illustrates the model selection ratios for all candidate models under the four model selection tests. Figs. 8(a), 8(b), 8(c), and 8(d) show the best performances of the model selection methods under the simulation conditions, whereas Figs. 8(e), 8(f), 8(g), and 8(h) show the worst performances. Similar to the simulation results of the GEV(0,1,0) conditions, the AIC performed better than the other tests for relatively lower sample sizes. As the sample size increased, the LRT showed lower ratios of selecting GEV(1,1,0) model than the other tests. Thus, the LRT showed the best performance for relatively larger sample sizes. Similar results were observed under other GEV(1,0,0) simulation conditions (not shown in figures). Regression analysis was performed for the reversed points using Eq. (10). In the case of the GEV(1,0,0) condition, x is the slope of the location parameter ( µ 1 ) and y is the sample size ( N ). The results are represented in Table 4 and illustrated in Fig. 9. To improve the accuracy of the regression equations, µ 1 was assumed to be ± 0.02 for the additional simulation experiments. Fig. 9 shows the reversed points and the regression line. The AIC is recommended in the shadow region under the regression line, whereas the LRT is recommended in the region over the regression line.

Fig. 7. Proper model selection ratios of GEV(1,0,0) model simulations.

Fig. 8. Model selection ratios of AIC, AICc, BIC, and LRT for all candidate GEV models according to the sample size, in the case of assumed GEV(1,0,0) model.

Table 4. Estimated regression equations of reversed points in GEV(1,0,0) model. - 15 -

Fig. 9. Regression line of reversed points and recommended model selection methods in GEV(1,0,0) model.

(4) GEV(1,1,0) conditions Time-dependent location and scale parameters were considered in the simulation of the GEV(1,1,0) model. Fig. 10 illustrates the simulation results of the proper model selection ratios. Unlike the simulation results of the other nonstationary GEV models, the noticeable difference was that the performances of the AIC were either similar to or better than those of the other methods for all sample sizes and parameters considered in this study. This is because the AIC showed a tendency to select more complex models than the other tests. Therefore, as shown in Fig. 10, there were no reversed points. The performances of the model selection tests were influenced by the slope of location parameter ( µ 1 ), the slope of the scale parameter ( σ1), and the value of the shape parameter ( ξ ) of the GEV(1,1,0) model. The proper model selection ratios of all the tests decreased with decrease in the absolute value of µ1 (Figs. 10(a) through 10 (h)). When µ 1 was positive, the results were nearly the same (not shown in Figures). When

σ1 increased from

-0.02 to -0.005, the performances of all the methods decreased (Figs. 10(a) through 10(d)). However, when

σ1 increased from 0.005 to 0.02, the proper model selection ratios of all the

methods decreased (Figs. 10(e) through 10(h)). The performance of all the model selection methods was slightly reduced with the increase in the shape parameter (Figs. 10(i) and 10(j)). Fig. 11 illustrates the selection ratios for all the candidate models under the AIC. At a sample size of 30, the AIC mostly selected the GEV(0,0,0) model as the best one. However, with increase in sample size, the proper model selection ratios increased, and the chances of the - 16 -

GEV(0,0,0) model getting selected decreased. In case of the GEV(1,1,0) simulation conditions, the results were largely influenced by both the location and scale parameters. Fig. 11(a) shows the model selection ratios under the condition of best performance, and Figs. 11(b) through 11(d) show the changes in the ratios with increase in that as

σ1. Fig. 11(b) illustrates

σ1 came closer to 0, the ratios of selecting the GEV(1,0,0) model increased. As the

sample size increased to 70, the ratios of selecting the GEV(1,0,0) and GEV(1,1,0) models increased. Furthermore, when the sample sizes were 60, 70, and 80, the ratios of selecting the GEV(1,0,0) model were higher than those of the other models. For a sample size of more than 80, the ratios of selecting the GEV(1,0,0) model decreased, while those of selecting the GEV(1,1,0) model (the proper model) increased. Fig. 11(c) shows the results when

σ1 increased to 0.01. As the sample size increased, similar variations in the ratios of

selecting the GEV(0,1,0) model were observed as in the ratios of selecting the GEV(1,0,0) model, as shown in Fig. 11(b). Meanwhile, the ratios of selecting the GEV(1,1,0) model increased. In Fig. 11(d), greater change was observed in case of the scale parameter compared with the location parameter, as

σ1 was the highest under the simulation

conditions. Consequently, for sample sizes of more than 40, the ratios of selecting the GEV(0,1,0) model were higher than those of the other models. As the sample size increased, the ratios of selecting the GEV(1,1,0) model (the proper model) increased gradually, while those of selecting the GEV(0,1,0) model increased rapidly to about 80%. Figs. 11(e) and 11(f) show the results under the simulation conditions, which are the same as illustrated in Figs. 10(g) and 10(h), respectively. Due to the abovementioned reason, the GEV(0,1,0) model was mostly selected as the appropriate model, as shown in Fig. 11 (d), 11(e) and 11(f). It was concluded from these results that when the change in one parameter is larger than that in other parameters, the model selection methods primarily select the GEV(0,1,0) or GEV(1,0,0) model as the appropriate one.

- 17 -

The following results were recorded for the simulation experiments that were performed on various models. The BIC demonstrated the best performance in case of the GEV(0,0,0) model when the sample size was more than 40, as well as in case of the GEV(0,1,0) model for large sample sizes. The LRT exhibited the best performance in case of the GEV(1,0,0) model for large sample sizes. The AIC proved to be the best method in cases of the GEV(0,1,0) and GEV(1,0,0) models for small sample sizes, as well as in case of the GEV(1,1,0) model for all sample sizes.

Fig. 10. Proper model selection ratios of GEV(1,1,0) model simulations.

Fig. 11. Model selection ratios of AIC for all candidate GEV models according to the sample size, in the case of assumed GEV(1,1,0) model.

4. Application 4.1. Actual Evaluation In order to evaluate the results of the simulation experiments, the observed annual maximum rainfall data at four selected sites— Jinju, Gunsan, Yeongju, and Namwon—were used as examples of stationary and nonstationary cases. The data was standardized on the basis of the mean value of each data set. The location, scale, and shape parameters of these four data sets were estimated by the maximum likelihood method for the sample using 20-year moving windows. Fig. 12 shows the estimated location, scale, and shape parameters of the four selected sites. Linear regression was then performed on the three parameters of the four data sets to assume the true GEV model. When the p-value of the slope coefficient was at a significance level lower than 1%, the parameter was assumed to have a trend. Table 5 represents the results of the significance test for the slope coefficient. Both the location and - 18 -

scale parameters showed increasing trends at the Namwon site; however, they demonstrated no trend at the Jinju site. At the Gunsan site, only the scale parameter exhibited a decreasing trend, while at the Yeongju site, only the location parameter was seen with an increasing trend. Therefore, the Jinju, Gunsan, Yeongju, and Namwon sites could be assumed as the GEV(0,0,0), GEV(0,1,0), GEV(1,0,0), and GEV(1,1,0) models, respectively (Table 5). The four stationary and nonstationary GEV models, listed in Table 1, were applied to the annual maximum rainfall data, and the AIC, AICc, BIC, and LRT methods were employed to select the best model. Table 6 shows the selected GEV model for each site.

Fig. 12. Estimated location, scale, and shape parameters for the sample using 20-year moving windows.

Table 5. Results of the hypothesis test for the slope coefficient on selected sites.

Table 6. Selected stationary and nonstationary GEV models for selected sites.

At the Jinju site, for which the best fitting model was the GEV(0,0,0) model, all tests selected the GEV(0,0,0) model as the best. The simulation results for the GEV(0,0,0) model indicated that all criteria could be applied to stationary GEV models for all sample sizes, because the proper model selection ratios of all methods were high. At the Gunsan site, for which the best fitting model was the GEV(0,1,0) model, the AIC and the AICc selected the proper model, whereas the other criteria selected the GEV(0,0,0) model. The estimated slope of the scale parameter ( σ1) and the shape parameter ( ξ ) for the GEV(0,1,0) model were 0.0212 and 0.285, respectively, and they were similar to the parameters in Fig. 4(b) ( σ1 =0.02 and ξ =0.2). As shown in Figs. 4(b) and 6(d), the AIC and the AICc showed better performance than the other model selection methods for the sample size of N = 46 . - 19 -

At the Yeongju site, for which the best fitting model was the GEV(1,0,0) model, the AIC and the AICc selected the proper model, while the other criteria selected the GEV(0,0,0) model. The estimated slope of the location parameter ( µ1 ) and the shape parameter ( ξ ) for the GEV(1,0,0) model were 0.0058 and 0.1491, respectively, and they were similar to the parameters in Fig. 7(c) ( µ 1 =0.005 and ξ =0.1). As illustrated in this Figs. 7(c) and 9(c), the AIC and the AICc demonstrated better performance than the other model selection methods for the sample size of N = 42 . At the Namwon site, for which the best fitting model was the GEV(1,1,0) model, only the AIC selected the proper model, whereas the AICc selected the GEV(1,0,0) model and the other criteria selected the GEV(0,0,0) model. The estimated slope of the location parameter ( µ1 ), slope of the scale parameter ( σ1), and the shape parameter ( ξ ) of the GEV(1,1,0) model were 0.0054, 0.0191, and -0.1385, respectively. The simulation results of the GEV(1,1,0) model ( µ 1 =0.005, σ1 =0.02, and ξ =-0.1), as shown in Fig. 10(j), were similar to the parameters of the Namwon site. As shown in Fig. 10(j), the AIC showed the best performance for the sample size of N =42. In this application, it was identified that all model selection methods could be used for stationary GEV models, while the AIC was the best selection method for nonstationary GEV models with small sample sizes.

4.2. Practical Application As the AICc has no advantage over the AIC for small sample size in nonstationary GEV models (Figs. 4, 7, and 10), the AICc does not need to be considered. When the model selection methods (AIC, BIC, and LRT) select the same GEV model as an appropriate model, we simply apply the selected model in the frequency analysis. If the selected GEV models are not the same, however, we can choose the best method as follows: - 20 -

-

When the model selection methods select the GEV(0,1,0) or GEV(1,0,0) model, we can calculate the reversed point using the regression equations in Tables 3 and 4. Through the comparison of the reversed point and the sample size of the application data, we can identify the best method for the given sample size using the Figs. 6 and 9.

-

In case the sample size is not large enough, the AIC sometimes selects the GEV(1,1,0) model while the other methods select the GEV(0,0,0) model. When the true model is the GEV(0,0,0) model, the probability that the AIC selects the GEV(1,1,0) model is less than 5% (Fig. 3(a)). On the other hand, when the true model is the GEV(1,1,0) model, the probability that the BIC or LRT selects the GEV(0,0,0) model is even greater than the probability that the AIC selects the GEV(0,0,0) model (Fig. 11). Therefore, we recommend the use of the AIC in that case.

5. Conclusions This simulation study was conducted to recommend appropriate model selection methods for use in nonstationary frequency analysis. Monte Carlo simulation experiments were conducted for various stationary and nonstationary GEV models, and the performances of the AIC, AICc, BIC, and LRT model selection methods were evaluated. For stationary GEV models, the proper model selection ratios of the BIC and LRT were above 90%, whereas those of the AIC and AICc were more than 70%. This is because the AIC and the AICc showed higher ratios of selecting more complex GEV models than the BIC and LRT. The BIC demonstrated the best performance when the sample size was more than 40.

- 21 -

In the case of nonstationary GEV models with four parameters—GEV(0,1,0) and GEV(1,0,0) models—the AIC performed better than the other criteria when the sample size was relatively small. However, as the sample size increased, the proper model selection ratios of the BIC and LRT increased to approximately 90%, whereas those of the AIC and AICc increased to around 80%, in most cases. This is because the AIC showed a tendency to select more complex models than the other tests. Therefore, a reversed point was found for some sample sizes. For sample sizes that were larger than the reversed point, the BIC and LRT showed the best performance in case of the GEV(0,1,0) and GEV(1,0,0) models, respectively. In addition, as the shape parameter increased, the performance of all the tests either slightly decreased (as in the case of the GEV(0,1,0) model) or increased (as in the case of the GEV(1,0,0) model). Based on these simulation results, regression analysis was performed for the reversed point to recommend the best model selection method for the given sample sizes and slope of parameters. For the simulation of the GEV(1,1,0) model, the proper model selection ratios of the AIC were found to be either equal to or greater than those of the other model selection methods for all sample sizes and parameters considered in this study. This is because the AIC is apt to select more complex models than the other tests. However, when a certain parameter was significantly more nonstationary than the other parameters, the model selection methods primarily selected some other nonstationary model as the appropriate one. The extent to which a parameter is nonstationary can be measured by observing the changes it undergoes with changing sample sizes. To evaluate the simulation experiments, the four stationary and nonstationary GEV models were applied to the annual maximum rainfall data of four selected sites, as observed by the Korea Meteorological Administration. The AIC, AICc, BIC, and LRT methods were then applied to select the best models. Similar to the simulation results, the AIC demonstrated the best performance for the selected sites based on relatively small sample sizes. In addition, a - 22 -

simple guide-line for practical applications was provided. From the abovementioned results, appropriate model selection method can be selected according to the sample size and GEV model parameters. Using the selected method, the most suitable GEV model will be determined and applied to estimate quantile in nonstationary frequency analysis.

Acknowledgements This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and future Planning (grant number: 2014006671).

References Akaike, H. (1973). “Information theory and an extension of the maximum likelihood principle.”

Second International Symposium on Information Theory, edited by B. N. Petrov and F. Csaki, pp. 267-281, Acad. Kiado’, Budapest. Brown, S.J., Caesar, J., Ferro, C.A.T. (2008). “Global changes in extreme daily temperature since 1950.” Journal of Geophysical Research, 113, D05115. Burnham, K.P. and Anderson, D.R. (2002). Model selection and Multimodel inference, 2nd ed., Springer, New York. Burnham, K.P. and Anderson, D.R. (2004). “Multimodel inference: understanding AIC and BIC in model selection.” Sociological Methods & Research, 33(2), 261-304. Cahill, A.T. (2003). “Significance of AIC differences for precipitation intensity distributions.” - 23 -

Advances in Water Resources, 26, 457-464. Cannon, A.J. (2010). “A flexible nonlinear modeling framework for nonstationary generalized extreme value analysis in hydroclimatology.” Hydrological Process, 24, 673-685. Clarke, R.T. (2002). “Estimating time trends in Gumbel-distributed data by means of generalized linear models.” Water Resources Research, 38(7), 1111, 16(1-11). Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer, London. El Adlouni, S., Ouarda, T.B.M.J., Zhang, X., Roy, R., and Bobée, B. (2007). “Generalized maximum likelihood estimators for the nonstationary generalized extreme value model.”

Water Resources Research, 43, W03410. García, J.A., Gallego, M.C., Serrano, A., and Vaquero, J.M. (2007). “Trends in blockseasonal extreme rainfall over the Iberian peninsula in the second half of the twentieth century.” Journal of Climate, 20, 113-130. Griffis, V.W. and Stedinger, J.R. (2007). “Incorporating climate changes and variability in to Bulletin 17B LP3 model.” World Environmental and Water Resources Congress 2007, Tampa, Florida. Heo, J.H., Shin, H., Nam, W., Om, J., and Jeong, C. (2013). “Approximation of modified Anderson-Darling test statistics for extreme value distributions with unknown shape parameter.” Journal of Hydrology, 499, 41-49. Hurvich, C.M. and Tsai, C.L. (1989). “Regression and time series model selection in small samples.” Biometrika, 76(2), 297-307. Hurvich, C.M. and Tsai, C.L. (1995). “Model selection for extended Quasi-Likelihood Models in small samples.” Biometrics, 51, 1077-1084. - 24 -

Jain, S. and Lall, U. (2000). “Magnitude and timing of annual maximum floods: Trends and large-scale climatic associations for the Blacksmith Fork river, Utah.” Water Resources

Research, 36(12), 3641-3651. Jain, S. and Lall, U. (2001). “Floods in a changing climate: Does the past represent the future?” Water Resources Research, 37(12), 3193-3205. Katz, R.W., (2013). “Statistical Methods for Nonstationary Extremes.” In: Extremes in a

Changing Climate, Edited by AghaKouchak, A., Easterling, D., Hsu, K., Schubert, S., and Sorrooshian, S., Chapter 2, Springer, London. Katz, R.W., Parlang, M.B., and Naveau, P. (2002). “Statistics of extremes in hydrology.”

Advances in Water Resources, 25, 1287-1304. Kharin, V.V. and Zwiers, F.W. (2005). “Estimating Extremes in Transient Climate Change Simulations.” Journal of Climate, 18, 1156-1173. Koutsoyiannis, D. (2011). “Hurst-Kolmogorov dynamics and uncertainty.” Journal of the

American Water Resources Association, 47(3), 481-495. Kullback, S. and Leibler, R.A. (1951). “On information and sufficiency.” Annals of

Mathematical Statistics, 22, 79-86. Laio, F., Baldassarre, G. D., and Montanari, A. (2009). “Model selection techniques for the frequency analysis of hydrological extremes.” Water Resources Research, 45(7), W07416. Leadbetter, M.R., Lindren, G., Rootzén, H. (1983). Extremes and Related Properties of

Random Sequences and Processes. Springer-Verlag, New York. Lettenmaier, D.P. and Burges, S.J. (1982). “Gumbel’s extreme value I distribution: A new look.” Journal of Hydraulics Division, ASCE, 108(4), 502-514. Lins, H.F., and Cohn, T.A. (2011). “Stationarity: Wanted dead or alive?” Journal of the - 25 -

American Water Resources Association, 47(3), 475-480 Mailhot, A., Kingumbi, A., Talbot, G., and Poulin, A. (2010). “Future changes in intensity and seasonal pattern of occurrence of daily and multi-day annual maximum precipitation over Canada.” Journal of Hydrology, 388, 173-185. Milly, P.C.D., Betancourt, J., Falkenmark, M., Hirsch, R.M., Kundzewicz, Z.W., Lettenmaier, D.P., and Stouffer, R.J. (2008). “Stationarity Is Dead: Whither Water Management?” Science, 319, 573-574. Montanari, A., and Koutsoyiannis, D. (2014). “Modeling and mitigating natural hazards: Stationary is immortal!” Water resources Research, 50, 9748-9756. Nadarajah, S. (2005). “Extremes of daily rainfall in west central Florida.” Climate Change, 69, 325-342. Olsen, J.R., Stedinger, J.R., Matalas, N.C., and Stakhiv, E.Z. (1999). “Climate variability and flood frequency estimation for the upper Mississippi and lower Missouri Rivers.” Journal of

the American Water Resources Association, 36(6), 1509-1523. Schwarz, G. (1978). “Estimating the dimension of a model.” The Annals of Statistics, 6(2), 461-464. Serinaldi, F., and Kilsby, C.G. (2015). “Stationarity is undead: uncertainty dominates the distribution of extremes.” Advances in Water Resources, 77, 17-36. Smyth, P. (2000). “Model selection for probabilistic clustering using cross-validated likelihood.” Statistics and Computing, 9, 63-72. Stone, M. (1979). “Comments on model selection criteria of Akaike and Schwarz.” Journal of

the Royal Statistics Society, Series B, 41, 276-278. Strupczewski, W.G., Singh, V.P., Feluch, W. (2001a). “Non-stationary approach to at-site - 26 -

flood frequency modelling Ⅰ. Maximum likelihood estimation.” Journal of Hydrology, 248, 123-142. Strupczewski, W.G., Singh, V.P., Mitosek, H.T. (2001b). “Non-stationary approach to at-site flood frequency modelling Ⅲ. Flood analysis of Polish rivers.” Journal of Hydrology, 248, 152-167. Sugahara, S., Rocha, R.P., and Silveira, R. (2009). “Non-stationary frequency analysis of extreme daily rainfall in Sao Paulo, Brazil.” International Journal of Climatology, 29, 13391349. Tramblay, Y., Neppel, L., Carreau, J., and Najib, K. (2013). “Non-stationary frequency analysis of heavy rainfall events in southern France.” Hydrological Sciences Journal, 58 (2), 280–294. Vasiliades, L., Galiatsatou, P., and Loukas, A. (2015). “Nonstationary Frequency Analysis of Annual Maximum Rainfall Using Climate Covariates.” Water Resources Management, 29, 339–358. Villarini, G., Smith, J.A., Serinaldi, F., Bales, J., Bates, P.D., and Krajewski, W.F. (2009). “Flood frequency analysis for nonstationary annual peak records in an urban drainage basin.”

Advanced in Water Resources, 32, 1255-1266. Villarini, G., Smith, J.A., and Napolitano, F. (2010). “Nonstationary modeling of a long record of rainfall and temperature over Rome.” Advanced in Water Resources, 33, 1256-1267. Wang, X.L., Zwiers, F.W., Swail, V.R. (2004). “North Atlantic Ocean Wave Climate Change Scenarios for the Twenty-First Century.” Journal of Climate, 17, 2368-2383. Wang, X.L., Trewin, B., Feng, Y., and Jones, D. (2013). “Historical changes in Australian temperature extremes as inferred from extreme value distribution analysis.” Geophysical - 27 -

Research Letters, 40, 573-578. Wi, S., Valdés, J.B., Steinschneider, S., and Kim, T.W. (2015). “Non-stationary frequency analysis of extreme precipitation in South Korea using peaks-over-threshold and annual maxima.” Stochastic Environmental Research and Risk Assessment, DOI: 10.1007/s00477015-1180-8. Zucchini, W. (2000). “An introduction to model selection.” Journal of Mathematical Psychology, 44, 41-61.

- 28 -

Table 1. Applied stationary and nonstationary GEV models. Model

Location parameter

Scale parameter

Shape parameter

GEV(0,0,0)

µ

σ

ξ

GEV(0,1,0)

µ

exp(σ 0 + σ1t )

ξ

GEV(1,0,0)

µ0 + µ1t

σ

ξ

GEV(1,1,0)

µ0 + µ1t

exp( σ 0 + σ 1t )

ξ

Table 2. Simulation conditions for stationary and nonstationary GEV models. Model GEV(0,0,0)

Location parameter (µ) 0

GEV(0,1,0)

0

GEV(1,0,0)

± 0.005t , ± 0.01t

GEV(1,1,0)

± 0.005t , ± 0.01t

Scale parameter ( σ ) 1 exp(±0.005t ) , exp( ±0.015t ) , 1 exp( ±0.005t ) , exp( ±0.015t ) ,

Shape parameter ( ξ )

± 0.1, ± 0.2 exp(±0.01t ) exp( ±0.02t )

± 0.1 , ± 0.2 ± 0.1 , ± 0.2

exp( ±0.01t ) exp( ±0.02t )

± 0.1 , ± 0.2

Table 3. Estimated regression equations of reversed points in GEV(0,1,0) model. Shape parameter

Regression equation

Coefficient of determination ( )

-0.2

y = 3.1995 x

−0.708

-0.1

y = 3.5037 x

−0.697

y = 3.6328 x

−0.695

0.9997

y = 3.6859 x

−0.694

0.9997

0.1 0.2

0.9997 0.9992

Table 4. Estimated regression equations of reversed points in GEV(1,0,0) model. Shape parameter

Regression equation

Coefficient of determination ( )

-0.2

y = 4.6963 x −0.66

0.999

y = 4.7982 x

−0.656

0.1

y = 4.6657 x

−0.665

0.9996

0.2

y = 4.5868 x −0.671

0.9994

-0.1

- 29 -

0.9997

Table 5. Results of the hypothesis test for the slope coefficient on selected sites. Presence of trend Site name

Observation Sample period size

Location parameter

Scale parameter

Shape parameter

True model

Jinju

1969~2013

45

X

X

X

GEV(0,0,0)

Gunsan

1968~2013

46

X



X

GEV(0,1,0)

Yeongju

1972~2013

42



X

X

GEV(1,0,0)

Namwon

1972~2013

42





X

GEV(1,1,0)

※ Significance level: 0.01

Table 6. Selected stationary and nonstationary GEV models for selected sites. Selected GEV model

True model

AIC

AICc

BIC

LRT

Jinju

GEV(0,0,0)

GEV(0,0,0)

GEV(0,0,0)

GEV(0,0,0)

GEV(0,0,0)

Gunsan

GEV(0,1,0)

GEV(0,1,0)

GEV(0,1,0)

GEV(0,0,0)

GEV(0,0,0)

Yeongju

GEV(1,0,0)

GEV(1,0,0)

GEV(1,0,0)

GEV(0,0,0)

GEV(0,0,0)

Namwon

GEV(1,1,0)

GEV(1,1,0)

GEV(0,1,0)

GEV(0,0,0)

GEV(0,0,0)

Site

- 30 -

Fig. 1. Simulation procedure performed in this study.

- 31 -

(a)  = ,  = ,  = .

(b)  = ,  = ,  = −. 

(best performance)

(worst performance)

Fig. 2. Proper model selection ratios of GEV(0,0,0) model simulations.

- 32 -

(a) AIC

(b) AICc

(c) BIC

(d) LRT

Fig. 3. Model selection ratios of AIC, AICc, BIC, and LRT for all candidate GEV models according to sample size (in the case of the assumed GEV(0,0,0) model:  = ,  = ,  = . ).

- 33 -

(a)  = ,  = −.   ,  = −.

(b)  = ,  = .  ,  = .

(best performance)

(similar to the Gunsan site)

(c)  = ,  = −. ,  = −.

(d)  = ,  = −.  ,  = . (worst performance)

Fig. 4. Proper model selection ratios of GEV(0,1,0) model simulations.

- 34 -

(a) AIC :  = ,  = −.  ,  = −.

(b) AICc :  = ,  = −.  ,  = −.

(c) BIC :  = ,  = −.  ,  = −.

(d) LRT :  = ,  =  −.  ,  = −.

Fig. 5. Model selection ratios of AIC, AICc, BIC, and LRT for all candidate GEV models according to sample size, in the case of assumed GEV(0,1,0) model.

- 35 -

(e) AIC :  = ,  = −. ,  = .

(f) AICc :  = ,  =  −. ,  = .

(g) BIC :  = ,  = −. ,  = .

(h) LRT :  = ,  =  −. ,  = .

Fig. 5. Model selection ratios of AIC, AICc, BIC, and LRT for all candidate GEV models according to sample size, in the case of assumed GEV(0,1,0) model (continued).

- 36 -

(a)  = −.

(b)  = −. 

(c)  = . 

(d)  = .

Fig. 6. Regression line of reversed points and recommended model selection methods in GEV(0,1,0) model.

- 37 -

(a)  = . ,  = ,  = .

(b)  = . ,  = ,  = −.

(best performance)

(c)  = . ,  = ,  = . 

(d)  = −. ,  = ,  = −.

(similar to the Yeongju site)

(worst performance)

Fig. 7. Proper model selection ratios of GEV(1,0,0) model simulations.

- 38 -

(a) AIC :  = . ,  = ,  = .

(b) AICc :  = . ,  = ,  = .

(c) BIC :  = . ,  = ,  = .

(d) LRT :  = . ,  = ,  = .

Fig. 8. Model selection ratios of AIC, AICc, BIC, and LRT for all candidate GEV models according to the sample size, in the case of assumed GEV(1,0,0) model.

- 39 -

(e) AIC :  = −. ,  = ,  = −.

(f) AICc :  = −. ,  = ,  = −.

(g) BIC :  = −. ,  = ,  = −.

(h) LRT :  = −. ,  = ,  = −.

Fig. 8. Model selection ratios of AIC, AICc, BIC, and LRT for all candidate GEV models according to the sample size, in the case of assumed GEV(1,0,0) model (continued).

- 40 -

(a)  = −.

(b)  = −. 

(c)  = . 

(d)  = .

Fig. 9. Regression line of reversed points and recommended model selection methods in GEV(1,0,0) model.

- 41 -

(a)  = −. ,  =  −.  ,  = −.

(b)  = −. ,  = −.  ,  = −.

(best performance)

(c)  = −. ,  =  −. ,  = −.

(d)  = −. ,  = −. ,  = −.

Fig. 10. Proper model selection ratios of GEV(1,1,0) model simulations.

- 42 -

(e)  = −. ,  =  . ,  = .

(f)  = −. ,  = . ,  = .

(g)  = −. ,  =  .   ,  = .

(h)  = −. ,  = .  ,  = . (worst performance)

(i)  = . ,  = .  ,  = −.

(j)  = . ,  = .  ,  = −.  (similar to the Namwon site)

Fig. 10. Proper model selection ratios of GEV(1,1,0) model simulations (continued). - 43 -

(a)  = −. ,  =  −.  ,  = −.

(b)  = −. ,  = −. ,  = −.

(best performance)

(c)  = −. ,  =  . ,  = −.

(d)  = −. ,  =  .   ,  = −.

(e)  = −. ,  =  .   ,  = .

(f)  = −. ,  = .  ,  = . (worst performance)

Fig. 11. Model selection ratios of AIC for all candidate GEV models according to the sample size, in the case of assumed GEV(1,1,0) model. - 44 -

(a) Location parameters

(b) Scale parameters

(c) Shape parameters

- 45 -

Fig. 12. Estimated location, scale, and shape parameters for the sample using 20-year moving windows.

- 46 -



We compared the AIC, AICc, BIC, and LRT for nonstationary GEV models.



Monte Carlo simulation was conducted for evaluating the performances of all tests.



Under stationary conditions, the BIC shows the best performance (N>40).



Under nonstationary conditions, regression lines for model selection were proposed.



The results of simulations were verified through the application of observed data.

- 47 -