International Review of Financial Analysis 20 (2011) 258–268
Contents lists available at ScienceDirect
International Review of Financial Analysis
Market risk model selection and medium-term risk with limited data: Application to ocean tanker freight markets Manolis G. Kavussanos ⁎, Dimitris N. Dimitrakopoulos Athens University of Economics and Business, 76 Patission St, Athens 104 34, Greece
a r t i c l e
i n f o
Available online 16 June 2011 JEL classification: G1 L9 Keywords: Freight rate risk Shipping Tankers Value at Risk Expected tail loss
a b s t r a c t The estimation of medium-term market risk dictated by limited data availability, is a challenging issue of concern amongst academics and practitioners. This paper addresses the issue by exploiting the concepts of volatility and quantile scaling in order to determine the best method for extrapolating medium-term risk forecasts from high frequency data. Additionally, market risk model selection is investigated for a new dataset on ocean tanker freight rates, which refer to the income of the capital good — tanker vessels. Certain idiosyncrasies inherent in the very competitive shipping freight rate markets, such as excessive volatility, cyclicality of returns and the medium-term investment horizons – found in few other markets – make these issues challenging. Findings indicate that medium-term risk exposures can be estimated accurately by using an empirical scaling law which outperforms the conventional scaling laws of the square and tail index root of time. Regarding the market risk model selection for short-term investment horizons, findings contradict most studies on conventional financial assets: interestingly, freight rate market risk quantification favors simpler specifications, such as the GARCH and the historical simulation models. © 2011 Elsevier Inc. All rights reserved.
1. Introduction In the last few years Value at Risk (VaR) and expected tail loss (ETL) have gained prominence, establishing themselves in the financial community as simple and intelligible risk metrics and management techniques of financial assets and portfolios. According to Jorion (1997) “VaR summarizes the worst loss over a target horizon with a given level of confidence”. Although the concept of VaR has become a standard for measuring market risk, it has been criticized mainly on its inability to quantify and express the loss beyond the VaR level (tail risk) and for not being a coherent1 risk measure (Artzner, Delbaen, Eber, & Heath, 1997, 1999). To remedy VaR's pitfalls Artzner et al., 1999, introduced the expected tail loss (ETL) or expected shortfall — which expresses the loss beyond the VaR level and fulfills the desirable coherency conditions. Another problem with the estimation of these risk metrics is the estimation of medium-term risk exposures for problems with limited data availability, making the accurate estimation of VaR a difficult task. Specifically, according to Diebold, Hickman, Inoue, and Schuermann (1997) scaling volatility with the square root of time is inappropriate, misleading and results in overestimation of the variability of long horizon volatility. Christoffersen, Diebold, and Schuermann (1998)
⁎ Corresponding author. Tel.: + 30 210 8203167; fax: + 30 210 8203196. E-mail address:
[email protected] (M.G. Kavussanos). 1 A risk metric is coherent if it fulfills the conditions of monotonicity, sub-additivity, positive homogeneity and transitional invariancy. VaR fails to fulfill the sub-additivity condition of coherency, which requires the risk of the total positions to be less than or equal to the sum of the risk of the individual positions. 1057-5219/$ – see front matter © 2011 Elsevier Inc. All rights reserved. doi:10.1016/j.irfa.2011.05.007
argue that “if the horizon of interest is more than ten or fifteen trading days, depending on the asset class, then volatility is effectively not forecastable”. Danielsson and Zigrand (2006) argue that the square root of time rule leads to a systematic underestimation of risk. Saadi and Rahman (2008) in their study undertaken for the Canadian stock market echo Danielson's and Zigrand's (2006) findings. On the other hand, Cotter (2007) utilizes data on European stock index futures and finds evidence of overestimation of the conditional estimates 2 in the case of the square root of time rule. To remedy the problem he proposes using an extreme value scaling law applied on the GARCH filtered returns, which yields the most efficient risk forecasts. The estimation of medium-term horizon risk is relevant in many applications in risk management as the investment horizon of some type of investments may be relatively longer than the daily investment horizon typically used in the case of conventional financial assets. This could be due either to factual reasons or regulatory demands. For instance, the investment horizon of investments in shipping freight markets are typically longer 3 compared to those of conventional financial assets. Additionally, the investment horizon for reporting the VaR of a bank for internal model validation purposes is typically medium-term according to the latest Basel accord guidelines (see Basel Committee on Banking Supervision, 2009, p.13). Considering the 2 Conditional estimates refer to the VaR estimates obtained by employing an ARGARCH model to filter the return series. 3 In shipping, often, long and variable investment horizons are part of the risk management's concerns as freight rates for the next contract are negotiated at the market rate towards the end of the fixture for a period, which ranges from the length of one journey to several years.
M.G. Kavussanos, D.N. Dimitrakopoulos / International Review of Financial Analysis 20 (2011) 258–268
necessity and the difficulty of extrapolating accurate medium-term risk forecasts from limited data, it is critical for risk taking agents to have an accurate estimate of the medium-term risk undertaken. Another issue related to market risk estimation is that there is no consensus with respect to a preferred method for estimating market risk; see Kuester, Mittnik, and Paoella (2006) and references therein. To this end, although a plethora of alternative models has been developed for the estimation of market risk, which model is appropriate in each market is a matter of empirical investigation. The aim of this paper is to address the critical issues of medium-term market risk estimation and market risk model selection by examining market risk measurement in price series that exhibit some special properties, such as medium-term investment horizons, cyclicality and excessive volatility. These may arise in certain markets where there is high degree of competition. Such markets include ocean bulk vessel freight markets, electricity prices and oil and agricultural product markets, amongst others; see for instance, Kavussanos and Visvikis (2006) for ocean going ship freight markets. The paper utilizes a set of ocean tanker freight rate data to investigate the above issues. The turbulent and idiosyncratic nature of these markets, the relative lack of empirical research combined with the fact that the issue of mediumterm risk estimation remains challenging and under-researched makes the topic investigated in this paper particularly interesting and motivating. Typically, owners of tanker vessels are remunerated for providing transportation services – the carriage, say, of crude oil cargo – to a charterer, through voyage freight rates 4. Ocean going tanker vessels account for approximately half of the world's seaborne trade and relate to the transportation cost of strategic for the world economy products such as the crude oil and its by-products 5. In the shipping business, freight rate market risk constitutes the most important portion of the business risks faced by shipowners. Also, it constitutes a significant cost for the charterers of the vessels (see Kavussanos & Visvikis, 2006, for a systematic analysis of this). Efficient and timely measurement and management of freight market risks are therefore vital for firms striving to maintain a competitive advantage, through the optimization of their investment decisions and the mitigation of losses in difficult periods. Regarding the issue of medium-term risk estimation this paper proposes as a solution an empirical scaling law which extrapolates medium-term risk forecasts from high frequency data. The comparison of the results with those obtained from conventional scaling laws, such as the square root of time or the tail index root of time, indicates that the proposed scaling law yields the most accurate and reliable risk forecasts. Concerning the issue of market risk model selection, a systematic view of the latter for the liquid bulk freight rate markets is provided here for the first time. It should be noted that despite the plethora of studies on VaR model selection for conventional financial assets and commodities, to our knowledge the issue of freight rate risk measurement for the liquid bulk shipping sector has not been studied in a systematic way. The exception, for shipping markets in general has been Tsolakis (2005) and Angelidis and Skiadopoulos (2008). Tsolakis (2005) illustrates how VaR and ETL can be used for quantifying freight rate risk exposures for shipping companies owning a portfolio of tanker vessels by providing an example of ETL estimation and examples of VaR estimation with the 4 There are several types of freight contracts, such as: Voyage (spot) charter contracts, to move, say, crude oil between two ports; time charters contracts (to hire the vessel for a period of time, say one year, paid in $/day); contracts of afreightment (to carry several cargos between two ports, irrespective of the vessel used, in $/ton) and bareboat charters (to lease the vessel for, say 10 years, paid in $/month). Economic agents choose different types of freight contracts to satisfy their risk-return profiles. For instance, tanker voyage charters are riskier compared to time charters or contracts of afreightment — see for example Kavussanos (2003). 5 Cargoes shipped by tanker vessels fall into three main categories: crude oil and products; liquefied gas; and liquid chemicals, such as ammonia, phosphoric acid, etc.
259
Conish–Fisher and the historical simulation methods. Angelidis and Skiadopoulos (2008) investigate four Baltic Exchange indices (albeit only one tanker route) 6 and find that non-parametric models perform best in estimating VaR. This paper contributes to the international literature in a number of ways: First, an improved method for the estimation of mediumterm risk by the VaR method is proposed and compared with existing methods. The proposed methodology yields fast and accurate risk forecasts and alleviates the pitfalls associated with existing scaling laws, which are used for extrapolating medium-term risk forecasts from limited high frequency data. Second, it investigates systematically for the first time the critical issue of market risk quantification for freight rate market risk exposures of the liquid bulk sector of the shipping industry; an idiosyncratic and neuralgic sector of the world economy. Third, a large number of alternative models are compared, in order to assess their validity and select the most appropriate risk measurement models that can best account for the idiosyncrasies of cyclicality, and volatility clustering found in this class of assets. For the sake of coherency and completeness, the evaluation is not limited to VaR levels but also examines the critical issue of ETL risk forecasting for losses beyond the VaR level. Fourth, it characterizes for the first time the tails of the empirical distributions of freight rate markets of the liquid bulk sector of shipping by examining their distributional properties and tail behavior. Finally, the empirical findings of this paper may assist risk taking agents in selecting the most appropriate market risk measurement models in the case of assets that exhibit similar characteristics in their returns with freight rate returns, such as energy and agricultural commodity markets. The remainder of this paper is organized as follows: Section 2 outlines the models employed and the testing methodology; Section 3 discusses the data; Section 4 presents and discusses the results while Section 5 concludes the paper.
2. Methodology A main concern for the market risk manager is the choice of the most appropriate market risk measurement method amongst a number of alternative models developed in the literature. Additionally, the estimation of medium-term risk is relevant in many applications in risk management and at the same time a particularly challenging issue when a limited historical sample is available. Thus, finding a reliable method for extrapolating medium-term risk forecasts from high frequency data is a critical task. This section outlines the alternative VaR and ETL methodologies compared for the estimation of market risk in this paper. Furthermore, it discusses the methods used for the estimation of the medium-term risk, the pitfalls associated with existing scaling laws and how risk taking agents may overcome these pitfalls by exploiting a new, reliable, accurate and fast method for the estimation of the medium-term market risk.
2.1. Value at risk Consider the price of an asset Pt at time period t and the continuously compounded return on this asset, rt = ln(Pt/Pt − 1), over a holding period/horizon from period t − 1 to t. Further, assume that rt is a stochastic process with mean μt, that is, rt = μt + εt; εt ~ i. i. d(0, σt2) and variance σt2, taking values from some unknown probability distribution function f. VaR expresses the expected maximum loss over a determined investment horizon (e.g. one day), that can be sustained, at a certain confidence level (1 − α).
6 Namely, the Baltic Dry Index, the Time-Charter Average Baltic Panamax Index, the Time-Charter Average Baltic Capesize Index and the Dirty Tanker TD3 Index.
260
M.G. Kavussanos, D.N. Dimitrakopoulos / International Review of Financial Analysis 20 (2011) 258–268
Average Baltic Indices
3500 3000 2500 2000
BCTI BDTI
1500 1000 500
8/3/2006
2/3/2006
8/3/2005
2/3/2005
8/3/2004
2/3/2004
8/3/2003
2/3/2003
8/3/2002
2/3/2002
8/3/2001
2/3/2001
8/3/2000
2/3/2000
8/3/1999
2/3/1999
8/3/1998
0
Date Fig. 1. Average Baltic Freight Indices for Clean Tankers (BCTI) and Dirty Tankers (BDTI) sector of shipping (8/3/1998-12/9/2006).
If the density of rt belongs to the class of elliptical distributions, 7 −a such as the normal distribution, VaRt1+ 1 takes the following form: 1−a + 1
VaR t
= μˆ t
+ 1 −F
−1
ˆt+1 ðaÞ⋅ σ
ð1Þ
defined parameters α and β, where α = λ and β = (1 − λ), summing to unity, which can be written in the following recursive form: qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi σ t + 1 = λσ t2 + ð1−λÞrt2 ð3Þ
where F − 1 denotes the standardized quantile of the assumed ˆ t + 1 are the estimated/forecasted location distribution and μˆ t + 1 , σ and scale parameters, respectively. The alternative models for estimating/forecasting σt + 1 and thus VaRt + 1 are presented next.
Riskmetrics thus exploits an exponentially declining weighting scheme that relies on only one parameter, the decay factor λ. According to Longerstaey (1996), a decay factor of λ = 0.94 produces on average a very good forecast of one-day volatility, while λ = 0.97 results in good estimates for one-month volatility8.
2.2. Value at risk models
2.2.4. Simulation methods The Historical Simulation (HS) method in order to approximate future returns exploits a historical window of the last n-days, and assumes that this window is representative for the distribution of future returns. The 100(1 − α)% VaR can be derived by calculating the empirical α-quantile of the sequence of past returns: 1−a a n VaR t + 1 = Q frt gt =1 ð4Þ
2.2.1. Volatility models: Random Walk, GARCH, Riskmetrics The Random Walk (RW) model is defined by: σt2+ 1 =σt2 +ε t, ε t ~i.i.d (0,σt2). Taking the expectation of the random walk next period's variance (σt2+ 1) yields the random walk VaR estimator of the standard deviation, ˆ t + 1 =σ ˆ t ; where σ ˆ is the sample standard which simplifies to σ deviation. The AutoRegressive Conditional Heteroscedasticity ARCH (Engle, 1982) and Generalized AutoRegressive Conditional Heteroscedasticity GARCH (Bollerslev, 1986) models can capture the empirical regularity of time varying volatility evident in the returns of most financial time series and also for tanker freight rates, see for instance Kavussanos (2003) for the latter. Thus, it can be used for volatility forecasting, required for forecasting VaR. A GARCH (p,q) forecasting model takes the following form: p−1
2
σt
+ 1
2
q−1
2
= ω + ∑ βj σ t−j + ∑ ai εt−i j=0
ð2Þ
i=0
where the estimated parameters ω, βj and ai satisfy the appropriate conditions for the non-negativity of the conditional variance; see Bollerslev (1986) and Nelson and Cao (1992) for details on the nonnegativity and stationarity conditions of the GARCH processes. Note that p
q
j=1
i=1
when ∑ βj + ∑ ai = 1 the model is Integrated GARCH (IGARCH). Riskmetrics (RM) exploits a restricted Integrated GARCH (IGARCH) filter for returns, with zero constant (ω = 0) and pre-
7 The class of elliptical distributions – also known as location-scale distributions – includes the Student's-t distribution, the exponential distribution and symmetric stable (or Pareto–Lévy) distributions, such as the normal and the Cauchy distributions (see Johnson, Kotz, & Balakrishnan, 1995 for a more comprehensive analysis regarding the aforementioned distributions).
Q a denotes the α-quantile and {rt}tn= 1 the series of returns from time 1 to n, where n represents the estimation window which is arbitrarily determined and typically ranges from 6 months to 2 years. The Hybrid Historical Simulation (HHS) is based on non-parametric inferences made through an exponentially weighted estimator of returns; see Boudoukh et al. (1998). To derive the HHS VaR estimator, a historical data window of K returns is obtained and a weight is −1 assigned to each of the most recent K returns {rt − n}nK = 0 , according to the following weighting scheme:0 ("
# )K ð1−λÞ n−1 λ 1−λK n=1
ð5Þ
where λ is a decay parameter, reflecting the importance of K past observations for the estimation of VaR 9. The returns are then ordered in ascending order and the 100 × (1 − α)%VaR is obtained by accumulating the weights until α is reached. Linear interpolation is used in order to obtain the exact quantiles of the distribution. Filtered Historical Simulation (FHS), introduced by Barone-Adesi, Giannopoulos, and Vosper (1999), is considered to be an improvement 8 Note that if λ = 1 the model reduces to an equally weighted average of squared returns. Optimal decay factors were derived from root mean squared error minimizations of volatility predictions from 480 Riskmetrics time series (see Longerstaey, 1996, chapter 5.3 for details). 9 Note that the parameter λ was arbitrarily set equal to 0.98, although, ex-post results from applying various values of λ for λ ∈ [0.95, 0.99] did not alter the general performance of the hybrid historical simulation model.
M.G. Kavussanos, D.N. Dimitrakopoulos / International Review of Financial Analysis 20 (2011) 258–268
261
Baltic Dirty Tanker Wordscale Indices for Tanker Routes TD3, TD5, TD7, TD9 500
Worldscale Indices
450 400 350
TD3
300
TD5
250
TD7
200
TD9
150 100 50 03/08/2006
03/02/2006
03/08/2005
03/02/2005
03/08/2004
03/02/2004
03/08/2003
03/02/2003
03/08/2002
03/02/2002
03/08/2001
03/02/2001
03/08/2000
03/02/2000
03/08/1999
03/02/1999
03/08/1998
0
Date Fig. 2. Baltic Dirty Tanker Worldscale Indices (BDTI) for routes TD3, TD5, TD7 and TD9 (8/3/1998-12/9/2006).
over the traditional HS method, by accounting for time varying volatility without imposing any rigorous parametric assumptions about the return distribution. This method involves 3-steps: First, a GARCH model is fitted to the returns so as to generate filtered, close to stationary, i.i.d. residuals for the historical simulation. For instance, assume the following GARCH (1,1) model: rt+1 = σ t + 1 zt + 1 ; zt e i:i:dð0; 1Þ;
2
2
2
σ t + 1 = ω + βσ t + art
ð6Þ
In sample forecasts of standardized GARCH errors are obtained as: et = zt + 1 = σ t + 1
ð7Þ
The second step consists of scaling bootstrapped standardized residuals ({si}in= 1) by the deterministic volatility forecast, n times, where n is the number of simulations and {si}in= 1 = {ei}in= 1 ⋅ σt + 1. The third step involves utilizing the simulated scaled residuals {si}in= 1 in conjunction with the mean specification of step one, to simulate the asset's price paths. VaR with a confidence level of 100 × (1 − α)% is estimated by finding the quantiles of the returns obtained from the simulation. That is, 1−a
VaR t + 1 = Q
a
n rˆ t t=1
ð8Þ
In case of multiple day (T) ahead risk forecasts, for each simulation trial, the above three step estimation procedure is applied T times where the forecasted returns are used as inputs for the GARCH model. Monte Carlo Simulation (MC) utilizes a GARCH model to account for the time varying volatility of returns. Specifically, a GARCH(1,1) model is fitted to the return data and 10,000 standardized pseudorandom normal variables are scaled by the forecast for volatility. The scaled residuals are then used in conjunction with the mean specification (of Eq.(6)) so as to formulate price paths and VaR is estimated from Eq. (4) where the Monte Carlo output of hypothetical returns are used, instead of realized past rt. 2.2.8. Extreme value method — peaks over threshold For the estimation of the extreme value VaR the procedure proposed by McNeil and Frey (2000) is followed. Specifically, a GARCH model is fitted to the sample (by means of quasi-maximum likelihood technique) and use the in-sample GARCH residuals (which should be i.i.d.) to fit a Generalized Pareto extreme value distribution (GPD) using maximum
likelihood techniques. VaR is then derived by: q ˆ t + 1 × qq VaR t+1 = μˆ t + 1 + σ
ð9Þ
where VaR tq+ 1 represents the VaR corresponding to the q-th quantile of ˆ t + 1 are the conditional one-step the loss distribution, μˆ t + 1 and σ ahead forecasts of the mean and the volatility, respectively and qq the tail estimator which is obtained by the following tail estimation formula: qq = u +
σ n −γ ð1−qÞ −1 γ Nu
ð10Þ
where u is a threshold corresponding to the value below which lies the 10% of the estimation sample of returns, n is the total number of observations, Nu is the number of observations exceeding the threshold u, and γ, σ are shape and scale parameter of the fitted GPD, respectively. 2.3. Expected tail loss (ETL) The risk metric of expected tail loss10 (ETL) is a coherent risk metric (see Artzner et al., 1997, 1999) which measures the expected value of shortfall of portfolio returns, with respect to some benchmark, under the condition that a shortfall occurs. Analytically: ETL = Et rt jrt bTHt;a
ð11Þ
where THt, a represents the benchmark of interest. Through this paper, the estimated VaR is used as the benchmark 11. 2.4. Estimation of VaR for medium-term investment horizons Although the investment horizon of interest commonly used for conventional financial series such as stocks, currencies and derivative instruments is typically one day, in the case of freight rate market risk exposures in shipping the investment horizons are variable and usually longer. Specifically, the holding period assumed depends on
10 Different names have been given to the expected tail loss, such as conditional VaR, expected shortfall, tails conditional VaR, among others. 11 As there is no closed form solution for the estimation of the ETL of every VaR model, a procedure proposed by Dowd (2002 p.42) is followed, according to which the tail of the projected distribution is sliced n times and the ETL is calculated as the equally weighted average of the estimated VaRs for each slice. Through this paper n is set to 5000.
262
Sample period: 3/8/1998 - 12/9/2006
BCTI BDTI BDTI TD3 BDTI TD5 BDTI TD7 BDTI TD9
Mean
Standard Deviation
Min
Max
Skewness
Kurtosis
J-B
ADF(lag) 1st diffs
Q(1)
Q(10)
Q(20)
Q2(1)
Q2(10)
Q2(20)
ARCH(1)
ARCH(10)
ARCH(20)
HKKP
0.030 [0.260] 0.013 [0.743] 0.016 [0.854] 0.017 [0.835] 0.013 [0.888] 0.022 [0.884]
1.229
− 5.556
9.928
− 16.497
8.120
5.247
− 50.199
39.961
3.868
− 20.806
26.097
4.231
− 22.387
45.582
5.623
− 41.989
46.239
9.942 [0.000] 8.109 [0.000] 17.743 [0.000] 10.251 [0.000] 18.771 [0.000] 17.431 [0.000]
4595 [0.000] 2260 [0.000] 18,429 [0.000] 4580 [0.000] 22,050 [0.000] 17,753 [0.000]
− 17.260(1) [0.000] − 19.202(3) [0.034] − 25.528(0) [0.001] − 28.453(0) [0.004] − 20.851(1) [0.000] − 15.234(7) [0.000]
858 [0.000] 776 [0.000] 475 [0.000] 361 [0.000] 583 [0.000] 265 [0.000]
2960 [0.000] 1572 [0.000] 682 [0.000] 503 [0.000] 1011 [0.000] 403 [0.000]
3155 [0.000] 1624 [0.000] 709 [0.000] 520 [0.000] 1237 [0.000] 530 [0.000]
226 [0.000] 69 [0.000] 68 [0.000] 43 [0.000] 47 [0.000] 89 [0.000]
494 [0.000] 123 [0.000] 149 [0.000] 75 [0.000] 58 [0.000] 109 [0.000]
502 [0.000] 135 [0.000] 212 [0.000] 95 [0.000] 60 [0.000] 125 [0.000]
226 [0.000] 69 [0.000] 68 [0.000] 42 [0.000] 47 [0.000] 89 [0.000]
290 [0.000] 88 [0.000] 105 [0.000] 60 [0.000] 53 [0.000] 100 [0.000]
293 [0.000] 94 [0.000] 126 [0.000] 72 [0.000] 55 [0.000] 110 [0.000]
3.8 6.2
1.908
1.231 [0.0651] − 0.385 [0.001] 0.227 [0.001] 0.613 [0.000] 1.699 [0.000] 0.574 [0.000]
3.8 3.2 4.5 3.0
Notes: Min and Max are the minimum and maximum values of the sample data, respectively. pffiffiffi pffiffiffi Skewness and kurtosis are the estimated centralized third and fourth moments of the data; their asymptotic distribution under the null are T ^ a3 eNð0; 6Þ and T ^ a4 −3 eN ð0; 24Þ. J–B is the Jarque–Bera (1980) test for normality; the statistic is χ2(2) distributed. ADF is the Augmented Dickey and Fuller (1981) test. The ADF regressions include an intercept term. The lag length of the ADF test (in parentheses) is determined by minimizing the Schwarz's Bayesian Information Criterion. Q(i) and Q2(i) are the Ljung and Box (1978) Q statistics on the first i = 1,10,20 lags of the sample autocorrelation function of rt and rt2, testing for autocorrelation and heteroscedasticity, respectively. These tests are distributed as χ2(i). ARCH(i) is a Lagrange multiplier (LM) test for Autoregressive Conditional Heteroskedasticity (ARCH) of order i = 1,10,20 (Engle, 1982). The HKKP, is Huisman et al. (2001) tail index estimator and measures the degree of tail fatness of a distribution as well as the number of existing moments: A relatively high tail index corresponds to a relatively low probability of extreme events. k Huisman et al. (2001) tail index estimator is equal to the estimate of the intercept β0 of the following regression γ(k) = β0 + β1k + ε(k), k = 1, …, κ where κ is equal to half the sample size, γðkÞ = 1k ∑ ln xn−j + 1 − lnðxn−k Þ is Hill's j=1 estimator (see Hill, 1975) of the tail index and x, k the order statistics for the returns and the number of k-order statistics (tail observations) of the sample, respectively. Numbers in square brackets [∙] indicate exact significance levels.
M.G. Kavussanos, D.N. Dimitrakopoulos / International Review of Financial Analysis 20 (2011) 258–268
Table 1 Summary statistics of scaled returns × 100 of spot Baltic Tanker Indices.
M.G. Kavussanos, D.N. Dimitrakopoulos / International Review of Financial Analysis 20 (2011) 258–268
263
Table 2 Summary of Christoffersen's backtesting results. VaR models satisfying Christoffersen's conditional coverage test Markets
α
BCTI
5% 1% 5% 1% 5% 1% 5% 1% 5% 1% 5% 1%
BDTI BDTI TD3 BDTI TD5 BDTI TD7 BDTI TD9
RW(50)
RM(250)
HS(700)
HHS(250)
X X X
X
X
X
X
X
GARCH(700)
MC(700)
EVT(1000)
X X X
X X X X X
X X X X
X X X X X X
X
X
X
X
X
X
X X X X X X X
X X
X
FHS(700)
X X X X X
Notes: This table summarizes the results of the competing VaR models, which satisfy Christoffersen's conditional coverage tests across the liquid bulk freight rate markets. The abbreviations RW, RM, HS, HHS, FHS, GARCH, MC, EVT stand for the random walk, the Riskmetrics, the historical simulation, the hybrid historical simulation, the filtered historical simulation, the GARCH with normal innovations, the Monte Carlo simulation and the extreme value peaks over threshold VaR methods, respectively. Numbers in brackets denote the length of the optimal estimation windows used for the estimation of VaR. The ‘X’ figure indicates that the respective models have passed Christoffersen's conditional coverage test at the 5% level of significance for the specific market.
the investment examined and could be short term, medium-term or long-term 12. As with other time series, due to data availability restrictions it may be difficult to obtain a long history of freight rate data in order to estimate medium-term risk. Thus, the risk manager has to resort either to simulation or time aggregation by employing a scaling law. Through this paper time scaling of volatility and time scaling of quantiles is used. In a general setting the mθ period VaR derived by the square root of time law is given by 1−a VaR t + mθ
pffiffiffiffiffiffiffi ˆ t + 1 mθFa−1 −μˆ t + 1 mθ =σ
1−a
1= ξ
1−a
VaR t + 1
1−a
s
1−a
ð14Þ
VaRt+k = k VaR t+1
where k is the investment horizon and s is a scaling exponent. Solving Eq. (14) for s yields:
s=
h
h i h ii 1 1−a 1−a ðkÞ log VaRt + k − log VaRt + 1 log
ð15Þ
ð12Þ
where mθ denotes the investment horizon, which is a product of a reference data frequency θ (typically set to 1-day) and of m the multiple needed to produce the investment horizon of interest and ˆ t + 1 , μˆ t + 1 are the forecasts of the volatility and the drift of the σ random variable for the reference data frequency θ, respectively. The square root of time rule, when applied to volatility scaling, requires that returns are conditionally homoscedastic and serially uncorrelated. These assumptions contradict empirical findings on financial markets returns (see Engle, 1982). To investigate the alternative of the scaling of quantiles, three different scaling laws are used: the square root of time law, a scaling law based on extreme value theory and the estimated tail index (or characteristic exponent) and finally, an empirically determined scaling law. Scaling quantiles is based on the self-similarity theory which constitutes a necessary condition for time scaling of quantiles of distributions (see Embrechts & Maejima, 2000 for further details on self similarity). The square root of time rule, when applied to scaling quantiles requires, apart from homoscedastic and serially uncorrelated returns, the assumption of normality. The extreme value law for scaling quantiles is based on McNeil and Frey (2000) work according to which multiple day VaR is estimated by the following power scaling law: VaRt + k = k
Finally, the empirical scaling law is based on the empirical quantiles of the P/L distribution (see Provizionatou & Markose, 2005): assume
ð13Þ
where 1/ξ is a characteristic exponent or the tail index parameter obtained by means of maximum likelihood estimation by fitting a GPD to the return data. 12 The case of other fixtures such as the time charters where the holding period could be rather long (for instance 6 months, one year or longer) is not investigated since measuring freight risk for such exposures is beyond the scope of this paper.
Eq. (15) can be written in terms of the empirically determined VaRs as: h h i h ii 1−a 1−a s≈ log eVaR t + k − log eVaR t + 1
1 logðkÞ
ð16Þ
−a 1−a where eVaRt1+ k and eVaRt + 1 are the empirically determined VaRs. That is, the a-th percentile of the k-day and one-day returns of the available historical estimation sample, respectively. A k-period VaR is obtained by the empirical scaling law as follows: first, the scaling exponent s is derived empirically from Eq. (16) by using the empirically determined VaRs (eVaRs). Second, the k-period VaR s −a (VaRt1+ k ) is estimated from Eq. (14) by scaling with k the one-step −a ahead forecast of the daily VaR (VaRt1+ ) obtained from the historical 1 simulation model detailed in Section 2.4.
2.5. Backtesting of VaR end expected tail loss Backtesting is an integral part of the financial risk management process. It gives invaluable information regarding the quality of the risk measurement models and constitutes a criterion for VaR model selection. For backtesting of the alternative one-step ahead forecasts of VaR models, Christoffersen's (1998) and Lopez (1998) tests, presented below, are applied. 2.5.1. Tests of unconditional, independence and conditional coverage The test of unconditional coverage consists of testing that E[It]=p against the alternative of E[It]≠p, where It is an indicator variable taking values of 1 if the loss exceeds the estimated VaR (this is called a violation or ‘hit’) and zero otherwise, and p equals VaR's theoretical coverage rate α. This is equivalent to testing that the sequence {It + 1}tT= 1 follows an i.i.d. Bernoulli process with parameter p.
264
M.G. Kavussanos, D.N. Dimitrakopoulos / International Review of Financial Analysis 20 (2011) 258–268
Table 3 Modified Diebold Mariano Statistics for forecasting accuracy of VaR models. Market
PANEL A: 95% VaR
BCTI
MC vs GARCH 35.177 [0.000] FHS vs MC − 48.014 [0.000] FHS vs EVT − 3.375 [0.000] EVT vs GARCH 5.0973 [0.000] HHS vs GARCH 4.655 [0.000] FHS vs GARCH 5.720 [0.000] GARCH vs FHS − 2.886 [0.003]
BDTI
TD3
TD5
TD7
TD9
PANEL B: 99% VaR EVT vs GARCH 10.517 [0.000] GARCH vs MC − 47.584 [0.000] GARCH vs EVT − 6.727 [0.000]
FHS vs GARCH 8.317 [0.000] MC vs GARCH 42.546 [0.000] MC vs FHS 44.990 [0.000]
EVT vs MC − 45.133 [0.000] FHS vs GARCH 6.006 [0.000]
MC vs GARCH 66.501 [0.000] EVT vs GARCH 3.018 [0.002] POT vs FHS 0.790⁎⁎ [0.216]
HS vs EVT − 1.326⁎ [0.056] FHS vs HS 0.837⁎⁎ [0.212]
FHS vs EVT − 2.509 [0.011] EVT vs HS 4.463 [0.001]
GARCH vs EVT − 3.297 [0.002] GARCH vs HS 1.661 [0.058]
MC vs EVT 19.917 [0.000] MC vs HS 7.251 [0.000]
EVT vs HS 0.755⁎⁎ [0.229] HS vs FHS 1.198⁎⁎
EVT vs HS 1.125⁎⁎ [0.136] HHS vs FHS 4.522 [0.000] HS vs HHS 2.072 [0.030] FHS vs HS 0.168⁎⁎ [0.434]
EVT vs FHS 1.988 [0.033] FHS vs HHS 3.056 [0.005] EVT vs HS 0.645⁎⁎ [0.257]
EVT vs HHS 2.474 [0.015]
[0.125] RM vs HHS 1.455⁎ [0.078] HHS vs HS 1.195⁎⁎ [0.124]
FHS vs GARCH 1.596⁎ [0.069]
Notes: This table reports Modified Diebold Mariano (MDM) test statistics for the null of equal forecasting accuracy within the VaR forecasting models. A benchmark model for each market which exhibits the lowest quadratic loss function score is compared vis a vis the remaining VaR models satisfying the Christoffersen's tests. The MDM test statistic is given by:
n−1 1 = 2 d rffiffiffiffiffiffiffiffiffiffiffiffiffiffi MDM = t ðn−1Þ where d, d, n, and Var are the loss differential (dt = g (ε it) − g (ε jt), with g (ε it), g (ε jt) being the squared violations or forecast errors from the n Var ðdÞ e n estimation of models i and j, respectively); the mean loss differential; the number of observations in the sample of the loss differential and the variance of the loss differential, respectively. Entries in square brackets are p-values. The abbreviations RM, HS, HHS, FHS, GARCH, MC, EVT stand for the Riskmetrics, the historical simulation, the hybrid historical simulation, the filtered historical simulation, the GARCH with normal innovations, the Monte Carlo simulation and the extreme value peaks over threshold VaR methods, respectively. ⁎⁎ and ⁎ indicate significance at the 5% and 10% significance levels, respectively. VaR models in bold are those selected by the MDM tests for each market.
The likelihood ratio (LR) test statistic for the unconditional coverage test follows a chi-square distribution with one degree of freedom: " LRuc ð pÞ = −2ln
# ð1−pÞT0 pT1 2 χ ð1Þ ð1−T1 = T ÞΤ0 ðT1 =T ÞΤ1 e
ð17Þ
where T0 and T1 are the number of zeros and ones in the ‘hit’ (or violation) sequence. An enhancement of the unconditional backtesting framework is the conditional coverage test, which takes into account not only the correct unconditional coverage but also ensures that the hit sequence is i.i.d. Christoffersen's LR test for independence (LRind), against an explicit first-order Markov alternative is given by: 2 6 6 LRind ð pÞ = −2ln 6 4
3 T T0 T1 T1 1− 1 7 T T 7 2 Τ00 Τ01 7e χ ð1Þ ˆ 01 5 ˆ 01 π 1−π
2
2.5.4. Regulatory evaluation of VaR and ETL To distinguish between well-specified VaR models the following regulatory loss function (RLF) is utilized in this paper which addresses the magnitude of the violations; see Lopez (1998): M
I VaR;t =
ðrt −VaRt Þ2 if rt bVaRt 0 otherwise
ð20Þ
ð18Þ A similar loss function is used for the backtesting of the ETL forecasts, defined by:
where Ti,j, i,j = 0,1 is the number of observations with a j following an i ˆ 01 = Τ01 = ðΤ00 + Τ01 Þ. in the It sequence, and π Finally, a third test proposed by Christoffersen (1998) involves examination of the correct conditional coverage, which requires the ‘hit’ sequence to pass both the independence and the correct unconditional coverage tests. The joint test of unconditional coverage and independence involves estimation of the following LR statistic: LRcc = LRunc + LRind e χ ð2Þ
The backtesting of the alternative models for multiple day ahead forecasts is conducted by comparing the expected number of violations to the realized number of violations in the concept of coverage ratios. This naïve backtesting procedure is used instead of Christoffersen's backtesting in order to avoid backtesting for a fictitious statistical significance in the test of independence stemming from the dependencies induced by the use of overlapping data in the estimation of medium-term risk.
ð19Þ
This statistic follows a chi-square distribution with two degrees of freedom.
M
IETL;t =
ðrt −ETL t Þ2 if rt bVaRt 0 otherwise
ð21Þ
M M Both IVaR, t and IETL, t are time series of numerical scores of model M (VaR and ETL, respectively) at t penalizing large deviations between negative returns and VaRs or ETL forecasts, respectively, provided that these negative returns exceed the respective VaR or ETL forecasts. M M Models with lower mean values of IVaR;t and IETL;t , that is, lower values of the mean RLF scores, are considered to be superior to models with higher RLF scores. Let g(εit) and g(εjt) be the squared violations (or forecast errors) reciprocating from the estimation of models i and j respectively. Then,
M.G. Kavussanos, D.N. Dimitrakopoulos / International Review of Financial Analysis 20 (2011) 258–268
265
Table 4 Modified Diebold Mariano Statistics for forecasting accuracy of expected tail loss models (95% and 99% VaR threshold). PANEL A: Modified Diebold Mariano Statistics of expected tail loss models (95% VaR threshold) Market BCTI
BDTI TD3 TD5 TD7 TD9
RW vs MC
RM vs MC
HS vs MC
HHS vs MC
FHS vs MC
GARCH vs MC
EVT vs MC
26.125 [0.000] FHS vs GARCH 5.092 [0.000] RW vs HS 1.375⁎ [0.087] RM vs RW − 1.012⁎⁎ [0.158] RW vs GARCH 0.463⁎⁎ [0.323] RM vs RW 2.592 [0.006] RW vs FHS 1.836 [0.036]
17.539 [0.000]
32.670 [0.000]
18.541 [0.000]
− 3.542 [0.000]
− 8.103 [0.000]
29.321 [0.000]
RM vs HS 0.115⁎⁎ [0.454] HS vs RW 2.215 [0.015] RM vs GARCH − 0.060⁎⁎ [0.323] HS vs RW 11.290 [0.000] RM vs FHS 2.151 [0.018]
HHS vs HS 1.926 [0.029] HHS vs RW 0.424⁎⁎ [0.337] HS vs GARCH 8.611 [0.000] HHS vs RW 6.551 [0.000] HS vs FHS 0.606⁎⁎ [0.273]
FHS vs HS 2.188 [0.016] FHS vs RW 2.540 [0.007] HHS vs GARCH 4.748 [0.000] FHS vs RW 13.889 [0.000] HHS vs FHS 2.036 [0.023]
GARCH vs HS 1.395⁎ [0.084] GARCH vs RW 2.439 [0.009] FSH vs GARCH 3.614 [0.000] GARCH vs RW 7.323 [0.000] GARCH vs FHS 1.524⁎ [0.066]
MC vs HS 1.909 [0.031] MC vs RW 1.598⁎ [0.057] MC vs GARCH 1.133⁎⁎ [0.131] MC vs RW 6.005 [0.000] MC vs FHS 2.039 [0.023]
EVT vs HS 18.020 [0.000] EVT vs RW 5.085 [0.000] EVT vs GARCH 12.629 [0.000] EVT vs RW 22.649 [0.000] EVT vs FHS 3.971 [0.000]
GARCH vs FHS − 1.160⁎⁎ [0.129] GARCH vs FHS 0.903⁎⁎ [0.189] GARCH vs HS 5.256 [0.000] GARCH vs HS 8.609 [0.000] GARCH vs HS 11.393 [0.000] GAR⁎CH vs HHS 2.860 [0.004]
MC vs FHS 8.143 [0.000] MC vs FHS 1.611⁎[0.061] MC vs HS 5.443 [0.000] MC vs HS 11.233 [0.000] MC vs HS 12.365 [0.000] MC vs HHS 4.239 [0.000]
EVT vs FHS 14.813 [0.000] EVT vs FHS 6.993 [0.000] EVT vs HS 3.734 [0.001] EVT vs HS 9.336 [0.000] EVT vs HS 13.597 [0.000] EVT vs HHS − 1.050⁎⁎ [0.152]
PANEL B: Modified Diebold Mariano Statistics of expected tail loss models (99% VaR threshold) Market BCTI BDTI TD3 TD5 TD7 TD9
RW vs FHS 14.883 [0.000] RW vs FHS 0.800⁎⁎ [0.216] RW vs HS 3.083 [0.003] RW vs HS 8.551 [0.000] RW vs HS 5.774 [0.000] RW vs HHS 3.208 [0.002]
RM vs FHS 9.242 [0.000] RM vs FHS 0.422⁎⁎ [0.339] RM vs HS 2.604 [0.008] RM vs HS 8.371 [0.000] RM vs HS 7.535 [0.000] RM vs HHS 3.508 [0.001]
HS vs FHS 10.383 [0.000] HS vs FHS − 11.075 [0.000] HHS vs HS − 4.299 [0.000] HHS vs HS 5.098 [0.000] HHS vs HS 1.350 [0.000] HS vs HHS − 4.519 [0.000]
HHS vs FHS 7.655 [0.000] HHS vs FHS 0.206⁎⁎[0.420] FHS vs HS 1.404⁎ [0.087] FHS vs HS − 0.532⁎⁎ [0.300] FHS vs HS 4.964 [0.000] FHS vs HHS − 1.353⁎ [0.094]
Notes: See notes in Table 3. This table reports MDM test statistics for the null of equally forecasting accuracy within ETL forecasting models. The abbreviations RW, RM, HS, HHS, FHS, GARCH, MC, EVT stand for random walk, historical simulation, hybrid historical simulation, filtered historical simulation, GARCH with normal innovations, Monte Carlo simulation and filtered extreme value peaks over threshold ETL models, respectively. ⁎⁎ and ⁎ indicate significance at the 5% and 10% significance levels, respectively. ETL models in bold are those selected by the MDM tests for each market.
the null hypothesis of equal expected forecasting accuracy between the risk models i, j is equivalent to testing that the expected value of the loss differential dt = g(εit) − g(εjt) is zero. That is, E[dt] = 0. Harvey, Leybourne, and Newbold (1997) introduced a modified version of the original Diebold and Mariano (1995) statistic for testing the null hypothesis of equal expected forecasting accuracy. This is given by: MDM =
n + 1 1=2 d ⋅rffiffiffiffiffiffiffiffiffiffiffiffiffiffi e t ðn−1Þ n VarðdÞ
ð22Þ
n where n is the sample size of the violation series and Var(d) the variance of d. In the case of backtesting VaR models, the modified Diebold and Mariano test (henceforth referred to as MDM) is undertaken only for VaR models which survived Christoffersen's backtesting. In the case of ETL models, all forecasting models are considered. 3. Data The dataset used in this study includes six freight rate indices: the Average Baltic Clean 13 Tanker index (BCTI) and the Average Baltic 13 The distinction between ‘clean’ and ‘dirty’ indices is due to the type of cargo that vessels operating in the constituent routes of each index transport. In general, oil products can be classified into two broad categories: clean and dirty oil products. Clean products consist of lighter (sweet) distillates, such as gasoline and kerosene, which are usually shipped via vessels with coated tanks to ensure the cleanliness of the product. Dirty products involve lower distillates and residual oil which is usually shipped in conventional tankers.
Dirty Tanker index (BDTI) covering the two major sub-sectors of tanker shipping markets; and four popular Worldscale 14 routes of the Baltic Dirty Index: route TD3 (Middle East Gulf to Japan, for Very Large Crude Carriers (VLCC) vessel sizes of 250,000 deadweight tonnes (dwt henceforth)), the TD5 route (West Africa to US Atlantic Coast (USAC), for Suezmax vessel sizes of 130,000 dwt), route TD7 (North Sea to Continent, for Aframax vessel sizes of 80,000 dwt) and route TD9 (Caribbean to US Gulf, for Panamax vessel sizes of 70,000 dwt). The BDTI and BCTI freight rate indices are averages of individual route indices, and can be thought of as imitating portfolios of freight rate positions, covering fleets of vessels. Route indices simulate risk exposures of vessels employed in individual routes of the dirty tanker sector. The individual route indices are more relevant for freight market risk exposures of smaller companies employing vessels in anyone of the investigated routes only. It should be noted here that the selected individual route freight rate indices correspond to the most actively utilized routes within the dirty bulk tanker segment. Furthermore, they are underlying assets of freight rate derivatives and as a consequence they can be hedged. The particular series are chosen
14 WorldScale is a system of pricing tanker freight as a percentage of expected freight rates. It is published annually by the WorldScale Association, a non-profit group based in London and New York. Worldscale expresses market levels of freight in terms of a direct percentage of scale rates. They are derived assuming that a “nominal” tanker functions on round voyages between designated ports. For instance, Worldscale 100 (WS100) means 100 points of 100% of the published rate or, in other words, the published rate itself, sometimes referred to as Worldscale flat, while Worldscale 30 means 30 points or 30 % of the published rate. Tanker freight rates are freely negotiated percentage adjustments to the scale rates that has determined the actual rate used for the payment of freight.
266
M.G. Kavussanos, D.N. Dimitrakopoulos / International Review of Financial Analysis 20 (2011) 258–268
Table 5 Evaluation of VaR methods in forecasting medium-term risk. Panel A: Random walk VaR estimation sample 50
Panel B: Empirically scaled VaR; estimation sample 50, 700
Market
95% VaR violations
99% VaR violations
95% VaR violations
99% VaR violations
BDTI BDTI BDTI BDTI
128 116 92 121
73 58 46 72
59 72 67 61
35 16 17 14
TD3 TD5 TD7 TD9
Panel C: Riskmetrics VaR; estimation sample 250
Panel D: Historical simulation VaR; estimation sample 700
Market
95% VaR violations
99% VaR violations
95% VaR violations
99% VaR violations
BDTI BDTI BDTI BDTI
114 125 74 120
60 65 37 64
110 153 85 115
55 89 26 52
TD3 TD5 TD7 TD9
Panel E: Hybrid historical simulation VaR; estimation sample 250
Panel F: Filtered historical simulation VaR; estimation sample 700
Market
95% VaR violations
99% VaR violations
95% VaR violations
99% VaR violations
BDTI BDTI BDTI BDTI
157 146 86 134
51 57 21 32
122 140 101 136
26 44 25 35
TD3 TD5 TD7 TD9
Panel G: GARCH-normal VaR; estimation sample 700
Panel H: Monte Carlo VaR; estimation sample 700
Market
95% VaR violations
99% VaR violations
95% VaR violations
99% VaR violations
BDTI BDTI BDTI BDTI
34 101 69 103
11 41 38 47
76 121 55 109
14 46 26 40
TD3 TD5 TD7 TD9
Panel I: Extreme value scaling VaR; estimation sample 1000 Market
95% VaR Violations
99% VaR Violations
BDTI BDTI BDTI BDTI
101 238 92 218
37 148 30 87
TD3 TD5 TD7 TD9
Notes: The first two columns give the number of violations of the respective VaR model at the 95% and 99% level for each of the markets followed. An exception occurs when rt|t − 1 b VaRt|t − 1. The investment horizons are set to 17, 17, 17, 2 and 6 days for the BDTI TD3, TD5, TD7 and TD9 routes, respectively. Optimal exception values are 50 and 10 for the 95% and 99% level VaR, respectively. Bold entries indicate the models which exhibit exception values closer to optimal exception values. The estimation sample for the empirically scaled VaR is 50 for observations for the estimation of the empirical VaRs and 700 for the estimation of the historical simulation one-day VaR.
so as to cover the two major segments of liquid bulk shipping (dirty and clean). Also, the specific routes and vessels cover a wide spectrum of vessel sizes in the tanker market. That is, VLCC, Suezmax, Aframax and Panamax vessels carrying crude oil. The time period investigated extends from 3/8/1998 to 12/9/2006, utilizing a total of 2035 daily observations. The dataset is obtained from Clarkson's research services. The evaluation sample covers the period between 12/9/2002 and 12/9/2006, yielding 1000 daily observations. The estimation sample is variable depending on the VaR model employed 15. Figs. 1 and 2 show graphically how freight rates have evolved over time for each of the ‘dirty’ and ‘clean’ segments of tanker shipping and for the most popular routes in the ‘dirty’ tanker sector, respectively. In Fig. 1, the BDTI and BCTI series seem to move closely together. The BDTI demonstrates a more volatile profile, particularly for the post-2003 period. As regards the route indices, they exhibit large spikes. The volatile behavior of freight rates, especially when the (competitive) freight market is strong is mainly attributed to two main factors: the high elasticity (steepness) of the supply curve for freight services when the market is strong and vessel utilization is near full employment, and the seasonal behavior, documented in Kavussanos and Alizadeh (2002). It is worth mentioning that all of the series display a mean reverting 15 Optimal estimation windows were chosen through experimenting with various estimation windows.
pattern. This is a consequence of the Worldscale cost adjustment system. Summary statistics on the first difference of logarithmic series, presented in Table 1 indicate that unconditional means are statistically zero for all series. The individual routes exhibit higher standard deviation values in comparison to those of the two indices-the portfolios of vessels. All return series exhibit significant positive skewness. High kurtosis values of all returns indicate that their distributions are more fat tailed than the normal distribution, implying higher probability of extreme events. Jarque–Bera tests confirm that all return series deviate from normality. Huisman, Koedijk, Kool, and Palm (2001) tail index algorithm16 (HKPP)17 estimates imply finite second moments for all series, while the smallest values of the tail index estimates attained in the case of 16 The HKKP, is Huisman et al. (2001) tail index estimator of Pareto type distributions with tail index γ which measures the degree of tail fatness of the distribution as well as the number of existing moments: A relatively high tail index corresponds to a relatively low probability of extreme events. It is equal to the estimate of the intercept β0 of the following regression γH = β0 + β1κ + ε(k), k = 1,…,κ where κ is equal to half the sample size, k γΗ = 1k ∑ ln rn−j + 1 − lnðrn−k Þ is Hill's maximum likelihood estimate (see Hill, 1975)
j=1
for the reverse of the tail index (γH = 1/γ), where x(n) are the nth order statistics for the negative returns (that is, X(1) ≤ X(2) ≤ … ≤ X(n)) and k∈{1,2,…,n − 1} a number corresponding to the kth ordered observation which is essentially a threshold. 17 The tail index characterizes the rate at which probability mass falls away in the tail of a distribution; to this end, the lower the tail index the higher the probability of extreme events.
M.G. Kavussanos, D.N. Dimitrakopoulos / International Review of Financial Analysis 20 (2011) 258–268
the TD3, TD5 and TD9 series imply heavier tailed distributions compared to the rest of the series. Tail index values are thus in line with the excess kurtosis results. Augmented Dickey-Fuller unit root tests (Dickey & Fuller, 1981) show that all variables are first difference stationary. The Ljung–Box Q-statistics (Ljung & Box, 1978) on the 1st, 10 and 20 lags of the sample autocorrelation function of the return and squared return series indicate significant serial correlation and heteroskedasticity, respectively. Langrage Multiplier tests illustrate ARCH effects in the residuals for all of the series. 4. Empirical findings 4.1. VaR and ETL estimates for short investment horizons Table 2 presents a summary of the results for Christoffersen's conditional coverage tests for one-day ahead VaR forecasts of the eight competing VaR models for the investigated markets. It is worth noting that both daily and medium-term investment horizons are considered in this paper, which correspond to the required market risk estimation horizons for small and large shipping companies, respectively. The daily investment horizon is a convention that seems to be relevant in the case of shipping companies that own large vessel portfolios and thus are more likely to engage in negotiating voyage fixtures on a single route or on the indices' constituent routes on a daily basis. On the other hand, medium-term ‘portfolio’ risk horizons are relevant in the case of small shipping companies that own a single or a small number of vessels. These are more likely to negotiate voyage fixtures on a medium-term basis due to the absence of vessels to be hired until the end of the fixture of the existing vessel. A VaR model is considered to be well specified if it provides correct conditional coverage. VaR models which exhibit correct conditional coverage are marked with the symbol ‘X’ in Table 2. Results indicate that the most “ill-suited” models for quantifying VaR are the random walk (RW) and the Riskmetrics (RM) models. This is attributed to their inability to deal with the issue of violation clustering in a Christoffersen's backtesting context, 18 resulting in the rejection of the test of independence. This could be due to RW's and Riskmetrics' failure to account for dependencies in these series that are driven by high cyclicality and excessive volatility dynamics. Furthermore, the inability of the RM's model to be successful for the lower loss quantiles could be due to the highly volatile nature of freight rate returns, which severely penalizes less adaptive VaR approaches (such as the Riskmetrics model) with fixed parameters. Models with time varying parameters perform better in capturing these effects. In general, in terms of Christoffersen's backtesting, the VaR models which yield sufficiently accurate VaR forecasts and are robust with respect to the confidence levels employed depend on the route being investigated and the required level of confidence. At the 95% confidence level, the GARCH with normal innovations (henceforth referred to as GARCH) VaR model performs best for all the investigated series. The filtered peaks over threshold (EVT) and the Monte Carlo (MC) VaR models pass the tests for five out of the six series. It seems that these more sophisticated models are more appropriate to account for the time varying volatilities prevalent in these series, a fact documented in Kavussanos (1996, 2003). At the 99% level of confidence, the filtered peaks over threshold (EVT), the historical simulation (HS) and the filtered historical simulation (FHS) models satisfy the conditional coverage test for all investigated series. The hybrid historical simulation model (HHS) is successful in 5 out of 6 cases. In general, more sophisticated VaR models satisfy Christoffersen's backtesting both for 95% and 99% levels. Business decisions regarding the appropriateness of VaR models are taken at the individual route level. Thus, for the BCTI clean tanker 18
Analytical unconditional, independence and conditional coverage test results are available upon request from the authors.
267
index baskets the more sophisticated GARCH, MC and EVT VaR models seem appropriate at both 5% and 1% levels. For the BDTI dirty tanker basket of routes, the same models are appropriate as well as the FHS. For individual routes of the dirty tanker market the EVT and the FHS models are mostly appropriate, with the FHS model performing fairly well amongst these routes. To distinguish between successful VaR approaches that “survived” Christoffersen's backtesting framework, we utilize the concept of Regulatory Loss Functions (RLFs), in conjunction with the Modified Diebold and Mariano (MDM) test of Harvey et al. (1997). To this end, pairwise comparisons of the RLF function's scores are made. Specifically, the benchmark for a series is defined as the model that produces the lowest RLF value. Then the RLF of every other model (that passed Christoffersen's tests) is compared with the RLF of the benchmark model, with the statistical significance of the difference being determined through the MDM test statistic. If equal forecasting accuracy is rejected by the MDM test, positive values of the MDM statistic indicate superiority of the benchmark model and vice versa. In case the MDM test results are inconclusive as to which model is best, bilateral tests are employed to select a single dominant model. Panels A and B of Table 3 display the MDM test statistic and the associated p-values for the 95% and 99% VaR forecasts, respectively, between the various models that survived Christoffersen's conditional coverage test and the ‘benchmark’ model. According to findings, the GARCH model is selected for the estimation of 95% VaR for all of the markets. For the 99% level VaR the historical simulation methods dominate parametric models as they are selected for the majority of the markets considered. Specifically, the HS method is selected for the BDTI, TD3, TD5 and TD9 markets, and the HHS for the TD7 market. The GARCH model is selected only for the BCTI market. Panels A and B of Table 4 report MDM test statistics and associated p-values for the 95% and 99% ETL forecasts, respectively, for all pairwise comparisons of each model with the benchmark model. The latter is defined as the model with the lowest quadratic loss function score defined by Eq. (21). As Christoffersen's tests are not applied in the case of ETL forecasts, all ETL models estimated in this paper are compared with the benchmark model. In terms of shortfall risk, at the 95% level VaR (panel A of Table 4), the non parametric models of the HS and HHS are clearly superior to any other model for all markets except for the BCTI market. The BCTI market is dominated by the GARCH or the FHS model (the FHS provides equal forecasting accuracy with the GARCH model). At the 99% level VaR (panel B of Table 4), the RW and HS models provide the best forecasting performance for the majority of the markets. Specifically, the RW model is chosen for the BDTI, TD3, TD5 and TD7 markets and the GARCH and HS methods are selected for the BCTI and the TD9 markets, respectively. It is worth noting that selection results for the ETL forecasts presented in Panels A and B of Table 4 are only indicative, as backtesting for tail risk requires a large evaluation sample which is difficult to obtain in the case of the examined series. Overall, findings contradict evidence from other financial assets, which indicates that sophisticated VaR models outperform simple specifications (see Kuester et al., 2006 and references therein). Specifically, in the case of the liquid bulk shipping sector, the simplest specifications such as the RW, and the HS outperform systematically more complex VaR and ETL models in forecasting freight rate risk. This could be due to the complicated structure of freight rate returns which is captured better (especially the lower loss quantile events of the profit/loss distribution) by non-parametric VaR models. 4.2. VaR estimates for medium-term investment horizons Next we compare the performance of the VaR approaches in forecasting medium-term risk for each of the Worldscale route indices when the underlying exposure stems from ownership of a single asset/
268
M.G. Kavussanos, D.N. Dimitrakopoulos / International Review of Financial Analysis 20 (2011) 258–268
vessel. The investment horizon is set equal to the duration of the trip and ranges from 2 to 17 days depending on the route19. Table 5 reports the number of observed violations for each of the employed VaR models; that is, the number of cases that VaR exceeds the realized loss-for every VaR model20. Thus, if the realized number of violations is lower than the expected number of violations we have overestimation of the realized VaR while in the opposite case we have underestimation. Results indicate that VaR models based on simulation or conventional scaling laws, such as the square or the tail index root of time tend to underestimate the realized VaR and exhibit substantial deviations from the expected number of violations. On the other hand, the proposed empirical scaling law clearly outperforms all other approaches, exhibiting violation values which are closer to the expected numbers. This scaling law seems to be a remedy for the problem of risk underestimation associated with the conventional scaling laws (i.e. the square root of time or the tail index root of time laws) providing more accurate and reliable risk forecasts.
5. Concluding remarks This paper investigates two critical and challenging issues associated with the concept of market risk measurement: the estimation of medium-term market risk and market risk model selection. Ocean going tanker vessel freight rates have been used for this purpose as this class of assets exhibit some special properties, such as excessive volatility, cyclicality of returns and medium-term investment horizons which render the investigation of the above two issues particularly appropriate and interesting. Findings provide a solution for the challenging issue of estimating medium-term risk from limited historical data. Specifically, an empirical scaling law is found to be the most reliable method for estimating medium-term market risk. The proposed method constitutes a significant improvement over the widely used square root of time rule and may be used in the practice of risk management (e.g. for estimating medium-term market risk for capital reserve requirements or constructing scenarios for variable investment horizons) to alleviate the problems of the conventional scaling rules. Another important finding that emerges from this paper is that return series which exhibit some special properties that are not typical of conventional financial assets, such as excessive volatility and cyclicality affect the empirical choice of VaR models. To this end, the risk measurement models selected for quantifying tanker freight rate risk differ from those selected for conventional financial assets, such as for stocks and exchange rates. Although the latter require sophisticated VaR methods for quantifying market risk, tanker freight rate markets select simpler risk measurement models, such as the GARCH and the random walk models. The results of this paper may be generalized for markets which exhibit similar characteristics with freight rate markets such as agricultural, energy commodity and real estate markets.
Acknowledgments The authors acknowledge financial support for this work from the Heraclitus research support program, financed jointly by the European Community and the Hellenic Ministry of Education. The paper has benefited from participants' comments at the 17th International Association of Maritime Economists (IAME) Conference, held in Athens, Greece, July 2007, and from the comments of anonymous referees and the editor of the journal. Any remaining errors or omissions are the responsibility of the authors. 19 The investment horizons are set to 17, 17, 17, 2 and 6 days for the BDTI TD3, TD5, TD7 and TD9 routes, respectively. 20 The expected number of violations is equal to 50 for the 95% level VaR and 10 for the 99% level VaR, as the length of the backtesting sample is equal to 1000 observations.
References Angelidis, T., & Skiadopoulos, G. (2008). Measuring the market risk of freight rates: A value-atrisk approach. International Journal of Theoretical and Applied Finance, 11(5), 447–469. Artzner, P., Delbaen, F., Eber, J. M., & Heath, D. (1997). Thinking coherently. Risk, 10(11), 68–71. Artzner, P., Delbaen, F., Eber, J. M., & Heath, D. (1999). Coherent measures of risk. Mathematical Finance, 9(3), 203–228. Barone-Adesi, G., Giannopoulos, K., & Vosper, L. (1999). VaR without correlations for non-linear portfolios. Journal of Futures Markets, 19(5), 583–602. Basel Committee on Banking Supervision (2009). Revisions to the Basel II Market Risk Framework, July. http://www.bis.org/publ/bcbs158.pdf. Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics, 31(3), 307–327. Boudoukh, J., Richardson, M., & Whitelaw, R. (1998). The best of both worlds. Risk, 11 (5), 64–67. Christoffersen, P. F. (1998). Evaluating interval forecasts. International Economic Review, 39(4), 841–862. Christoffersen, P. F., Diebold, F. X., & Schuermann, T. (1998). Horizon problems and extreme events in financial risk management. Economic policy review (pp. 109–118). : Federal Reserve Bank of New York. Cotter, J. (2007). Varying the VaR for unconditional and conditional environments. Journal of International Money and Finance, 26, 1338–1354. Danielsson, J., & Zigrand, J. P. (2006). On time-scaling of risk and the square-root-of-time rule. Journal of Banking & Finance, 30(10), 2701–2713. Dickey, D., & Fuller, W. (1981). Likelihood ratio statistics for autoregressive time series with a unit root. Econometrica, 49(4), 1057–1072. Diebold, F. X., & Mariano, R. S. (1995). Comparing predictive accuracy. Journal of Business and Economic Statistics, 13, 253–263. Diebold, F., Hickman, A., Inoue, A., & Schuermann, T. (1997). Converting 1-day volatility to h-day volatility: Scaling by root-h is worse than you think. Wharton Financial Institutions Center. Working Paper. No. 97–34. Published in condensed form as Scale Models. Risk, 11, 104–107 1998. Dowd, K. (2002). Measuring market risk. New York: John Wiley & Sons Ltd. Embrechts, P., & Maejima, M. (2000). An Introduction to the theory of self-similar processes. International Journal of Modern Physics B, 14(12–13), 1399–1420. Engle, R. F. (1982). Autoregressive conditional heteroskedasticity with estimates of the variance of United Kingdom inflation. Econometrica, 50(4), 987–1007. Harvey, D. I., Leybourne, S. J., & Newbold, P. (1997). Testing the equality of prediction mean squared errors. International Journal of Forecasting, 13, 281–291. Hill, B. (1975). A simple general approach to inference about the tail of a distribution. The Annals of Statistics, 3(5), 1163–1174. Huisman, R., Koedijk, K. G., Kool, C. J. M., & Palm, F. (2001). Tail-index estimator in small sample. Journal of Business and Economic Statistics, 19(2), 208–216. Jarque, C. M., & Bera, A. K. (1980). Efficient tests for normality, homoscedasticity and serial independence of regression residuals. Economics Letters, 6, 255–259. Johnson, N., Kotz, S., & Balakrishnan, N. (1995). Continuous univariate distributions (2nd Edition). : Wiley-Interscience. Jorion, P. (1997). Value at risk: The new benchmark for controlling market risk. Chicago: Irwin Professional Publications. Kavussanos, M. G., & Alizadeh, A. H. (2002). Seasonality patterns in tanker shipping freight markets. Economic Modelling, 19(5), 747–782. Kavussanos, M. G. (1996). Comparison of volatility in dry-bulk shipping sector: Spot versus time-charters and small versus large vessels. Journal of Transport Economics and Policy, 30, 67–82. Kavussanos, M. G. (2003). The time varying risks among segments of the tanker freight markets. Maritime Economics and Logistics, 5(3), 227–250. Kavussanos, M. G., & Visvikis, I. D. (2006). Derivatives and risk management in shipping. Witherby Seamanship International. Kuester, K., Mittnik, S., & Paoella, M. (2006). Value-at-risk prediction: A comparison of alternative strategies. Journal of Financial Econometrics, 4, 53–89. Ljung, M., & Box, G. (1978). On a measure of lack of fit in time series models. Biometrica, 65(2), 297–303. Longerstaey, J. (1996). Riskmetrics technical manual (4th Edition). http://www.riskmetrics. com/rmcovv.html. Lopez, J. A. (1998, October). Evaluating value-at-risk estimates. Federal Reserve Bank of New York, Economic Policy Review. McNeil, A., & Frey, R. (2000). Estimation of tail related risk measures for heteroscedastic financial time series: An extreme value approach. Journal of Empirical Finance, 7(3), 271–300. Nelson, D. B., & Cao, C. Q. (1992). Inequality constraints in the univariate GARCH model. Journal of Business & Economic Statistics, 10(2), 229–235. Provizionatou, V., & Markose, S. (2005). Empirical scaling rules for value-at-risk. Working Paper. : Centre For Computational Finance and Economic Agents, University of Essex. Saadi, S., & Rahman, A. (2008). Evidences of non-stationary bias in scaling by square root of time: Implications for value-at-risk. International Financial Markets, Institutions and Money, 18, 272–289. Tsolakis, S., 2005. Econometrics Analysis of Bulk Shipping Markets Implications for Investment Strategies and Financial Decision-Making. PhD Thesis, Erasmus University, Rotterdam, The Netherlands. (https://ep.eur.nl/bitstream/1765/6717/1/Phd_Stavros_Tsolakis_9June05.pdf).