A stock market risk forecasting model through integration of switching regime, ANFIS and GARCH techniques

A stock market risk forecasting model through integration of switching regime, ANFIS and GARCH techniques

Accepted Manuscript Title: A Stock Market Risk Forecasting model through integration of switching regime, ANFIS and GARCH techniques Authors: R. Werne...

864KB Sizes 1 Downloads 69 Views

Accepted Manuscript Title: A Stock Market Risk Forecasting model through integration of switching regime, ANFIS and GARCH techniques Authors: R. Werner Kristjanpolleri, V. Kevin Michell PII: DOI: Reference:

S1568-4946(18)30114-5 https://doi.org/10.1016/j.asoc.2018.02.055 ASOC 4747

To appear in:

Applied Soft Computing

Received date: Revised date: Accepted date:

17-11-2016 20-2-2018 24-2-2018

Please cite this article as: R.Werner Kristjanpolleri, V.Kevin Michell, A Stock Market Risk Forecasting model through integration of switching regime, ANFIS and GARCH techniques, Applied Soft Computing Journal https://doi.org/10.1016/j.asoc.2018.02.055 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

A Stock Market Risk Forecasting model through integration of switching regime, ANFIS and GARCH techniques.

IP T

Werner Kristjanpoller R. Departamento de Industrias

SC R

Universidad Técnica Federico Santa María Av. España 1680, Valparaíso, Chile

A

N

U

Teléfono: (56 32) 2654571, Fax: (56 32) 2654815

M

Mr. Kevin Michell V.

ED

Universidad Tecnica Federico Santa Maria Departamento de

PT

Industrias Av. España 1680 Valparaiso Chile

A

CC E

email: [email protected]

1

U

SC R

IP T

Graphical abstract

M

ED

CC E

Abstract

A

We attempt to improve GARCH forecasting using artificial intelligence methods. Stock market volatility is predicted using a hybrid model. Fuzzy Neural Network – Artificial Neural Network forecasting comparison is studied. The importance of the fuzzy inference in forecasting is concluded. Accuracy improvement is found to be around 20% using the Fuzzy Neural Network. The inclusion of fuzzy inference improves volatility forecasting.

PT

     

N

Highlights.

Stock market volatility forecasting is an important and interesting topic of research due to its impact on trading decisions. This behavior is particularly important in emerging

A

economies in Latin America, and moreover, in the larger stock markets of this region (Brazil, Mexico, and Chile). The Latin American region is highly influenced by macroeconomic factors; therefore, it is relevant to discover ways in which the market index forecast accuracy can be improved. Thus, in this study, we present a novel methodology: first, we forecast the volatility of each market using different GARCH

2

models. Then, we use Markov Switching to determine the states of external factors. Subsequently, these states are combined in a ANFIS model to determine individual impact on each index, and finally, we use an ANN algorithm to improve the forecast accuracy of the best GARCH model forecast with the combined effects of all the external factors. The results indicate that this methodology manages to improve

IP T

prediction in terms of MAPE and RMSE, thus providing a more accurate volatility estimation.

SC R

Keywords: ANFIS; GARCH; Markov Switching; Stock Market Volatility; Volatility Forecasting.

N

U

1. Introduction

A

Forecasting market risk is a widely studied subject that has captured the interest of

these

data

for

model

M

scholars due to its highly non-linearity and volatility. Thus, several approaches use testing.

Generalized

auto

regressive

conditional

ED

heteroscedasticity (GARCH) models [1] are one of the most commonly used models to study volatility. Although response to these models is generally good, they are unable

PT

to successfully capture extreme changes in the complete time series. Due to this shortcoming, one of the focuses of research has been to work on alternatives that can

CC E

better approximate the non-linear part of the series, mainly using Artificial Intelligence, such as artificial neural networks (ANN) [2]. This algorithm has been used extensively in stock markets and for market risk, because it is able to theoretically approximate any

A

non-linear function with minimal error. In practice, the use of an ANN allows us to improve forecasting systems [3] as well as forecasts from econometric models [4] with excellent results. Nonetheless, from an economic point of view, the volatility behavior of an asset depends on several other variables that may have impacts on the particular asset.

3

Therefore, it is key to understand how these new variables (external to the asset) affect it. These variables are not necessarily the same as the explanatory variables of an econometric model. As their influence could be non-linear, the inclusion of these variables may result in a worse forecast. Thus, a variable that could be important in a linear model is not necessarily a good input for an ANN model, and an irrelevant

IP T

variable in a linear model could be a good input for the ANN. The variables used in this paper were particularly selected because of their importance in the region [5]. We considered 11 factors that may influence the behavior of the stock index return volatility

SC R

of three different countries, 6 exchange rates, 4 commodity prices, and one interest rate. The exchange rates are Euro to Dollar, Chilean Peso to Dollar, Mexican Peso to

U

Dollar, Brazilian Real to Dollar, Yen to Dollar, and Peruvian Sol to Dollar. We chose to

A

are prices of WTI, gold, silver, and copper.

N

include the interest rate of United States Federal Reserve, and the four commodities

M

It is not straightforward to apply an auto-regressive (AR) model since these variables, and specifically their impact, may be seasonal or depend on time. Regime switching

ED

models are often used for such a task, because they dynamically model a variable depending on the probability that in that time, it could be affected by the selected

PT

explanatory variables. The Markov Switching (MS) model can be used for this problem. According to the above, we propose the use of MS in this paper in order to determine

CC E

the states of each external factor in terms of high and low volatility. These states can be used to explain different changes in asset volatility in a linear model. However, metaheuristics that can explain non-linear relationships provide better

A

results. To observe the actual relationship between the variable states and the studied asset, several techniques are mentioned in the literature, such as genetic algorithm (GA) [6], support vector machine (SV) [7], artificial neural fuzzy inference system (ANFIS) [8], among others. The ANFIS approach has exhibited solid results when the task involves integrating external variables in the forecast [9], due to its ability to relate

4

different inputs in a “non-crisp” way. In this paper, the use of ANFIS is proposed to study this relationship, acting as a practical link between the external variables that may influence asset behavior and the asset itself. Some recent papers have addressed the goal to predict financial asset volatility with fuzzy systems. Regarding Hung [34], a fuzzy system is used in which the GARCH

IP T

model is integrated inside the architecture and optimized using a GA. The contribution that makes our study different is that we use the Markov switching technique to

SC R

integrate the effects of external factors and with this, enhance the forecast of the best GARCH in each case. We then use this effect and outperform the GARCH forecast

with the use of an ANN. Regarding Dash and Dash [35], they use a novel methodology

U

in which the neural network of the fuzzy system is changed for a functional link neural

N

network and combined with the EGARCH model to perform the volatility forecast. The

A

difference between their approach and the presented paper is that we integrated the

M

Markov Switching to capture the effects of external factors in the prediction of GARCH. Then, these effects are combined into a neural network and thus outperform the best

ED

GARCH prediction.

The remainder of this paper is ordered as follows: In Section (2), we perform a full

PT

review of studies related to the presented problem. Section (3) explains the methodologies used in detail. In Section (4), we present the model results and

CC E

analyses, and Section (5) discusses the different sensitivity results for the crucial parameters. Finally, Section (6) presents the discussion and conclusions.

A

2. Literature Review

There is a great amount of research related to the study of stock market volatility, mainly because of its importance in both economic approaches and modeling. In terms of economics, there is a clear relationship between external variables and stock market assets. In a study conducted by Engel, Ghysels & Sohn [10], these authors analyze

5

how macroeconomic variables, such as growth and inflation, can have a strong influence on long-term stock market volatility. Moreover, they found that, in the short term (1-day-ahead), macroeconomic variables have a 10% to 35% influence on volatility prediction. Corradi, Distaso & Mele [11] come to a similar conclusion regarding the positive relationship between macroeconomic variables and fluctuations in stock

IP T

market volatility. They specifically found that this is explained by the business cycle. Zakaria & Shamsuddin [12] and Attari & Safdar [13] study this relationship in two different stock markets. The first of these studies was performed in the Malaysian stock

SC R

market and used variables such as the exchange rate, interest rates, inflation, among

others. The authors chose a GARCH model to estimate volatility and implemented a

U

VAR to determine Granger causality. Their results show that, apart from the exchange rate, the remaining macroeconomic variables do not influence the stock market

N

volatility of Malaysia, according to the Granger causality test. The second study,

A

however, discovered a positive relationship in the Pakistan stock market. This study

M

includes macroeconomic variables such as inflation and growth, and concludes that

ED

these variables have an important impact on the market. Thus, they use an EGARCH model to estimate volatility. The variables we selected for this paper can be found in

PT

these studies, especially in the last two mentioned. However, we chose to incorporate the four most important commodities in the world (gold, silver, oil and copper) along

CC E

with the interest rate, considering that they are also considered macroeconomic variables; therefore, these commodities may influence the stock market volatility of a country.

A

Many volatility studies use econometric models to perform forecasts. The GARCH models and their improvements such as EGARCH, PHIGARCH, PGARCH, TGARCH, and several others are the most popular approaches in the literature. The reader can refer to [14] [15] [16] [17] for examples of studies that implement GARCH to forecast stock market volatility. The main advantage of using GARCH models is that they can

6

be used alongside other types of models in order to improve forecasts or as a part of an integrated model. A study performed by Sadorsky [18] uses VARMA-GARCH and DCC-AGARCH models to calculate volatility and conditional correlations between emerging stock markets and oil, copper, and wheat prices. He finds that negative residuals tend to increase the variance, even more so than positive residuals.

IP T

Moreover, he concludes that, after an important crisis, emerging stock markets and commodities markets increase the conditional correlation. Girardin & Joyeux [19] apply a GARCH-MIDAS model to explore the influence of volume and economic

SC R

fundamentals in the A-share China market. They follow the guidelines of Engle, Ghysel

& Sohn [20], but also add speculative factors. This study observes that the A-share

U

market depends considerably on economic fundamentals after the entrance of WTO in the market. Furthermore, before this occurrence, speculative factors were the main

N

contributor in explaining volatility. The B-share market, on the other hand, always

A

depended on the speculative factors. Beirne et al [21] use a Tri-variate VAR-

M

GARCH(1,1)-in-mean model to estimate the volatility of 41 emerging market

ED

economies (EMEs). They find that EMEs are highly influenced by the spillovers from regional and global markets. However, these effects were not the same for every

PT

region; the effects of mean return spillovers were more significant in Asia and Latin America, while variance spillovers had more significance in emerging Europe. More

CC E

recently, Syriopoulos, Makram & Boubaker [22] use a VAR-GARCH to analyze the effect of BRICS on the US stock market. They could expand the relationship of these two worlds as well as explore returns and volatility transmission dynamics between the

A

two, in the stock market and business sectors. The GARCH models function best in the daily volatility stock market; nonetheless, this performance could be improved by using models to predict volatility state changes. Regimen switching techniques are very promising in these situations, because they can successfully identify the particularities of the time series in short term. Choi &

7

Hammoudeh [23] use Markov Switching to determine high and low volatility states that are dependent on time for commodities and the S&P500. These authors establish that there is two-regime volatility in the commodities and stock markets. Walid et al. [24] use a Markov Switching EGARCH model in emerging economies to determine whether there is an effect on stock market exchange rates. They find that there is a regime in

IP T

both the mean and conditional variance, and furthermore, that the relationship is strong in terms of stock price depending on foreign exchange rates. Xinyi, Margaritis & Wang

[25] also use a Markov Switching model to investigate the relationship of price and

SC R

volume in the return volatility. They discover that price range has significant

explanatory and predictive power for return volatility; however, volume does not have

U

significant effect on either of these. Additionally, these authors conclude that there is a strong asymmetry between the price ranges and return volatility. Moreover, regime

N

switching was specifically used to outperform other econometric models in forecasting.

A

Reher & Wilfling [26] perform a nesting procedure of Markov Switching and GARCH,

M

extending it to estimate general specifications. This approach overcomes all nested

ED

specifications, in both statistical and out-of-sample forecasting performance, and is also able to generalize the Markov Switching GARCH models through the use of the

PT

proposed framework. Chuang, Huang & Lin [27] compare the performance of the Markov Switching multifractal model with other classic econometric approaches to

CC E

calculate historical volatility of the S&P 100 index and equity options. They find that the GARCH and Markov approaches obtain similar results in both cases, but that the regime-dependent model outperforms all other models in a global financial crisis

A

scenario.

The previously mentioned econometric models can successfully model time series; however, the forecast they perform can be significantly improved if these models are combined with artificial intelligence. The most common algorithm used in this approach is ANN. Dixit, Roy & Uppal [28] use ANN to forecast upward and downward

8

movements in implied volatility for the next day’s trading. They find that ANN can successfully predict the next day’s downward movement in volatility of the Indian stock index. Mantri, Gaham & Nayak [29] provide a basis on which they compare different GARCH class models and ANN in forecasting accuracy. They demonstrate that ANN is better at predicting volatility, even in the long term. However, the ANOVA test indicates

IP T

that there is no difference in the volatility estimated by the models. Due to this, it would be interesting to see the result of the combination of an econometric and artificial

intelligence model. Babu & Reddy [30] use an ARIMA-ANN model to forecast time

SC R

series data. They first predict the error component of the econometric model with ANN,

and then “adjust” the econometric model with this forecast. However, they first use a

U

filter that helps to determine how to apply the ARIMA-ANN model given the analyzed data. With this improvement, the hybrid model has higher prediction accuracy.

N

Hajizadeh et al [31] use an EGARCH-ANN model to forecast the S&P 500 index

A

volatility return. They implement two approaches: the first is to improve the EGARCH

M

forecast by introducing it as an input in the ANN, along with other explanatory

ED

variables. The second approach was to introduce the EGARCH forecast in the ANN but along with the simulated data and the explanatory variables. They conclude that the

PT

second model had a better performance. Monfared & Enke [32] use the same principle, but in a different way. They propose three different types of ANN to improve the result

CC E

of the GJR-GARCH model. The three types of ANN that were tested are: Feed-forward with Backpropagation, Generalized Regression and Radial Bases Function. These authors reach the conclusion that ANNs perform better in comparison with econometric

A

models during a crisis period. Kristjanpoller & Minutolo [33] also use a hybrid model in two steps. The first step was to fit the econometric model according to the selected data. Then, the ANN was used to calculate the error. For various loss functions, the hybrid model separately outperforms the econometric model and the ANN. Kristjanpoller et al. [66] analyzed the same markets and forecast volatility with classical models and with ANN-GARCH. Their results show that ANN-GARCH models have a

9

better performance for forecasts than classical models, thus increasing the accuracy of the forecasts through artificial intelligence. One of the difficulties with this kind of algorithm is selecting the hyperparameters; this topic motivated our investigation in order to attempt to find better ways of carrying out this process [76]. This opens the opportunity to use another kind of algorithm to improve the performance

IP T

of an econometric model. In this paper, the focus is fuzzy systems. Hung [34] incorporated fuzzy systems to analyze clustering in the GARCH forecast. Due to the

SC R

highly nonlinear characteristic of the problem, they proposed a genetic algorithm to calculate the parameters of both the GARCH and the membership function of fuzzy systems. The GARCH model was significantly improved when the fuzzy system

U

adaptive forecast and the clustering effects were incorporated. Dash & Dash [35]

N

proposed an EGARCH ANN model to explain volatility. They use a fuzzy system with

A

ANN to include the different innovations of the authors. Due to the high number of

M

parameters to be calculated, a differential evolution algorithm was applied to find the solution. They observed that, in terms of loss function, the proposed approach

ED

significantly lowers them. In a study conducted by Atsalakis and Valavanis [36], the use of a fuzzy logic system is proposed together with neural networks (ANFIS models) to

PT

improve prediction results regarding other models of weekly prediction. The aim of this study was to achieve the best forecast for one-day-ahead prediction with an ANFIS

CC E

model, using only historical prices of different stocks. They validated their results by obtaining percentages of prediction accuracy, concluding that the ANFIS method was superior to 13 different methods for predicting stock prices in terms of efficient

A

transactions. Hong and Lee [37] used ANFIS-type models to effectively predict the future price of electricity. The first study proposed a recursive fuzzy neural network, as they found that this kind of ANN was better at predicting LMP (Locational marginal pricing) together with Fuzzy Logic. They also determined that fuzzy reasoning is a

10

powerful technique to predict future price, especially when linguistic factors complement the existing information. Recently, Zameer et al. [77] proposed a hybrid method for wind prediction, combining different types of Neural Networks and GP to perform forecasts over different wind data sets. Specifically, the model consists of five base learners that are trained and tested in

IP T

a first phase, where features are also selected using tested data. The base predictions and those features are then fed into the GP algorithm as the meta classifier to make

SC R

the final prediction, splitting the data into two-thirds for training and one-third for testing. The model is able to outperform other approaches in terms of RMSE and SDE. The results also present statistical significance. Qureshi et. al. [78] propose an embedded

U

system to predict wind power. The proposed model is separated into two phases:

N

based-regressors and meta-regressors. The first phase consists of several auto-

A

enconders that are pre-trained and fine-tuned to generate the base prediction and

M

testing data, which is used for the second phase. In this second phase, two-thirds of the data are used in a Deep Belief Network training and the remaining one-third is used

ED

to test the whole framework. The data used are from 5 different wind farms situated in Europe. The results indicate that the model is able to generalize and transfer learning

CC E

PT

from one farm to another. The generalization is also statistically significant.

3. Methodology

As mentioned in the literature review, economic variables improve forecasting. For this

A

reason, we considered 11 factors in this study that may influence the behavior of stock index return volatility in three different countries. Specifically, we chose 6 exchange rates, 4 commodities, and a global market interest rate as a reference. Exchange rates, commodities, and global macroeconomic variables are widely considered to be influential in the stock market, especially in emerging economies [69] [70] [71].

11

Therefore, these variables are very relevant in the context of this study. The hypothesis is that these additional variables may play a key role in performing a better forecast of the stock index. The focus is to determine whether there are high and low volatility periods for these external variables; this can be done using a Markov Switching approach. Since the relationship of these factors with the variables of interest is

IP T

unknown, an ANFIS model is proposed that is capable of capturing their impact. Finally, this impact is added as an input to ANN in addition to the best GARCH model forecast to outperform it. We named the proposed methodology Markov Switching

SC R

Fuzzy Inference Neural Network GARCH (MS-FNN-GARCH) model. 3.1 GARCH Models.

Conditional

Heteroskedasticity

(ARCH)

models

and

their

N

AutoRegressive

U

Given the heteroskedasticity characteristic of economic and financial time series, the

generalization (GARCH) are established for modeling by Engle [1], Bollerslev [54] and

A

Taylor [55]. The GARCH (p,q) model that is described in Equations 1a, 1b, and 1c is

M

applied to the return of a financial asset with an autoregressive model (I) as an average.

I

+ ∑ βGi rt−i + εt

ED

rt =

αG0

(1a)

i=1

PT

̂ 𝑡2 ) εt ~N(0, 𝐻𝑉 p

(1b) q

2 ̂ 𝑡2 = c G + ∑ αGi ε2t−i + ∑ γGj 𝐻𝑉 ̂ 𝑡−𝑗 𝐻𝑉

(1c)

j=1

CC E

i=1

where G indicates that the coefficient belongs to the GARCH(p,q) model, αGi , βGi , γGj and c G are the parameters of the GARCH model, rt−i is the return in time t lagged I

A

̂ 𝑡2 is the volatility in time t. Many derivatives stems from the family of times and 𝐻𝑉 GARCH models, such as TARCH (threshold ARCH) [56] or EGARCH (exponential GARCH) [57], which are asymmetric and consequently allow negative shocks to have a more distinct impact upon volatility than positive impacts. The mean equation for each model expresses the correlation order of one AR (k). In the case of TARCH, the model incorporates an asymmetry term in its variance equation, Equation 2a. Meanwhile, in

12

the case of the EGARCH model, there is also an asymmetry term, but the relationship in the variance is associated with the logarithm of variance, Equation 2b. p

̂ 𝑡2 = c TG + 𝐻𝑉

q

2 ∑ αTG i εt−i i=1

+

s

̂2 ∑ γTG j 𝐻𝑉𝑡−𝑗 j=1

p

̂ 𝑡2 ) = c EG + log⁡(𝐻𝑉

2 2 + ∑ δTG k εt−k (εt−k < 0) q

s

2 ̂2 ∑ αEG i εt−i /𝐻𝑉𝑡−𝑖

+

2 ̂2 ∑ δEG k |εt−k /𝐻𝑉𝑡−𝑘 |

̂2 + ∑ γEG j log⁡(𝐻𝑉𝑡−𝑗 )

k=1

(2b)

j=1

IP T

i=1

(2a)

k=1

where TG indicates that the coefficient belongs to the TGARCH(p,q,s) model, αTG i , TG 2 TG ̂ 𝑡−𝑗 γTG are the parameters of the TGARCH model and 𝐻𝑉 is the volatility j , δk and c

SC R

in time t lagged j times. EG indicates that the coefficient belongs to the

EG EG EGARCH(p,q,s) model, αEG and c EG are the parameters of the EGARCH i , δk , γj 2 ̂ 𝑡−𝑘 model and 𝐻𝑉 is the volatility in time t lagged by k (or i or j) times. In this research,

U

we analyze classical models such as GARCH, GARCH(p,q), EGARCH(p,q,s) and TGARCH(p,q,s) with a maximum average lag (r) of 5 lags. To further expand the

N

classical models, we incorporate the Markov Switching GARCH (MS-GARCH) and

A

MIDAS GARCH models.

M

3.1.1. Markov Switching GARCH

The Markov-switching approach was introduced by Hamilton [58 59] for formalizing

ED

the statistical identification of changes in regimes. After, the Markov-switching approach was extended to the ARCH models by Hamilton and Susmel [60] and Cai [61]. Finally, Gray [62] and Klaassen [63] extended the model to MS-GARCH,

CC E

PT

establishing the model according to the equations 3(a-h). 𝑟𝑡,𝑠𝑡 = 𝜑𝑠𝑀𝑆𝐺 + 𝜀𝑡 𝑡

(3.a)

2 ̂ 𝑡,𝑠 𝜀𝑡 |𝑠𝑡 ~𝑁(0, 𝐻𝑉 ) 𝑡

(3.b)

𝑀𝑆𝐺 2 ∑ 𝛼𝑖,𝑠 𝜀𝑡−𝑖 𝑡 𝑖=1

𝑞 𝑀𝑆𝐺 ̂ 2 + ∑ 𝛽𝑗,𝑠 𝐻𝑉𝑡−𝑗 𝑡

(3.c)

𝑗=1

A

2 ̂ 𝑡,𝑠 𝐻𝑉 = 𝛼0𝑀𝑆𝐺 + 𝑡

𝑝

where MSG indicates that the coefficient belongs to the MS-GARCH(p,q) model and 𝑀𝑆𝐺 𝑀𝑆𝐺 𝛼𝑖,𝑠 ≥ 0 and 𝛽𝑗,𝑠 ≥ 0 are the parameters of the MS-GARCH model. It is mandatory 𝑡 𝑡 𝑀𝑆𝐺 𝑀𝑆𝐺 that 𝛼0𝑀𝑆𝐺 > 0, 𝛼𝑖,𝑠 ≥ 0 and 𝛽𝑗,𝑠 ≥ 0 to ensure a positive conditional variance. The 𝑠𝑡 𝑡 𝑡

indicates the state of the world in which the variance is calculated, and is related as the following probabilities:

13

P{𝑠𝑘 = 𝑗|𝑠𝑘−2 = 𝑗, 𝑠𝑘−2 = 𝑘, … } = 𝑃{𝑠𝑘 = 𝑗|𝑠𝑘−1 = 𝑖} = 𝑝𝑖𝑗

(3.d)

In this case, the states of the world are fixed to 2 (0 represent the normal volatility and 1 represent the high volatility) and therefore the transition probabilities can be expressed as (3.e)

Pr[𝑠𝑘 = 1|𝑠𝑘−1 = 0] = 1 − 𝑝1𝑀𝑆𝐺

(3.f)

IP T

Pr[𝑠𝑘 = 0|𝑠𝑘−1 = 0] = 𝑝1𝑀𝑆𝐺 Pr[𝑠𝑘 = 1|𝑠𝑘−1 = 1] = 𝑝2𝑀𝑆𝐺

(3.g)

Pr[𝑠𝑘 = 0|𝑠𝑘−1 = 1] = 1 − 𝑝2𝑀𝑆𝐺

SC R

(3.h)

3.1.2. GARCH Midas.

The Midas model was introduced by Ghysels et al. [64] and extended to the GARCH

U

models by Engle and Rangel [65]. The GARCH Midas is specified by the equations

N

4.(a-e)

A

𝑟𝑖,𝑡 = 𝜇𝐺𝑀 + √𝜏𝑡 𝑔𝑖,𝑡 𝜖𝑖,𝑡 ,⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡∀𝑖 = 1, … , 𝑁 𝜀𝑖,𝑡 |Φi−1,t ~N(0,1)

(4.a)

M

(4.b)

ED

where 𝜇𝐺𝑀 is the constant of the model, 𝑔𝑖,𝑡 is the short-term component and 𝜏𝑡 is the long-term component. The first one can be formulated as 2

PT

𝑔𝑖,𝑡 = (1 − 𝛼 𝐺𝑀 − 𝛽 𝐺𝑀 ) + 𝛼 𝐺𝑀

(𝑟𝑖−1,𝑡 − 𝜇𝐺𝑀 ) + 𝛽 𝐺𝑀 𝑔𝑖−1,𝑡 𝜏𝑖

(4.c)

CC E

where 𝛼 𝐺𝑀 and 𝛽 𝐺𝑀 are the parameters of the GARCH-MIDAS approach. The second, 𝜏𝑡 , can be formulated as 𝐾

A

𝜏𝑡 = 𝑚 + 𝜃

𝐺𝑀

̂ 2𝑡−𝑘 ∑ 𝜑𝑘 (𝑐1 , 𝑐2 )𝐻𝑉

(4.d)

𝑘=1

where GM indicates that the coefficient belongs to the GARCH(p,q) model. Finally 𝜑𝑘 is a beta-lag polynomial written as 𝑘 𝑐1 −1

𝜑𝑘 (𝑐1 , 𝑐2 ) =

(𝐾)

𝑘 𝑐2 −1

(1 − 𝐾)

𝑗 𝑐1 −1 ∑𝐾 (1 𝑗=1 (𝐾)

𝑗 𝑐2 −1 − 𝐾)

(4.e)

14

where 𝑤1 and 𝑤2 are the weights of the MIDAS methodology. 3.2 Markov Switching Fuzzy Inference Neural Network GARCH (MS-FNN-GARCH) model The proposed model is composed of a Fuzzy Inference Neural Network, with pre-

IP T

processing and post-processing stages.

SC R

3.2.1. Preprocessing Data.

As mentioned in the literature review, economic variables improve forecasting. For this reason, we considered 11 factors in this Study that may influence the behavior of stock

U

index return volatility in three different countries. It is important to choose variables that

N

are widely considered as references in the stock market, specifically:

A

1. 6 exchange rates

3. Global market interest.

M

2. 4 commodities

ED

These additional variables may play a key role in performing a better forecast of the stock index. Firstly, the main focus is to determine whether there are high and low

PT

volatility periods for these external variables; this can be done using a Markov

CC E

Switching approach. Thus, each of these factors is analyzed and determines the high volatility and low volatility presented on every period. To analyze the impact of external factors, it is first important to determine whether this

A

factor presents volatility states in the short term. To do this, the MS approach is very helpful. The Markov Switching autoregressive (MS-AR) model, following [42], can be written as follows: 𝑥𝑡 = 𝜇𝑆𝑀𝑆𝑅 + 𝛾1𝑀𝑆𝑅 𝑥𝑡−1 + 𝜖𝑡 𝑡

(5)

x

𝜖𝑡 |𝑠𝑡 ~𝑁(0, σ𝑆2𝑡 )

15

where 𝑆𝑡 is a latent variable modeled by a Markov process (two states), 𝑥𝑡 are the logarithmic return of the external factors in day t, 𝛾1𝑀𝑆𝑅 is a parameter of the MSGARCH model and 𝜇𝑆𝑀𝑆𝑅 is the intercept. This model was determined to be the best 𝑡 model that was empirically adjusted. The transition probabilities of these variables are given by 𝑃[𝑆𝑡 = 0|𝑆𝑡−1 = 0] = 𝑝1𝑀𝑆𝑅

IP T

(6.a)

𝑃[𝑆𝑡 = 1|𝑆𝑡−1 = 0] = 1 − 𝑝1𝑀𝑆𝑅 𝑃[𝑆𝑡 = 1|𝑆𝑡−1 = 1] = 𝑝2𝑀𝑆𝑅

(6.b)

𝑃[𝑆𝑡 = 0|𝑆𝑡−1 = 1] = 1 − 𝑝2𝑀𝑆𝑅

SC R

(6.c)

(6.d)

where 𝑆𝑡 = 0⁡and⁡𝑆𝑡 = 1⁡refers to the low volatility and high volatility regime at time t,

U

and 𝑝1𝑀𝑆𝑅 and 𝑝2𝑀𝑆𝑅 refers to the probabilities of the corresponding state, respectively. It

N

is important to notice that both the variance and the constant factor are dependent on

A

the Markov process regime. This allows us to identify high and low volatility periods

M

supported by the fact that the volatility has identifiable clusters of high and low volatility

ED

[43]. 3.2.2. ANFIS

PT

Due to the unknown relationship of these factors with the variables of interest, an ANFIS model is proposed that can capture their impact. Each of the factors high and

CC E

low state, are introduced in an ANFIS method to determine the actual effect of every factor in the respective index. Thus, after the ANFIS analyze, we have 11 effects,

A

related to each of the external factor. Following Jang [44], the ANFIS model is the combination of the fuzzy logic approach and ANN modeling. The use of fuzzy logic inside the ANN is done by using the membership function and works the entire network with fuzzy numbers. For more details of how membership function works, see [45] [46] [47].

16

The inputs of the proposed ANFIS are 2 (two states for each external factor) and the output is1 (the impact of the specific factor on the market index). The ANFIS is able to determine how each of these factors impacts the index in a quantitative way.

A

M

The super index indicates the layer on the description.

N

U

SC R

IP T

Figure 1: "ANFIS Architecture"

ED

3.2.3. Post-processing by ANN improved forecast. Finally, these impacts are added as an input to an ANN to outperform the GARCH

PT

prediction. The ANN model is used to improve the GARCH forecast with each of the 11 external effects obtained in the previous step. It is noteworthy that the main difference

CC E

between these two steps is that, with the ANFIS, the effect of the external factors is individual, while in the ANN, the effect is analyzed as a group of variables. Therefore, the ANN is used for the proposition of uniting the impacts of the external factors in a

A

way that it is unclear and dependent on time series. ANNs can approximate, in theory, any non-linear function [49] and have a learning procedure based on the backpropagation algorithm [50]. The ANN algorithm propagates a signal through the network forward, and a function is activated in each neuron. The most commonly used function is the sigmoid function, defined as

17

𝑠(∙) =

1 1 + 𝑒 −(∙)

(7)

where (∙) is the input of the function. The algorithm, working with the sigmoid function can be summarized as follows: 𝑙 1. Generate the weight random, 𝜃𝑖,𝑗

IP T

2. Choose an input-output pattern, this is, Ζ(𝑘)and Υ(𝑘), been k the k-ith set of data.

SC R

3. Propagate the k-ith signal through the network and obtain each output valor as

𝑙 = 𝑓 (∑ 𝜃𝑖,𝑗 𝜌𝑗𝑙−1 )

(8)

N

𝑗=0

U

𝑛𝑙−1

𝜌𝑖𝑙 (𝑘)

A

where 𝑛𝑙−1 is the number of neurons in the layer l.

(𝐿)

(𝐿)

(𝐿)

where 𝑠𝑔𝑖

as

(𝐿)

= [Υ𝑖 − 𝜌𝑖𝐿 ] [𝑓′ (𝑠𝑔𝑖 )]

(9)

ED

𝛿𝑖

M

4. Calculate the error as 𝐸 = 𝐸(𝑘) + 𝐸 and the signal error 𝛿𝑖

is the summarization of all signals from the layer L to the neuron i.

CC E

PT

4. Update the weight according to Δ𝜃 (𝑙) = −𝜂

𝜕𝐸(𝑘) 𝜕𝜃 (𝑙)

(10)

A

Applying chain rule and considering the sigmoid activation function (𝑙)

(𝑙) (𝑙−1)

Δ𝜃𝑖,𝑗 = −𝜂𝛿𝑖 𝜌𝑗

for⁡𝑙 = 𝐿, 𝐿 − 1, … ,1

(11)

using (𝐿)

𝛿𝑖

(𝐿)

= [Υ𝑖 − 𝜌𝑖𝐿 ] [𝑓′ (𝑠𝑔𝑖 )] for⁡𝑙 = 𝐿

(12)

𝑛𝑙 (𝑙) 𝛿𝑖

=

(𝑙) 𝜌𝑖 (1 −

(𝑙) (𝑙+1) (𝑙+1) 𝜌𝑖 ) ∑ 𝛿𝑗 𝜃𝑗,𝑖 for⁡𝑙 𝑗=1

<𝐿

(13)

18

where 𝜂 is the learning rate. 6. Repeat the process from step 2 for the next set (k). After passing through all the steps, one training epoch is completed 7. Verify whether the accumulated error E in the output layer is less than a desired value or if the number of epochs reaches the maximum, stop the

IP T

algorithm. If not, repeat the process from step 2.

The based model had 3 hidden layers, 15 neurons in each layer, and 1 output layer.

SC R

Configuration was set following [51]. The maximum epoch was set to 10000 and the training set was set to 252 (a stock market year).

U

We named the proposed methodology Markov Switching Fuzzy Inference Neural

N

Network GARCH (MS-FNN-GARCH) model.

A

Considering the above, we proposed the following steps in order obtain an improved

M

forecast for the Latin American market index: (1) Perform the volatility measurement and GARCH forecast, (2) Identify, using MS, the high and low volatility states of the

ED

factors, (3) Relate these states using an ANFIS approach, and (4) Include these relationships and the best GARCH forecast in an ANN to outperform the best GARCH.

CC E

PT

Figure 2: "Proposed MS-FNN-GARCH Model"

External Factors (11)

States of Factors (22)

A

MS_AR

Impact of Factors (11)

ANN

ANFIS

Improved Forecast

GARCH Forecast

3.3 Volatility measurement and forecast

19

There are many proxies to measure volatility, but in this case, historical volatility is chosen, because is the most used and simplest to analyze dispersion [39, 52]. Thus, following [40, 52-53], historical volatility can be estimated as follows: 𝜏

1 𝐻𝑉𝑡 = √ ∑(𝑟𝑡 − 𝑟̅ )2 𝜏

(14)

𝑖=1

IP T

where 𝜏 is the number of days to calculate the historical volatility, 𝑟𝑡 is the logarithmic return of the day i and 𝑟̅ is the average logarithmic return. In this study, the focus was

SC R

to analyze the 21-day-ahead forecast, and therefore, 𝜏 would be equal to 21. 21 days is approximately the labor days of a month, and then this is a monthly volatility.

U

3.4 Rolling Windows

N

Before we explain the steps of the new methodology, it is important to notice that this

A

study works based on a rolling windows approach. This procedure can be resumed as

M

follow

1. Take a fixed windows to adjust the model (252 days for this case [72] [73]) and

ED

then forecast (1 days ahead for this case). This is, i.e., take from day 1 to day 252 to model and then forecast day 253 for the first forecast.

PT

2. Move 1 day forward and repeat step 1 (i.e. for the second forecast, the data to

CC E

model would be from day 2 to 253 in order to forecast the day 254 and so on). 3. Repeat this until all the period is cover

The data go from February 02, 2001 to December 31, 2010, with 2415 observations. In

A

section 3.6 the fixed windows are sensitized to see if it influences the model behavior. This procedure is performed for all the models presented here; thus, the out-of-sample period to evaluate the performance of each benchmark and the MS-FNN-GARCH model runs from February 2, 2002 to December 31, 2010. This creates a total of 2163

20

observations in order to distinguish which model is the best to forecast out-of-sample data. 3.5 Sensibility Analysis In the previous methodology, we arbitrarily selected some ANN parameters for the forecast. These parameters are not necessarily the best; therefore, to make the results

IP T

less particular, we proceeded to vary them. The most important parameters that were set arbitrarily were the numbers of neurons (15), the numbers of layer (3), the training

SC R

percent (and therefore, the validation and test percent) (0.7) and the length of data that the ANN use to train (252). The variations were done as shown in Table 1. Table 1. Parameter Sensitivity. Increment

Number of Neurons

5

5

Number of Layers

2 0.5

20

1

4

0.1

0.8

3 months

15 months

A

Training Percent

3 months

M

Time Horizon

Max Value

U

Min Value

N

Parameter

ED

This analysis is applied in the following order: the first sensitivity of neurons and hidden layers is conducted, and the best combination is selected from the results. Using this,

PT

the second sensitivity is performed, adjusting the training and time parameters. This is done because the primary and main impact on the model would be to the neurons and

CC E

layers, as these modify the internal structure of the network. 3.6 Loss Function

A

To evaluate the results of the different model projections, the forecasting results are contrasted with the historical volatility value, and the indicators of Mean Square Error (MSE) and Mean Absolute Percentage Error (MAPE) are utilized. Mean square error measures the average of the quadratic deviations forecasted with respect to the historical volatility. It tends to penalize greater deviations in a greater way by having

21

these squared. It is worth noting that the neural network is programmed to minimalize the same function but with training values. This indicator is described as follows: 𝑛

1 ̂ 𝑡2 − 𝐻𝑉𝑡 )2 𝑀𝑆𝐸 = ∑(𝐻𝑉 𝑛

(15)

𝑖=1

̂ 𝑡2 corresponds to the volatility forecast for the time t, RVt is the actual volatility where 𝐻𝑉

IP T

for the time t and n is the number of forecasted periods. The mean absolute percentage error corresponds to the deviation measured with

SC R

respect to the real value sought. It is determined as follows: 𝑛

̂ 𝑡2 − 𝐻𝑉𝑡 | 1 |𝐻𝑉 𝑀𝐴𝑃𝐸 = ∑ 𝑛 𝐻𝑉𝑡 𝑖=1

(16)

U

This indicator implies the relative extent of the absolute error with respect to the

N

objective value through historical volatility. This indicator is particularly interesting for

A

the objective of this study, since more than one precise value of magnitude is required

M

for the uses of estimation outside of the sample, which is the most precise in relative

ED

magnitude.

The MSE was selected because it is defined as robust loss functions, [74], while the

PT

MAPE was selected because it is defined as the loss function adjusted by the heteroskedasticity according to Fuertes et al. [75], calling this Heteroskedasticity

CC E

adjusted mean absolute error (HMAE). 3.7. Model Confidence Set. To test the performance, the Model Confidence Set (MCS) was applied, [67-68]. The

A

MCS is constructed from different models and an evaluation criterion; this procedure is carried out with an equivalent test and an elimination rule. First, the models pass through the equivalence test to determine which models are “equally good”. Afterwards, the elimination rule is used to remove models with poor performances in comparison to the others. Thus, the MCS is defined as the set of surviving models. Furthermore, the MCS gives the p-values of the test for each model. The main difference with other tests is that the MCS scan selects more than one model, and 22

therefore depends on the characteristics of the data and how it can be modeled using a specific methodology. Specifically, following general theory [67] and [68], it is considered a set M0 in which the tested models are located. The models previously must pass through a loss function in order to correctly make the comparison. If it is sought to evaluate the performances between different forecasts, 𝑌̂𝑖,𝑡 and the real data 𝑌𝑡 with a loss function L, we would

IP T

have 𝐿𝑖,𝑡 = 𝐿(𝑌𝑡 , 𝑌̂𝑖,𝑡 ). The relative performance would then be 𝑑𝑖𝑗,𝑡 = 𝐿𝑖,𝑡 − 𝐿𝑗,𝑡 ,⁡⁡⁡⁡⁡⁡⁡⁡for⁡all⁡𝑖, 𝑗 ∈ ⁡ 𝑀0

(17)

which satisfies

(18)

U

𝑀∗ = {𝑖 ∈ 𝑀0 : 𝜇𝑖𝑗 ≤ 0⁡for⁡all⁡𝑗 ∈ 𝑀0 }

SC R

where M0 is the set of models. Therefore, the objective of the MCS is to determine M*

N

where 𝜇𝑖𝑗 = 𝐸(𝑑𝑖𝑗,𝑡 ). The null hypothesis being tested is 𝐻0,𝑀 : 𝐸[𝑑𝑖𝑗,𝑡 ] = 0⁡∀𝑖, 𝑗 ∈ 𝑀. If

A

this is rejected, at least one model is considered inferior and therefore, eliminated from the set M. These steps are performed until the remaining models in M are equal to

𝑑̅𝑖 , 𝑑̅𝑖 √𝑑̅𝑖

= 𝑚−1 ∑𝑗∈𝑀 𝑑̅𝑖𝑗 , 𝑑̅𝑖𝑗 = 𝑇 −1 ∑𝑇𝑡=1 𝑑𝑖𝑗,𝑡 , where m is the number

ED

𝑇𝐷 = ∑𝑖∈𝑀 𝑡𝑖2 with𝑡𝑖 =

M

∗ ̂1−𝛼 𝑀 (the 1 − 𝛼 model confidence set). The test for the null hypothesis was

of models in M. To estimate the distribution, a bootstrap methodology is used since there is no information about the distribution parameters. Finally, the test results are

PT

grouped per the MCS p-value. This study is fixed for i=1, being the loss function of the best GARCH model benchmarking. Then, the test is used to define the MS-FNN-

CC E

GARCH combinations that have a more statistically significant performance.

4. Data and Results

A

4.1 Factor identification and data collection For the following study, the data used for the ANN methodology were GARCH-type model predictions for the IPC, IPSA, and IBOV, the main stock market indexes of Mexico, Chile, and Brazil, respectively. Specifically, we use 21 inputs related to the previous predictions (t-a, whit a=1… 21). The data source is the Economatica

23

database. For the MS-FNN-GARCH methodology, apart from the basic data, eleven volatility factors were used as additional data, which were selected according to relevance in the Latin American market. These factors were six types of currency exchange (Euro to Dollar, Chilean Peso to Dollar, Mexican Peso to Dollar, Brazilian Real to Dollar, Yen to Dollar, and Peruvian Sol to Dollar), the United States Federal

IP T

Reserve of St. Lois interest rate, and four commodity prices (gold, WTI, copper, and silver). For all of these, a period was selected from 20-02-2001 to 31-12-2010.

SC R

4.2 Results for the Base Parameters

We compared the forecasting ability of the different GARCH-type models, for the whole period. Therefore, the forecast was compared in terms of MAPE and MSE as

U

discussed in the previous section. Table 2 summarizes the results for each market.

TGARCH

EGARCH

MAPE

MSE

MAPE

IPC

4.15E-07

1.186

4.11E-07

1.144

IPSA

6.77E-08

0.666

7.37E-08

IBOV

1.47E-06

0.932

1.50E-06

GARCH-MIDAS

MSE

MAPE

MSE

MAPE

MSE

MAPE

3.50E-07

0.936

4.27E-07

0.975

4.27E-07

0.959

0.650

7.07E-08

0.592

9.18E-08

0.982

7.37E-08

0.650

0.872

1.44E-06

0.798

1.78E-06

0.975

1.79E-06

0.985

ED

MSE

MS-GARCH

M

Market

A

GARCH

N

Table 2. MAPE and MSE for each forecasting GARCH-type model.

The best forecasting model by MSE is shown in bold.

It can be observed that for the Mexican and Brazilian stock market, the best GARCH

PT

model to forecast volatility is the EGARCH, while for the Chilean market it is the basic

CC E

GARCH. Once the best GARCH model was obtained for each market, this was entered as an input in an ANN with 3 layers and 15 neurons, as well as in the proposed model with the same configuration of layers and neurons. The results can be observed in

A

Table 3.

Table 3. MAPE and MSE for each ANN-GARCH and MS-FNN-GARCH model. Best GARCH

ANN-GARCH

MSE

MAPE

MSE

MAPE

MSE

MAPE

IPC

3.50E-07

0.936

3.17E-07

0.723

3.06E-07

0.723

IPSA

6.77E-08

0.666

6.13E-08

0.599

4.96E-08

0.498

IBOV

1.44E-06

0.798

1.37E-06

0.666

1.33E-06

0.704

Market

MS-FNN-GARCH

24

The best forecasting model by MSE is shown in bold.

By comparing the results of the base case, it can be observed that the basic ANN model has a better performance than the best GARCH model for each market and that the proposed MS-FNN-GARCH model further improves this. This implies that the information obtained from the factors analyzed in the ANFIS generates an input that adds values to the ANN. The MSE is reduced by 26.7% in the case of the Chilean

IP T

stock market, by 12.6% in the case of the IPC, and by 7.3% in the Brazilian case. There is also an improvement in the accuracy of the forecasts by observing their

SC R

performance with the MAPE, decreasing by 25.1%, 22.8% and 11.8% for the IPSA, IPC and IBOV with respect to the best GARCH model. It is noteworthy that in the case

U

of the IBOV, the ANN-GARCH model has a better MAPE than the MS-FNN-GARCH.

N

Due to the fact that the ANFIS model has two inputs and one output, it is possible to obtain a surface determined by rules (see Figure 3). Following this recommendation,

A

the drawing of the rules was added to the manuscript. In this figure, it is clear that when

M

GARCH volatility is low and the effect of the external factors is also low, the historical

ED

volatility effect that is obtained is low as well. The same interpretation is provided for the high volatility case. In the middle, when some of the characteristics are high and

PT

others are low, the total effect on the historical volatility is neutral, because the

Figure 3: " Surface determined by rules "

A

CC E

combination of the features is inconclusive.

25

5. Sensitivity Analysis.

IP T

Even though the previously proposed methodology provided good results for different

markets, it is still interesting to study whether these results are affected as well as if the

SC R

magnitude of the arbitrary parameters, such as neurons and layers, are modified.

Therefore, we change several parameters of the base model (3 hidden layers and 15 neurons). The Model Confidence Set (MCS) test is also applied to determine the best

U

forecasting model(s) in each market and for each loss function. Sensitivity results are

N

shown by stock market in Tables 4, 5 and 6. In the case of the Mexican stock market,

A

the best configuration of the MS-FNN-GARCH is obtained with 2 layers and 5 neurons,

M

decreasing the MSE by 18.9% and the MAPE by 25.1%. By applying the MCS test, it is observed that in the MSE there is no model that is as good as the MS-FNN-GARCH

ED

2Lx5N; meanwhile, in the case of the MAPE, it is the best configuration, but the 2x20 ̂ and 3x20 models are also statistically significant and superior (ℳ 90 )⁡.

PT

Table 4. Model Sensitivity by Neurons and Layers for IPC volatility. MSE

Var.

p-value

MAPE

2x5

2.97E-07

-15.2%

0.504

0.735

-21.6%

0.000

2x10

3.02E-07

-13.9%

0.504

0.735

-21.6%

0.000

2x15

2.84E-07

-18.9%

1.000

0.702

-25.1%

1.000

2x20

2.86E-07

-18.5%

0.841

0.706

-24.6%

0.987

3x5

3.05E-07

-13.1%

0.434

0.780

-16.7%

0.000

3x10

3.02E-07

-13.8%

0.504

0.721

-23.0%

0.328

3x15

3.06E-07

-12.6%

0.434

0.723

-22.8%

0.000

3x20

2.93E-07

-16.4%

0.504

0.714

-23.7%

0.979

4x5

3.17E-07

-9.4%

0.434

0.781

-16.6%

0.000

4x10

3.09E-07

-11.7%

0.434

0.747

-20.2%

0.000

4x15

3.10E-07

-11.6%

0.434

0.785

-16.2%

0.000

A

CC E

Model

Var.

p-value

26

4x20

3.09E-07

-11.8%

0.434

0.761

-18.7%

0.000

ANN-GARCH

3.17E-07

-9.5%

0.063

0.723

-22.8%

0.000

GARCH

3.50E-07

0.434

0.936

0.000

GARCH is the best GARCH-type model. Var. corresponds to the percentage variation of the loss function with respect to the best GARCH model. P-value is the value by applying the MCS test. The bold font indicates the best models by MCS.

In the case of the IPSA, the best configuration of the MS-FNN-GARCH is obtained with

IP T

2 layers and 15 neurons, just as in the case of the IPC. This configuration decreases the MSE by 33.8%. It can be observed that according to the MCS test, the 2Lx20N is equally good since its p-value is greater than 0.90. For the case of the MAPE, the 2x15

SC R

configuration continues to be the best, decreasing it by 25.9%. By applying the MCS test, it is observed that the 2x10, 2x20 and 3x15 configurations are statistically

U

̂ significant and superior (ℳ 90 )⁡.

MSE

Var.

p-value

MAPE

2x5

4.87E-08

-28.0%

0.832

0.519

-22.0%

0.423

2x10

4.80E-08

-29.1%

0.832

0.500

-24.9%

0.942

2x15

4.48E-08

-33.8%

1.000

0.493

-25.9%

1.000

2x20

4.54E-08

-33.0%

0.993

0.495

-25.7%

0.943

3x5

5.53E-08

-18.4%

0.656

0.530

-20.3%

0.188

5.00E-08

-26.2%

0.832

0.532

-20.1%

0.188

4.96E-08

-26.7%

0.832

0.498

-25.1%

0.943

Var.

p-value

3x20

ED

M

A

Model

N

Table 5. Model Sensitivity by Neurons and Layers for IPSA volatility.

5.33E-08

-21.3%

0.689

0.528

-20.6%

0.236

4x5

5.55E-08

-18.0%

0.656

0.538

-19.2%

0.101

4x10

5.65E-08

-16.5%

0.656

0.532

-20.0%

0.101

4x15

5.47E-08

-19.3%

0.656

0.532

-20.0%

0.101

4x20

5.45E-08

-19.5%

0.656

0.530

-20.4%

0.101

ANN-GARCH

6.13E-08

-9.4%

0.176

0.599

-10.0%

0.000

GARCH

6.77E-08

0.539

0.666

3x10

CC E

PT

3x15

0.101

A

GARCH is the best GARCH-type model. Var. corresponds to the percentage variation of the loss function with respect to the best GARCH model. P-value is the value by applying the MCS test. The bold font indicates the best models by MCS.

Lastly, for the case of the Brazilian stock market, the MSE decreases by 10% with the 2x15 configuration, just as is the case with the other two markets. It can be observed that according to the MCS test, the 2x5, 2x10, 3x10, 3x15, 4x5, 4x10 and 4x15 models

27

̂ are equally good since they are statistically significant and superior (ℳ 90 )⁡. For the case of the MAPE, the ANN-GARCH configuration is the best model and the only good model according to the MCS test, concluding that the FUZZY inference does not provide more information to the ANN to obtain a better volatility forecast. Table 6. Model Sensitivity by Neurons and Layers for IBOV volatility. MAPE

-6.1%

0.940

0.707

-11.4%

0.891

1.35E-06

-5.9%

0.940

0.684

-14.4%

0.891

2x15

1.29E-06

-10.0%

1.000

0.680

-14.9%

0.891

2x20

1.41E-06

-1.9%

0.534

0.743

-6.9%

0.570

3x5

1.37E-06

-4.4%

0.534

0.720

-9.8%

0.631

3x10

1.33E-06

-7.1%

0.940

0.695

-13.0%

0.891

3x15

1.33E-06

-7.3%

0.940

0.704

-11.8%

0.826

3x20

1.36E-06

-5.4%

0.666

0.693

-13.1%

0.891

4x5

1.33E-06

-7.5%

0.940

0.754

-5.5%

0.098

4x10

1.32E-06

-7.9%

0.940

0.733

-8.2%

0.213

4x15

1.32E-06

-8.0%

0.940

0.686

-14.1%

0.891

4x20

1.38E-06

-3.7%

0.666

0.720

-9.8%

0.570

ANN-GARCH

1.37E-06

-4.4%

0.534

0.666

-16.6%

1.000

GARCH

1.44E-06

0.534

0.798

SC R

2x10

p-value

U

1.35E-06

N

2x5

Var.

IP T

p-value

A

Var.

M

MSE

0.213

ED

Model

PT

GARCH is the best GARCH-type model. Var. corresponds to the percentage variation of the loss function with respect to the best GARCH model. P-value is the value by applying the MCS test. The bold font indicates the best models by MCS.

CC E

To find the best configuration of the model, a new sensitivity analysis is done. This complement sensitivity is based on changing the percentage of training (and therefore the percentage of test and validation) and the time window.

A

Table 7. Sensitivity of Training and Time Parameters for the best MS-FNN-ANN model for each stock market. Sensitivity Variable Training

IPC 2x15

IPSA 2x15

IBOV 2x15

MSE

MAPE

MSE

MAPE

MSE

MAPE

0.5

3.05E-07

0.782

4.87E-08

0.515

1.36E-06

0.783

0.6

3.05E-07

0.742

4.42E-08

0.522

1.30E-06

0.731

28

0.7

2.84E-07

0.702

4.48E-08

0.493

1.29E-06

0.680

0.8

2.88E-07

0.696

4.89E-08

0.510

1.28E-06

0.672

3 month

2.46E-07

0.614

5.70E-08

0.532

1.22E-06

0.558

6 month

3.29E-07

0.669

7.21E-08

0.545

1.48E-06

0.695

9 month

3.00E-07

0.643

5.12E-08

0.529

1.47E-06

0.720

12 month

2.84E-07

0.702

4.48E-08

0.493

1.29E-06

0.680

15 month

2.91E-07

0.831

4.89E-08

0.554

1.20E-06

0.735

IP T

Time

The bold font indicates the optimal models for each market and variable sensitivity by applying ̂ the MCS test, (ℳ 90 )⁡.

SC R

Clearly, the parameters directly affect model predictions and functionality. In the case

of the IPC, the best training percentage was the base case, 70%, 15% testing, and 15% validation. In the case of the MAPE, it is the same configuration although the 80%

U

training is equally good according to the MCS test. In the case of the IPSA for the

N

MCS, the two best models are with 70% and 80% training, while in MAPE it is only with

A

70% training. Therefore, it is concluded that for the IPC and for the IPSA, the best

M

forecasting model is MS-FNN-GARCH 2Lx15N. For the Brazilian market, the best prediction according to MSE is done with the 80% training model, while according to

ED

MAPE, the best are 70% and 80% training. Thus, the best model for Brazil is with 80%

PT

training.

In the case of time, the results were inconclusive in the three markets and

CC E

demonstrated that when analyzing greater periods of time (and therefore, more data), the MS-FNN-GARCH forecast is not guaranteed to improve. Moreover, both IPC and IBOV exhibit behavior of this type in terms of MAPE, with the best configuration being 3

A

months. IPSA has a local minimum at 12 months in terms of MAPE and MSE. On the other hand, IBOV shows a minimum at 15 months for MSE, but by MAPE the minimum is 3 months.

6. Conclusions

29

This study focuses on improving the forecasting power of the well-known GARCH model, combining forecasts and additional external variables (macroeconomics) in a fuzzy intelligent system that allows us to analyze new relationships and obtain more precise results. The methodology allows us to conclude that an important improvement is obtained by

IP T

using fuzzy system inference in combination with analytic models in comparison with the results delivered individually. This makes the use of hybrid models that have the

SC R

capacity to detect relationships through time series attractive, creating an efficient

process by using these relationships in order to accurately obtain results. Moreover, the method with a combination of techniques delivers better results, demonstrating that

U

the synergy among the models results in an improved forecast.

N

The results of the MS-FNN-GARCH method are consistent in the improvement of the

A

forecast, both in square and absolute percentage error, diminishing them in all cases

M

compare with the best GARCH model (42.3% for the IPC, 51.1% for the IPSA and 20% for the IBOV in MSE; 52.4% for the IPC, 35,1% for the IPSA and 43% for the IBOV in

ED

MAPE) and compare also with the ANN-GARCH approach (28.9% for the IPC, 36.8% for the IPSA and 14.2% for the IBOV in MSE; 17.8% for the IPC, 21.5% for the IPSA

PT

and 19.4% for the IBOV in MAPE). Our results support the study hypothesis that the combination of several soft computing techniques provide a better forecast over an

CC E

analytical model. It is important to note that the forecast obtained with the proposed methodology is better in terms of both dispersion and punctual error, since it diminishes both the MSE and MAPE. These results also were tested with the MCS approach,

A

which support the results found. The seeking of the best configuration in the methodology proposed was also tested in terms of the sensitivity of several parameters that were arbitrarily chosen in the base model. This allows us to determine even better results with a new combination of these parameters, further diminishing the errors.

30

Comparing these results with those obtained by Kristjanpoller et al. [66], it can be observed by using the comparable loss function and MAPE that the MS-FNN-GARCH has a greater accuracy than the ANN-GARCH. Therefore, it can be concluded that the FUZZY inference contributes to the ANN forecast, providing information with value. Thus, in this study, it is verified that the use of soft computing in combination with an

IP T

analytical econometric model provides a better forecast than the individual techniques. This is due to the fact that this methodology allows us to relate more factors that

SC R

influence the behavior of a particular index in terms of influence, and therefore, identify

A

CC E

PT

ED

M

A

N

U

patterns of high and low impact.

31

References. [1] Engle, R. F., & Sheppard, K. (2001). Theoretical and empirical properties of dynamic conditional correlation multivariate GARCH (No. w8554). National Bureau of Economic Research.

IP T

[2] Gately, E. (1995). Neural networks for financial forecasting. John Wiley & Sons, Inc.. [3] Kaastra, I., & Boyd, M. (1996). Designing a neural network for forecasting financial and economic time series. Neurocomputing, 10(3), 215-236.

SC R

[4] Adhikari, R., &Agrawal, R. K. (2014). A combination of artificial neural network and random walk models for financial time series forecasting. Neural Computing and Applications, 24(6), 1441-1449.

U

[5] Chkili, W., & Nguyen, D. K. (2014).Exchange rate movements and stock market returns in a regime-switching environment: Evidence for BRICS countries. Research in International Business and Finance, 31, 46-56.

N

[6] Wei, L. Y. (2013). A GA-weighted ANFIS model based on multiple stock market volatility causality for TAIEX forecasting. Applied Soft Computing, 13(2), 911-920.

A

[7] Luo, L., & Chen, X. (2013). Integrating piecewise linear representation and weighted support vector machine for stock trading signal prediction. Applied Soft Computing, 13(2), 806-816.

M

[8] Wei, L. Y., Cheng, C. H., & Wu, H. H. (2014). A hybrid ANFIS based on n-period moving average model to forecast TAIEX stock. Applied Soft Computing, 19, 86-92.

ED

[9] Ahmadifard, M., Sadenejad, F., Mohammadi, I., &Aramesh, K. (2013). Forecasting stock market return using ANFIS: the case of Tehran Stock Exchange. International Journal of Advanced Studies in Humanities and Social Science, 1(5), 452-459.

PT

[10] Engle, R. F., Ghysels, E., &Sohn, B. (2013). Stock market volatility and macroeconomic fundamentals. Review of Economics and Statistics, 95(3), 776-797.

CC E

[11] Corradi, V., Distaso, W., &Mele, A. (2013).Macroeconomic determinants of stock volatility and volatility premiums. Journal of Monetary Economics, 60(2), 203-220. [12] Zakaria, Z., &Shamsuddin, S. (2012). Empirical evidence on the relationship between stock market volatility and macroeconomics volatility in Malaysia. Journal of Business Studies Quarterly, 4(2), 61.

A

[13] Attari, M. I. J., Safdar, L., & Student, M. B. A. (2013). The relationship between macroeconomic volatility and the stock market volatility: Empirical evidence from Pakistan. Pakistan Journal of Commerce and Social Sciences, 7(2), 309-320. [14] Wei, Y., Wang, Y., & Huang, D. (2010). Forecasting crude oil market volatility: Further evidence using GARCH-class models. Energy Economics, 32(6), 1477-1484. [15] Liu, H. C., & Hung, J. C. (2010). Forecasting S&P-100 stock index volatility: The role of volatility asymmetry and distributional assumption in GARCH models. Expert Systems with Applications, 37(7), 4928-4934. [16] Creti, A., Joëts, M., & Mignon, V. (2013).On the links between stock and commodity markets' volatility. Energy Economics, 37, 16-28. 32

[17] Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica: Journal of the Econometric Society, 987-1007. [18] Sadorsky, P. (2014). Modeling volatility and correlations between emerging market stock prices and the prices of copper, oil and wheat. Energy Economics, 43, 72-81. [19] Girardin, E., &Joyeux, R. (2013). Macro fundamentals as a source of stock market volatility in China: A GARCH-MIDAS approach. Economic Modelling, 34, 59-68. [20] Engle, R. F., Ghysels, E., &Sohn, B. (2008, August). On the economic sources of stock market volatility.In AFA 2008 New Orleans Meetings Paper.

IP T

[21] Beirne, J., Caporale, G. M., Schulze-Ghattas, M., &Spagnolo, N. (2010). Global and regional spillovers in emerging stock markets: A multivariate GARCH-in-mean analysis. Emerging Markets Review, 11(3), 250-260.

SC R

[22] Syriopoulos, T., Makram, B., &Boubaker, A. (2015). Stock market volatility spillovers and portfolio hedging: BRICS and the financial crisis. International Review of Financial Analysis, 39, 7-18.

U

[23] Choi, K., &Hammoudeh, S. (2010). Volatility behavior of oil, industrial commodity and stock markets in a regime-switching environment. Energy Policy, 38(8), 43884399.

N

[24] Walid, C., Chaker, A., Masood, O., & Fry, J. (2011). Stock market volatility and exchange rates in emerging countries: A Markov-state switching approach. Emerging Markets Review, 12(3), 272-292.

M

A

[25] Xinyi, L., Margaritis, D., & Wang, P. (2013). Stock Market Volatility and Equity Returns: Evidence from a Two-State Markov-Switching Model with Regressors. Journal of Empirical Finance, Forthcoming.

ED

[26] Reher, G., &Wilfling, B. (2016). A nesting framework for Markov-switching GARCH modelling with an application to the German stock market. Quantitative Finance, 16(3), 411-426.

PT

[27] Chuang, W. I., Huang, T. C., & Lin, B. H. (2013). Predicting volatility using the Markov-switching multifractal model: Evidence from S&P 100 index and equity options. The North American Journal of Economics and Finance, 25, 168-187.

CC E

[28] Dixit, G., Roy, D., &Uppal, N. (2013). Predicting India Volatility Index: An Application of Artificial Neural Network. International Journal of Computer Applications, 70(4). [29] Mantri, J. K., Gahan, P., &Nayak, B. B. (2014). Artificial neural networks—an application to stock market volatility. Soft-Computing in Capital Market: Research and Methods of Computational Finance for Measuring Risk of Financial Instruments, 179.

A

[30] Babu, C. N., & Reddy, B. E. (2014). A moving-average filter based hybrid ARIMA– ANN model for forecasting time series data. Applied Soft Computing, 23, 27-38. [31] Hajizadeh, E., Seifi, A., Zarandi, M. F., &Turksen, I. B. (2012). A hybrid modeling approach for forecasting the volatility of S&P 500 index return. Expert Systems with Applications, 39(1), 431-436. [32] Monfared, S. A., &Enke, D. (2014). Volatility forecasting using a hybrid GJRGARCH neural network model. Procedia Computer Science, 36, 246-253.

33

[33] Kristjanpoller, W., &Minutolo, M. C. (2015). Gold price volatility: A forecasting approach using the Artificial Neural Network–GARCH model. Expert Systems with Applications, 42(20), 7245-7251. [34] Hung, J. C. (2011). Applying a combined fuzzy systems and GARCH model to adaptively forecast stock market volatility. Applied Soft Computing, 11(5), 3938-3945. [35] Dash, R., & Dash, P. K. (2016). An evolutionary hybrid Fuzzy Computationally Efficient EGARCH model for volatility prediction. Applied Soft Computing, 45, 40-60.

IP T

[36] Atsalakis, G. S., &Valavanis, K. P. (2009). Forecasting stock market short-term trends using a neuro-fuzzy based methodology. Expert Systems with Applications, 36(7), 10696-10707. [37] Hong, Y. Y., & Lee, C. F. (2005). A neuro-fuzzy price forecasting approach in deregulated electricity markets. Electric Power Systems Research, 73(2), 151-157.

SC R

[38] Atsalakis, G. S., &Valavanis, K. P. (2009). Forecasting stock market short-term trends using a neuro-fuzzy based methodology. Expert Systems with Applications, 36(7), 10696-10707. [39] Corsi, F., Mittnik, S., Pigorsch, C., &Pigorsch, U. (2008). The volatility of realized volatility. Econometric Reviews, 27(1-3), 46-78.

N

U

[40] Andersen, T. G., &Bollerslev, T. (1998). Answering the skeptics: Yes, standard volatility models do provide accurate forecasts. International economic review, 885905.

M

A

[41] Hansen, P. R., &Lunde, A. (2005). A forecast comparison of volatility models: does anything beat a GARCH (1, 1)?.Journal of applied econometrics, 20(7), 873-889.[42] Hamilton, J. D., &Susmel, R. (1994). Autoregressive conditional heteroskedasticity and changes in regime. Journal of Econometrics, 64(1), 307-333.

ED

[43] Lux, T., &Marchesi, M. (2000). Volatility clustering in financial markets: a microsimulation of interacting agents. International journal of theoretical and applied finance, 3(04), 675-702.

PT

[44] Jang, J. S. (1993). ANFIS: adaptive-network-based fuzzy inference system. IEEE transactions on systems, man, and cybernetics, 23(3), 665-685. [45] Kim, B., &Bishu, R. R. (1998). Evaluation of fuzzy linear regression models by comparing membership functions. Fuzzy sets and systems, 100(1), 343-352.

CC E

[46] Hong, T. P., & Lee, C. Y. (1996). Induction of fuzzy rules and membership functions from training examples. Fuzzy sets and Systems, 84(1), 33-47. [47] Civanlar, M. R., &Trussell, H. J. (1986). Constructing membership functions using statistical data. Fuzzy sets and Systems, 18(1), 1-13.

A

[48] Takagi, T., &Sugeno, M. (1983, July). Derivation of fuzzy control rules from human operator’s control actions. In Proceedings of the IFAC symposium on fuzzy information, knowledge representation and decision analysis (Vol. 6, pp. 55-60).sn. [49] Nelles, O. (2013). Nonlinear system identification: from classical approaches to neural networks and fuzzy models. Springer Science & Business Media. [50] Hecht-Nielsen, R. (1989, June). Theory of the backpropagation neural network.In Neural Networks, 1989. IJCNN., International Joint Conference on (pp. 593605). IEEE.

34

[51] Mailis, T., Stoilos, G., &Stamou, G. (2007, June).Expressive reasoning with horn rules and fuzzy description logics.In International Conference on Web Reasoning and Rule Systems (pp. 43-57).Springer Berlin Heidelberg. [52] Lahmiri, S. (2017). Modeling and predicting historical volatility in exchange rate markets. Physica A: Statistical Mechanics and its Applications, 471, 387-395. [53] Amilon, H. (2003). A neural network versus Black-Scholes: A comparison of pricing and hedging performances. Journal of Forecasting, 22(4), 317. [54] Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of econometrics, 31(3), 307-327.

IP T

[55] Taylor, S. J. (1986).Modelling financial time series. Wiley, New York, NY, United States.

SC R

[56] Zakoian, J. M. (1994). Threshold heteroskedastic models.Journal of Economic Dynamics and control, 18(5), 931-955. [57] Nelson, D. B. (1991). Conditional heteroskedasticity in asset returns: A new approach. Econometrica: Journal of the Econometric Society, 347-370.

U

[58] Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle.Econometrica: Journal of the Econometric Society, 357-384.

N

[59] Hamilton, J. D. (1990). Analysis of time series subject to changes in regime.Journal of econometrics, 45(1-2), 39-70.

A

[60] Hamilton, J. D., &Susmel, R. (1994). Autoregressive conditional heteroskedasticity and changes in regime. Journal of econometrics, 64(1), 307-333.

M

[61] Cai, J. (1994). A Markov model of switching-regime ARCH.Journal of Business & Economic Statistics, 12(3), 309-316.

ED

[62] Gray, S. F. (1996). Modeling the conditional distribution of interest rates as a regime-switching process. Journal of Financial Economics, 42(1), 27-62.

PT

[63] Klaassen, F. (2002). Improving GARCH volatility forecasts with regime-switching GARCH. In Advances in Markov-Switching Models (pp. 223-254).Physica-Verlag HD. [64] Ghysels, E., Sinko, A., &Valkanov, R. (2007). MIDAS regressions: Further results and new directions. Econometric Reviews, 26(1), 53-90.

CC E

[65] Engle, R. F., & Rangel, J. G. (2008). The spline-GARCH model for low-frequency volatility and its global macroeconomic causes. Review of Financial Studies, 21(3), 1187-1222. [66] Kristjanpoller, W., Fadic, A., &Minutolo, M. C. (2014). Volatility forecast using hybrid Neural Network models. Expert Systems with Applications, 41(5), 2437-2442.

A

[67] Hansen, P. R., Lunde, A., &Nason, J. M. (2003). Choosing the best volatility models: The model confidence set approach. Oxford Bulletin of Economics and Statistics, 65(s1), 839-861. [68 ]Hansen, P. R., Lunde, A., &Nason, J. M. (2011). The model confidence set. Econometrica, 79(2), 453-497. [69] Gospodinov, N., & Jamali, I. (2015). The response of stock market volatility to futures-based measures of monetary policy shocks. International Review of Economics & Finance, 37, 42-54.

35

[70] Kandil, M. (2004). Exchange rate fluctuations and economic activity in developing countries: Theory and evidence. Journal of Economic Development, 29, 85-108. [71] Mensi, W., Beljid, M., Boubaker, A., & Managi, S. (2013). Correlations and volatility spillovers across commodity and stock markets: Linking energies, food, and gold. Economic Modelling, 32, 15-22. [72] Wang, Y., Wei, Y., & Wu, C. (2010). Cross-correlations between Chinese A-share and B-share markets. Physica A: Statistical Mechanics and its Applications, 389(23), 5468-5478.

IP T

[73] Chan, F., Marinova, D., & McAleer, M. (2004). Modelling the asymmetric volatility of anti-pollution patents in the USA. Scientometrics, 59(2), 179-197.

[74] Patton, A. J. (2011). Volatility forecast comparison using imperfect volatility proxies. Journal of Econometrics, 160(1), 246-256.

SC R

[75] Fuertes, A. M., Izzeldin, M., & Kalotychou, E. (2009). On forecasting daily stock volatility: The role of intraday information and market conditions. International Journal of Forecasting, 25(2), 259-281.

N

U

[76] R. Jovanovic, L. M. Pomares, Y. E. Mohieldeen, D. Perez-Astudillo and D. Bachour, "An evolutionary method for creating ensembles with adaptive size neural networks for predicting hourly solar irradiance," 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, 2017, pp. 1962-1967. doi: 10.1109/IJCNN.2017.7966091

A

[77] Zameer, A., Arshad, J., Khan, A., & Raja, M. A. Z. (2017). Intelligent and robust prediction of short term wind power using genetic programming based ensemble of neural networks. Energy conversion and management, 134, 361-372.

A

CC E

PT

ED

M

[78] Qureshi, A. S., Khan, A., Zameer, A., & Usman, A. (2017). Wind power prediction using deep neural network based meta regression and transfer learning. Applied Soft Computing, 58, 742-755.

36