Forecasting tourism demand to Catalonia: Neural networks vs. time series models

Forecasting tourism demand to Catalonia: Neural networks vs. time series models

Economic Modelling 36 (2014) 220–228 Contents lists available at ScienceDirect Economic Modelling journal homepage: www.elsevier.com/locate/ecmod F...

393KB Sizes 1 Downloads 54 Views

Economic Modelling 36 (2014) 220–228

Contents lists available at ScienceDirect

Economic Modelling journal homepage: www.elsevier.com/locate/ecmod

Forecasting tourism demand to Catalonia: Neural networks vs. time series models Oscar Claveria a,⁎, Salvador Torra b a b

Research Institute of Applied Economics (IREA), University of Barcelona, Barcelona 08034, Spain Department of Econometrics and Statistics, University of Barcelona, Barcelona 08034, Spain

a r t i c l e

i n f o

Article history: Accepted 17 September 2013 JEL classification: C53 C42 C45 L83 Keywords: Forecasting Time series models Neural networks Tourism demand Catalonia

a b s t r a c t The increasing interest aroused by more advanced forecasting techniques, together with the requirement for more accurate forecasts of tourism demand at the destination level due to the constant growth of world tourism, has lead us to evaluate the forecasting performance of neural modelling relative to that of time series methods at a regional level. Seasonality and volatility are important features of tourism data, which makes it a particularly favourable context in which to compare the forecasting performance of linear models to that of nonlinear alternative approaches. Pre-processed official statistical data of overnight stays and tourist arrivals from all the different countries of origin to Catalonia from 2001 to 2009 is used in the study. When comparing the forecasting accuracy of the different techniques for different time horizons, autoregressive integrated moving average models outperform self-exciting threshold autoregressions and artificial neural network models, especially for shorter horizons. These results suggest that the there is a trade-off between the degree of pre-processing and the accuracy of the forecasts obtained with neural networks, which are more suitable in the presence of nonlinearity in the data. In spite of the significant differences between countries, which can be explained by different patterns of consumer behaviour, we also find that forecasts of tourist arrivals are more accurate than forecasts of overnight stays. © 2013 Elsevier B.V. All rights reserved.

1. Introduction Many stationary phenomena can be approximated by linear time series models. Nevertheless, it is generally believed that the nonlinear methods outperform the linear methods in modelling economic behaviour. Artificial intelligence techniques have become an essential tool for economic modelling and forecasting, as they are far better able to handle nonlinear behaviour. Neural networks have been applied in many areas, but only recently for tourism demand forecasting. Tourism data is characterised by strong seasonal patterns and volatility, thus the original series requires significant pre-processing in order to be used with forecasting purposes. While eliminating the existing outliers and adjusting the seasonal component of the series, this filtering process ends up conditioning the forecasting performance of the models. Therefore, tourism demand is a particularly interesting field in which to analyse the effects of data pre-pre-processing on forecast accuracy and to compare the forecasting performance of neural networks relative to that of time series models. There has been a growing interest in tourism research over the past decades. Some of the reasons for this increase in the number of studies ⁎ Corresponding author at: Department of Econometrics and Statistics, University of Barcelona, Diagonal 690, 08034 Barcelona, Spain. Tel.: +34 934021825; fax: +34 934021821. E-mail addresses: [email protected] (O. Claveria), [email protected] (S. Torra). 0264-9993/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.econmod.2013.09.024

of tourism demand modelling and forecasting are: the constant growth of world tourism, the utilisation of more advanced forecasting techniques in tourism research and the requirement for more accurate forecasts of tourism demand at the destination level. The consolidation of tourism planning at a regional level in many countries, such as Spain (Ivars, 2004), is one of the main reasons behind the increasing demand for accurate forecasts of tourist arrivals in a specific region. Despite the consensus on the need to develop accurate forecasts, there are very few studies undertaken at a regional level due to the lack of statistical information. All this has led us to focus on forecasting inbound international tourism demand to Catalonia, which is one of the main tourist destinations in Europe (Gary and Cànoves, 2011). Catalonia is one of the seventeen autonomous communities in Spain. Barcelona is its capital. Over 14million foreign visitors come to Catalonia every year, leading to 111 million overnight stays. Tourism makes a major contribution to Catalonia's economic development: it accounts for 12% of GDP and provides employment for around 19% of the working population in the service sector. Therefore, accurate forecasts of tourism volume play a major role in tourism planning as they enable destinations to predict infrastructure development needs. The forecast of tourism volume in the form of arrivals is especially important because it is an indicator of future demand (Chu, 2009). Despite the fact that tourist arrivals are the most popular measure of tourism demand, some studies have used tourist expenditure in the destination (Li et al., 2006a), tourism revenues (Akal, 2004) or tourism employment

O. Claveria, S. Torra / Economic Modelling 36 (2014) 220–228

(Witt et al., 2004). To our knowledge, there is only one previous study (Claveria and Datzira, 2010) that has used overnight stays as a proxy measure of tourism demand to compare the resulting forecasts to those of tourist arrivals. According to Song and Li (2008), who reviewed the tourism literature on tourism demand modelling and forecasting, there is no one model that stands out in terms of forecasting accuracy. Following Coshall and Charlesworth (2010), studies of tourism demand forecasting can be subdivided into causal econometric models and non-causal time series models. On the one hand, the most commonly used casual econometric models found in the literature are: cointegration and error correction (ECM) models (Algieri, 2006; Dritsakis, 2004), time varying parameter (TVP) models (Song and Wong, 2003), structural equation (SEQ) models (Turner and Witt, 2001), vector autoregressive (VAR) models (Song and Witt, 2006) and linear almost ideal system (LAIDS) models (Han et al., 2006). These methods have also been combined (Li et al., 2006b). On the other hand, the most widely used procedures in non-causal time series forecasting are the autoregressive integrated moving average (ARIMA) models (Goh and Law, 2002) and the exponential smoothing (ES) models (Cho, 2003). Less frequently applied are nonlinear methods such as self-exciting threshold autoregressions (SETAR) and Markov-switching regime models (Claveria and Datzira, 2010). Recently, artificial intelligence (AI) methods have also been implemented in tourism forecasting. The most commonly used AI methods are artificial neural network (ANN) models. ANN models have been applied in many fields, but only recently to tourism demand forecasting (Kon and Turner, 2005; Palmer et al., 2006). This increasing interest in more advanced forecasting techniques together with the fact that tourism has become a leading global industry, contributing to a significant proportion of world production, trade, investments and employment, has lead us to evaluate the forecasting performance of artificial neural network models to that of the most widely used procedures on tourism demand modelling. We use different forecasting horizons and compare the forecasting performance of two different measures of tourism demand (tourist arrivals and overnight stays) for all the different countries of origin to Catalonia. The main objective of the paper is to evaluate the forecasting performance of artificial neural networks relative to different time series models (ARIMA and SETAR models) at a regional level. We use official statistical data of inbound international tourism demand to Catalonia from 2001 to 2009. Then the root mean squared forecast error (RMSFE) is computed for different forecast horizons (1, 2, 3, 6 and 12months) and the Diebold–Mariano loss-differential test for predictive accuracy is performed in order to compare the different methods for both tourist arrivals and overnight stays. The structure of the paper is as follows. Section 2 briefly describes our methodological approach, including both time series models and artificial neural network models. The data set is described in Section 3. In Section 4 results of the forecasting competition are discussed. Last, conclusions are given in Section 5.

2. Methodology 2.1. Time series models A time series model explains a variable with regard to its own past and a random disturbance term. Time series models have been widely used for tourism demand forecasting in the past four decades, with the dominance of the integrated moving-average (ARIMA) models proposed by Box and Jenkins (1970). In this work two different time series models are used to obtain forecasts for the quantitative variables expressed as year-on-year growth rates: autoregressive integrated moving average (ARIMA) models and self-exciting threshold autoregression (SETAR) models.

221

2.1.1. Autoregressive integrated moving average models (ARIMA) The general expression of an ARIMA model is the following: λ

xt ¼

  Θs Ls θðLÞ Φs ðLs Þ ϕðLÞΔDs Δd

εt

ð1Þ

where Θs(Ls) = (1 − ΘsLs − Θ2sL2s − … − ΘQsLQs) is a seasonal moving average polynomial, Φs(Ls) = (1 − ΦsLs − Φ2sL2s − … − ΦPsLPs) is a seasonal autoregressive polynomial, θ(L) = (1 − θ1L1 − θ2L2 − … − θqLq) is a regular moving average polynomial, and φ(L) = (1 − φ1L1 − φ2L2 − … − φpLp) is a regular autoregressive polynomial, λ is the value of the Box and Cox (1964) transformation, ΔD s is the seasonal difference operator, Δd is the regular difference operator, S is the periodicity of the considered time series, and εt is the innovation which is assumed to behave as a white noise. In order to use this kind of models with forecasting purposes we have designed an algorithm that identifies that best suited model, including the necessary number of differences D and d. To determine the number of lags that should be included in the model, we have selected the model with the lowest value of the Akaike Information Criteria (AIC) considering models with a minimum number of 1 lag up to a maximum of 12 (including all the intermediate lags). 2.1.2. Self-exciting threshold autoregression models (SETAR) A self-excited threshold autoregressive model (SETAR) for the time series xt can be summarised as follows: BðLÞ·xt þ ut if xt−k ≤x

ð2Þ

ζ ðLÞ·st þ vt if xt−k Nx

ð3Þ

where ut and vt are white noises, B(L) and ζ(L) are autoregressive polynomials, the value k is known as delay and the value x is known as threshold. This two-regime self-exciting threshold autoregressive process is estimated for the CCI and a Monte Carlo procedure is used to generate multi-step forecasts. The selected values of the delay are those minimising the sum of squared errors among values between 1 and 12. The values of the threshold are given by the variation of the analysed variable. 2.2. Artificial neural network (ANN) models In recent years the study of artificial neural network (ANN) models has aroused great interest, as they are universal function approximators capable of mapping any linear or non-linear function. Neural networks have been applied in many fields, but they are increasingly being used for prediction and classification, the areas where statistical methods have traditionally been used (Adya and Collopy, 1998; Estrella and Mishkin, 1998; Swanson and White, 1997). Only recently ANN models are being used for tourism demand forecasting (Cho, 2003; Kon and Turner, 2005; Law, 2000, 2001; Law and Au, 1999; Palmer et al., 2006; Tsaur et al., 2002). ANN models have two learning methods: supervised and unsupervised. The neuronal network model most widely used in time series forecasting is the multi-layer perceptron (MLP) method. The MLP is a supervised neural network based on the original simple perceptron model, but with additional hidden layers of neurons between the input and output layers that increases the learning power of the MLP. The number of hidden neurons determines the MLP network's capacity to learn. Selecting the network which performs best with the least possible number of hidden neurons is most recommended. Due to their flexibility, ANN models lack a systematic procedure for model building. Therefore, obtaining a reliable neural model involves selecting a large number of parameters experimentally through trial and error (Palmer et al., 2006). Kock and Teräsvirta (2011) and Zhang et al. (1998) review the main ANN modelling issues: the network

222

O. Claveria, S. Torra / Economic Modelling 36 (2014) 220–228

A

Tourists

Overnights

40.0

60.0 50.0

30.0

40.0 20.0

30.0

10.0

20.0 10.0

0.0

0.0 -10.0 -20.0 2002

-10.0 2003

2004

B

2005

2006

2007

2008

2009

-20.0 2002

2003

France

2004

2005

2006

2007

2008

2009

United Kingdom

100.0

100.0

80.0

80.0

60.0

60.0 40.0

40.0

20.0

20.0

0.0

0.0

-20.0

-20.0

-40.0

-40.0 2002 2003 2004 2005 2006 2007 2008 2009

-60.0 2002 2003 2004 2005 2006 2007 2008 2009

Belgium and Netherlands

Germany

50.0 40.0 30.0 20.0 10.0 0.0 -10.0 -20.0 -30.0 2002

2003

2004

2005

2006

2007

2008

2009

60.0 50.0 40.0 30.0 20.0 10.0 0.0 -10.0 -20.0 -30.0 -40.0 2002

Italy 80.0 60.0 40.0 20.0 0.0 -20.0 2003

2004

2005

2006

2004

2007

2008

2009

60.0 50.0 40.0 30.0 20.0 10.0 0.0 -10.0 -20.0 -30.0 -40.0 2002

2003

2004

Northern countries

2006

2007

2008

2009

2005

2006

2007

2008

2009

2008

2009

2008

2009

Switzerland

80.0

80.0

60.0

60.0 40.0

40.0

20.0

20.0

0.0

0.0

-20.0

-20.0 -40.0 2002

2005

United States and Japan

100.0

-40.0 2002

2003

-40.0 2003

2004

2005

2006

2007

2008

2009

-60.0 2002

2003

Russia 200.0 150.0 100.0 50.0 0.0 -50.0 -100.0 2002 2003 2004 2005 2006 2007 2008 2009

Source: Compiled by the author.

2004

2005

2006

2007

Other countries 60.0 50.0 40.0 30.0 20.0 10.0 0.0 -10.0 -20.0 -30.0 2002

2003

2004

2005

2006

2007

O. Claveria, S. Torra / Economic Modelling 36 (2014) 220–228

C

France

223

United Kingdom

200.0

300.0 250.0

150.0

200.0 100.0

150.0

50.0

100.0 50.0

0.0

0.0 -50.0

-50.0

-100.0 2002 2003 2004 2005 2006 2007 2008 2009

-100.0 2002 2003 2004 2005 2006 2007 2008 2009

Belgium and Netherlands

Germany

200.0 150.0 100.0 50.0 0.0 -50.0 2002

2003

2004

2005

2006

2007

2008

2009

50.0 40.0 30.0 20.0 10.0 0.0 -10.0 -20.0 -30.0 -40.0 2002

Italy 200.0 150.0 100.0 50.0 0.0 -50.0 -100.0 2002 2003 2004 2005 2006 2007 2008 2009

2005

2006

2007

2008

2009

350.0 300.0 250.0 200.0 150.0 100.0 50.0 0.0 -50.0 -100.0 2002 2003 2004 2005 2006 2007 2008 2009

Switzerland 150.0 100.0 50.0 0.0 -50.0 -100.0 2002 2003 2004 2005 2006 2007 2008 2009

Russia 300.0 250.0 200.0 150.0 100.0 50.0 0.0 -50.0 -100.0 -150.0 2002 2003 2004 2005 2006 2007 2008 2009

2004

United States and Japan

Northern countries 140.0 120.0 100.0 80.0 60.0 40.0 20.0 0.0 -20.0 -40.0 -60.0 2002 2003 2004 2005 2006 2007 2008 2009

2003

Other countries 100.0 80.0 60.0 40.0 20.0 0.0 -20.0 -40.0 2002 2003 2004 2005 2006 2007 2008 2009

Source: Compiled by the author.

Fig. 1. A. Evolution of international overnight stays and tourist arrivals in Catalonia.1 B. Evolution of international tourist arrivals in Catalonia by country of origin.2 C. Evolution of international overnight stays in Catalonia by country of origin.3. Source: Compiled by the author.

1 The black line represents the year-on-year growth rates of the seasonally adjusted series of tourist arrivals and overnight stays. The dotted line represents the year-on-year growth rates of the trend-cycle component.

2 The black line represents the year-on-year growth rates of the seasonally adjusted series of tourists who come to Catalonia from each visitor country. The dotted line represents the year-on-year growth rates of the trend-cycle component. 3 The black line represents the year-on-year growth rates of the seasonally adjusted series of international overnight stays in Catalonia by country of origin. The dotted line represents the year-on-year growth rates of the trend-cycle component.

224

O. Claveria, S. Torra / Economic Modelling 36 (2014) 220–228

architecture (determining the number of input nodes, hidden layers, hidden nodes and output nodes), the activation function, the training algorithm, the training sample and the test sample, and the performance measures. In this work we used the MLP specification suggested by Kuan and White (1994):

required in most cases4 and prove the importance of deseasonalizing and detrending tourism demand data before modelling and forecasting. In spite of the fact that we have accounted for the presence of trends and seasonality tourism data may sometimes require further preprocessing (Zhang and Qi, 2005). 4. Results

   q xt ¼ f β0 þ Σ β j g xt−1 φij þ φ0 j j¼1 n o φij ; i ¼ 0; 1; ⋯; p; j ¼ 1; ⋯; q n o β j ; j ¼ 0; 1; ⋯; q

ð4Þ

where f is the output function; g is the activation function; p is the number of inputs; q is the number of neurons in the hidden layer; xt is the output; xt − 1 is the input; βj are the weights connecting the output with the hidden layer and φij are the weights connecting the input with the hidden layer. We chose an MLP(1;3) architecture that allows us to represent the possible non-linear relationship between xt and xt − 1. See Choudhary and Haider (2012) and Nakamura (2006) for other specifications. Following Bishop (1995) and Ripley (1996), we divided the collected data into three sets: training, validation and test sets. This division seeks to improve the performance of the network with new cases. To achieve a more reliable and accurate result, a four year period served as the training set. Based on these considerations, the first fifty observations were selected as the training set, the next twenty-one as the validation set and the last 10% as the testing set. These models were implemented using MATLAB™ and its Neural Networks module. Inputs were normalised in order to facilitate the learning process. We used Levenberg–Marquardt backpropagation in order to calculate the weights in each of the iterations based on the minimization of the mean squared error. 3. Data Monthly data of tourist arrivals and overnight stays from foreign countries to Catalonia over the time period 2001 to 2009 were provided by the Direcció General de Turisme de Catalunya and the Statistical Institute of Catalonia (IDESCAT). As it can be seen in Fig. 1A to C, monthly series of both tourist arrivals and overnight stays show a marked seasonality. In order to eliminate both linear trends as well as seasonality we obtained the trend-cycle component of the series using Seats/ Tramo and used year-on-year growth rates. Table 1 shows a descriptive analysis of annual growth in tourism demand between January 2002 and December 2009. During this period, Italy and the Northern countries experienced the highest growth in both tourist arrivals and overnight stays. Russia is the country that presents the highest dispersion in growth rates for both tourist arrivals and overnight stays. Additionally, we computed some of the most commonly used methods to test the unit root hypothesis: the augmented Dickey–Fuller (ADF) test, the Phillips–Perron (PP) test, and the Kwiatkowski–Phillips– Schmidt–Shin (KPSS) test. While the ADF and the PP statistics test the null hypothesis of a unit root in xt (Table 2) and in the firstdifferenced values of xt (Table 3), the KPSS statistic tests the null hypothesis of stationarity in both xt and Δxt. As it can be seen in Table 2, in most countries we cannot reject the null hypothesis of a unit root at the 5% level. Similar results are obtained for the KPSS test, where the null hypothesis of stationarity is rejected in most cases. When the tests were applied to the first difference of individual time series (Table 3), the null of nonstationarity is strongly rejected in most cases. In the case of the KPSS test, we cannot reject the null hypothesis of stationarity at the 5% level in any country. These results imply that differencing is

In this section we evaluated the forecasting performance of artificial neural network (ANN) models relative to different time series models (ARIMA and SETAR models) at a regional level. We used preprocessed official statistical data of overnight stays and tourist arrivals from all the different countries of origin to Catalonia from 2001 to 2009. All models were estimated from January 2001 to January 2008 and forecasts for 1, 2, 3, 6 and 12 months ahead were computed. The specifications of the models were based on information up to that date and, then re-estimated each month for forecasts to be computed. Given the availability of actual values until December 2009, forecast errors were computed in a recursive way (i.e., for the 1 month forecast horizon, 12 forecast errors were computed). To summarise this information, we calculated the root mean squared forecast error (RMSFE) to rank the different methods according to their values. The RMSFE is especially useful when working with growth rates. Additionally, due to the squaring process, the RMSFE is more sensitive than other forecasting accuracy measures to occasional large errors. Finally, in order to check whether the reduction in RMSFE was statistically significant, the Diebold–Mariano loss-differential test for predictive accuracy was performed. For ANN models we divided the collected data into three sets (training, validation and test sets) in order to improve the performance of the network with new cases. To achieve a more reliable and accurate result, a four year period served as the training set. Based on these considerations, to compare ANN forecasts to those of time series models, the first fifty observations were selected as the training set, the next twenty-one as the validation set and the last 10% of the data as the testing set. The results of the forecasting competition are shown in Tables 4 and 5. These tables present the values of the root of the mean squared forecast error (RMSFE) obtained from recursive forecasts for 1, 2, 3, 6 and 12 months during the year 2009. Table 5 shows RMSFE values for each country for the number of tourist arrivals while Table 6 shows RMSFE values for each country for the number of overnight stays. When analysing the forecast accuracy for tourist arrivals, ARIMA models show lower RMSFE values than ANN models, especially for shorter horizons. For overnight stays, ARIMA models do not always converge. In spite of showing the lowest RMSFE values for tourist arrivals for the United Kingdom and Germany, SETAR models display the highest RMSFE values. In four countries, the lowest RMSFE values are found for the longest horizon. For one month ahead, Switzerland displays the lowest RMSFE values for both tourist arrivals and overnight stays, as opposed to the United States and Japan that show the highest RMSFE values. While the out-of-sample forecast performance of ANN models relative to time series models differs between tourist arrivals (ARIMA models clearly outperform SETAR and ANN models) and overnight stays (ANN models display lower RMSFE than ARIMA and SETAR models in most cases, especially for longer horizons), the key issue is testing which model shows significantly lower forecasting errors. As the lowest RMSFE values are usually obtained for the shortest forecasting horizon, we calculated the measure of predictive accuracy proposed by Diebold and Mariano (1995) between each two models for one month ahead forecasts. In Table 6 we present the results of the Diebold–Mariano test, while in Table 7 we indicate which is the model that shows significantly lower forecasting errors for each country of origin. 4 An algorithm to determine the necessary number of differences has been implemented for model selection.

O. Claveria, S. Torra / Economic Modelling 36 (2014) 220–228

225

Table 1 Descriptive analysis of the year-on-year rates of the trend-cycle series. Country

France United Kingdom Belgium and NL Germany Italy US and Japan Northern countries Switzerland Russia Other countries Total

Tourist arrivals

Overnight stays

Mean

SD

Skew.

Kurt.

Mean

SD

Skew.

Kurt.

6.54 4.55 2.12 0.62 7.85 4.63 9.17 −0.19 7.87 6.47 4.26

15.67 15.07 9.12 7.87 14.01 10.61 16.57 1.41 23.12 9.83 8.14

1.26 0.16 1.00 −0.09 −0.09 0.52 0.03 −1.10 −2.23 0.06 −0.67

4.19 2.50 3.84 3.28 1.71 4.42 2.50 2.85 6.04 2.41 2.59

4.37 3.34 −1.45 1.92 11.28 5.90 9.31 −3.05 1.76 7.86 2.94

14.20 21.91 2.33 4.92 23.99 28.72 20.19 4.32 36.06 14.98 7.67

1.34 0.65 −0.13 0.12 2.05 0.25 −0.27 0.06 −0.65 1.33 1.87

4.64 3.19 1.38 2.26 5.64 1.62 1.89 1.56 3.42 4.37 5.20

Note: SD — standard deviation, Skew. — skewness, Kurt. — kurtosis. Table 2 Unit root tests in xt — test for I(0). Country

France United Kingdom Belgium and NL Germany Italy US and Japan Northern countries Switzerland Russia Other countries Total

Tourist arrivals

Overnight stays

ADF

PP

KPSS

ADF

PP

KPSS

−2.56 0.00 −1.87 −0.36 −1.28 −4.10 −0.04 −1.65 −2.13 −0.94 −0.87

−2.56 −0.64 −1.88 −1.01 −0.58 −3.16 −1.88 −0.18 −2.23 −1.36 −1.01

0.75 1.20 0.75 0.78 0.60 0.28 0.26 0.92 0.38 0.46 0.92

−0.72 −1.48 −1.29 −2.21 −2.12 −1.47 −1.39 −1.35 −2.91 −1.74 −1.96

−2.47 −2.42 −0.85 −2.09 −2.15 −1.53 −2.07 −0.86 −2.91 −1.94 −2.02

0.74 0.44 1.12 0.26 0.43 0.55 0.24 0.81 0.10 0.37 0.55

Note: Estimation period 2002–2009. Tests for unit roots: ADF — augmented Dickey and Fuller (1979) test, the 5% critical value is −2.90; PP — Phillips and Perron (1988) test, the 5% critical value is −2.89. Test of stationarity: KPSS — Kwiatkowski et al. (1992) test, the 5% critical value is 0.46.

As shown in Tables 6 and 7, for tourist arrivals ARIMA models outperform SETAR and ANN models in most countries. This difference is statistically significant in 40% and 70% of the cases respectively. SETAR models show significantly lower forecasting errors than ANN models in six countries, while ANN models only in two. For overnight stays, ARIMA models outperform SETAR and ANN models in all countries except France, and in almost all cases the difference is statistically significant. SETAR models show significantly lower forecasting errors than ANN models in four countries, while ANN models only in one.

In summary, the comparison of the out-of-sample forecast performance of artificial neural network models relative to time series models for inbound tourism demand in Catalonia permits us to conclude that ARIMA models show significantly lower RMSFE values than ANN and SETAR models in most cases, therefore showing the best forecasting ability. In a recent work for Taiwan, Lin et al. (2011) also found that ARIMA models outperformed ANN models. Nevertheless, these results contrast with those obtained by Cho (2003), Law (2000) and Law and Au (1999), who found evidence in favour of ANN models when compared to ARIMA models.

Table 3 Unit root tests in Δxt — test for I(1). Country

France United Kingdom Belgium and NL Germany Italy US and Japan Northern countries Switzerland Russia Other countries Total

Tourist arrivals

Overnight stays

ADF

PP

KPSS

ADF

PP

KPSS

−7.13 −3.22 −6.23 −2.81 −1.18 −8.53 −5.52 −1.78 −6.42 −2.84 −8.64

−9.56 −4.00 −9.81 −2.68 −2.16 −7.01 −2.94 −1.76 −9.53 −2.77 −8.69

0.19 0.07 0.05 0.18 0.27 0.09 0.18 0.22 0.06 0.10 0.08

−3.57 −4.72 −1.01 −4.68 −9.75 −9.55 −4.57 −2.13 −7.15 −2.56 −9.71

−3.05 −3.31 −2.37 −2.89 −9.75 −9.55 −2.70 −2.27 −7.27 −2.77 −9.71

0.17 0.05 0.18 0.06 0.06 0.08 0.23 0.29 0.05 0.09 0.06

Note: Estimation period 2002–2009. Tests for unit roots: ADF — augmented Dickey and Fuller (1979) test, the 5% critical value is −2.90; PP — Phillips and Perron (1988) test, the 5% critical value is −2.89. Test of stationarity: KPSS — Kwiatkowski et al. (1992) test, the 5% critical value is 0.46.

226

O. Claveria, S. Torra / Economic Modelling 36 (2014) 220–228

Table 4 Average RMSFE (2009) — tourist arrivals.

Table 5 Average RMSFE (2009) — overnight stays.

Tourist arrivals

France ARIMA SETAR ANN

2 months

3 months

6 months

12 months

1.19 1.25 4.86

1.80 1.85 4.68

2.41 2.53 5.07

5.88 5.01 5.38

7.69 12.46 6.43

3.43 4.28 10.66

5.48 6.90 13.05

11.12 10.17 4.78

1.83 2.56 4.44

United Kingdom ARIMA 2.67 SETAR 2.56 ANN 10.95 Belgium ARIMA SETAR ANN

Overnight stays

1 month

and Netherlands 0.40 1.21 1.23 1.92 4.22 6.21

1 month

2 months

3 months

6 months

12 months

France ARIMA 1.07 SETAR 1.49 ANN 4.98

1.92 2.47 3.31

3.33 4.13 7.89

8.35 9.69 4.55

14.34 22.55 4.17

27.47 20.30 17.52

United Kingdom ARIMA – SETAR 8.09 ANN 10.22

– 15.15 8.25

– 20.44 9.33

– 38.31 9.51

– 21.32 4.59

4.40 4.36 7.18

6.02 9.16 21.42

Belgium ARIMA SETAR ANN

0.28 0.46 0.25

1.46 1.15 0.98

0.91 1.88 0.43

and Netherlands 0.10 0.19 0.15 0.27 0.43 0.14

Germany ARIMA 1.20 SETAR 0.89 ANN 0.98

1.21 0.83 3.56

1.15 0.85 6.80

4.84 1.56 7.86

0.77 0.73 11.43

Germany ARIMA 0.14 SETAR 1.04 ANN 1.03

0.32 1.20 1.04

0.62 1.40 0.95

1.95 2.81 0.74

2.55 6.87 0.01a

Italy ARIMA SETAR ANN

0.82 1.14 11.96

0.91 1.10 7.05

4.58 0.94 6.11

2.63 2.11 2.92

Italy ARIMA 0.34 SETAR 4.70 ANN 2.14

0.47 5.97 1.10

0.61 6.52 0.28

2.23 7.50 12.27

1.30 15.28 0.24

United States and Japan ARIMA 14.93 20.65 SETAR 23.17 26.58 ANN 15.11 15.53

21.38 23.84 14.50

26.17 38.23 13.90

45.80 28.19 16.83

United States and Japan ARIMA – – SETAR 24.92 28.19 ANN 17.02 28.07

– 31.82 13.07

– 44.51 14.71

– 69.91 25.85

Northern countries ARIMA 2.36 SETAR 3.06 ANN 1.97

4.09 2.61 2.77

5.92 4.33 5.49

8.99 19.39 6.31

17.21 43.80 2.80

Northern countries ARIMA – SETAR 3.71 ANN 1.89

– 3.54 1.53

– 3.50 2.91

– 16.03 3.22

– 50.12 3.12

Switzerland ARIMA 0.06a SETAR 0.17 ANN 2.55

0.13 0.21 2.68

0.20 0.34 2.68

1.43 1.15 3.18

1.39 5.22 1.68

Switzerland ARIMA 0.06 SETAR 0.19 ANN 0.59

0.17 0.24 2.13

0.37 0.36 0.70

1.31 0.74 4.18

1.57 3.70 0.23

Russia ARIMA SETAR ANN

0.85 5.57 11.09

1.15 5.65 5.95

1.47 5.72 16.48

4.63 5.78 11.65

1.68 4.69 8.04

Russia ARIMA – SETAR 12.64 ANN 7.92

– 19.23 8.33

– 24.27 9.12

– 27.08 8.39

– 44.77 10.65

Other countries ARIMA 0.64 SETAR 1.44 ANN 4.19

1.12 1.63 1.74

2.06 1.98 4.61

7.56 2.82 1.68

9.59 0.39 1.94

Other countries ARIMA – SETAR 3.79 ANN 4.12

– 4.24 3.21

– 4.82 3.16

– 7.04 3.87

– 6.90 1.80

Total ARIMA SETAR ANN

– 3.85 7.60

– 5.48 4.85

– 9.77 4.54

– 1.86 6.55

Total ARIMA – SETAR 1.61 ANN 1.82

– 2.16 2.72

– 2.32 2.10

– 2.32 0.34

– 4.38 0.74

0.80 1.16 3.80

– 2.00 8.27

Italics: best model for each country. (–) Matrix singular or not positive definite. a Best model.

Italics: best model for each country. (–) Matrix singular or not positive definite. a Best model.

The reason for the lack of consensus may arise from difference sources. The first is related to the structure of the network. In this study we have used an MLP (1;3) specification in order to represent the possible non-linear relationship between each two consecutive growth rates, without incorporating any additional memory values. This structure only introduces one lag when running the model. Therefore, it has to be taken into account that ANN models could be improved through structure optimization. The fact that ARIMA models outperformed ANN models for forecasting tourism demand is also related to the linearity of the filtered data set. The fact that tourism demand data is characterised by strong seasonal patterns and high levels of volatility requires some preprocessing of the original series in order to be used in the models. While eliminating the existing outliers and smoothing the original series, this filtering process ends up conditioning the forecasting performance of the models due to an information loss. Our results suggest

that the there is a trade-off between the degree of pre-processing and the accuracy of the forecasts obtained with neural networks, which are especially suited to deal with nonlinear data. 5. Conclusion The fact that tourism has become one of the most rapidly growing global industries has led to the requirement of more accurate forecasts of tourism demand at the destination level. This, in turn, has caused an increasing interest in more advanced forecasting approaches such as artificial intelligence techniques. Both factors have led us to evaluate the performance of neural networks relative to that of time series models. We focused in inbound tourism demand to Catalonia, which is one of the main tourist destinations in Europe. The main objective of the paper was to analyse the possibility of improving the accuracy of tourism demand forecasts for Catalonia using

O. Claveria, S. Torra / Economic Modelling 36 (2014) 220–228

227

Table 6 Diebold–Mariano loss-differential test statistic for predictive accuracy (1 month). Country

Tourist arrivals ARIMA vs. SETAR

France United Kingdom Belgium and the Netherlands Germany Italy US and Japan Northern countries Switzerland Russia Other countries Total

−0.36 −0.24 −4.38a 1.27 −2.92a −1.34 −1.33 −2.71a −17.77a −1.64 –

Overnight stays ARIMA vs. ANN a

−16.49 −3.83a −2.24a 0.97 −8.15a −0.32 0.76 −26.32a −48.55a −3.71a –

SETAR vs. ANN a

−14.81 −2.75a −1.81 −0.32 −6.96a 1.25 2.75a −19.17a −21.00a −2.25a 6.89a

ARIMA vs. SETAR 0.02 – −1.09 −3.59a −18.52a – – −4.07a – – –

ARIMA vs. ANN a

−4.73 – −9.26a −4.13a −15.57a – – −4.75a – – –

SETAR vs. ANN −3.04a −1.71 −5.91a −0.22 9.24a 1.07 1.92 −3.01a 1.97 −0.02 −3.01a

Note: Diebold–Mariano test statistic with NW estimator. Null hypothesis: the difference between the two competing series is non-significant. A negative sign of the statistic implies that the second model has bigger forecasting errors. a Significant at the 5% level.

Table 7 Models with significant lower forecasting errors between each two competing models (1 month). Country

France United Kingdom Belgium and the Netherlands Germany Italy US and Japan Northern countries Switzerland Russia Other countries Total

Tourist arrivals

Overnight stays

ARIMA vs. SETAR

ARIMA vs. ANN

SETAR vs. ANN SETAR SETAR

ARIMA

ARIMA ARIMA ARIMA

ARIMA

ARIMA

SETAR

ARIMA ARIMA

ARIMA ARIMA ARIMA –

ANN SETAR SETAR SETAR ANN



ARIMA vs. SETAR

ARIMA vs. ANN

SETAR vs. ANN

ARIMA – ARIMA ARIMA ARIMA – – ARIMA – – –

SETAR

– ARIMA ARIMA ARIMA – – ARIMA – – –

SETAR ANN

SETAR

SETAR

Note: Empty spaces imply that no one model has significant lower forecasting errors at the 5% level. (–) Matrix singular or not positive definite.

neural modelling, extending the results of previous research on other fields. We evaluated the forecasting performance of an artificial neural network (ANN) approach relative to different time series models, autoregressive integrated moving average (ARIMA) models and selfexciting threshold autoregressions (SETAR). We compared different time horizons and used tourist arrivals and overnight stays from all the different countries of origin to Catalonia as proxy measures of tourism demand, obtaining more accurate forecasts for tourist arrivals than for overnight stays. Although it is generally believed that the nonlinear methods outperform the linear methods in modelling economic behaviour, when comparing the forecasting accuracy of the different techniques, ARIMA models outperformed SETAR and ANN models, especially for shorter horizons. In spite of the significant differences between countries, these results are related to the required pre-processing of the original data set. While accounting for the presence of seasonality and eliminating the existing outliers, beyond a certain point the information loss caused by the filtering process lowers the accuracy of neural network forecasts compared to those of linear models, as neural networks are far better able to handle nonlinear behaviour. It also has to be taken into account that ANN models can be improved through structure optimization, incorporating additional memory values. Therefore, a challenging question to be considered in further research is whether the implementation of optimised neural networks and of recent advances on dynamic networks may improve tourism demand forecasting.

Acknowledgements We wish to thank Núria Caballé at the Observatori de Turisme de Catalunya for providing us the data used in the study. We also wish to

thank the anonymous reviewers and the editor for their helpful comments and suggestions. References Adya, M., Collopy, F., 1998. How effective are neural networks at forecasting and prediction? A review and evaluation. J. Forecast. 17, 481–495. Akal, M., 2004. Forecasting Turkey's tourism revenues by ARMAX model. Tour. Manag. 26, 359–365. Algieri, B., 2006. An econometric estimation of the demand for tourism: the case of Russia. Tour. Econ. 12, 5–20. Bishop, C.M., 1995. Neural Networks for Pattern Recognition. Oxford University Press, Oxford. Box, G., Cox, D., 1964. An analysis of transformation. J. R. Stat. Soc. Ser. B 211–264. Box, G.E.P., Jenkins, G.M., 1970. Time Series Analysis: Forecasting and Control. Holden Day, San Francisco. Cho, V., 2003. A comparison of three different approaches to tourist arrival forecasting. Tour. Manag. 24, 323–330. Choudhary, A., Haider, A., 2012. Neural network models for inflation forecasting: an appraisal. Appl. Econ. 44, 2631–2635. Chu, F.L., 2009. Forecasting tourism demand with ARMA-based methods. Tour. Manag. 30, 740–751. Claveria, O., Datzira, J., 2010. Forecasting tourism demand using consumer expectations. Tour. Rev. 65, 18–36. Coshall, J.T., Charlesworth, R., 2010. A management orientated approach to combination forecasting of tourism demand. Tour. Manag. 32, 759–769. Dickey, D.A., Fuller, W.A., 1979. Distribution of the estimators for autoregressive time series with a unit root. J. Am. Stat. Assoc. 74, 427–431. Diebold, F.X., Mariano, R., 1995. Comparing predictive accuracy. J. Bus. Econ. Stat. 13, 253–263. Dritsakis, N., 2004. Cointegration analysis of German and British tourism demand for Greece. Tour. Manag. 25, 111–119. Estrella, A., Mishkin, F.S., 1998. Predicting U.S. recessions: financial variables as leading indicators. Rev. Econ. Stat. 80, 45–61. Gary, L., Cànoves, G., 2011. Life cycles, stages and tourism history. The Catalonia (Spain) experience. Ann. Tour. Res. 38, 651–671. Goh, C., Law, R., 2002. Modelling and forecasting tourism demand for arrivals with stochastic nonstationarity seasonality and intervention. Tour. Manag. 23, 499–510. Han, Z., Dubarry, R., Sinclair, M.T., 2006. Modelling US tourism demand for European destinations. Tour. Manag. 27, 1–10.

228

O. Claveria, S. Torra / Economic Modelling 36 (2014) 220–228

Ivars, J.A., 2004. Tourism planning in Spain. Evolution and perspectives. Ann. Tour. Res. 31, 313–333. Kock, A.B., Teräsvirta, T., 2011. Forecasting with nonlinear time series models. In: Clements, M.P., Hendry, D.F. (Eds.), Oxford Handbook of Economic Forecasting. Oxford University Press, Oxford, pp. 61–87. Kon, S.C., Turner, W.L., 2005. Neural network forecasting of tourism demand. Tour. Econ. 11, 301–328. Kuan, C., White, H., 1994. Artificial neural networks: an econometric perspective. Econ. Rev. 13, 1–91. Kwiatkowski, D., Phillips, P.C.B., Schmidt, P., Shin, Y., 1992. Testing the null hypothesis of stationarity against the alternative of a unit root. J. Econ. 54, 159–178. Law, R., 2000. Back-propagation learning in improving the accuracy of neural networkbased tourism demand forecasting. Tour. Manag. 21, 331–340. Law, R., 2001. The impact of the Asian financial crisis on Japanese demand for travel to Hong Kong: a study of various forecasting techniques. J. Travel Tour. Mark. 10, 47–66. Law, R., Au, N., 1999. A neural network model to forecast Japanese demand for travel to Hong Kong. Tour. Manag. 20, 89–97. Li, G., Song, H., Witt, S.F., 2006a. Time varying parameter and fixed parameter linear AIDS: an application to tourism demand forecasting. Int. J. Forecast. 22, 57–71. Li, G., Wong, K.F., Song, H., Witt, S.F., 2006b. Tourism demand forecasting: a time varying parameter error correction model. J. Travel Res. 45, 175–185. Lin, C.J., Chen, H.F., Lee, T.S., 2011. Forecasting tourism demand using time series, artificial neural networks and multivariate adaptive regression splines: evidence from Taiwan. Int. J. Bus. Admin. 2, 14–24. Nakamura, K., 2006. Neural representation of information measure in the primate premotor cortex. J. Neurophysiol. 96, 478–485.

Palmer, A., Montaño, J.J., Sesé, A., 2006. Designing an artificial neural network for forecasting tourism time-series. Tour. Manag. 27, 781–790. Phillips, P.C.B., Perron, P., 1988. Testing for a unit root in time series regression. Biometrika 75, 335–346. Ripley, B.D., 1996. Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge. Song, H., Li, G., 2008. Tourism demand modelling and forecasting — a review of recent research. Tour. Manag. 29, 203–220. Song, H., Witt, S.F., 2006. Forecasting international tourist flows to Macau. Tour. Manag. 27, 214–224. Song, H., Wong, K.F., 2003. Tourism demand modelling: a time-varying parameter approach. J. Travel Res. 42, 57–64. Swanson, N.R., White, H., 1997. Forecasting economic time series using flexible versus fixed specification and linear versus nonlinear econometric models. Int. J. Forecast. 13, 439–461. Tsaur, S.H., Chiu, Y.C., Huang, C.H., 2002. Determinants of guest loyalty to international tourist hotels: a neural network approach. Tour. Manag. 23, 397–405. Turner, L.W., Witt, S.F., 2001. Forecasting tourism using univariate and multivariate structural time series models. Tour. Econ. 7, 135–147. Witt, S.F., Song, H., Wanhill, S.P., 2004. Forecasting tourism-generated employment: the case of Denmark. Tour. Econ. 10, 167–176. Zhang, G.P., Qi, M., 2005. Neural network forecasting for seasonal and trend time series. Eur. J. Oper. Res. 160, 501–514. Zhang, G.P., Patuwo, B.E., Hu, M.Y., 1998. Forecasting with artificial neural networks: the state of the art. Int. J. Forecast. 14, 35–62.