Forecasting air passenger demand with a new hybrid ensemble approach

Forecasting air passenger demand with a new hybrid ensemble approach

Journal of Air Transport Management 83 (2020) 101744 Contents lists available at ScienceDirect Journal of Air Transport Management journal homepage:...

5MB Sizes 0 Downloads 73 Views

Journal of Air Transport Management 83 (2020) 101744

Contents lists available at ScienceDirect

Journal of Air Transport Management journal homepage: http://www.elsevier.com/locate/jairtraman

Forecasting air passenger demand with a new hybrid ensemble approach Feng Jin a, Yongwu Li b, Shaolong Sun c, Hongtao Li a, * a

School of Traffic and Transportation Lanzhou Jiaotong University, Lanzhou, 730070, China Research Base of Beijing Modern Manufacturing Development, College of Economics and Management, Beijing University of Technology, Beijing 100124, China c School of Management, Xi’an Jiaotong University, Xi’an 710049, China b

A R T I C L E I N F O

A B S T R A C T

Keywords: Air passenger demand forecasting Variational mode decomposition Autoregressive moving average model Kernel extreme learning machine

Analyzing and modeling passenger demand dynamic, which has important implications on the management and the operation in the entire aviation industry, are deemed to be a tough challenge. Air passenger demand, however, exhibits consistently complex non-linearity and non-stationarity. To capture more precisely the aforementioned complex behavior, this paper proposes a hybrid approach VMD-ARMA/KELM-KELM for the short-term forecasting, which consists of variational mode decomposition (VMD), autoregressive moving average model (ARMA) and kernel extreme learning machine (KELM). First, VMD is adopted to decompose the original data into several mode functions so as to reduce their complexity. Then, the unit root test (ADF test) is employed to classify all the modes into the stable and unstable series. Meanwhile, the ARMA and the KELM models are used to forecast both the stationary and non-stationary components, respectively. Lastly, the final result is integrated by another KELM model incorporating the forecasting results of all components. In order to prove and verify the feasibility and robustness of the proposed approach, the passenger demands of Beijing, Guangzhou and Pudong airports are introduced to test the performance. Also, the experimental results show that the novel approach does have a more obviously advantage than other benchmark models regarding both accuracy and robustness anal­ ysis. Therefore, this approach can be utilized as a convincing tool for the air passenger demand forecasting.

1. Introduction Air transport is a complex system that includes aircraft, airports, flight routes and air traffic management systems, and it plays an extremely important role in transportation industry, and also in the social and economic development. In the airline industry, air passenger demand forecasting is an essential element for airport managers to make opportune operation plans (Kim and Shin, 2016). Air transportation requires long-term demand forecasts to develop long-term operational plans, such as the opening of new routes, airport construction and layout, reducing flight costs and adjusting ticket prices, meanwhile, it is also necessary to pay more attention to the immediate short-term de­ mand forecasts in the purpose of airport scheduling, quarterly operating plans, and short-term maintenance and inspection plans (Tsui et al., 2014). In view of the huge workload of airport construction and the situation that is difficult to change after completion, the accurate fore­ casting of airport demand is extremely important for its construction, investment and management (Flyvbjerg et al., 2005). However, it is a very challenging work to forecast passenger demand due to its multiple characteristics, such as irregularity, high volatility and non-stationarity

(Xiao et al., 2014). To address this issue, some scholars have paid gradually close attention to the air passenger flow demand forecasting research (Sun et al., 2019). All kinds of models have been built up to forecast passenger demand, the prediction methods can be classified into three categories: economic models, time series models and artificial intelligence methods (Dantas et al., 2017). The economic methods focus on the correlation between the passenger demand and multiple variables, which are regarded as the influence of changes in the economic environment and the traffic system, and then the forecasting models are established via a host of equations. The used commonly economic models include regression analysis (Abed et al., 2001), causality test (Fernandes and Pacheco, 2010), logit model (Garrow and Koppelman, 2004), gravity model (Grosche et al., 2007), which perform better in a relatively stable application environment (Xiao et al., 2014). The time series methods rely mainly on historical data to predict by mining the intrinsic rela­ tionship between current data and past observations series. The various time series models have been used to forecast passenger demand, such as smoothing techniques (Samagaio and Wolters, 2010; Williams et al., 1998), adapted Markov model(Chin and Tay, 2001), ARIMA/SARIMA

* Corresponding author. E-mail address: [email protected] (H. Li). https://doi.org/10.1016/j.jairtraman.2019.101744 Received 10 May 2019; Received in revised form 29 October 2019; Accepted 29 October 2019 Available online 15 November 2019 0969-6997/© 2019 Elsevier Ltd. All rights reserved.

F. Jin et al.

Journal of Air Transport Management 83 (2020) 101744

characteristics and is capable of capturing the intrinsic features in the original data and improving the forecasting accuracy. Therefore, many researchers have been making efforts to process the volatility and the noise existed in the data series by adopting different data pre-processing techniques before forecasting. For instance, several decomposition ap­ proaches have been incorporated into the hybrid model to identify and extract the main characteristics of nonlinear series, including singular spectral analysis (SSA) (Xiao et al., 2014), ensemble empirical mode decomposition (EEMD) (Shao et al., 2015) and complementary ensemble empirical mode decomposition (CEEMD) (Pei et al., 2017). For example, a new hybrid model was developed for short-term passenger demand forecasting, which consists of singular spectrum analysis (SSA), adaptive network-based fuzzy inference system (ANFIS) and improved particle swarm optimization algorithm (IPSO) (Xiao et al., 2014). The results confirm that the established model is able to estimate satisfac­ torily changes in the future passenger demand. Similarly, Xie et al. proposed a hybrid approaches for short-term passenger demand fore­ casting(Xie et al., 2014). The experiment analysis demonstrate that the proposed hybrid model perform better than other competitive models, and indicate that it is a prospective tool to forecast the complex time series with high volatility and nonlinearity. Therefore, it can be concluded that, in previous studies, the forecasting results of the hybrid model are outperformed usually the single model. At the same time, the common decomposition methods, such as EMD and EEMD, are sensitive to signal noise and sampling. Dragomiretskiy and Zosso proposed a variational mode decomposition method (VMD) in 2014 to address these problems, which is a novel non-recursive timefrequency analysis technique (Dragomiretskiy and Zosso, 2014). The VMD can adaptively decompose the original signal into a series of quasi-orthogonal modes(Wang et al., 2015). So far, the VMD decom­ position method has been introduced into some prediction fields. For example, Zhu et al. proposed a hybrid forecasting approach incorpo­ rating VMD, the mode reconstruction (MR) with the optimal combined forecasting model (CFM) to forecast carbon price series (Zhu et al., 2019). The empirical results display that the proposed model is superior to competitive models in some statistical measures and robustness. Lahmiri presented a new time series approach to forecast the economic and financial time series, which are in link with VMD and generalized regression neural network (GRNN) (Lahmiri, 2016). The performance of the model is appraised by comparing the prediction results of the VMD-GRNN model and the EMD-GRNN model, and the analysis illus­ trate that the VMD is an efficient and meritorious signal analysis tech­ nology. In this paper, the variational mode decomposition model (VMD) is utilized as a data pre-processing strategy to obtain the characteristic information of different scales of the original series. However, the data information in the real life (such as the air pas­ senger demand) involves a rather complex internal structure, which can hinder the artificial intelligence models from obtaining the more com­ plex behavior patterns in the original time series. The previous studies have emphasized the significance of data decomposition as a preprocessing technology, and present that the decomposition can obvi­ ously improve the forecasting performance of the approach. But these studies may ignore the unique trait of each component such as the fre­ quency range of the signal and the stability of the modes. By the analysis of the above literatures, this research proposes a novel hybrid approach (VMD-ARMA/KELM-KELM) as follows: (1) The VMD is used to decom­ pose the passenger demand series into several subsequences. (2) By testing the stability of each decomposition mode, the autoregressive moving average model is applied to forecast stable subseries, and the non-stationary sequence is predicted by the kernel extreme learning machine. (3) The forecasting results of ARMA model and KELM are in­ tegrated into one final result. To illustrate the performance of the pro­ posed approach, the air passenger demand of Beijing and Guangzhou airports are calculated and compared with other competitive ap­ proaches. Meanwhile, the study also adopts the air passenger demand data of Shanghai Pudong airport to confirm the robustness of the

(Tsui et al., 2014), gray theory (Hsu and Wen, 2000), seasonal adjust­ ment method (Aston and Koopman, 2006), and Holt-Winters (Segura and Vercher, 2007), etc. As we all know, the causal economic model is more suitable for using multiple influencing factors to analyze the long-term or short-term relationship between a specific factor and pas­ senger demand through statistics such as cointegration test and Granger causality test (Baker et al., 2015). In addition, although the time series model has obvious advantages in dealing with nonlinear and unstable series, it is susceptible to the change of internal parameters of the system (Xiao et al., 2014). As a fact, the air passenger demand series has been confirmed to be subject to several factors that make it nonstationary at level value, for instance, economic growth (GDP), regional resources, revenue and so on (Hakim and Merkert, 2016). On account of the non-linear characteristic of passenger demand, the economic and time series approaches are criticized severely due to their limited and poor effective forecasting ability (Tsui et al., 2014). Fortu­ nately, with the development of information technology, academic re­ searchers and business practitioners try to explore artificial intelligence forecasting algorithms, which are characterized with self-adaption and non-linearity, and have the ability to map arbitrary function(Zhang et al., 1998). By adjusting the weights online, these methods can approximate the arbitrary nonlinear function to an expected accuracy, and capture the inherently complex, dynamic and nonlinear character­ istics in the data, as well. Therefore, the artificial intelligence approach possesses usually higher forecasting accuracy than that of both the econometric models and the time series ones. Till now, the research results on artificial intelligence models are flourish in many aspects, for example multilayer perceptron neural networks (Smith and Demetsky, 1997; Van Arem et al., 1997; Lee et al., 2006), Kalman filter (Vythoul­ kas, 1993), time-delay neural networks (Zhang, 2000), radial basis function neural networks (Zheng et al., 2006), the support vector ma­ chine for regression (Castro-Neto et al., 2009), Elman neural network (Elman) (Hao and Tian, 2019), and extreme learning machine (ELM) (Li et al., 2018). Currently, this kind of artificial intelligence algorithms have been developed in various forecasting areas, including passenger demands forecasting (Xiao et al., 2014), wind speed forecasting (Zhao et al., 2016) electricity price forecasting (Wang et al., 2016), etc. They have been identified as a valid prediction methods with strong robust­ ness and fault tolerance. Since most passenger demand forecasting models have their own advantages and disadvantages, none of them can always obtain the desired forecasting results. For instance, the economic models and the time series ones are extremely simple in structure, which require only endogenous variables without the need of other exogenous variables. But the drawbacks of them are a poor extrapolation effect, a narrow forecasting range and a strong dependence on data, so that, they are more beneficial to forecast linear data rather than for the irregular and nonstationary data (Yang et al., 2017). On the other hand, artificial intelligence models can deal with multiple variables and non-linear problems, while the settings of input-output structure and parameters are primarily dependent on the experience of the researchers. These limited parts of the non-linear input-output relationship may lead to their unsatisfactory prediction accuracy (Li et al., 2014). In addition, some other shortcomings connected with the artificial intelligence models are over fitting, slow learning speed, and the high probability of entrapment in local minimum. In fact, numerous factors, such as the economic development, policy adjustments and seasonal cycles, will affect the demand of air passen­ gers. Furthermore, with the implementation of the low-altitude opening strategy, the uncontrollable and unpredictable emergencies increase the difficulty of accurate passenger demand forecasting. The single econo­ metric approaches and the artificial intelligence models can not meet the requirements of all the aviation management participators in terms of error and accuracy. To further enhance the predictive capability of the models, the novel hybrid approach is an excellent option in which it combines creatively the forecasting models with some different 2

F. Jin et al.

Journal of Air Transport Management 83 (2020) 101744

proposed approach. The empirical analysis exhibits that the proposed hybrid approach is superior to other competitive approaches, which is an efficient and meritorious tool in air transport demand forecasting and can be identified as a reliable alternative. The proposed approach in this paper integrates the advantages of various methods and technologies, and overcomes data complexity caused by different factors, which can completely capture and learn the intrinsic characteristics in the original air passenger demand. Simultaneously, it can greatly improve the forecasting performance and obtain more reliable results. Consequently, the most accurate forecasting results can provide some strong and effective references and advices for the operation and the construction in the aviation industry. The remainder of this research is arranged as follows. Section 2 de­ scribes the framework and developing process of the hybrid forecasting approach in detail, and all kinds of evaluation criteria. Three case studies and forecasting results are shown in Section 3. The conclusions are given in Section 4.

The VMD method calculates the sum of the bandwidths of each mode by estimating the H 1 Gaussian smoothness of demodulated signal. So for the original input signal yt , the constrained variational problem can be constructed eventually: ��2 � � �X�� �� � �� �� ��∂t δðtÞ þ j * rk ðtÞ e jwk t �� minfrk ;wk g �� �� πt 2 k s:t:

X rk ¼ y t ; k

where rk is the mode function, and wk (k ¼ 1; 2; ⋯; K) represents the corresponding central frequency. δ Stands for the Dirac distribution, t is time, j2 ¼ 1 and � is convolution operator. By introducing the Lagrangian multipliers λ to transform the above constrained problem into an unconstrained variational problem, it is expressed as: ��2 �� ��2 � � X���� �� X �� �� �� j Lðrk ; wk ; λÞ ¼ α ����∂t δðtÞ þ rk ðtÞ���� * rk ðtÞ e jwk t ���� þ ����yt πt 2 2 k k X þ 〈λðtÞ; yðtÞ rk ðtÞ〉;

2. Methodology 2.1. Variational mode decomposition

k

The variational mode decomposition (VMD) is a new signal pro­ cessing method, which determines adaptively the internal band of the signal and estimates simultaneously the corresponding mode, so as to balance properly the error between the modes. The main framework of

where α denotes a quadratic penalty factor, which is to ensure the reconstruction accuracy of the signal when the noise is mixed in the signal, λðtÞ is the Lagrange multiplier.

the VMD is to solve the variational problem, and deal effectively with the estimation error of envelope caused by the recursive decomposition method. It is significant in solving signal noise and avoiding modal aliasing. The VMD algorithm is summarized as Algorithm 1. First, the intrinsic mode (Mode) is defined as an amplitudemodulated-frequency-modulated (AM-FM) signal in the VMD algo­ rithm (Dragomiretskiy and Zosso, 2014), and written as:

nþ1 To address the above problem, the rnþ1 and λnþ1 are updated by k k , wk alternating direction multiplier method to get the saddle point of the Lagrangian function. At the same time, rnþ1 can be obtained by: k

Algorithm 1.

b y ðwÞ br nþ1 k ðwÞ ¼

rk ðtÞ ¼ mk ðtÞcosðϕk ðtÞÞ;

Decomposition process of VMD

P

b þ λðwÞ 2 : 2 wk Þ

r i ðwÞ i6¼k b

1 þ 2αðw

The update formula for the center frequency is solved as: �2 R ∞ �� � w�br k ðwÞ� dw 0 wnþ1 ¼ ; �2 k R ∞ �� � r ðwÞ� dw 0 �bk

where mk ðtÞ stands for the instantaneous amplitude of rk ðtÞ, also known as the envelope, the phase ϕk ðtÞ is marked as a non-decreasing function. According to the above equation, rk ðtÞ can be regarded as a pure har­ monic signal with mk ðtÞ and ϕk ðtÞ, and the changes of the amplitude and frequency are relatively slow. In order to calculate the bandwidth of each mode, the analytical signal of each intrinsic mode component is obtained by the Hilbert transform. The spectrum of each mode signal obtained is tuned to the corresponding “baseband”, and the following formula is proposed to evaluate the bandwidth of the mode: �� � � j δðtÞ þ * rk ðtÞ e jwk t : πt

nþ1

where br k ðwÞ is expressed as a Wiener filtering of the current remaining P b br i ðwÞ, wnþ1 stands for the center of gravity of the power y ðwÞ k spectrum of the current mode. See literature (Dragomiretskiy and Zosso, 2014) for detailed steps.

3

F. Jin et al.

Journal of Air Transport Management 83 (2020) 101744

average order. yt is the time series, εt stands for an independent and identically distributed sequence of random variables. θi ði ¼ 1; 2; ⋯; pÞ and ϕj ðj ¼ 1; 2; ⋯; qÞ are model parameters. The modeling process of ARMA is described: (1) Testing stationarity: the stationarity test is carried out by means of timing diagram. (2) Identifying the structure of the model: the values of the autocorrelation coefficient (ACF) and the sample bias autocorrelation coefficient (PACF) of the observed samples are calculated. (3) Determining the parameters (p, q) of the model and establishing a model: according to the property of the ACF and PACF, the appropriate ARMA (p, q) model is selected and the data is fitted. (4)Testing the validity of the model. (5) Forecasting data: the fitted model is adopted to predict the future trend of the time series. 2.3. Kernel extreme learning machine The extreme learning machine (ELM) is an effective single hidden layer feedforward neural network (SLFN). This method updates the weight and bias at one time to replace the weight and bias in the traditional algorithm, so it has a great advantage in learning speed, and has been applied widely to many fields. The topology structure of ELM is shown in Fig. 1. The basic principle of the ELM is that the parameters of the hidden layer nodes can be randomly assigned, and the weight of the output layer is represented by a simple generalized inverse operation of the output matrix of the hidden layer (Huang and Chen, 2007; Huang et al., 2006). For N training set ðxj ; vj Þ, xj ; vj 2 Rd , {j ¼ 1; 2; ⋯; N}, the ELM is expressed as follows:

Fig. 1. The topology structure of ELM.

2.2. Autoregressive moving average model The autoregressive moving average model (ARMA) is an crucial method for studying time series, and is also one of the methods for highresolution spectral analysis (Box and Jenkins, 2010). The “hybrid” model is consists of an autoregressive model (AR model) and a moving average model (MA model). The regression equation ARMA (p, q) is expressed as: yt ¼ θ0 þ ϕ1 yt

1

þ ϕ2 yt

2

þ ⋯ þ ϕp yt

p

þ εt

θ1 εt

1

θ 2 εt

2



θ q εt

q L X

f ðxÞ ¼

where p is regarded as the autoregressive order and q is the moving

βi gðx; wi ; bi Þ ¼ GðxÞβ> ;

i¼1

Fig. 2. The framework of the VMD-ARMA/KELM-KELM approach. 4

F. Jin et al.

Journal of Air Transport Management 83 (2020) 101744

air transport demand forecasting consists of the following sections:

where L is the number of hidden layer nodes, wi and bi are defined as the

weight and bias of the hidden layer, respectively. β ¼ ðβ1 ; β2 ; ⋯; βL Þ is the weight vector that connects the hidden and output layers. GðxÞ ¼ ½gðx; w1 ; b1 Þ; gðx; w2 ; b2 Þ; ⋯; gðx; wL ; bL Þ� represents the activation func­ tion of the hidden layer. The goal of ELM is to ensure that the training error εt is as small as possible and the output weight is minimized (Hoerl and Kennard, 1970; Rao and Mitra, 1972). The optimization model can be written as follows: �� ��2 N 1�� �� λX min ����β���� þ ε2 β 2 2 i¼j t >

s:t: GðxÞβ> ¼ vj

1. Data preparation. The air passenger demand dataset is separated into training set and testing set. The training data is employed to build the forecasting approach, and the data from the test set is used to assess the established approach. 2. VMD decomposition. One factor affects significantly the prediction results when the VMD is applied to the data preprocessing process, that is, the number of modes. For one thing, it is extremely possible that too few modes can inadequately extract the feature information hidden in the original series. For the other thing, the excessive components may generate poor prediction results because of the effects of error accumulation, that is to say, the forecasting error for each model will accumulate in the final ensemble step (Niu et al., 2016, 2017). Fortunately, after several experiments, we have come to the conclusion that the modes obtained by VMD are more robust in terms of the noise suppression than the modes decomposed by other approaches such as complementary ensemble empirical mode decomposition (CEEMD) and Singular spectrum analysis (SSA). Therefore, when decomposing with VMD, we can completely express the characteristics of the original data by appropriately setting components with the equal or more less modes as that decomposed by CEEMD and SSA or other models. 3. Stationarity test. Both the autocorrelation function and unit root test are simultaneously applied to test the stationarity of the subseries. If the value of the sample autocorrelation function coefficient ap­ proaches gradually zero, then the time series is proved to be sta­ tionary, otherwise it is not. The unit root test refers to whether there exists a unit root in the sequence, and the stationary time series is considered to have no unit root. The ADF test, the Dickey-Fuller(DF) test and the Said-Dickey test are the most commonly used unit root tests. In this research, the ADF test is selected as a tool to examine the stationarity. If the ADF statistics value is smaller than the critical ones, the null hypothesis that the original sequence has a unit root is rejected at the significance level (1%, 5% and 10%), that is, the original sequence is stationary. 4. Based on evaluating the stationarity of each subseries, a novel “divide and conquer” input strategy is adopted by the ARMA and KELM. The ARMA model, which rely on the time series itself, obtains the quantitative relationship between the past data and the current ones, so as to establish a model with the former data as independent variable and the latter one as dependent variable. In particular, it has an excellent performance in forecasting stationary data. Thus, the ARMA model is used to predict the stationary modes. 5. The KELM is chosen as the forecasting model of non-stationary data. The reason is: (1) the algorithm will randomly generate the connection weights between the input layer and the hidden one, as well as the thresholds of the hidden layer neurons, which need not be adjusted during the training process; (2) the unique optimal solution is obtained by setting the number of neurons in the hidden layer; (3) the algorithm shows the advantages of the fast operation speed, the strong generalization ability and the difficulty in over-fitting. 6. By combining the forecasting results of the ARMA and KELM, the final result is obtained.

εj ; j ¼ 1; 2; ⋯; N;

where λ is the penalty factor. For the case that the training sample is large, i.e., N≫ L, we can get � � 1 1 β ¼ H> þ HH > T: λ The output function of the extreme learning machine is � � 1 1 þ HH > f ðxÞ ¼ GðxÞβ> ¼ GðxÞH > T: λ The kernel functions have strong nonlinear mapping capability (Huang et al., 2011). Linear inseparable problems are mapped to high-dimensional spaces by the kernel functions, thereby, making them linearly separable (Huang and Chen, 2008). By referring to the inner product theory of kernel function, we adopt directly the kernel functions to replace the mapping of hidden layer nodes of ELM. An extreme learning machine based on kernel function is proposed (Huang and Chen, 2008). If the feature mapping function gðxÞ of the hidden layer neurons is unknown, the kernel matrix can be defined: 2 3> kðx; x1 Þ � � � 1 � 1 6 1 kðx; x2 Þ 7 7 1 þ HH > f ðxÞ ¼ GðxÞH > þ HH > T¼6 T: 4 ⋮ 5 λ λ kðx; xN Þ In the process of implementing the ELM by the kernel function, the GðxÞ of the hidden layer remains unknown and is replaced by the cor­ responding kernel function kðx; xj Þ. The above situation indicates that the kernel function can take the place of the random mapping of the ELM and make the output weight more stable than before. In this research, the following four kernel functions are used: 1. Linear kernel function: � k x; xj ¼ x> xj 2. Polynomial kernel function: � �p k x; xj ¼ ηx> xj þ r ; η > 0 3. RBF kernel function: � �� ��2 � � k x; xj ¼ exp η��x> xj �� ;

2.5. Performance evaluation criteria

η>0

4. Wavelet kernel function: � �� k x; xj ¼ cos α x xj exp η x

xj

�2 �

;

There are a variety of error measurement criteria for evaluating the forecasting performance. However, the relevant studies have shown that there is no general standard formula to evaluate the effectiveness of forecasting approaches. Thus, the following several popular metrics are used to assess the utility from different perspectives, such as the mean absolute error (MAE), the mean absolute percent error (MAPE), the root mean square error (RMSE) and the directional perdition statistics (Dstat) (Huang and Chen, 2008; Xie et al., 2014; Niu et al., 2016; Flyvbjerg et al., 2005). The low MAE, MSE, RMSE and the high Dstat indicate a

α; η > 0

2.4. The framework of hybrid ensemble approach The overall framework of the proposed approach is illustrated as Fig. 2. The proposed hybrid approach (VMD-ARMA/KLEM-KELM) for 5

F. Jin et al.

Journal of Air Transport Management 83 (2020) 101744

Fig. 3. Monthly air passenger demand series for three airports.

better prediction performance. The most widely used measures are n P

A’iþ1 j

jAi

MAE ¼ i¼1

n

� n � P �Ai �

MAPE ¼ i¼1

where at ¼ 1, if ðAiþ1 Ai ÞðA’iþ1 Ai Þ � 0, and at ¼ 0, otherwise. Ai and A’i stand for the actual and forecasting values of the time series in period i, respectively, and n is the number of testing sample sets. To assess evaluate whether the forecasting accuracy of the proposed hybrid approach is statistically superior to other competitive ap­ proaches, the Diebold Mariano (DM) test method is introduced, which has been used in many research applications(Diebold and Mariano, 1995). The hypothesis of DM test is defined as:

� �

A’iþ1 � � Ai

n

� 100%

vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi uP un u ðAi A’iþ1 Þ2 t RMSE ¼ i¼1 n Dstat ¼

H0 : Dðfh Þ ¼ 0; 8h

H1 : Dðfh Þ 6¼ 0; 9h;

where fh ¼ Sðetþh Þ Sðetþh Þ, e stands for the forecasting error and S represents a loss function. The test statistic is calculated by: ðAÞ

ðBÞ

n 1X at � 100%; n i¼1

Fig. 4. The decomposition results of air passenger demand series via VMD. 6

F. Jin et al.

Journal of Air Transport Management 83 (2020) 101744

Fig. 5. Autocorrelation function graph of the subseries via Eviews in Beijing.

Pk

.

(1) Determining the referenced series and the compared series. The referenced series: Yi ¼ fyi ðkÞ; k ¼ 1; 2; ⋯; m; i ¼ 1; 2; ⋯; ng, the compared series: Y0 ¼ fy0 ðkÞ; k ¼ 1; 2; ⋯; mg. (2) Processing the referenced series and compared series as nondimensionalize;

k ffiffiffiffiffi ; DM ¼ qffiffiffiffiffi� 2 E k h¼1 fh

where E2 represents an estimation of the variance of fh . The null hy­ pothesis can be rejected when jDMj > Zα=2 , α represents significance level. In the process of the system development, if the two factors change synchronously, it can be said that they have a higher degree of corre­ lation, on the contrary, they are lower. Therefore, the gray correlation analysis (GCA) method is a method that measures the degree of corre­ lation according to the similarity or difference degree of the develop­ ment trend among factors, i.e., “gray correlation degree”. The specific calculation steps of GCA are as follows:

y’j ðkÞ ¼

yj ðkÞ maxk yj ðkÞ

ðj ¼ 0; 1; ⋯; nÞ;

where maxk yj ðkÞ refers to the maximum of jth series. (3) Calculating the gray correlation coefficient between the refer­ enced series and compared series, note ϑi ðkÞ as the gray corre­ lation coefficient:

Fig. 6. Autocorrelation function graph of the subseries via Eviews in Guangzhou. 7

F. Jin et al.

Journal of Air Transport Management 83 (2020) 101744

In this subsection, the monthly data on air passenger demand in Beijing, Guangzhou, and Shanghai Pudong airport are collected from January 2006 to November 2017. As shown in Fig. 3, the air passenger time series of three airports in level show non-stationary. Furthermore, the training set covers the period from January 2006 to July 2015, while the out-of-sample data is from August 2015 to November 2017. The training set of this research accounts for 80% of all data.

Table 1 ADF values of the proposed approach and competitive approaches, Beijing and Guangzhou airport. Model

Beijing airport t-statistic

Mode Mode Mode Mode Mode a

1 2 3 4 5

5.8576 11.3456 7.3721 5.1862 3.9317

Guangzhou airport p-valuea 0.0000 0.0000 0.0000 0.0001 0.0000

t-statistic 4.2375 12.1610 6.0345 3.5307 –

p-valuea 0.0000 0.0000 0.0000 0.0005 –

3.1. Analysis of the case study results As described in Section 2.4, here the VMD-ARMA/KELM-KLEM approach is developed to forecast the passenger demand series. For a start, both of the passenger demand series of Beijing airport and Guangzhou airport are decomposed by the VMD, and the results are displayed in Fig. 4. Since too few patterns will not completely extract the feature information hidden in the original data, it is crucial for the VMD to decompose the original data into how many modes (Niu et al., 2018). It has been experimentally confirmed that the modes gained by VMD is much smoother than that of CEEMD, EEMD and SSA decomposition. Therefore, by comparing with the EEMD and SSA or other methods, when decomposing with VMD, we can completely express the charac­ teristics of the original data by setting the number of components appropriately, which is less than or equal to the number decomposed by EEMD, SSA or other similar methods. In this paper, the passenger de­ mand series of Beijing airport and Guangzhou airport are decomposed into seven modes, respectively. After being decomposed by VMD, a series of modes arranged from low frequency to high frequency are generated. For Beijing airport, the stationarity analysis is first applied to the VMD decomposition subseries. In Fig. 5, the ACF coefficients of Modes 1–5 exhibit a trend of gradual decay. The ACF coefficients of other modes tend to fall outside the confidence interval, which implies that these modes are not stationary. The ADF test values for Modes 1–5 indicate that the t-statistic is lower than the significance level (1%), therefore, the null hypothesis of the

MacKinnon (1996) one-side p-value.

ϑi ðkÞ ¼

mini mink jy0 ðkÞ y’i ðkÞj þ α*maxi maxk jy0 ðkÞ y’i ðkÞj jy0 ðkÞ y’i ðkÞj þ α*mini mink jy0 ðkÞ y’i ðkÞj

� � where mini mink �y0 ðkÞ y’i ðkÞ� refers to the minimum of absolute differ­ ences among all of the referenced series and compared series. By the � � same token, maxi maxk �y0 ðkÞ y’i ðkÞ� is the maximum of these. α is defined as the distinguish coefficient between 0 and 1. In this study, α is equal to 0.5. (4) Computing the gray correlation degrees: Di ¼

m 1X ϑi ðkÞ; m k¼1

ði ¼ 1; 2; ⋯; nÞ:

3. Empirical study Three empirical data are used to state clearly the forecasting per­ formance of the proposed approach in this research. Beijing and Guangzhou airports’s data are used to establish and test the performance of the hybrid approach, and the data from Shanghai Pudong airport is adopted to verify the hybrid approach’s applicability and robustness.

Fig. 7. One-step ahead forecasting results in Beijing and Guangzhou airport. 8

F. Jin et al.

Journal of Air Transport Management 83 (2020) 101744

Fig. 8. Two-step ahead forecasting results in Beijing and Guangzhou airport. Table 2 The results of one-step forecasting performances of all approaches. Models

ARIMA ELM Elman PSO-SVM KELM SSA-ARMA/KELM-KELM EEMD-ARMA/KELM-KELM CEEMD-ARMA/KELM-KELM VMD-KELM-KELM VMD-ARMA/ELM-ELM VMD-ARMA/ELMAN-ELMAN VMD-ARMA/PSO-SVM-PSO-SVM VMD-ARMA/KELM-KELM

Beijing airport

Guangzhou airport

MAE

MAPE

RMSE

Dstat

MAE

MAPE

RMSE

Dstat

65.3568 51.3267 49.1825 22.1546 36.1105 32.9267 19.1341 14.5238 14.5230 3.7772 3.1688 0.7065 0.5967

155 0.8743 0.307 0.2038 0.0658 0.0634 0.0412 0.0369 0.0199 0.0181 0.0180 0.0017 0.0009

61.9683 59.4474 55.4731 43.1793 42.4478 42.4437 16.3654 15.5627 15.5598 6.0689 5.6758 1.5314 0.8174

33.3333 37.0370 40.7407 51.8518 59.2592 62.9629 70.3703 74.0740 88.8888 90.4792 92.7783 93.2241 96.2329

43.7252 30.0243 23.9744 22.1546 19.9245 19.1341 8.9082 8.2214 7.3094 5.8138 0.7845 0.5967 0.3794

11.5825 11.3567 0.1669 0.0886 0.0810 0.0480 0.0422 0.0389 0.0239 0.0220 0.0020 0.0015 0.0010

62.5841 28.4904 27.3458 25.1123 22.7063 21.5020 10.3228 9.8640 9.1223 8.6417 1.0017 1.0002 0.5837

33.3333 37.0370 40.7407 44.4444 47.4785 55.5555 62.9629 66.6666 74.0740 81.4814 92.5925 94.3677 96.2962

Note:(a/b). a:the stationary series forecasting model. b:the non-stationary series forecasting model.

unit root test is rejected. The same reason, as shown in Fig. 6 and Table 1, the ACF test values of Modes 1–4 demonstrate the signs of exponential decay in Guangzhou airport. The ADF values of Mode 1–4 indicate that the t-statistic is lower than the significance levels (1%). In short, both methods more clearly illustrate that the Mode 1–5 in Beijing airport and the Mode 1–4 in Guangzhou airport are the stationary series. Thus, the forecasting of the Mode 1–5 and Mode 1–4 series is appro­ priate for the ARMA model. Next, the input structure and order of the ARMA model are deter­ mined. The coefficient of the correlograms (ACF) displays exponential decay characteristics on different lags. When we estimate the ARMA model and its two lag orders p and q, two common information criteria, Akaike information standard (AIC) and Schwarz standard (SC), are

introduced for justification, and the optimal model has the character­ istics of minimal AIC and SC values. The ARMA (p, q) model is developed by testing the values of AIC and SC. As we can see from Table 1, Mode 1–4 and Mode 1–5 are stable respectively for Beijing and Guangzhou airports. Therefore, the ARMA model is adopted to forecast the sta­ tionary modes mentioned above. For brevity, the p, q values of the ARMA model in each stationary mode are not provided here. Then, the same method is individually used for all subseries via VMD decomposition. After the stationarity test, the ACF coefficients of the remaining parts of the sub-series, namely Mode 5–7 of Beijing airport and Mode 6–7 for Guangzhou airport, fall outside the confidence in­ terval and do not satisfy the conditions of stationarity test. At the same time, the ADF test value indicates that the t-statistic is higher than the 9

F. Jin et al.

Journal of Air Transport Management 83 (2020) 101744

Table 3 The results of two-step forecasting performances of all approaches. Models

ARIMA ELM Elman PSO-SVM KELM SSA-ARMA/KELM-KELM EEMD-ARMA/KELM-KELM CEEMD-ARMA/KELM-KELM VMD-KELM-KELM VMD-ARMA/ELM-ELM VMD-ARMA/ELMAN-ELMAN VMD-ARMA/PSO-SVM-PSO-SVM VMD-ARMA/KELM-KELM

Beijing airport

Guangzhou airport

MAE

MAPE

RMSE

Dstat

MAE

MAPE

RMSE

Dstat

63.1276 53.3854 46.7028 44.6158 40.3781 37.2209 19.3059 18.6266 17.5930 15.3156 10.3854 7.8164 5.1546

2.0896 1.6039 1.0290 0.9925 0.7516 0.6643 0.3177 0.2675 0.2302 0.0690 0.0607 0.0507 0.0472

66.1046 57.8460 55.7392 47.8707 46.1381 45.7125 25.0624 21.6214 20.5135 18.0052 13.2839 10.1069 6.9202

26.9230 34.6153 34.6153 34.6153 42.3076 50 52.4680 53.8461 56.1352 57.6923 65.3846 76.9230 80.7692

67.7933 42.9744 39.1384 34.3331 24.5679 18.2502 14.7574 13.9949 9.7665 8.8369 8.3915 4.2753 3.3159

0.0523 2.2148 0.9800 0.8933 0.6294 0.5222 0.4259 0.3693 0.1391 0.0872 0.0796 0.0464 0.0360

23.7757 51.1939 47.3306 36.8156 30.0244 23.1122 18.0099 16.9562 11.7765 10.5441 10.4941 5.3459 4.3011

26.9230 34.6153 46.1538 50 53.8461 55.6548 57.6923 59.2574 60.9768 63.6362 65.3846 76.9230 79.5769

Note:(a/b). a:the stationary series forecasting model. b:the non-stationary series forecasting model.

Fig. 9. VMD decomposition results of passenger demand series in Pudong airport.

significance level (1%), and therefore, the null hypothesis of the unit root test is accepted. The above two aspects show that both Mode 5–7 and Mode 6–7 are non-stationary sequences. In summary, the KELM is used to predict the non-stationary series according to its unique ad­ vantages. Finally, another KELM is applied to integrate the predictions of the ARMA model and KELM into the final result.

display in Fig. 7 and Fig. 8, the values of the evaluation criteria are given in Table 2 and Table 3. From the forecasting results and empirical analysis, three conclusions are given: (1) From Tables 2 and 3, it displays that the proposed approach ex­ ceeds twelve involved time series approaches (non-decomposi­ tion models, EEMD-ARMA/KELM-/KELM, CEEMD-ARMA/ KELM-KELM, SSA-ARMA/KELM-KELM, etc.) in terms of the MAE, MAPE and RMSE. All the results also show that using VMD decomposition as a data preprocessing method can help extract different fluctuation characteristics hidden in the original pas­ senger demand series to improve the forecasting accuracy.

3.2. Comparison and discussion Some benchmark approaches are employed to prove the superiority of the decomposition-ensemble approach in both one-step and two-step ahead forecasting. The forecasting results of all involved approaches are 10

Journal of Air Transport Management 83 (2020) 101744

F. Jin et al.

Fig. 10. Autocorrelation function graph of the subseries via Eviews in Pudong.

Fig. 11. One-step ahead forecasting results in Pudong airport.

Fig. 12. Two-step ahead forecasting results in Pudong airport.

(2) Obviously, the proposed approach acquires higher Dstat than twelve listing approaches (non-decomposition approaches, EEMD-ARMA/KELM-KELM, CEEMD- ARMA/KELM-KELM, SSAARMA/KELM-KELM, etc.) in Tables 2 and 3. We think the main reasons are as follows: (a) the ARMA model can mirror the

internal connection information of the stationary series between the past and current data; and (b) the KELM has the advantages of the fast learning speed and excellent generalization capabilities, which make it to be beneficial for non-linear series forecasting.

11

F. Jin et al.

Journal of Air Transport Management 83 (2020) 101744

Table 4 Comparison of forecasting performances of different approaches (Pudong). Models

One-step forecasting

ARIMA ELM Elman PSO-SVM KELM SSA-ARMA/KELM-KELM EEMD-ARMA/KELM-KELM CEEMD-ARMA/KELM-KELM VMD-KELM-KELM VMD-ARMA/ELM-ELM VMD-ARMA/ELMAN-ELMAN VMD-ARMA/PSO-SVM-PSO-SVM VMD-ARMA/KELM-KELM

Two-step forecasting

MAE

MAPE

RMSE

Dstat

MAE

MAPE

RMSE

Dstat

34.9946 32.6412 31.2687 30.1827 26.9515 12.4620 12.3777 11.8387 8.8279 8.5374 7.5740 4.7967 1.9885

0.3654 0.2452 0.2384 0.1121 0.0629 0.0558 0.0534 0.0475 0.0231 0.0216 0.0154 0.0146 0.0089

62.5859 46.3306 37.8679 36.3092 35.5966 34.7475 13.9498 13.3162 12.9728 10.9623 10.2062 9.2784 6.0039

33.3333 44.4444 47.2547 59.2592 60.5911 62.9629 62.9629 66.6666 88.8888 90.1960 91.6870 91.6870 92.3215

48.8169 39.3512 37.4302 33.5230 32.3771 21.7468 18.4620 14.8830 10.3873 10.1671 5.1308 3.9025 1.8363

6.1479 4.9702 1.4811 1.0010 0.6495 0.6018 0.4253 0.2949 0.2302 0.0890 0.0720 0.0604 0.0585

60.1103 57.8460 51.0975 41.9020 39.5814 37.0406 25.7561 20.4078 17.5587 11.9171 11.7652 6.3423 5.1185

26.9230 34.6153 46.1538 50 57.6923 61.5384 61.5384 65.3846 66.3790 66.3846 69.2307 76.9230 80.7692

Note:(a/b). a:the stationary series forecasting model. b:the non-stationary series forecasting model. Table 5 DM test results of the developed approach and competitive approaches, Beijing airport, Guangzhou airport and Pudong airport. Models

ARIMA ELM Elman PSO-SVM KELM SSAARMA/ KELMKELM EEMDARMA/ KELMKELM CEEMDARMA/ KELMKELM VMDKELMKELM VMDARMA/ ELMELM VMDARMA/ ELMANELMAN VMDARMA/ PSOSVMPSOSVM VMDARMA/ KELMKELM a

Beijing airport

Guangzhou airport

Pudong airport

Onestep

Twostep

Onestep

Twostep

Onestep

Twostep

7.9659 2.2377a 10.1952 0.1782 2.2566a 7.7020

8.2627 4.8923 4.4425 2.7255 0.1430 7.2041

4.6835 3.9093 6.6818 3.7184 0.5550 7.0984

5.9303 11.8187 11.2005 9.4612 7.1568 3.7832

6.1930 9.7620 5.9017 3.7385 7.2059 5.3697

8.9661 17.7273 3.3096 2.8719 0.2189 2.8487

6.2992

1.9857a

5.7525

2.1408a

4.4469

6.0148

14.3389

21.4436

13.0717

12.4960

6.2395

9.7776

14.8996

0.1664

2.1041

0.0436

0.2189

4.5730

4.6431

6.0262

10.7797

0.5392

9.4859

0.1578

10.7397

2.2869a

9.8177

0.0173

10.6026

0.1227

0.6016

0.1028

9.8179

4.6720

10.3086

0.0342













Table 6 GCA results of all approaches. Models

ARIMA ELM Elman PSO-SVM KELM SSAARMA/ KELMKELM EEMDARMA/ KELMKELM CEEMDARMA/ KELMKELM VMDKELMKELM VMDARMA/ ELMELM VMDARMA/ ELMANELMAN VMDARMA/ PSOSVMPSOSVM VMDARMA/ KELMKELM

Beijing airport

Guangzhou airport

Pudong airport

Onestep

Twostep

Onestep

Twostep

Onestep

Twostep

0.92357 0.87874 0.90824 0.91035 0.90676 0.98827

0.72262 0.66138 0.65310 0.66700 0.65080 0.85543

0.79387 0.64596 0.70219 0.79355 0.78183 0.90420

0.71772 0.70668 0.73971 0.70917 0.68097 0.85192

0.75046 0.63775 0.70985 0.77802 0.75684 0.89567

0.76097 0.60471 0.63902 0.66697 0.66773 0.84783

0.93008

0.81472

0.92290

0.85265

0.85249

0.80369

0.97405

0.86646

0.87701

0.83038

0.83862

0.81617

0.98513

0.88714

0.95345

0.91935

0.90559

0.91913

0.52270

0.76488

0.68343

0.81763

0.80763

0.75350

0.98608

0.73238

0.92490

0.77355

0.91648

0.80576

0.98460

0.77881

0.95338

0.83931

0.90691

0.85314

0.98814

0.99992

0.99395

0.99944

0.96920

0.99884

*5% sgnificance level.

5% sgnificance level.

3.3. Robustness of the models

(3) Furthermore, the kernel functions also play an essential role in producing the optimal KELM. It can be clearly seen that the developed approach obtains the higher accuracy than the ELMbased VMD-ARMA/ELM-ELM approach from Tables 2 and 3.

In order to further confirm the truth of the validity, applicability and robustness of the novel hybrid approach VMD-ARMA/KELM-KELM, the passenger demands data set (see Fig. 3) collected in Shanghai Pudong airport is used as another exercise in this research. As in the case of 12

F. Jin et al.

Journal of Air Transport Management 83 (2020) 101744

Table 7 The results of improvement percentage of the developed approach compare with the benchmark approaches in one-step forecasting. Models

ARIMA ELM Elman PSO-SVM KELM SSA-ARMA/ KELM-KELM EEMD-ARMA/ KELM-KELM CEEMDARMA/ KELM-KELM VMD-KELMKELM VMD-ARMA/ ELM-ELM VMD-ARMA/ ELMANELMAN VMD-ARMA/ PSO-SVMPSO-SVM

Beijing airport

Table 9 The results of improvement percentage of the developed approach compare with the competitive approaches in Pudong airport.

Guangzhou airport

Models

MAE (%)

MAPE (%)

RMSE (%)

MAE (%)

MAPE (%)

RMSE (%)

99.087 98.837 98.787 97.307 98.348 98.188

99.999 99.897 99.707 99.558 98.632 98.580

98.681 98.625 98.526 98.107 98.074 98.074

99.132 98.736 98.417 98.287 98.096 98.017

99.991 99.991 99.401 98.871 98.765 97.917

99.067 97.951 97.865 97.579 97.429 97.285

96.881

97.816

95.005

95.741

97.630

94.346

95.892

97.561

94.748

95.385

97.429

94.083

95.891

95.477

94.747

94.809

95.816

93.601

84.203

95.028

86.531

93.474

95.455

93.246

81.170

95.000

85.599

51.638

50.000

41.729

15.541

47.059

46.624

36.417

33.333

41.642

ARIMA ELM Elman PSO-SVM KELM SSA-ARMA/ KELM-KELM EEMD-ARMA/ KELM-KELM CEEMDARMA/ KELM-KELM VMD-KELMKELM VMD-ARMA/ ELM-ELM VMD-ARMA/ ELMANELMAN VMD-ARMA/ PSO-SVMPSO-SVM

Note:(a/b). a:the stationary series forecasting model. b:the non-stationary series forecasting model.

ARIMA ELM Elman PSO-SVM KELM SSA-ARMA/ KELM-KELM EEMD-ARMA/ KELM-KELM CEEMDARMA/ KELM-KELM VMD-KELMKELM VMD-ARMA/ ELM-ELM VMD-ARMA/ ELMANELMAN VMD-ARMA/ PSO-SVMPSO-SVM

Beijing airport

Two-step forecasting

MAE (%)

MAPE (%)

RMSE (%)

MAE (%)

MAPE (%)

RMSE (%)

94.318 93.908 93.641 93.412 92.622 84.043

97.564 96.370 96.267 92.061 85.851 84.050

90.407 87.041 84.145 83.465 83.134 82.721

96.238 95.334 95.094 94.522 94.328 91.556

99.048 98.823 96.050 94.156 90.993 90.279

91.485 91.152 89.983 87.785 87.068 86.181

83.935

83.333

56.961

90.054

86.245

80.127

83.203

81.263

54.913

87.662

80.163

74.919

77.475

61.472

53.719

82.322

74.587

70.849

76.708

58.796

45.231

81.939

34.270

57.049

73.746

42.208

41.174

64.210

18.750

56.495

58.544

39.041

41.174

52.946

3.146

19.296

Note:(a/b). a:the stationary series forecasting model. b:the non-stationary series forecasting model.

shown in Table 4. According to the experimental results, the analogous conclusions with Beijing and Guangzhou airports can be gained. Same as Table 3, the hybrid approach VMD-ARMA/KELM-KELM possesses the excellent performance compared with the other competitive approaches including non-decomposition approaches, EEMD-ARMA/KELM-KELM, CEEMDARMA/KELM-KELM, SSA-ARMA/KELM-KELM, etc. In this experiment, once again, the obtained results demonstrate that the developed approach is suitable for both one-step and two-step passenger demand forecasting, and has a strong application ability. Similarly, in all the VMD decomposition strategies, the performance of the proposed approach (VMD-ARMA/KLEM-KLEM model) is much superior to others (VMD-ARMA/ELM-ELM, VMD-ARMA/ELMAN-ELMAN, VMD-ARMA/ PSO-SVM-PSO-SVM and VMDKLEM-KLEM). In sum, the decomposition strategy (VMD, EEMD, CEEMD and SSA) can significantly increase the forecasting ability of the KELM model, and the kernel function has a gainful impact on the ELM model.

Table 8 The results of improvement percentage of the developed approach compare with the competitive approaches in two-step forecasting. Models

One-step forecasting

Guangzhou airport

MAE (%)

MAPE (%)

RMSE (%)

MAE (%)

MAPE (%)

RMSE (%)

91.835 90.345 88.963 88.447 87.234 86.151

97.741 97.057 95.413 95.244 93.720 92.895

89.531 88.037 87.585 85.544 85.001 84.861

95.109 92.284 91.528 90.342 86.503 81.831

99.166 98.375 96.327 95.970 94.280 93.106

81.910 91.598 90.913 88.317 85.675 81.390

73.300

85.143

72.388

77.531

91.547

76.118

72.327

82.355

67.994

76.306

90.252

74.634

70.701

79.496

66.265

66.048

74.119

63.477

66.344

31.594

61.566

62.477

58.716

59.208

50.367

22.241

47.905

60.485

54.774

59.014

34.054

6.903

31.530

36.417

22.414

19.544

3.4. DM test To evaluate the forecasting accuracy of different approaches in term of statistical perspective, the DM test is used to investigate the ap­ proaches involved, and the results are shown in Table 5. For Beijing airport, the proposed VMD-ARMA/KELM-KLEM approach is obviously better than other benchmark approaches except PSO-SVM and VMDARMA/PSO-SVM-PSO-SVM in one-step forecasting, and KELM, VMDKELM-KELM and VMD-ARMA/PSO-SVM-PSO-SVM in the two-step forecasting. For the one-step forecasting of Guangzhou airport, the developed approach is quite close to the predicted performance of KELM model. And it is also close to VMD-KELM-KELM, VMD-ARMA/ELMELM, and VMD-ARMA/ELMAN-ELMAN in the two-step forecasting. In addition, the VMD-ARMA/KELM-KELM approach is significantly better than other competitive approaches under the 95% confidence level apart from VMD-KELM-KELM, KELM, VMD-ARMA/ELM-ELM, VMDARMA/ELMAN-ELMAN, VMD-ARMA/PSO-SVM-PSO-SVM for Pudong airport.

Note:(a/b). a:the stationary series forecasting model. b:the non-stationary series forecasting model.

Beijing and Guangzhou airports, the results of original passenger de­ mands series are decomposed via VMD as follows Fig. 9, and the auto­ correlation function graph of the subseries is shown in Fig. 10. All the forecasting results of the approaches involved in this research, including the developed approach and the competitive approaches, are illustrated in Fig. 11 and Fig. 12, respectively. The forecasting errors including MAE, MAPE, RMSE and Dstat of all approaches are calculated and 13

F. Jin et al.

Journal of Air Transport Management 83 (2020) 101744

Fig. 13. The one-step error results of the developed approach and other benchmark approaches for Beijing airport.

Fig. 14. The two-step error results of developed approach and other benchmark approaches for Beijing airport.

3.5. GCA results

of the proposed approach displays the most biggest relational degrees with actual ones.

In this subsection, the gray correlation analysis is selected to assess the performance of different approaches. The relevance of the actual results and forecasting is revealed by gray relational degrees. The analysis results are shown in Table 6. Apparently, the forecasting values

3.6. Analysis and evaluation of the proposed approach Due to the noticeable irregularity, volatility and seasonality of the air 14

F. Jin et al.

Journal of Air Transport Management 83 (2020) 101744

Fig. 15. The one-step error results of developed approach and other benchmark approaches for Guangzhou airport.

Fig. 16. The two-step error results of developed approach and other benchmark approaches for Guangzhou airport.

passenger demand, the econometric forecasting and time series fore­ casting models are unsatisfactory in capturing such uncertain behavior. Therefore, the reduction of data complexity is the key to improve fore­ casting accuracy. The proposed novel hybrid approach, i.e., VMDARMA/KELM-KELM, is superior to other comparative approaches in air passenger demand forecasting. Tables 7–9 indicate the improvement

percentage of all the approaches. For example, compared with the linear ARIMA model for Beijing airport, the proposed model improves each index by 99.087%, 99.999%, 98.681% in one-step forecasting. In the two-step forecasting of Guangzhou airport, the established approach is 36.417%, 22.414% and 19.544% better than VMD-ARMA/PSO-SVMPSO-SVM, respectively. At the same time, from Figs. 13–18, we can 15

F. Jin et al.

Journal of Air Transport Management 83 (2020) 101744

Fig. 17. The one-step error results of developed approach and other benchmark approaches for Pudong airport.

Fig. 18. The two-step error results of developed approach and other benchmark approaches for Pudong airport.

see that the forecasting error of the developed approach is the smallest compared to other main benchmark approaches. There are three main reasons for above phenomenon: (1) The VMD decomposition method can effectively reduce the complexity of air passenger demand data and help to improve the forecasting perfor­ mance. (2) According to the ADF test, all modes are classified into the

stable and unstable sequences. (3) Based on the advantages of different forecasting approach, ARMA model and KELM are used to predict the stable and unstable sequences, respectively. In short, the results show that reducing the complexity of data and making forecasting according to the characteristics of data can obtain a better forecasting perfor­ mance, and the developed hybrid approach can also be applied to other 16

F. Jin et al.

Journal of Air Transport Management 83 (2020) 101744

Table 10 The sensitivity analysis of multiple iterations based on the final ensemble KELM. Iteration Beijing airport

50 70 100 120 150 200 Guangzhou airport 50 70 100 120 150 200 Pudong airport 50 70 100 120 150 200

time series forecasting characteristics.

One-step

Two-step

MAE

MAPE

RMSE

MAE

MAPE

RMSE

0.6486 0.6714 0.5967 0.6203 0.6146 0.6003

0.1077 0.0018 0.0009 0.001 0.0013 0.00096

0.8715 0.8586 0.8174 0.8277 0.8504 0.83

5.2322 5.1934 5.2108 5.2349 5.1565 5.1546

0.0606 0.5294 0.0569 0.04903 0.0487 0.0472

6.9282 6.9367 6.9872 7.1687 7.0716 6.9202

0.3991 0.3794 0.3849 0.3956 0.4193 0.4293

0.0019 0.001 0.0024 0.00102 0.0013 0.0012

0.6103 0.5837 0.5994 0.5845 0.6005 0.6147

0.0432 3.3159 0.3489 0.3264 0.3233 0.3206

0.0439 0.036 0.0369 0.0361 0.0374 0.04019

4.3264 4.3011 4.4163 4.3206 4.3278 4.3136

2.1586 2.136 2.458 1.9885 2.0468 2.0294

0.0174 0.0142 0.0189 0.0089 0.0093 0.00901

6.3909 6.1386 5.9284 6.0039 6.2212 6.1249

1.9726 1.8985 1.8363 1.9707 1.9019 2.0128

0.0616 0.0739 0.0585 0.0684 0.0598 0.082

5.4423 5.9043 5.1185 5.1793 5.1877 5.3219

problems

with

the

similar

complex

obtain more effective and convincing results, which are proved by different evaluation criteria. (4) The hybrid approach VMD-ARMA/ KELM-KELM is initially developed, and its stability is demonstrated from various aspects including running times, the number of iterations and so on. In general, the proposed approach can provide advices for airlines and airports, including the strategic development planning (aviation equipment, capital construction, opening new routes, network man­ agement), and the daily operation management (flight cost, labor and wage planning, production mission planning, personnel training). Also, this method has some wider potential applications, for instance the tourist arrival rate forecasting, the railway passenger flow forecasting, and the like.

3.7. Sensitivity analysis in the proposed approach The iterations of the final ensemble KELM play a significant role in the stability and the validity of proposed approach. An undue number of iterations may cause overfitting or plunge into the local optimum. Consequently, the iteration times of the final ensemble model KLEM are analyzed according to the air passenger demand from three locations. The experimental results are represented in Table 10, from which we can draw three conclusions: (1) With the number of iterations increasing, all three evaluation criteria display a fluctuating tendency in different it­ erations. That is, the prediction accuracy increases first and then de­ creases. (2) Actually, the specifics situation of forecasting modeling depends largely on the real practice of the decision-making. Here, taking the accuracy and computational cost of the proposed forecasting model into consideration, 100 can be regarded as the correspondingly suc­ cessful number of iterations. (3) Meanwhile, the KELM model runs more than ten times in each iteration stage. The actual results reveal that the error values change within a certain range and tend to be stable basically.

Conflicts of interest The authors state that there is no conflict of interests regarding the publication of this paper. Acknowledgments This research was supported by National Natural Science Foundation of China (Project No. 71373263, Project No. 11361031) and the Lanz­ hou Jiaotong University–Tianjin University Innovation Fund Project (Project No. 2018064). The authors would like to express their heartfelt gratitude to the editors and the two referees for their invaluable com­ ments and suggestions. Their comments and suggestions have raised the quality of the paper greatly.

4. Conclusions In this paper, on the basis of the Variational mode decomposition (VMD), Autoregressive moving average model (ARMA) and Kernel extreme learning machine (KELM), we develop an approach for shortterm air passenger forecasting. Air passenger demand data from Bei­ jing, Guangzhou and Pudong airports are used for forecasting and testing, respectively. The empirical results show that the proposed hybrid approach is the most excellent model for forecasting the demand of aviation passengers with complex features, which reduces greatly the forecasting error and is superior to other competitive approaches in the accuracy and direction change of the forecasting results. The main contributions of this paper can be summarized as follows: (1) Internal characteristics can be more efficient to extract in the original air passenger demand by adopting VMD. (2) The stationary and nonstationary series are predicted respectively by the compatible models based on the results of the stationarity test, and the unique character­ istics of each subseries can more be captured completely. (3) By taking advantage of different forecasting models, the proposed approach can

References Abed, S.Y., Ba-Fail, A.O., Jasimuddin, S.M., 2001. An econometric analysis of international air travel demand in Saudi Arabia. J. Air Transp. Manag. 7, 143–148. https://doi.org/10.1016/S0969-6997(00)00043-0. Aston, J.A.D., Koopman, S.J., 2006. A non-Gaussian generalization of the airline model for robust seasonal adjustment. J. Forecast. 25, 325–349. https://doi.org/10.1002/ for.991. Baker, D., Merkert, R., Kamruzzaman, M., 2015. Regional aviation and economic growth: cointegration and causality analysis in Australia. J. Transp. Geogr. 43, 140–150. Box, G.E.P., Jenkins, G.M., 2010. Time series analysis : forecasting and control. J. Time 31, 303. Castro-Neto, M., Jeong, Y., Jeong, M.K., Han, L.D., 2009. AADT prediction using support vector regression with data-dependent parameters. Expert Syst. Appl. 36, 2979–2986. https://doi.org/10.1016/J.ESWA.2008.01.073.

17

F. Jin et al.

Journal of Air Transport Management 83 (2020) 101744 Niu, M., Hu, Y., Sun, S., Yu, L., 2018. A novel hybrid decomposition-ensemble model based on VMD and HGWO for container throughput forecasting. Appl. Math. Model. 57, 163–178. Pei, D., Wang, J., Guo, Z., Yang, W., 2017. Research and application of a novel hybrid forecasting system based on multi-objective optimization for wind speed forecasting. Energy Convers. Manag. 150, 90–107. Rao, C.R., Mitra, S.K., 1972. Generalized inverse of a matrix and its applications. Berkeley Symp. Math. Stat. Probab. 1, 601–620. https://doi.org/10.2307/1266840. Samagaio, A., Wolters, M., 2010. Comparative analysis of government forecasts for the Lisbon Airport. J. Air Transp. Manag. 16, 213–217. https://doi.org/10.1016/j. jairtraman.2009.09.002. Segura, J.V., Vercher, E., 2007. Holt-Winters Forecasting: an alternative formulation applied to UK air passenger data AU - bermúdez. J.D. J. Appl. Stat. 34, 1075–1090. https://doi.org/10.1080/02664760701592125. Shao, Z., Gao, F., Yang, S.-L., Yu, B., 2015. A new semiparametric and EEMD based framework for mid-term electricity demand forecasting in China: hidden characteristic extraction and probability density prediction. Renew. Sustain. Energy Rev. 52, 876–889. https://doi.org/10.1016/J.RSER.2015.07.159. Smith, B.L., Demetsky, M.J., 1997. Traffic flow forecasting: comparison of modeling approaches. J. Transport. Eng. 123 (4), 261–266. Sun, S., Lu, H., Tsui, K., Wang, S., 2019. Nonlinear vector auto-regression neural network for forecasting air passenger flow. J. Air Transp. Manag. 78, 54–62. https://doi.org/ 10.1016/j.jairtraman.2009.09.002. Tsui, W.H.K., Ozer Balli, H., Gilbey, A., Gow, H., 2014. Forecasting of Hong Kong airport’s passenger throughput. Tour. Manag. 42, 62–76. https://doi.org/10.1016/J. TOURMAN.2013.10.008. Van Arem, B., Kirby, H.R., Van Der Vlist, M.J.M., Whittaker, J.C., 1997. Recent advances and applications in the field of short-term traffic forecasting. Int. J. Forecast. 13, 1–12. https://doi.org/10.1016/S0169-2070(96)00695-4. Vythoulkas, P., 1993. Alternative approaches to short term traffic forecasting for use in driver information systems. Transp. Traffic Theory 12, 485–506. Wang, Y., Markert, R., Xiang, J., Zheng, W., 2015. Research on variational mode decomposition and its application in detecting rub-impact fault of the rotor system. Mech. Syst. Signal Process. 60–61, 243–251. Wang, J., Liu, F., Song, Y., Zhao, J., 2016. A novel model: dynamic choice artificial neural network (DCANN) for an electricity price forecasting system. Appl. Soft Comput. 48, 281–297. Williams, B.M., Durvasula, P.K., Brown, D.E., 1998. Urban freeway traffic flow prediction application of seasonal autoregressive integrated. Transp. Res. Rec. 1644, 132–141. https://doi.org/10.3141/1644-14. Xiao, Y., Liu, J.J., Hu, Y., Wang, Y., Lai, K.K., Wang, S., 2014. A neuro-fuzzy combination model based on singular spectrum analysis for air transport demand forecasting. J. Air Transp. Manag. 39, 1–11. https://doi.org/10.1016/j.jairtraman.2014.03.004. Xie, G., Wang, S., Lai, K.K., 2014. Short-term forecasting of air passenger by using hybrid seasonal decomposition and least squares support vector regression approaches. J. Air Transp. Manag. 37, 2013–2015. Yang, W., Wang, J., Wang, R., 2017. Research and application of a novel hybrid model based on data selection and artificial intelligence algorithm for short term load forecasting. Entropy 19. https://doi.org/10.3390/e19020052. Zhang, H.M., 2000. Recursive prediction of traffic conditions with neural network models. J. Transport. Eng. 126 (6), 472–481. Zhang, G., Eddy Patuwo, B., Hu, M.Y., 1998. Forecasting with artificial neural networks: the state of the art. Int. J. Forecast. 14, 35–62. https://doi.org/10.1016/S0169-2070 (97)00044-7. Zhao, J., Guo, Z.-H., Su, Z.-Y., Zhao, Z.-Y., Xiao, X., Liu, F., 2016. An improved multi-step forecasting model based on WRF ensembles and creative fuzzy systems for wind speed. Appl. Energy 162, 808–826. https://doi.org/10.1016/j. apenergy.2015.10.145. Zheng, W., Lee, D.-H., Shi, Q., 2006. Short-term freeway traffic flow prediction: bayesian combined neural network approach. J. Transport. Eng. 132, 114–121. https://doi. org/10.1061/(ASCE)0733-947X(2006)132:2(114). Zhu, J., Wu, P., Chen, H., Liu, J., Zhou, L., 2019. Carbon price forecasting with variational mode decomposition and optimal combined model. Physica A 519, 140–158. https://doi.org/10.1016/j.physa.2018.12.017.

Chin, A.T.H., Tay, J.H., 2001. Developments in air transport: implications on investment decisions, profitability and survival of Asian airlines. J. Air Transp. Manag. 7, 319–330. https://doi.org/10.1016/S0969-6997(01)00026-6. Dantas, T.M., Cyrino Oliveira, F.L., Varela Repolho, H.M., 2017. Air transportation demand forecast through Bagging Holt Winters methods. J. Air Transp. Manag. 59, 116–123. https://doi.org/10.1016/j.jairtraman.2016.12.006. Diebold, F.X., Mariano, R.S., 1995. Comparing predictive Accuracy technical working paper series. J. Bus. Econ. Stat. 13, 134–144. Dragomiretskiy, K., Zosso, D., 2014. Variational mode decomposition. IEEE Trans. Signal Process. 62, 531–544. Fernandes, E., Pacheco, R.R., 2010. The causal relationship between GDP and domestic air passenger traffic in Brazil. Transp. Plan. Technol. 33, 569–581. https://doi.org/ 10.1080/03081060.2010.512217. Flyvbjerg, B., Holm, M.K.S., Buhl, S.L., 2005. How (in) accurate are demand forecasts in public works projects? The case of transportation. J. Am. Plan. Assoc. 71, 131–146. Garrow, L.A., Koppelman, F.S., 2004. Predicting air travelers’ no-show and standby behavior using passenger and directional itinerary information. J. Air Transp. Manag. 10, 401–411. https://doi.org/10.1016/J.JAIRTRAMAN.2004.06.007. Grosche, T., Rothlauf, F., Heinzl, A., 2007. Gravity models for airline passenger volume estimation. J. Air Transp. Manag. 13, 175–183. https://doi.org/10.1016/J. JAIRTRAMAN.2007.02.001. Hakim, M.M., Merkert, R., 2016. The causal relationship between air transport and economic growth: empirical evidence from South Asia. J. Transp. Geogr. 56, 120–127. https://doi.org/10.1016/j.jtrangeo.2016.09.006. Hao, Y., Tian, C., 2019. The study and application of a novel hybrid system for air quality early-warning. Appl. Soft Comput. 74, 729–746. https://doi.org/10.1016/j. asoc.2018.09.005. Hoerl, A.E., Kennard, R.W., 1970. Regression : biased problems nonorthogonal estimation for. Technometrics 42, 80–86. https://doi.org/10.1080/ 00401706.1970.10488634. Hsu, C.I., Wen, Y.H., 2000. Application of Grey theory and multiobjective programming towards airline network design. Eur. J. Oper. Res. 127, 44–68. https://doi.org/ 10.1016/S0377-2217(99)00320-3. Huang, G. Bin, Chen, L., 2007. Convex incremental extreme learning machine. Neurocomputing 70, 3056–3062. https://doi.org/10.1016/j.neucom.2007.02.009. Huang, G. Bin, Chen, L., 2008. Enhanced random search based incremental extreme learning machine. Neurocomputing 71, 3460–3468. https://doi.org/10.1016/j. neucom.2007.10.008. Huang, G.-B., Zhu, Q.-Y., Siew, C.-K., 2006. Extreme learning machine: theory and applications. Neurocomputing 70, 489–501. https://doi.org/10.1016/J. NEUCOM.2005.12.126. Huang, G. Bin, Wang, D.H., Lan, Y., 2011. Extreme learning machines: a survey. Int. J. Mach. Learn. Cybern. 2, 107–122. https://doi.org/10.1007/s13042-011-0019-y. Kim, S., Shin, D.H., 2016. Forecasting short-term air passenger demand using big data from search engine queries. Autom. ConStruct. 70, 98–108. https://doi.org/ 10.1016/j.autcon.2016.06.009. Lahmiri, S., 2016. A variational mode decompoisition approach for analysis and forecasting of economic and financial time series. Expert Syst. Appl. 55, 268–273. https://doi.org/10.1016/J.ESWA.2016.02.025. Lee, S., Lee, Y.-I., Cho, B., 2006. Short-term travel speed prediction models in car navigation systems. J. Adv. Transp. 40, 122–139. https://doi.org/10.1002/ atr.5670400203. Li, B., Rong, X., Li, Y., 2014. An improved kernel based extreme learning machine for robot execution failures. Sci. World J. https://doi.org/10.1155/2014/906546, 2014. Li, C., Xiao, Z., Xia, X., Zou, W., Zhang, C., 2018. A hybrid model based on synchronous optimisation for multi-step short-term wind speed forecasting. Appl. Energy 215, 131–144. https://doi.org/10.1016/j.apenergy.2018.01.094. Niu, M., Wang, Y., Sun, S., Li, Y., 2016. A novel hybrid decomposition-and-ensemble model based on CEEMD and GWO for short-term PM2.5 concentration forecasting. Atmos. Environ. 134, 168–180. https://doi.org/10.1016/J. ATMOSENV.2016.03.056. Niu, M., Gan, K., Sun, S., Li, F., 2017. Application of decomposition-ensemble learning paradigm with phase space reconstruction for day-ahead PM2.5 concentration forecasting. J. Environ. Manag. 196, 110–118. https://doi.org/10.1016/j. jenvman.2017.02.071.

18