A novel wind speed prediction method based on robust local mean decomposition, group method of data handling and conditional kernel density estimation

A novel wind speed prediction method based on robust local mean decomposition, group method of data handling and conditional kernel density estimation

Energy Conversion and Management 200 (2019) 112099 Contents lists available at ScienceDirect Energy Conversion and Management journal homepage: www...

2MB Sizes 0 Downloads 27 Views

Energy Conversion and Management 200 (2019) 112099

Contents lists available at ScienceDirect

Energy Conversion and Management journal homepage: www.elsevier.com/locate/enconman

A novel wind speed prediction method based on robust local mean decomposition, group method of data handling and conditional kernel density estimation

T

Yan Jianga, Shuoyu Liua, Liuliu Pengb, , Ning Zhaoc ⁎

a

College of Engineering and Technology, Southwest University, Chongqing 400715, China School of Civil Engineering, Chongqing University, Chongqing 400044, China c School of Civil Engineering, Sichuan Agricultural University, Chengdu 611830, China b

ARTICLE INFO

ABSTRACT

Keywords: Short-term wind speed prediction Robust local mean decomposition Group method of data handling Traditional conditional kernel density estimation Unbiased conditional kernel density estimation Deterministic prediction Probabilistic prediction

Short-term wind speed prediction has a great practical significance in wind power exploitation and development. However, owing to strong nonstationarity and nonlinearity of natural wind, it is hard to obtain accurate and reliable prediction by traditional models. To this end, this paper develops an innovative hybrid model to reveal the different characteristics of wind speed data for obtaining the satisfactory forecasting capability. Besides the deterministic prediction, this method could also provide probabilistic prediction. Specifically, robust local mean decomposition is firstly implemented to address the nonstationarity and nonlinearity of wind speed time series by separating them into a series of different subseries with lower ones; Then, an unexplored group method of data handling neural network is developed to extract the nonlinear information in each subseries and subsequently perform the corresponding individual prediction; Furthermore, for the above modeling residuals, traditional conditional kernel density estimation with the embedded dimension determined by partial autocorrelation function is employed to describe the other potential linear and nonlinear characteristics in the form of conditional probability, where the bandwidth matrices are optimized by normal reference criterion. On this basis, the deterministic prediction with the single-value is generated through aggregate calculation. In order to further consider the uncertainty in the data, the probabilistic prediction using unbiased conditional kernel density estimation is performed and the corresponding result is exhibited by prediction intervals. Case studies based on three groups of the measured wind speed data are conducted to comprehensively illustrate the effectiveness of the proposed method. The final results indicate that the proposed method occupies greater accuracy and higher reliability in comparison with the other concerned models.

1. Introduction 1.1. Motivation As one of the clean, inexhaustible and inexpensive renewable resources, wind energy is undergoing rapid development over the past few decades [1]. For example, in China, the installed wind power capacity reaches 62364.2 MW in 2011 with an average annual growth rate of 39.4% [2]. However, owing to the complex features of natural wind (e.g., nonlinearity, nonstationarity and uncertainty), the integration of wind energy into power system usually confronts many obstacles, which is not conductive to the rational planning, maintenance and scheduling of wind energy [3]. In order to alleviate the effect of these adverse factors, accurate and reliable short-term wind speed



prediction is urgently needed due to its strongly positive correlation with wind power [4]. 1.2. Literature review Generally, wind speed forecasting methods can be classified into two main categories, namely single and hybrid models. As the name implies, the single models can only depict a certain type of characteristic hidden in the data, such as linearity or nonlinearity. For example, autoregressive integrated moving average (ARIMA) was employed to reveal the linear relationship in the data, and the corresponding results presented it had better performance than the persistence model [5]. After that, fractional ARIMA (FARIMA) was designed to better capture this relationship and its final forecasting result was superior to those of

Corresponding author. E-mail addresses: [email protected] (Y. Jiang), [email protected] (S. Liu), [email protected] (L. Peng), [email protected] (N. Zhao).

https://doi.org/10.1016/j.enconman.2019.112099 Received 11 June 2019; Received in revised form 9 August 2019; Accepted 22 September 2019 Available online 01 October 2019 0196-8904/ © 2019 Elsevier Ltd. All rights reserved.

Energy Conversion and Management 200 (2019) 112099

Y. Jiang, et al.

Nomenclature

The subset size in moving average method The step length between local mean signals mi and mi+1 The PDF of s (i ) The mean of s (i ) The standard deviation of s (i ) s z (t ) The zero-baseline envelope signal c v (n + t ) The forecasting result of c v (n + t ) C GMDH model training input matrix Z GMDH model training output matrix Cp The p-th row elements in the matrix C zp The p-th row element in the matrix Z M The number of rows in the matrix C V The number of columns in the matrix C zp The forecasting output for the input Cp xl The input in TCKDE model training yl The output in TCKDE model training L The forecasting step size d The embedded dimension in TCKDE r The number of the independent vectors in xl f (y |x ) The predictive PDF by TCKDE K d (·) The d-dimensional Gaussian kernel function Hx ,Hy The bandwidth parameter matrices in TCKDE µl , l2 The forecasting mean and variance in TCKDE The error between yl and µl l f (y |x ) The predictive PDF by UCKDE yl , l 2 The forecasting mean and variance in UCKDE 2 The predicted L-step ahead variance n+L The number of the data for evaluation n x (t ) The predicted value at time point t Lt , Ut The predicted lower and upper bounds xMax , xMin The maximum and minimum in the testing set MAE Mean absolute error MRPE Mean relative percentage error RMSE Root mean square error ACE Average coverage error PINAW Prediction intervals normalized average width PICP Prediction intervals coverage probability PINC Prediction intervals nominal confidence level CWC Coverage width-based criterion The parameter to depict ACE

s (i ) f (i ) µs

Model Abbreviation ARIMA ANN DWT EMD EWT FARIMA GMDH KDE LSSVM LMD MKDE NRC PDF PI QR RLMD TCKDE UCKDE VMD WPD

Autoregressive integrated moving average Artificial neural network Discrete wavelet transform Empirical model decomposition Empirical wavelet transform Fractional autoregressive integrated moving average Group method of data handling Kernel density estimation Least square support vector machine Local mean decomposition Multivariable kernel density estimation Normal reference criterion Probability distribution function Prediction interval Quantile regression Robust local mean decomposition Traditional conditional kernel density estimation Unbiased conditional kernel density estimation Variational mode decomposition Wavelet packet decomposition

Indexes, parameters and variables t n ni mi ai K mij (t ) aij (t )

h11 (t ) s1j (t ) N

a1 (t ) c v (t ) k

Index for time period The number of wind speed time series The i-th extreme point of the signal x(t) The local mean between ni and ni+1 The local amplitude between ni and ni+1 The number of extreme points The continuous local mean at the j-th iteration The continuous local amplitude at the j-th iteration The signal x(t) subtracts the initial local mean m11 (t ) The pure frequency modulated signal at the j-th iteration The number of iterations The envelope function The v-th product function from RLMD The number of product functions

the compared models [6]. On the other hand, in order to describe the nonlinear attribute that widely exists in the physical world, the complex neural network and machine learning models with good generalization capability and adaptability attract extensively attentions [7]. For example, three typical neural networks including adaptive linear element, back propagation and radial basis function had been applied in wind speed prediction, and the results displayed all of them were effective [8]. Meanwhile, the machine learning models, such as least square support vector machine (LSSVM), had also been employed to offer wellpleasing forecasting results [9]. However, these single models sometimes cannot provide satisfactory forecasting results because of their failure to accurately reveal the valid information of the wind speed data with complex features [10]. For the sake of capturing more useful information, the combined models which could integrate the strengths of multiple single models have been developed [11]. In recent years, these hybrid models have attracted more and more attention, and become increasingly popular in this field. For instance, in order to simultaneously explain the linear and nonlinear relationships, the model combining ARIMA with artificial neural network (ANN) was designed and the results showed it occupied better performance than the involved single models [12]. Then, another method which was hybrid of seasonal ARIMA and LSSVM was also

proposed to describe these characteristics, and a significant superiority over previously reported single models was obtained [13]. Recently, the combination of ARIMA and nonparametric model, i.e., kernel density estimation (KDE), was developed and the satisfactory results were generated [14]. It should be noted that the above hybrid methods achieve good prediction under the assumption that the wind speed time series could be regarded as stationary process [2]. However, actual wind speed data usually present high nonstationarity. In this case, these models may not provide acceptable forecasting results [15]. In order to mine the traits more meticulously, especially for the data with strong nonstationarity, many other hybrid models with better performance have been explored [16]. Among them, the decomposition-based methods attract more attentions because the implementation of signal decomposition techniques could effectively separate the original data into a number of subseries with lower nonstationarity and nonlinearity. With the completion of this data preprocessing, the appropriate model is established to produce the prediction for each subseries, and the final result is the summation of these individual predictions [17]. For example, original data were divided into a number of different subseries by empirical mode decomposition (EMD), and then recursive ARIMA (RARIMA) was set up for each subseries to conduct wind speed prediction. The results indicated that the proposed method 2

Energy Conversion and Management 200 (2019) 112099

Y. Jiang, et al.

had better performance than the single RARIMA [18]. Similarly, another commonly used decomposition technique, discrete wavelet transform (DWT), was combined with support vector machine (SVM) to enhance the forecasting capability and the final result was satisfactory as well [19]. However, these two techniques are not always effective in practical application due to their inherent disadvantages [20]. For instance, EMD is vulnerably subjected to the influences of the end effect and mode aliasing, and is sensitive to sampling and noise, whereas DWT needs to specify the mother wavelet and the number of decomposition levels in advance [21]. To this end, a series of new techniques have been developed. These techniques mainly include ensemble EMD (EEMD) [22], complete EEMD [23], wavelet packet decomposition (WPD) [24], empirical wavelet decomposition (EWT) [25] and variational model decomposition (VMD) [26], and some of them have been successfully exploited in wind speed prediction [27]. Nevertheless, these new approaches still face the other demerits. With VMD as an example, although it could well curb the influences of the end effect and mode aliasing, its number of decomposition levels is specified empirically [26]. It is worth noting that the forecasting models mentioned above are mainly used to capture the deterministic information hidden in the wind speed data and can only provide a single value in each time prediction. Evidently, these models cannot well explain the uncertainty associated with the data. Fortunately, probabilistic models could describe this attribute by prediction interval (PI), which is conductive to reduce the risk caused by unreasonable wind power production scheduling and control [28]. For example, as one of the most popular single models, quantile regression (QR) had been adopted in probabilistic prediction. The results exhibited that this model enhanced the forecasting reliability significantly [29]. Based on relative entropy and mutual information theory, a multivariable KDE (MKDE) was put forward. This method took into account the other meteorological factors (e.g., temperature, directionality and humidity) and then produced reliable probabilistic prediction [30]. Recently, the signal decomposition technique (e.g., EWT) had also been successfully introduced in probabilistic prediction [31]. Along with this idea, Zhang et al. [15] developed another new forecasting method based on VMD and machine learning models, where both deterministic and probabilistic predictions were generated. However, these probabilistic models are rarely involved in the analysis of how to yield good deterministic predictions [32]. Meanwhile, compared with the relatively mature deterministic prediction, probabilistic prediction still deserves further research.

On the whole, the contribution of this study is summarized in below:

• Compared with the existing decomposition techniques, RLMD not









only alleviates the influences of the end effect and mode aliasing, but also enhances the model robustness significantly. This approach could simultaneously optimize the boundary condition, envelope estimation and sifting stopping criterion of original local mean decomposition (LMD). Therefore, the nonstationary and nonlinear characteristics of wind speed time series may be well addressed and the satisfactory forecasting performance could be obtained; Compared with the traditional nonlinear models (e.g., SVM, LSSVM and ANN), GMDH not only can accelerate convergence and avoid the problem of getting struck in local minima, but also optimize input variables and model structure automatically. It could handle many complex nonlinear problems in self-organizing manner with active neurons. Therefore, this model may have better performance to extract the nonlinear information in wind speed time series, by which the final forecasting accuracy could be enhanced; Compared with the parametric models, TCKDE belongs to semiparametric model and does not require any distribution assumption. With the data enriched, the corresponding conditional probability distribution function (PDF) can converge to the true distribution gradually. Therefore, the multimodal distribution which is often presented in the non-Gaussianity of wind speed data could be addressed easily; Compared with the other probabilistic models, UCKDE could explain the uncertain information of the data by transforming the better deterministic prediction into the probabilistic prediction. Meanwhile, this model could also provide the unbiased variance estimation based on deeply analyzing the inner dependence of original data. By this model, more reliable probabilistic forecasting capability may be provided; Three case studies based on the measured data present that the proposed method could generate both deterministic and probabilistic predictions with greater accuracy and higher reliability in comparison with the other involved models. For example, compared with TCKDE, the improvements by the proposed method in terms of the indexes mean absolute error (MAE) in the three case studies are 26.061%, 24.149% and 16.292%, respectively.

1.4. Organization of the paper The remainder of this paper is organized as follows: Section 2 illustrates the principles of RLMD and the involved forecasting models including GMDH, TCKDE and UCKDE. Section 3 elaborates the details of the proposed method. Three case studies and the corresponding discussions are provided in Section 4. Some conclusions are offered in Section 5.

1.3. Contribution In this paper, a novel short-term wind speed forecasting method that synthesizes robust local mean decomposition (RLMD), group method of data handling (GMDH), traditional conditional kernel density estimation (TCKDE) and unbiased CKDE (UCKDE) is proposed to describe the different characteristics hidden in the wind speed data, by which both point and interval predictions could be provided. More specifically, RLMD is firstly adopted to decrease the nonstationarity and nonlinearity of the raw wind speed data by dividing them into a number of different subseries. Then, GMDH is employed to explain the nonlinear information of the data and subsequently perform the prediction for each subseries. For the above model training errors, TCKDE with the embedded dimension determined by partial autocorrelation function (PACF) is established to extract the linear and other nonlinear characteristics that cannot be revealed by the hybrid of RLMD and GMDH. On this basis, the deterministic prediction is produced. In order to further capture the uncertain information, UCKDE is used to analyze the residuals from the deterministic prediction and then provide the probabilistic PIs. Finally, comprehensively comparative study based on three case studies indicates that the proposed method outperforms the other involved models. The overview of wind speed prediction in this paper is shown in Fig. 1.

2. Methodology 2.1. Robust local mean decomposition LMD is an adaptive time-frequency representation technique applied in many fields, such as signal processing and structural damage identification [33]. It can demodulate amplitude and frequency modulated signals into a set of product functions, and each of them is the product of an instantaneous envelope signal and a pure frequency modulated signal. For a given signal x (t ), t = 1, 2, ..., n , the specific process of LMD is presented as follows: (1) Find all local extrema including maxima and minima in the original signal x (t ) . Then, calculate the local mean mi and amplitude ai of two successive extrema ni and ni + 1, which are expressed by

3

Energy Conversion and Management 200 (2019) 112099

Y. Jiang, et al.

mi =

ni + ni + 1 |n ni + 1 | ; ai = i 2 2

s11 (t ) =

(1)

where ni , (i = 1, 2, ..., K ) is the i-th extreme point; K denotes its total number. Then, a smoothly varying continuous local mean function mij (t ) can be formed by the moving averaging method, where the subscript j, (j = 1, 2, ..., N ) stands for the number of iterations to generate a normalized frequency modulated signal. Similarly, the corresponding envelope function aij (t ) can be produced. It should be noted that the performance of the moving averaging method is controlled by the fixed subset size [34].

(4) Obtain the envelope a1 (t ) by the following formula, i.e.,

a1 (t ) = a11 (t )·a12 (t )·...· a1N (t ) =

N a (t ) j = 1 1j

s.t. lim a1N (t ) = 1 N

(2)

m11 (t )

(3)

when s11 (t ) is a pure frequency modulated signal, its envelope function a12 (t ) should satisfy the condition of a12 (t ) = 1. If a12 (t ) 1, s11 (t ) can be regarded as the original signal and the above procedures should be repeated until the target signal s1N (t ) is generated (N is the final number of iterations), where the corresponding envelope function a1N (t ) is equal to one.

(2) Subtract the initial local mean m11 (t ) from the original signal and obtain the remaining series h11 (t ) , i.e.,

h11 (t ) = x (t )

h11 (t ) a11 (t )

(5) Multiply the pure frequency modulated signal s1N (t ) by the envelope function a1 (t ) and produce the first product function c1 (t ) ,

(3) Divide h11 (t ) by a11 (t ) to conduct amplitude modulation and yield that

Original Data are Grouped into Training and Forecasting Parts Enhance Original LMD by Optimizing Boundary Condition, Envelope Estimation and Sifting Stopping Criterion

Only Training Part is Decomposed by RLMD c1

c2

...

(4)

ck

ck+1 Adopt Three-layers GMDH Neural Network

Build GMDH Model for Each Decomposed Subseries Obtain the Errors from Model Training by GMDH

Construct Explanatory and Target Variables of TCKDE by PACF Analysis

Build TCKDE for These Errors

Aggregate the Above Predictions

Obtain the Model Training Errors from TCKDE

Deterministic Prediction

Establish UCKDE for the Above Error Component

Determine Bandwidth Parameters of TCKDE by PACF

The Dimension Parameter Determination by Using PACF to Analyze the Interdependence of Original Data

Probabilistic Prediction

Update Training Part Using Forecasting Part Data Three Case Studies Result Analysis Fig. 1. The overview of wind speed prediction in this paper. 4

Energy Conversion and Management 200 (2019) 112099

Y. Jiang, et al.

i.e.,

nonlinear practical problems (e.g., data mining, prediction and optimization) in heuristic self-organizing manner [36]. In this method, some important information including the input variable pairs, neurons in the embedded layers, number of layers and model structure can be generated automatically [37]. Specifically, it can be exhibited as a set of neurons where the combinations of different pairs in each layer are captured by a polynomial or the other functions, such as harmonic and logistic ones. By this procedure, the properties of input that provide little information about the location and shape of hyper dimension space can be filtered out. This method belongs to the mapping of multiple-input to single-output. With the above subseries {c v (t ), t = 1, 2, ..., n} as an example, the specific modeling process of GMDH is shown as follows: This subseries is firstly divided into two parts: model training input matrix C and output matrix Z with M × V and M × 1 elements, respectively, i.e.,

(5)

c1 (t ) = a1 (t )· s1N (t )

where c1 (t ) usually contains the highest frequency component of signal x (t ) . (6) Repeat the steps (1)–(5) for the remaining component x (t ) c1 (t ) until the residual signal can be regarded as monotonic function. Then, the signal x (t ) can be decomposed into k product functions and a monotonic function ck + 1 (t ) , i.e., k+1

x (t ) =

c v (t )

(6)

v=1

LMD has many promising properties. Particularly, it can avoid the existence of negative frequency in the decomposed subseries [35]. However, this approach may suffer from the disturbances of the end effect and mode mixing. In order to alleviate these two limitations, the suitable parameters with regard to boundary condition, envelope estimation and sifting stopping criterion need to be determined beforehand [34]. As a result, a robust LMD (RLMD) is developed to consider them simultaneously [34]. The specific optimizations are presented in below.

C=

K 1

µs =

s

=

[s (i)

i=1

µs ]2 ·f (i )

1 n

RMS [z (t )] =

EK [z (t )] =

(

z¯]4

1 n

n t=1

[z (t )

z¯]2

)

2

3; z¯ =

1 n

V

Apqw c v (p)· c v (q)·c v (w ) +

(14)

M p=1

[g (cp) M

z p ]2

min

(15)

Obviously, V (V 1)/2 middle candidate models will be generated in the first layer of GMDH. After that, the candidate models with the minimum E values are selected as the inputs in the second layer. Similarly, the new middle candidate models can be produced in this layer and some superior ones are remained for the next layer. By this procedure, new neurons are produced constantly to serve for the next layer until the current layer’s best approximation is inferior to that in the previous layer. Apparently, this method essentially establishes polynomials of polynomials and only the optimal model is reserved. From [37], the simplified quadratic polynomial with two variables is enough to address the practical problems. With constantly iterative optimization, more layers are required to realize the better forecasting performance, which inevitably increases the model complexity. In this study, the maximum number of hidden layers is set to 3 due to its better performance in practice [37] and the corresponding parameter settings refer to the results in literature [38]. The structure of the three-hidden layer GMDH neural network is presented in Fig. 2. From the above procedures, the forecasting result of each subseries at time point t + L can be obtained and denoted as c v (t + L) , where L is the forecasting step size.

n

z (t ) t=1

V

Apq c v (p)·c v (q)

in which A0 , Ap , Apq and Apqw are the undetermined coefficients, which can be estimated by regression techniques so that the minimum mean square error between the output z p and the target z q can be guaranteed, i.e., [37]

(10)

[z (t )

V

p=1 q=1

p = 1 q= 1 w = 1

n

n t=1

V

Ap c v (p) +

V

E=

1 n

V

+

1 corresponds to the zero-baseline envelope

t=1

c v (V + p ) (12)

i=1

(9)

[z (t )]2

,

(13)

g^ (Cp) = A0 +

(3) Sifting stopping criterion: minimize the following function F in which z (t ) = ai (t ) signal;

1)

=

c v (V + 1) c v (V + 2)

where z p is the forecasting output for the given input Cp ; g (·) denotes the approximate function to reflect the relationship between input and output variables. Generally, this connection can be explained by a complicated infinite form of Volterra-Kolmogorov-Gabor polynomial, i.e.,

where s (i ) is the step length between the local mean signals or local amplitude signals; f (i) stands for the corresponding PDF; µs and s correspond to the mean and standard deviation of s (i ), respectively; odd (·) is the operation to return the nearest odd integer larger than or equal to the input.

F= RMS [z (t )] + EK [z (t )]

c v (V + p

zp

z p = g (Cp)

(8)

i=1

c v (p )

; Z=

z1 z2

For the p-th data pairs, the corresponding forecasting result of GMDH can be obtained by

K 1

s (i)· f (i);

=

c v (V ) c v (V + 1)

p = 1, 2, ..., M

(7)

s)

c v (1) c v (2)

Cp

(1) Boundary condition: mirror extension algorithm is adopted to determine symmetry points for the left and the right ends of the signal; (2) Envelope estimation: an optimal subset size is obtained based on statistics theory and shown as

= odd (µs + 3 ×

C1 C2

(11)

For the sifting process, the proposed stopping criterion relies on the relationship among three successive iterations. More details about RLMD can be referred to [34]. 2.2. Forecasting models Generally, wind speed data are mainly composed of the deterministic and probabilistic components. In this study, the combination of GMDH and TCKDE is proposed to explain the deterministic component embedded in original data, while UCKDE is adopted to describe the probabilistic component. The details of these models are shown in below. 2.2.1. Group method of data handling neural network As a relatively unexplored neural network, GMDH can handle many 5

Energy Conversion and Management 200 (2019) 112099

Y. Jiang, et al.

Input layer

First layer

Second layer

Therefore, the estimated conditional PDF can be reformulated as

Third layer

r

f (y | x ) =

l (x )· l=1

1 ·K1 [H y 1 (y |Hy |

yl )]

(22)

where r

yl = ul +

e (x ) e

l

(23)

e=1

and then the corresponding estimation of variance is redefined by [39] l

2

=

2 K

µl ]2 f (y |x ) dy = |Hy |2

[y

r l (xl )· l=1

The optimal model Fig. 2. The structure of three-layer GMDH neural network.

f^ (n + L) =

1)]; yl = [x (d + L

2 n+L

l (x )

{

1 1 l (x )· |H | ·K1 [H y (y y

1

= K d [Hx (x

xl )]/

yl )]

r K [Hx 1 (x l=1 d

(17)

l (xl )· yl

=

µl ]2 f (y |x ) dy = |Hy |2

[y

2 K

+

l (xl )·[yl l =1

2 K

=

Rd

xxT K d (x ) dx

µl ]2

(19) (20)

From [39], it can be obtained that the estimation of expectation in TCKDE may have large bias. In order to handle this problem, an unbiased CKDE (UCKDE) is developed. In this method, the distribution of the error item is assumed to be consistent with that of the variable yl and the estimation of the expectation can depend on the other approaches with better performance. At this moment,

yl = µl + where

l

l

e (x r + L ) e

l e=1

(26)

(1) RLMD is used to adaptively decompose the original data into a group of different subseries with weak nonstationary and nonlinear features, as shown in Eq. (6); (see Section 2.1) (2) GMDH is employed to describe partial of nonlinear information hidden in each subseries, by which the corresponding deterministic prediction is performed. Then, the errors from the above model training are summarized to construct a new error series. Obviously, this series may contain some other features that cannot be captured by GMDH; (see Section 2.2.1) (3) TCKDE is designed to analyze the reconstructed error series, i.e., error correction. By this method, the deterministic forecasting result of RLMD-GMDH can be further modified. With the completion of the deterministic prediction by RLMD-GMDH-TCKDE, the final error component can be generated. Obviously, this error component may convey uncertain information, which could be well explained by probabilistic model; (see Section 2.2.2) (4) UCKDE is proposed to explain the uncertainty in the final error

r 2 l

r l (x r + L )·

The workflow of the proposed method is presented in Fig. 3 and the detailed procedures are listed in below:

(18)

l=1

r

+

(25) 2

3.1. The framework of the proposed method

r

yf (y |x ) dy =

2 K

yl )]

The proposed method adopts the concept of “decomposition and ensemble” and is the hybrid of RLMD, GMDH, TCKDE and UCKDE. In this method, the deterministic prediction is carried out by the combination of RLMD, GMDH and TCKDE. Then, the probabilistic prediction is further performed by UCKDE. This method mainly includes a preprocessing signal decomposition process to reduce the nonstationarity and nonlinearity of original data, and a post-processing error correction to explain the other useful information hidden in the data.

}

xl )]

= |Hy |2

1 ·K1 [H y 1 (y |Hy |

3. The proposed method

where K d (·) is d-dimensional Gaussian kernel function [39]; Hx and Hy are the bandwidth matrices optimized by normal reference criterion (NRC) [40]. Then, the forecasting expectation and variance of yl are produced by

µl =

e

e=1

where f (n + L) and n +2 L are the predicted PDF and variance at time point n + L , respectively. From the above illustrations, it can be obtained that UCKDE could perform unbiased probabilistic prediction, while TCKDE could simultaneously execute both deterministic and probabilistic predictions. In addition to explaining some nonlinear features, TCKDE could also explain partial linear ones because of the utilization of PACF. The details of TCKDE and UCKDE can be found in [39].

where r = n d L + 1; d is the embedded dimension, which can be determined by the partial autocorrelation function (PACF) analysis. Then, the conditional PDF of random variable y on the basic of random vector x is given by r l =1

l (x r + L )·

l=1

(16)

1 + l)]

r l =1

2.2.2. Traditional and unbiased conditional kernel density estimations TCKDE mostly appears in multivariate form [30]. In terms of wind speed prediction, it is usually used to explain the relationship between multiple explanatory variables and the target wind speed. Similar to GMDH, this model also could address the problem of multiple-input to single-output. More importantly, it could be further expanded to univariate form, i.e., the prediction of the target wind speed only depends on the historical wind speed data, and the corresponding studies have been reported recently [14]. The fundamental principle of univariate TCKDE is elaborated in below. For the given signal x (t ), t = 1, 2, ..., n , the construction of multiple-input x and single-output y in TCKDE is produced and the corresponding independent samples are {xl , l = 1, 2, ..., r } and {yl , l = 1, 2, ..., r } , respectively, i.e.,

f (y | x ) =

l

xr+L With the input of the up-to-date vector ([x (r + L), x (r + L + 1), , x (n)]), the L-step ahead forecasting results can be generated by

The model selected in second layer

x (d + l

e (x )

(24)

The model selected in first layer

xl = [ x (l)

2

r

+

(21)

stands for the error between predicted and actual values. 6

Energy Conversion and Management 200 (2019) 112099

Y. Jiang, et al.

Decompose Original Data into Different Subseries by RLMD

RLMD

Reduce Nonstationarity and Nonlinearity Hidden in the Original Data

Establish GMDH for Each Subseries and Get the Corresponding Model Training Error

Deterministic Prediction Employ GMDH-TCKDE to Explain Linear and Nonlinear Relationships

Build TCKDE for the Model Training Error and Obtain the Final Error Component

Probabilistic Prediction

Adopt UCKDE for the Final Error Component to Conduct Unbiased Probabilistic Prediction

Employ UCKDE to Explain Uncertain Information Embedded in the Data

Update Database by Actual Data Fig. 3. The workflow of the proposed method.

• Root Mean Square Error:

component and then carry out probabilistic prediction; (see Section 2.2.2) (5) The original data are constantly updated and the corresponding predictions are conducted by repeating the procedures (1)–(4).

MRPE =

n+n

x (t ) x (t ) x (t )

t=n +1

• Reliability

(29)

(30)

where PINC is PI nominal confidence level; PICP denotes PI coverage probability, which can be defined as

PICP =

1 n

n+ n t; t = n+ 1

t

=

1, if x (t ) [Lt , Ut ] 0, otherwise

(31)

in which [Lt , Ut ] indicates the predicted PI at time point t. ACE 0 means the constructed PIs are reliable, and vice versa. In this situation, the smaller value manifests the better performance of the constructed PIs. Therefore, ACE = 0 corresponds to the optimal PIs.

• Mean Absolute Error:

• Sharpness

n+ n t = n+ 1

1 n

ACE = PICP - PINC

3.2.1. Deterministic prediction criteria For the sake of evaluating the performance of the involved deterministic models, three commonly statistical metrics are utilized including mean absolute error (MAE), root mean square error (RMSE) and mean relative percentage error (MRPE). Generally, the smaller values of them indicate the higher forecasting accuracy of the model examined [41]. Their mathematical definitions are shown as follows.

x (t )|

(28)

3.2.2. Probabilistic prediction criteria Generally, the results of the probabilistic prediction can be exhibited by the constructed PIs. In order to comprehensively evaluate the quality of these PIs, three indexes from different viewpoints including reliability, sharpness and their combination are used. These indexes are average coverage error (ACE), prediction interval normalized average width (PINAW) and coverage width-based criterion (CWC), respectively [42].

Multiple criteria are employed to perform a systematic assessment of the proposed method from two perspectives including deterministic forecasting accuracy and probabilistic forecasting reliability. The brief illustration of these criteria is presented in below.

| x (t )

x (t )]2

in which n is the number of the data used in the performance assessment; x (t ) and x (t ) are the actual and predicted values at time point t, respectively.

3.2. Evaluation criteria

1 n

[x (t ) t=n +1

• Mean Relative Percentage Error:

It should be emphasized that the decomposition in the proposed method takes the form of real-time. In order to track time-varying characteristics of nonstationary signals that widely exist in the physical world, the model parameters required should be updated in each realtime prediction. In summary, the proposed method uses different models to explain different features hidden in original data. Concretely, RLMD is used to reduce the nonstationarity and nonlinearity of the data; GMDH is proposed to describe the nonlinear characteristics embedded in the data; TCKDE not only can capture the linear relationship of the data, but also reveal the partial nonlinear information; UCKDE is developed to depict the uncertain information contained in the raw data.

MAE =

n+n

1 n

RMSE =

(27) 7

Energy Conversion and Management 200 (2019) 112099

Y. Jiang, et al.

PINAW =

n (xMax

value of CWC means the higher quality of the constructed PIs.

n+n

1 xMin )

(Ut

Lt )

(32)

t=n +1

4. Case studies

where xMax and xMin are the maximum and minimum in the testing set. Different from ACE, the index PINAW can assess the width of the PIs. Generally, the narrow PIs are more informative and competitive than the wide ones. In other words, the smaller the value is, the better the constructed PIs are.

• Combination of reliability and sharpness

CWC = PINAW·{1 + (ACE) =

{

(ACE)·exp[

In order to confirm the superiority of the proposed method, three sets of 10-min mean wind speed time series collected by National Oceanic and Atmospheric Administration (NOAA) are employed as the experimental data, as displayed in Fig. 4. The first two groups of data have 1200 measured observations and both of them are divided into two parts in the experimental implementation, i.e., the 1st-1000th samples are treated as the training set to construct and train model, while the remaining 200 samples are referred to as the testing set to conduct the performance evaluation. The third group of data with a greater amount of data (2000 measured observations) is used to further illustrate the performance of the proposed method. Different from the above treatment, the initial 1600 data points of this dataset are

(33)

·(ACE)]}

1, if ACE< 0 0, otherwise

(34)

where is the parameter to depict the difference between PICP and PINC, and its value takes 50 in this study [42]. Obviously, the smaller

Wind Speed (m/s)

32 Training Part

25

Testing Part

18 11 4 1

300

600 T (10 min)

900

1000

1200

(a) Dataset 1 25 Testing Part

Wind Speed (m/s)

Training Part 20 15 10 5 1

300

600 T (10 min)

900

1000

1200

(b) Dataset 2

Wind Speed (m/s)

21

Training Part

Testing Part

14

7

0 1

400

800

1200 T (10min)

(c) Dataset 3 Fig. 4. Three sets of 10-min averaged wind speed data. 8

1600

2000

Energy Conversion and Management 200 (2019) 112099

Y. Jiang, et al.

PACF analysis of the model training errors from TCKDE. From Fig. 7, it could be observed that there is no significant correlation relationship in this error component. On this basis, the deterministic prediction is generated and the corresponding forecasting model is RLMD-GMDHTCKDE. In order to better illustrate the uncertain information, the probabilistic interval prediction with 95% confidence interval is carried out by UCKDE. The corresponding one-step ahead results are shown in Fig. 8. Analogously, the above procedures are repeated for the other training data and then the final forecasting results could be provided.

Table 1 Different features of three datasets. Data Sources

Mean

Std.

Max.

Min.

Skewness

Kurtosis

Stationarity

Dataset 1 Dataset 2 Dataset 3

11.53 14.50 12.30

5.09 3.04 2.86

28.75 22.89 20.69

4.69 7.08 1.72

1.56 0.44 0.06

4.81 2.88 3.10

138 163 205

regarded as the training part and the remaining data points are considered to be the testing part. The descriptive summary of these three datasets, including mean, standard deviation (Std.), minimum (Min.), maximum (Max.), skewness, kurtosis and stationarity, is presented in Table 1. Herein, the measure of stationarity is performed by reverse arrangements test, where the result beyond the range [162, 272] means the corresponding dataset can be regarded as nonstationary process, and the larger deviation from this range indicates the stronger nonstationarity [43]. In addition, skewness = 0 and kurtosis = 3 means the corresponding data follow Gaussian distribution, and the farther away from the target values manifests the stronger non-Gaussianity. From Table 1, it can be obtained that wind speed fluctuates severely, and presents apparent nonstationarity and non-Gaussianity in Dataset 1.

4.2. Forecasting result and discussion In order to verify the effectiveness of the proposed method, seven other methods are employed to make a comparative study. These methods are ARIMA-UCKDE, LSSVM-UCKDE, Models 1–3, GMDHUCKDE, TCKDE. All of them are presented comprehensively in Table 2. From Table 2, it can be observed that all involved models could produce both deterministic and probabilistic predictions. On this basis, one-step ahead wind speed prediction is implemented and the corresponding forecasting results are given in below. 4.2.1. Deterministic prediction Based on the foregoing three groups of datasets, the corresponding experiments are carried out and denoted as case studies 1–3, respectively. Specifically, Tables 3–5 display the error indexes of all involved models. The improvement percentages between the proposed method and the other compared models are shown in Tables 6–9. In order to intuitively understand the forecasting results, Figs. 9–11 present the deterministic predictions by Model 3, TCKDE and the proposed method. From these forecasting results and the corresponding comparisons, some observations are listed in below:

4.1. Wind speed prediction

c1

c2

c3

c4

c5

c6

c7

c8

Taking the case study based on the dataset 1 as an example, the forecasting processes of the proposed method are briefly illustrated in below. Fig. 5 displays the RLMD’s decomposition results of the initial training part (i.e., the 1st-1000th data points). From Fig. 5, eight different subseries are obtained. Based on these decomposed results, GMDH is adopted as the predictor to conduct the corresponding prediction. Then, the GMDH’s training errors of the above eight subseries are exhibited in Fig. 6(a) and denoted as error component I. The corresponding PACF analysis results are provided in Fig. 6(b). It is obvious that a significant correlation relationship exists in error component I. Then, TCKDE is established to further describe the other linear and nonlinear features contained in error component I. Fig. 7 shows the

20 12 4 61 0 -6 41 0 -4 51 0 -5 41 0 -4 41 0 -4 41 0 -4 41 0 -4 1

(1) The comparison among the models without the execution of signal decomposition techniques exhibits that Model 3 is superior to the other models. For example, in case study 1, its indexes of MAE, RMSE and MRPE are 0.393, 0.517 and 0.034, respectively, while

200

400

600

800

1000

200

400

600

800

1000

200

400

600

800

1000

200

400

600

800

1000

200

400

600

800

1000

200

400

600

800

1000

200

400

600

800

1000

200

400

600

800

1000

Fig. 5. Decomposition results by RLMD (the 1st-1000th data points). 9

Energy Conversion and Management 200 (2019) 112099

Y. Jiang, et al.

5

1

0.5

PACF

Error

3

PACF

1

0

-1 -3 1

200

400 600 Sample number

800

1000

-0.5

1

a) Error component I

5

10 Lag

15

20

b) PACF analysis of error component I

Fig. 6. The result analysis for GMDH’s model training errors.

1

Table 3 Result comparisons among deterministic predictions (Dataset 1).

PACF

PACF

0.5

0

-0.5

1

5

10 Lag

15

20

95% PINC Actual value Predicted value

PDF

0.4 0.3 0.2 0.1 0 0

5

10 15 20 Wind speed(m/s)

25

30

Fig. 8. Predictive value and 95% confidence interval of the 1001st data point (Dataset 1). Table 2 Eight involved forecasting models. Method

Decomposition Technique

Deterministic Model

Probabilistic Model

Proposed Model 1 Model 2 Model 3 LSSVM-UCKDE ARIMA-UCKDE GMDH-UCKDE TCKDE

RLMD EMD DWT N.A. N.A. N.A. N.A. N.A.

GMDH-TCKDE GMDH-TCKDE GMDH-TCKDE GMDH-TCKDE LSSVM ARIMA GMDH TCKDE

UCKDE UCKDE UCKDE UCKDE UCKDE UCKDE UCKDE TCKDE

MAE

RMSE

MRPE

Proposed method Model 1 Model 2 Model 3 LSSVM-UCKDE ARIMA-UCKDE TCKDE GMDH-UCKDE

0.334 0.375 0.357 0.393 0.463 0.501 0.452 0.459

0.428 0.472 0.453 0.517 0.574 0.624 0.568 0.572

0.023 0.031 0.030 0.034 0.041 0.043 0.039 0.040

those in TCKDE are 0.452, 0.568 and 0.039, respectively. The reason of this phenomenon could be that Model 3 conducts the deterministic prediction based on GMDH and TCKDE, which can well explain the linear and nonlinear characteristics hidden in the data. In contrast, the other models can only describe some characteristics of the data, such as LSSVM-UCKDE can only reveal partial nonlinear features in the data. (2) Compared with ARIMA-UCKDE, these three nonlinear model including LSSVM-UCKDE, TCKDE and GMDH-UCKDE can achieve the better prediction. The cause could be attributed to that the nonlinear information in the three datasets is more significant than the linear one. On the other hand, the comparison among these nonlinear models displays that GMDH-UCKDE outperforms LSSVMUCKDE. The reason could be that GMDH has the ability to avoid the problems caused by local minimal, which may be the trouble in the implementation of LSSVM. Additionally, the performance of TCKDE is unstable in these three case studies. For example, compared with GMDH-UCKDE, TCKDE performs better in case study 1, but produces poorer performance in the remaining two case studies. The reason could be that TCKDE is generally suitable to capture the nonGaussianity embedded in the data, which is remarkable in dataset 1 (from Table 1). Therefore, the forecasting model selection should

Fig. 7. The PACF result of TCKDE’s model training error.

0.5

Indexes

Table 4 Result comparisons among deterministic predictions (Dataset 2).

10

Indexes

MAE

RMSE

MRPE

Proposed method Model 1 Model 2 Model 3 LSSVM-UCKDE ARIMA-UCKDE TCKDE GMDH-UCKDE

0.323 0.489 0.456 0.379 0.411 0.481 0.426 0.392

0.420 0.601 0.534 0.485 0.516 0.571 0.528 0.492

0.021 0.038 0.029 0.025 0.026 0.031 0.027 0.025

Energy Conversion and Management 200 (2019) 112099

Y. Jiang, et al.

rely on the characteristics of wind speed data. (3) The Models 1 and 2 belong to the decomposition-based forecasting methods, where the nonstationarity and nonlinearity of the raw data can be previously decreased. Evidently, these two models own the higher complexity than the models without utilization of signal decomposition techniques. However, the experimental result presents that these two models only show obvious advantages in case study 1, while in the other case studies they even perform worse. For instance, compared with Model 3, Model 1 achieves the obvious improvement in case study 1 with the index of MAE decrease by 0.018, while the corresponding index in case study 2 is increased by 0.11. It indicates that the decomposition-based forecasting methods may not be always effective in practice. The reason could be that the defects of the decomposition approaches, such as the introduction of illusive or irrelevant components, may augment the forecasting challenge [20]. Hence, the more model complexity may not indicate the better capability. (4) Compared with Models 1 and 2, the proposed method exhibits overall improvement. For example, compared with Model 1, the improvements in terms of the indexes of MAE, RMSE and MRPE (case study 3) are 30.703%, 32.602% and 24.975%, respectively. It reveals that the decomposed results of RLMD may be more appropriate for wind speed prediction than those of EMD and DWT. Generally, these two mainstream signal preprocessing approaches will suffer from the disturbances of the end effect and mode aliasing [21]. Fortunately, these problems can be effectively alleviated by the proposed RLMD [34]. It means by appropriate modification the decomposition-based method may be powerful and effective in practice. (5) The proposed method is the hybrid of RLMD, GMDH, TCKDE and UCKDE, which could take advantage of the strengths of all single models. In this method, different characteristics hidden in the data could be explained by these individual models. Specifically, RLMD is used to address the data’s nonstationarity and nonlinearity; the combination of GMDH and TCKDE not only can describe the remaining nonlinear characteristics, but also capture the embedded linear ones; UCKDE is developed to depict the uncertain information contained in the data. The final experiments show the proposed method owns the highest forecasting capability. (6) It should be emphasized that the proposed method does not have the ability to provide the satisfactory forecasting accuracy for any type of wind speed data. However, the experimental results indicate the proposed method may be more suitable for the data with stronger nonstationarity and non-Gaussianity. For example, the improvement between GMDH-UCKDE and the proposed method in terms of the index MAE in case study 1 is 27.285%, while those in the other two case studies are 17.557% and 8.451%, respectively.

Table 5 Result comparisons among deterministic predictions (Dataset 3). Indexes

MAE

RMSE

MRPE

Proposed method Model 1 Model 2 Model 3 LSSVM-UCKDE ARIMA-UCKDE TCKDE GMDH-UCKDE

0.425 0.613 0.552 0.444 0.486 0.538 0.507 0.464

0.646 0.958 0.834 0.671 0.723 0.811 0.756 0.712

0.046 0.061 0.057 0.048 0.051 0.055 0.053 0.050

Table 6 Improvements by the proposed method (Dataset 1). Indexes Proposed Proposed Proposed Proposed Proposed Proposed Proposed

PI =

vs. vs. vs. vs. vs. vs. vs.

Model 1 Model 2 Model 3 LSSVM-UCKDE ARIMA-UCKDE TCKDE GMDH-UCKDE

|Iproposed - Iothers | Iothers

PMAE (%)

PRMSE (%)

PMRPE (%)

10.924 6.331 14.950 27.946 33.376 26.061 27.285

9.348 5.568 17.262 25.480 31.359 24.650 25.193

25.627 22.641 31.510 43.142 45.902 39.479 40.790

× 100% , I = MAE, RMSE and MRPE .

Table 7 Improvements by the proposed method (Dataset 2). Indexes Proposed Proposed Proposed Proposed Proposed Proposed Proposed

vs. vs. vs. vs. vs. vs. vs.

Model 1 Model 2 Model 3 LSSVM-UCKDE ARIMA-UCKDE TCKDE GMDH-UCKDE

PMAE (%)

PRMSE (%)

PMRPE (%)

33.887 29.102 14.667 21.416 32.815 24.149 17.557

30.156 21.396 13.463 18.623 26.474 20.490 14.601

44.776 27.447 14.379 19.075 32.128 22.998 15.838

Table 8 Improvements by the proposed method (Dataset 3). Indexes Proposed Proposed Proposed Proposed Proposed Proposed Proposed

4.2.2. Probabilistic prediction In order to further demonstrate the reliability of the proposed method, the probabilistic predictions of all involved models are performed. Taking one-step ahead prediction with 95% PINC as an example, the corresponding result comparisons are enumerated in Tables 9–11. From Tables 9 to 11, it can be observed as follows:

vs. vs. vs. vs. vs. vs. vs.

Model 1 Model 2 Model 3 LSSVM-UCKDE ARIMA-UCKDE TCKDE GMDH-UCKDE

PMAE (%)

PRMSE (%)

PMRPE (%)

30.703 23.046 4.368 12.595 21.043 16.292 8.451

32.602 22.581 3.727 10.695 20.386 14.622 9.316

24.975 19.710 4.692 10.264 16.790 12.910 8.469

Table 9 Result comparisons among probabilistic predictions (Dataset 1).

(1) Compared with TCKDE, these three models including ARIMAUCKDE, Models 1 and 2 even perform worse. For example, the CWC indexes of them based on case study 3 are 0.588, 0.637, 0.752 and 0.689, respectively. The reason of these observations could be attributed to the low deterministic forecasting capability which may deteriorate the probabilistic forecasting performance. This phenomenon could also be obtained by the comparison among the methods without the utilization of signal preprocessing techniques, where ARIMA-UCKDE manifests the worst performance. (2) Apart from the above three models, the other UCKDE-based models have better forecasting reliability in comparison with TCKDE. For 11

Indexes

ACE

PINAW

CWC

Proposed method Model 1 Model 2 Model 3 LSSVM-UCKDE ARIMA-UCKDE TCKDE GMDH-UCKDE

0.030 0.030 0.050 0.050 0.050 0.050 0.050 0.040

0.294 0.382 0.348 0.357 0.402 0.484 0.429 0.373

0.294 0.382 0.348 0.357 0.402 0.484 0.429 0.373

Energy Conversion and Management 200 (2019) 112099

Y. Jiang, et al.

Wind Speed (m/s)

25

16

Measured

14

20

Proposed

Model 3

TCKDE

12

15 10 5 1001

1040

1080

T (10 min)

1120

1160

1200

Fig. 9. Deterministic predictions by Model 3, TCKDE and the proposed method (Dataset 1).

Wind Speed (m/s)

25

15

20

Measured

Proposed

Model 3

TCKDE

10

15 10 5 1001

1040

1080

T (10 min)

1120

1160

1200

Fig. 10. Deterministic predictions by Model 3, TCKDE and the proposed method (Dataset 2).

Wind Speed (m/s)

28

14

Measured

Proposed

Model 3

TCKDE

12 10 17

6 1601

1700

1800 T (10 min)

1900

2000

Fig. 11. Deterministic predictions by Model 3, TCKDE and the proposed method (Dataset 3). Table 10 Result comparisons among probabilistic predictions (Dataset 2).

Table 11 Result comparisons among probabilistic predictions (Dataset 3).

Indexes

ACE

PINAW

CWC

Indexes

ACE

PINAW

CWC

Proposed method Model 1 Model 2 Model 3 LSSVM-UCKDE ARIMA-UCKDE TCKDE GMDH-UCKDE

0.050 −0.010 0.020 0.050 0.050 0.050 0.050 0.050

0.398 0.435 0.487 0.467 0.491 0.587 0.535 0.488

0.398 1.152 0.487 0.467 0.491 0.587 0.535 0.488

Proposed method Model 1 Model 2 Model 3 LSSVM-UCKDE ARIMA-UCKDE TCKDE GMDH-UCKDE

0.048 0.050 0.050 0.050 0.050 0.050 0.050 0.050

0.377 0.752 0.689 0.441 0.533 0.637 0.588 0.494

0.377 0.752 0.689 0.441 0.533 0.637 0.588 0.494

example, the CWC indexes of Model 3 in the three case studies are 0.357, 0.467 and 0.441, respectively, whereas those of TCKDE are 0.429, 0.535 and 0.588, respectively. It indicates UCKDE may be more suitable than TCKDE for offering the well-pleasing result.

(3) Although the decomposition-based prediction methods can effectively reduce nonstationarity in data, they may not always be effective in wind speed prediction due to their own shortcomings. For example, Model 1 outperforms Model 3 in case study 1 (strong 12

Energy Conversion and Management 200 (2019) 112099

Y. Jiang, et al.

Wind speed(m/s)

25

Proposed

Model 3

TCKDE

Measured

20 15 10 5 1001

1040

1080

T(10min)

1120

1160

1200

Fig. 12. PIs by Model 3, TCKDE and the proposed method with 95% PINC (Dataset 1).

Wind speed(m/s)

25

Proposed

Model 3

TCKDE

Measured

20 15 10 5 1001

1040

1080

T(10 min)

1120

1160

1200

Fig. 13. PIs by Model 3, TCKDE and the proposed method with 95% PINC (Dataset 2).

Wind speed(m/s)

25

Proposed

Model 3

TCKDE

Measured

20 15 10 5 0 1601

1700

1800 T(10min)

1900

2000

Fig. 14. PIs by Model 3, TCKDE and the proposed method with 95% PINC (Dataset 3).

nonstartionarity), while it performs worse in case studies 2 and 3 (weak nonstartionarity). Therefore, the effectiveness of the decomposition methods depends on the balance between their advantages and disadvantages. (4) The proposed method surpasses the other involved models in each case study. In order to intuitively demonstrate the proposed method, Figs. 12–14 display the PIs constructed by Model 3, TCKDE and the proposed method. From Figs. 12–14, it can be observed that most of actual observations fall within the given PINC. Meanwhile, the PIs width of the proposed method is narrower than those of Model 3 and TCKDE. Based on the above methods, Figs. 15–17 further present the predictive PDFs and PIs of the 1039th data point of datasets 1 and 2, and the 1650th data point of dataset 3, which also validates the superiority of the proposed method.

deterministic prediction, but also yields more reliable probabilistic prediction. 5. Conclusions In order to meticulously mine attributes of wind speed data and subsequently generate more reliable prediction, this paper proposes a novel hybrid method, which is composed of robust local mean decomposition (RLMD), group method of data handling neural network (GMDH), traditional conditional kernel density estimation (TCKDE) and unbiased CKDE (UCKDE). Systematic assessment based on three case studies with one-step ahead prediction is employed to evaluate the performance of the proposed method, and several main conclusions are summarized as follows: The proposed self-adaptive RLMD technique could effectively alleviate the disturbances of the adverse factors (e.g., the end effect and

In summary, the proposed method not only produces more precise 13

Energy Conversion and Management 200 (2019) 112099

Y. Jiang, et al.

0.5

0.6

95%PINC(TCKDE) Actual value

0.5

0.5

0.4

PDF

PDF

0.4

0.6

95%PINC(Model 3) Actual value

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

0 0

10 20 Wind speed(m/s)

0 0

30

95%PINC(Proposed) Actual value

0.4

PDF

0.6

10 20 Wind speed(m/s)

0 0

30

10 20 Wind speed(m/s)

30

Fig. 15. Predictive PDFs and PIs of the 1039th data point from TCKDE, Model 3 and the proposed method (Dataset 1). 0.8

95%PINC(TCKDE) Actual value

0.6

PDF

PDF

0.6

0.4

0.2

0 0

0.8

95%PINC(Model 3) Actual value

0.4

0.2

10 20 Wind speed(m/s)

0 0

30

95%PINC(Proposed) Actual value

0.6

PDF

0.8

0.4

0.2

10 20 Wind speed(m/s)

0 0

30

10 20 Wind speed(m/s)

30

Fig. 16. Predictive PDFs and PIs of the 1039th data point from TCKDE, Model 3 and the proposed method (Dataset 2). 0.5

0.5

95%PINC(TCKDE) Actual value

0.5

95%PINC(Model 3) Actual value

0.4

0.3

0.3

0.3

PDF

PDF

0.4

PDF

0.4

0.2

0.2

0.2

0.1

0.1

0.1

0 0

10 20 Wind speed(m/s)

30

95%PINC(Proposed) Actual value

0 0

10 20 Wind speed(m/s)

30

0 0

10 20 Wind speed(m/s)

30

Fig. 17. Predictive PDFs and PIs of the 1650th data point from TCKDE, Model 3 and the proposed method (Dataset 3).

mode aliasing) by the optimizations of boundary condition, envelope estimation and sifting stopping criterion. By this approach, the original data’s nonstationarity and nonlinearity could be addressed significantly and the decomposed results may be more appropriate for wind speed prediction. For example, the case study based on dataset 3 shows that the indexes of MAE and CWC in the proposed method are 0.425 and 0.377, while those in Model 2 (DWT-based forecasting method) are 0.552 and 0.689, respectively. The combination of GMDH and TCKDE may be a salutary attempt in boosting the forecasting accuracy. For example, compared with TCKDE, GMDH-TCKDE has the better performance with the index RMSE decrease by 0.043 (case study based on dataset 2). In this method, the

potential linear and nonlinear characteristics of the data could be well captured. With the completion of this comprehensive features extraction, the satisfactory forecasting accuracy could be provided. UCKDE may own the better performance than TCKDE in providing reliable PIs. For instance, the index CWC in TCKDE is 0.429, while that of GMDH-UCKDE is 0.373 (case study based on dataset 1). Generally, this model describes the uncertain information by transforming the better deterministic prediction into the probabilistic prediction with unbiased result. Therefore, it could overcome the deficiencies of the deterministic prediction to dispose uncertainty hidden in the data. The proposed method incorporates the advantages of the above models to explain the different characteristics of wind speed data. Three 14

Energy Conversion and Management 200 (2019) 112099

Y. Jiang, et al.

case studies based on the measured data with different characteristics exhibit that: the proposed method may be more suitable for the data with stronger nonstationarity and non-Gaussianity; besides more accurate point predictions, the proposed method could also realize the more compact PIs in comparison with the other concerned models. Therefore, it may pose a great potential in practice, especially for smart grid planning. As mentioned above, the implementation of wind speed prediction in this paper only depends on the single historical wind speed time series. However, the actual wind speed is usually affected by many meteorological factors, such as wind directionality, temperature, atmospheric pressure, air density and humidity. Therefore, the future work should integrate these factors into the forecasting model for the sake of enhancing the forecasting reliability. Moreover, the subsequent research should also focus on the selection of the parameters in some evolution models.

forecasting. Energy Convers Manage 2017;148:554–68. [15] Zhang YC, Liu KP, Qin L, An XL. Deterministic and probabilistic interval prediction for short-term wind power generation based on variational mode decomposition and machine learning methods. Energy Convers Manage 2016;112:208–19. [16] Amini MH, Kargarian A, Karabasoglu O. ARIMA-based decoupled time series forecasting of electric vehicle charging demand for stochastic power system operation. Electr Power Syst Res 2016;140:378–90. [17] Zhang C, Wei HK, Zhao JS, Liu TH, Zhu TT, Zhang KJ. Short-term wind speed forecasting using empirical mode decomposition and feature selection. Renewable Energy 2016;96:727–37. [18] Liu H, Tian HQ, Li YF. An EMD-recursive ARIMA method to predict wind speed for railway strong wind warning system. J Wind Eng Ind Aerodyn 2015;141:27–38. [19] Liu D, Niu DX, Wang H, Fan LL. Short-term wind speed forecasting using wavelet transform and support vector machines optimized by genetic algorithm. Renewable Energy 2014;62:592–7. [20] Wang YM, Wu L. On practical challenges of decomposition-based hybrid forecasting algorithms for wind speed and solar irradiation. Energy 2016;112:208–20. [21] Zheng WQ, Peng XG, Lu D, Zhang D, Liu Y, Lin ZH, et al. Composite quantile regression extreme learning machine with feature selection for short-term wind speed forecasting: a new approach. Energy Convers Manage 2017;151:737–52. [22] Wu ZH, Huang NE. Ensemble empirical mode decomposition: a noise-assisted data analysis method. Adv Adaptive Data Anal 2009;1(01):1–41. [23] Torres ME, Colominas MA, Schlotthauer G, Flandrin P. A complete ensemble empirical mode decomposition with adaptive noise. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2011. p. 4144–7. [24] Ravikumar K, Tamilselvan S. On the use of wavelets packet decomposition for time series prediction. Appl Math Sci 2014;8(58):2847–58. [25] Gilles J. Empirical wavelet transform. IEEE Trans Signal Process 2013;61(16):3999–4010. [26] Dragomiretskiy K, Zosso D. Variational mode decomposition. IEEE Trans Signal Process 2014;62(3):531–44. [27] Sun N, Zhou JZ, Chen L, Jia BJ, Tayyab M. An adaptive dynamic short-term wind speed forecasting model using secondary decomposition and an improved regularized extreme learning machine. Energy 2018;165:939–57. [28] Zhang Y, Wang JX, Wang XF. Review on probabilistic forecasting of wind power generation. Renew Sustain Energy Rev 2014;32(5):255–70. [29] Bremnes JB. Probabilistic wind power forecasts using local quantile regression. Wind Energy: Int J Progr Appl Wind Power Convers Technol 2004;7(1):47–54. [30] Juban J, Fugon L, Kariniotakis G. Probabilistic short-term wind power forecasting based on kernel density estimators, European Wind Energy Conference and exhibition. EWEC 2007, 2007.. [31] Hu JM, Wang JZ. Short-term wind speed prediction using empirical wavelet transform and Gaussian process regression. Energy 2015;93:1456–66. [32] Zhang C, Wei HK, Zhao X, Liu TH, Zhang KJ. A Gaussian process regression based hybrid approach for short-term wind speed prediction. Energy Convers Manage 2016;126:1084–92. [33] Smith JS. The local mean decomposition and its application to EEG perception data. J R Soc Interface 2005;2(5):443–54. [34] Liu ZL, Jin YQ, Zuo MJ, Feng ZP. Time-frequency representation based on robust local mean decomposition for multicomponent AM-FM signal analysis. Mech Syst Sig Process 2017;95:468–87. [35] Cheng JS, Yang Y, Yang Y. A rotating machinery fault diagnosis method based on local mean decomposition. Digital Signal Process 2012;22(2):356–66. [36] Srinivasan D. Energy demand prediction using GMDH networks. Neurocomputing 2008;72(1–3):625–9. [37] Madandoust R, Bungey JH, Ghavidel R. Prediction of the concrete compressive strength by means of core testing using GMDH-type neural network and ANFIS models. Comput Mater Sci 2012;51(1):261–72. [38] Khosravi A, Machado L, Nunes RO. Time-series prediction of wind speed using machine learning algorithms: a case study Osorio wind farm, Brazil. Appl Energy 2018;224:550–66. [39] Hyndman RJ, Bashtannyk DM, Grunwald GK. Estimating and visualizing conditional densities. J Comput Graph Statist 1996;5(4):315–36. [40] Zambom AZ, Dias R. A review of kernel density estimation with applications to econometrics. arXiv preprint arXiv:1212.2812, 2012. [41] Liu H, Duan Z, Han FZ, Li YF. Big multi-step wind speed forecasting model based on secondary decomposition, ensemble method and error correction algorithm. Energy Convers Manage 2018;156:525–41. [42] Wang Y, Hu QH, Meng DY, Zhu PF. Deterministic and probabilistic wind power forecasting using a variational Bayesian-based adaptive robust multi-kernel regression model. Appl Energy 2017;208:1097–112. [43] Jiang Y, Zhao N, Peng LL, Liu SY. A new hybrid framework for probabilistic wind speed prediction using deep feature selection and multi-error modification. Energy Convers Manage 2019;199. 111981.

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgements The supports by the National Natural Science Foundation of China (Grant Nos. 51808078, 51878041), the 111 Project (Grant No. B18062), the China Postdoctoral Science Foundation (Grant No. 2018M640900) and the Doctoral Funding of Southwest University (SWU118067) are greatly acknowledged. References [1] Zhang D, Peng XG, Pan KD, Liu Y. A novel wind speed forecasting based on hybrid decomposition and online sequential outlier robust extreme learning machine. Energy Convers Manage 2019;180:338–57. [2] Hu JM, Wang JZ, Zeng GW. A hybrid forecasting approach applied to wind speed time series. Renewable Energy 2013;60:185–94. [3] Sun W, Wang YW. Short-term wind speed forecasting based on fast ensemble empirical mode decomposition, phase space reconstruction, sample entropy and improved back-propagation neural network. Energy Convers Manage 2018;157:1–12. [4] Zhou JY, Jing S, Gong L. Fine tuning support vector machines for short-term wind speed forecasting. Energy Convers Manage 2011;52(4):1990–8. [5] Cadenas E, Rivera W. Wind speed forecasting in the south coast of Oaxaca, Mexico. Renewable Energy 2007;32(12):2116–28. [6] Kavasseri RG, Seetharaman K. Day-ahead wind speed forecasting using f-ARIMA models. Renewable Energy 2009;34(5):1388–93. [7] Gao MY, Wang Y, Wang YF, Wang P. Experimental investigation of non-linear multistable electromagnetic-induction energy harvesting mechanism by magnetic levitation oscillation. Appl Energy 2018;220:856–75. [8] Li G, Shi J, Zhou JY. Bayesian adaptive combination of short-term wind speed forecasts from neural network models. Renewable Energy 2011;36(1):352–9. [9] Yuan XH, Chen C, Yuan YB, Huang YH, Tan QX. Short-term wind power prediction based on LSSVM–GSA model. Energy Convers Manage 2015;101:393–401. [10] Yu CJ, Li YL, Zhang MJ. An improved wavelet transform using singular spectrum analysis for wind speed forecasting based on elman neural network. Energy Convers Manage 2017;148:895–904. [11] Boroojeni KG, Amini MH, Bahrami S, Iyengar SS, Sarwat AI, Karabasoglu O. A novel multi-time-scale modeling for electric power demand forecasting: from short-term to medium-term horizon. Electr Power Syst Res 2017;142:58–73. [12] Cadenas E, Rivera W. Wind speed forecasting in three different regions of Mexico, using a hybrid ARIMA–ANN model. Renewable Energy 2010;35(12):2732–8. [13] Guo ZH, Zhao J, Zhang WY, Wang JZ. A corrected hybrid approach for wind speed prediction in Hexi Corridor of China. Energy 2011;36(3):1668–79. [14] Han QK, Meng FM, Hu T, Chu FL. Non-parametric hybrid models for wind speed

15