Nonparametric wind power forecasting under fixed and random censoring

Nonparametric wind power forecasting under fixed and random censoring

Energy Economics 84 (2019) 104520 Contents lists available at ScienceDirect Energy Economics journal homepage: www.elsevier.com/locate/eneeco Nonpa...

3MB Sizes 2 Downloads 51 Views

Energy Economics 84 (2019) 104520

Contents lists available at ScienceDirect

Energy Economics journal homepage: www.elsevier.com/locate/eneeco

Nonparametric wind power forecasting under fixed and random censoring夽 Christian M. Dahl a,∗ , Georgios Effraimidis b , Mikkel H. Pedersen a a b

University of Southern Denmark, Denmark Qualcomm, United States

a r t i c l e

i n f o

Article history: Received 30 April 2018 Received in revised form 6 June 2019 Accepted 14 September 2019 Available online 20 October 2019 JEL classification: C14 C21 C22 C24 C53 Q47

a b s t r a c t We consider nonparametric forecasting of wind power for individual wind turbines, allowing for random right censoring as well as two-sided fixed censoring. We propose a very fast estimation algorithm and show that this estimator of the unknown regression function is uniformly consistent. We argue that the key statistical features of the proposed nonparametric regression framework, such as nonlinearities and fixed and random censoring, are all needed in order to properly capture the main characteristics of wind power production functions. We show by a simulation study that the asymptotic properties of the estimator also holds in finite samples. We provide an empirical illustration comparing the forecasting accuracy of the proposed nonparametric regression model to some of the existing and popular forecasting devices used for predicting short to medium term wind power production at the individual turbine level. The empirical results are generally very encouraging. © 2019 Published by Elsevier B.V.

Keywords: Kaplan-Meier estimator Nonparametric methods Time series

1. Introduction Over a relatively short period of time, renewable energy sources have shifted from being plausible alternatives to becoming realities in national and international energy policies. The main reasons for this are well-known to all: The foreseeable future depletion of petrol reserves and the intensification of ecological policies, such as those promoted by the Kyoto Protocol. According to the International Energy Agency, in 2014 and 2015 almost 23 percent of global electricity was produced from renewable energy sources. Amongst those, wind power is the fastest growing power generating technology in Europe, the United States, and Canada, and the second largest in China. In Denmark, wind energy currently meets more

夽 We thank Niels Haldrup for useful comments and suggestions as well as seminar participants at CREATES-Aarhus University, the Asian Meeting of the Econometric Society in Singapore (2013), and the Annual Meeting of the Midwest Econometrics Group in Bloomington (2013). We thank the anonymous reviewers and the editor for their careful reading of our manuscript and their many insightful comments and suggestions. We bear the sole responsibility for all remaining errors. ∗ Corresponding author at: Department of Business and Economics, University of Southern Denmark, Campusvej 55, 5230 Odense M., Denmark E-mail address: [email protected] (C.M. Dahl). https://doi.org/10.1016/j.eneco.2019.104520 0140-9883/© 2019 Published by Elsevier B.V.

than 40 percent of the country’s electricity demand while the share is nearly 20 percent in Ireland, Portugal, and Spain. According to the European Commission’s communication “Energy Roadmap 2050”, wind power is expected to provide more electricity than any other technology in 2050. Energy market makers require reliable forecasts of the expected wind power production up to 36 hours in advance to manage their energy portfolios and to ensure the daily availability of energy to consumers while providing the best service, the lowest price, and the lowest environmental impact possible. Likewise, local wind power producers operating under “free” market conditions benefit directly from better forecasts, as deviations of expected and actual wind power production are penalized extensively. Hence, better and more efficient forecasts are likely to reduce operating costs, which is especially appealing in the current transition period when government subsidies across European power markets are being reduced or eliminated. Finally, better and more efficient forecasts will reduce total energy supply uncertainty and volatility, which is otherwise induced via increasing wind power penetration rates in the European power markets. Thus, the objective of this paper is to address all these needs by developing a new econometric model that can be used to improve

2

C.M. Dahl, G. Effraimidis and M.H. Pedersen / Energy Economics 84 (2019) 104520

the quality of short term (up to 36 h) forecasts of wind power production, both for individual wind turbines and for wind farms. We believe that existing forecasting models at the turbine specific level are not optimal because they ignore many statistically important features of wind power production, which results in imprecise and possibly inconsistent predictions. We will argue that the most important of these “neglected” features include: (a) Fixed left and right censoring of the wind power production function, which follows naturally because wind turbines are capacity constrained and do not operate frictionlessly/costlessly at low wind speeds; (b) Random right censoring of the wind power production function; for example, caused by precautionary interventions in the presence of unstable/turbulent weather systems; (c) Nonlinearities, where the functional form of the relationship between power output and it determinants is unknown; and finally (d) Heteroskedasticity or time varying volatility of wind power production, primarily caused by the difference in the stability of high and low pressure weather systems. All the above features of the sampling design are familiar and important concepts in statistical analysis. However, they have almost exclusively been applied in very different areas and, to the best of our knowledge, rarely in a unified setting. The ambition of this paper is to build a unified theoretical framework taking all these observed features of the data into account and to analyze and illustrate to what extent this will improve the short term forecasts. In particular, we propose a nonparametric regression model where fixed left or right censoring or random right censoring can occur at any point in time. Using a conditional Kaplan-Meier type estimator for the conditional cumulative distribution function, we propose an easy and fast method to compute an estimator of the unknown regression function and show that the estimator is uniformly consistent under a reasonable set of regularity conditions. The paper is organized as follows: Section 2 defines and describes the general modeling framework that we consider. In Section 3 we present the estimation procedure and in Section 4 we establish the uniform consistency of the estimator under a very general set of assumptions. Section 5 discusses the requirements for forecasting in the proposed modeling framework. In Section 6 we examine the performance of the new estimator on simulated data, while Section 7 illustrates the performance of the estimator on empirical data. Out-of-sample predictions of power output for individual wind power turbines are computed and compared to the predictions from some of the most popular forecasting tools applied in the industry. Finally, Section 8 concludes.

soring, (ii) fixed right censoring, and (iii) random right censoring. In particular, fixed left (right) censoring refers to the situation in which – when Yt∗ ≤ L (Yt∗ ≥ U) – we only observe the corresponding inequality, where the censoring given by L and U is fixed and observed with L < U. Moreover, when Yt∗ > L the output variable is sometimes randomly right censored; that is, for some observations we observe that the variable Yt∗ exceeds a certain value, Ct , with Ct to take values in the interval (L, U) with probability one. It should be emphasized that the main asymptotic results of this paper will still apply if L and U depend on t as long as L and U are both nonstochastic (fixed) and bounded uniformly in t. Dependence on t would in this case not affect the consistency results. However, in order to keep notation simple and because L and U are time invariant in the application we suppress subscript t in what follows. In mathematical notation, we observe n copies (Yt , Xt , rt : 1 ≤ t ≤ n), where Yt = min{max(min(Yt∗ , Ct ), L), U}, rt = 1(Yt∗ < Ct ) with P(L < Ct < U) = 1 and Ct ⊥⊥ Yt∗ |Xt , where ⊥⊥ denotes stochastic independence. Note that rt = 1 when Yt = L, and rt = 0 when Yt = U. Note also that Ct is never observed, only the censoring indicator, rt . The conditional independence assumption, Ct ⊥⊥ Yt∗ |Xt , is critical and plays a crucial role in the proof of Proposition 1.1 In the empirical illustration presented in Section 7 there exist two types of random censoring and they both satisfy the conditional independence assumption. The first type of random censoring typically occurs in relation to maintenance and calibration of the windpower turbines. Maintenance and calibration are normally scheduled weeks in advance, and hence this form of random censoring is not affected by the expected production. The second type of random censoring can be viewed as a production frontier function of the wind power turbine when L < Y* < U. The production frontier function of a wind power turbine is predetermined and only depends on inputs (X). In this case, the conditional independence assumptions hold true. There is a large literature on the nonparametric identification and estimation of (Xt ) in the latent-dependent-variable framework; see, e.g., Chen et al. (2005) and Lewbel and Linton (2002), as well as more recent contributions on nonparametric identification of accelerated failure times in competing risk models; see e.g., Lee and Lewbel (2013). We should point out that our model cannot be nonparametrically identified and estimated by the methods of Chen et al. (2005) and Lewbel and Linton (2002), which focus on cross-sectional data and allow only for one-sided fixed censoring, nor by the competing risk modeling framework of Lee and Lewbel (2013). 3. Estimation of the function 

2. The nonparametric censored regression model Let Yt∗ represent the (latent) output (e.g., wind power) and Xt be a vector of observed covariates that affect the realization of Yt∗ (e.g., wind speed), where the index t refers to the corresponding time period. To keep the notation simple throughout the paper, we assume that all the components of the vector Xt are continuously distributed. The inclusion of discrete covariates is straightforward and thus we omit it for notational convenience. In particular, all the main results in the sequel are valid by just conditioning on the values of the discrete components of Xt . The time series is expressed as follows: Yt∗ = (Xt ) + t ,

(1)

where we do not make any parametric assumptions regarding the function  and the distribution of the disturbance term t . Our main goal is to provide forecasts for the outcome variable Yt∗ as a function of Xt . The main challenge of the above problem is that there are three types of censoring with respect to the variable Yt∗ : (i) fixed left cen-

Our first step is to provide either a point estimate or an interval estimate for the function . Interval estimation in this context implies that for some values of Xt we will obtain an inequality for the estimate of . In the sequel, we will make use of the following zero conditional median assumption concerning the error term: Assumption 1.

P(t ≤ 0|Xt ) = 0.5.

Our estimation strategy for  is based on the fact that the quantity P(Yt∗ ≤ y|Xt ) can be nonparametrically estimated only for y ∈ [L, U]. We first provide the connection between the probability P(Yt∗ ≤ y|Xt ) and the function . Let q(Xt ):= inf {y : P(Yt∗ ≤ y|Xt ) ≥ 0.5}. That is, the quantity q(Xt ) is defined as the conditional median of the distribution of Yt∗ . By Assumption 1, if P(Yt∗ ≤ L|Xt ) > 0.5 ⇒ L − (Xt ) > 0 ⇒ (Xt ) < L,

(2)

if P(Yt∗ ≤ U|Xt ) < 0.5 ⇒ U − (Xt ) < 0 ⇒ (Xt ) > U,

(3)

1

Specifically, the condition C t ⊥ Yt∗ |Xt ensures that Eq. (A.8) reduces to Eq. (A.9).

C.M. Dahl, G. Effraimidis and M.H. Pedersen / Energy Economics 84 (2019) 104520

if

P(Yt∗

≤ L|Xt ) < 0.5 and

⇒ (Xt ) = q(Xt ),

P(Yt∗

˛-mixing property. More precisely, the mixing coefficients are defined as follows:

≤ U|Xt ) > 0.5

with q(Xt ) ∈ [L, U].

(4)

P(Yt∗ ≤ y|Xt , Yt∗ ≤ L)P(Yt∗ ≤ L|Xt ) +P(Yt∗



y|Xt , Yt∗

>

L)P(Yt∗

(5)

> L|Xt )

Consequently, ˆ t∗ ≤ y|Xt ) = P(Y ˆ t∗ ≤ L|Xt ) + P(Y ˆ t∗ ≤ y|Xt , Yt∗ > L)P(Y ˆ t∗ > L|Xt ). (6) P(Y ˆ t∗ ≤ L|Xt ) and P(Y ˆ t∗ > L|Xt ) can be obtained The quantities P(Y by means of a conventional conditional kernel estimator, whereas ˆ t∗ ≤ y|Xt , Yt∗ > L) can be derived by applying the Kaplan-Meier P(Y estimator. To proceed with the detailed description of the estimation method, we need to add some extra notation. Specifically, let ω = (ω1 , . . ., ωd ), where d is equal to the dimension of the vector Xt ,

l=d  ω 

and introduce the quantity Kh (ω) = 1d l=1 K hl . K is a bounded h and integrable kernel of order two, which is either Lipschitz continuous or has a bounded derivative with integrable tail, see, e.g., Hansen (2008). The bandwidth parameter h converges to zero as the sample size n goes to infinity. Moreover, we will use the counting process Nt (y) = 1(Yt ≤ y, rt = 1) and the local weights Kh (Xt − Xi )

n

i=1

Kh (Xt − Xi )

n 

n 

(7)

=1−

1−

i=1

Proposition 1. Suppose that the probability density functions f(Xt ) and f(X0 , X ) are bounded for all t > 0 and all ≥ t for some t . In addition, suppose that the second derivatives (w.r.t. Xt ) of f(Xt ) and P(Yt∗ ≤ y|Xt )f (Xt ) are continuous. Then, ˆ t∗ ≤ y|Xt ) − P(Yt∗ ≤ y|Xt )| = Op |P(Y

n =1

wi (Xt )1(Yi > L) 1(Y ≥ Yi )1(Y > L)w (Xt )

Ni (y) .

(8)

Accordingly, the estimator of q(Xt ) is obtained by using the following formula: ˆ t∗ ≤ y|Xt ) ≥ 0.5}. qˆ (Xt ) = inf {y : P(Y



ln n nhd



+ h2 .

Furthermore, suppose that the probability density function f (Yt∗ |Xt ) is ˆ t∗ ≤ L|Xt ) < 0.5 and P(Y ˆ t∗ ≤ continuous w.r.t. to both arguments, P(Y 3 U|Xt ) > 0.5. Then, ln n nh

d



+ h2

.

5. Forecasting We are primarily interested in the forecast of Yt∗ , and thus we will restrict our attention to the estimation of the quantity E(Yt∗ |Xt ). For this purpose we need to impose an extra condition on the distribution of the error term. More precisely, we will need the following assumption:

and for all y ∈ [L, U] ˆ t∗ ≤ y|Xt , Yt∗ > L) P(Y

For the bandwidth

sequence we assume that (nh ) log n = o(1). Note that the conditions for our problem are simpler than those of Hansen (2008), as we set the parameter s equal to ∞. The reason for having s =∞ is that our goal is to derive the uniform convergence result for a distribution function, which implies that the dependent variable in the nonparametric estimation is an indicator function; that is, for every value of s the expectation of the dependent variable will be finite (in fact, less than one). Let X ⊂ Rd and Y ⊂ [L, U] be compact sets. We assume throughout the paper that P(min(Yt∗ , Ct ) > U − |Xt ) > 0 for any  > 0, Xt ∈ X. In the sequel, we will use the generic symbol f to denote the density of the corresponding random variable (or vector). The following result establishes the uniform convergence rate of the nonparametric estimators. Its proof is presented in Appendix A.2

Xt ∈ X

wi (Xt )1(Yi > L),

ˇ−1−(d/)−d . ˇ+3−d

d −1

sup |ˆq(Xt ) − q(Xt )| = Op

i=1

 n 

log n = o(1), where =



ˆ t∗ > L|Xt ) P(Y

i=1

=

−1

y ∈ Y,Xt ∈ X

wi (Xt )1(Yi ≤ L),

|P(A ∩ B) − P(A)P(B)| = O(m−ˇ ),

where M1 and M∞ +m denote the -field generated by {Wt : 1 ≤ t ≤ } and {Wt :  + m ≤ t}, respectively. To establish the uniform convergence rate of the estimators, we should impose some conditions regarding the parameter ˇ as well as the bandwidth sequence, h. Following Hansen (2008), we require the existence of a parameter, , such that ˇ > 1 + (d/) + d and

sup

.

Using the nonparametric estimator for the conditional probability function proposed by Li and Racine (2006), p. 182, we write ˆ t∗ ≤ L|Xt ) = P(Y

sup

A ∈ M ,B ∈ M∞ +m

(n hd )

= P(Yt∗ ≤ L|Xt ) + P(Yt∗ ≤ y|Xt , Yt∗ > L)P(Yt∗ > L|Xt ).

wi (Xt ) =

˛m :=

1

The “if” statements in the above expressions are mutually exclusive, and consequently only one of these three will hold for some Xt . In view of the above, it is clear that once we estimate the quantity P(Yt∗ ≤ L|Xt ), we can proceed with the (point) estimation of the function . We have for y ∈ [L, U] that P(Yt∗ ≤ y|Xt ) =

3

(9)

By making use of (2)–(4), (6), and (9), we can obtain an (point) estimate for the function . Note that the point estimate for a certain value of Xt is derived in our problem if and only if there exists y¯ ˆ t∗ ≤ y|X ¯ t ) = 0.5. In that case, obviously qˆ (Xt ) = y. ¯ Note such that P(Y that qˆ (Xt ) can be interpreted as a conditional median estimator. 4. Consistency Define Wt := (Yt , Xt , rt ). We first impose the condition that the sequence {Wt : 1 ≤ t ≤ n} is strictly stationary and satisfies the

Assumption 2.

E(t |Xt ) = 0.

A sufficient condition for Assumptions 1 and 2 to be true is the symmetry around zero of the distribution of t . Assumption 2 implies (Xt ) = E(Yt∗ |Xt ). Therefore, by virtue of Assumptions 1 and 2 and the relationships (2)–(4), we get if P(Yt∗ ≤ L|Xt ) > 0.5 ⇒ E(Yt∗ |Xt ) < L,

(10)

2 Lecoutre and Ould-Said (1995) calculate the convergence rate for the KaplanMeier estimator under the assumption that the dependence pattern of the data is alpha-mixing. However, as we explain in Appendix A (before Lemma 1), our convergence rate is faster than the rate derived by Lecoutre and Ould-Said (1995). 3 The conditions of Proposition 1 require continuity and boundness and not uniform continuity and uniform boundness as in the case of Hansen (2008). The reason for using this slightly weaker requirement is that we consider fixed compact sets (and thus uniform continuity and boundness automatically hold) for our asymptotic analysis.

4

C.M. Dahl, G. Effraimidis and M.H. Pedersen / Energy Economics 84 (2019) 104520

if P(Yt∗ ≤ U|Xt ) < 0.5 ⇒ E(Yt∗ |Xt ) > U, if P(Yt∗ ≤ L|Xt ) < 0.5 and ⇒ E(Yt∗ |Xt ) = q(Xt ),

(11)

P(Yt∗ ≤ U|Xt ) > 0.5

with q(Xt ) ∈ [L, U].

(12)

6. Simulation In this section we present a small Monte Carlo study in order to analyze and illustrate the finite sample properties of qˆ . We have designed the simulation study specifically to mimic some of the most salient empirical features of wind power curves as already discussed in the introduction. The importance of these features for empirical wind power curve modeling will be discussed further in the empirical application of Section 7. To be specific, we assume that wind speed is given according to Xt ∼N(8, 9|0 ≤ Xt ≤ 15). The latent power output of a wind turbine is given as Yt∗ = (Xt ) + t , where  is specified as a ninth order



polynomial, and t ∼N(0, 0.15 Xt ). The ninth order polynomial and the degree of heterogeneity in t have been carefully calibrated to match observed data and the shape of the empirical power curves of Section 7. In all the simulations and in the empirical illustration we have specified the Gaussian kernel for Kh ( · ). We consider three types of censoring: (i) Fixed left censoring when Yt∗ ≤ 0, (ii) fixed right censoring when Yt∗ ≥ 3.075, and (iii) random right censoring, which occurs between the fixed censoring points. In this context we view the random censoring as an upper limit on the output of the turbine that exists for all intermediate values of wind speed. In particular, we consider three cases of random censoring. The first case is without random censoring. In the second case we let the random censoring cut-off value be increasing in the wind speed. We will refer to this case as heteroskedastic random censoring. To be precise we let Ct be a function of Xt . It would be interesting to analyze under which conditions this function, which can be interpreted as the efficient production frontier function, could be identified and estimated. This, however, is outside the scope of the present paper. In the third case the random censoring is as in the second case but tighter. In this case we define the boundary such that the turbine can be viewed as operating very close to its maximal output for all intermediate values of wind speed. Consequently, we will refer to this case as tight random censoring. In all cases we sample 10,000 observations, using 80 percent of the sample for estimation and 20 percent for out-of-sample forecasting. The number of independent Monte Carlo replications equals 100. We consider the estimator qˆ as well as a kernel estiˆ which we will refer to as the empirical power curve. mator k, This empirical power curve, computed according to the “method of bins” using 0.5 m/s bins, is the relevant benchmark when computing and comparing power curves according to International Electrotechnical Commission (IEC) Standards, see, e.g., page 21 in IEC Standards (2005), which can be found at http://www.iec.ch for computational details. This method-of-bins approach suggested by the IEC turns out to be very simply transformed into a traditional Nadaraya-Watson kernel regression estimator/forecaster using a uniform kernel with a proper bandwidth selection in order to mimic the 0.5 m/s bins. The estimator qˆ requires that the indicator variable of censorship is always observable. Therefore, a comparison should be made with an estimator that simply ignores those random censored observations. If random censorship only happens very rarely, it is necessary to know if a common practice of discarding these observations introduces some noticeable bias. It might be that random censorship does not have the same impact in the present

applications as, for instance, in medicine studies. The comparison will be made with estimators (counterparts) that are computed based on the same kernel weights (w(Xt )) and/or bins. Consequently, we compare the estimators qˆ , kˆ with their counterparts qˆ NORC and kˆ NORC where random censored observations are simply discarded. In order to evaluate the accuracy of the estimators we will consider five alternative measures of forecasting accuracy. In addition to Root Mean Squared Error (RMSE) and Mean Absolute Deviation (MAD) we will report the Scale-independent Mean Absolute Prediction Error (SMAPE) (Armstrong, 1985), as well as the Geometric Mean Relative Absolute Error (GMRAE) of Armstrong and Collopy (1992) and Fildes (1992) and the Unscaled Mean Bounded Relative Absolute Error (UMBRAE) recently suggested by Chen et al. (2017). We define the measures based on relative absolute errors, i.e., GMRAE and UMBRAE, relative to qˆ . A reported value of, say, GMRAE, where GMRAE is larger than unity indicates that the estimator kˆ is less accurate than qˆ . A reported measure of UMBRAE larger than unity has the interpretation that the estimator kˆ is (UMBRAE-1)*100 percent less accurate than qˆ . The SMAPE, GMRAE, and UMBRAE are all included, as they have been shown to be very robust against outliers; see Chen et al. (2017). Special attention to outliers is very important, as they are commonly observed in wind turbine data. Summary statistics of all the predictive accuracy measures are presented in Table 1 for the case of no ramdom censoring. From ˆ Table 1 it is evident that qˆ is performing marginally better than k. The NORC estimators qˆ NORC and kˆ NORC are included only to illustrate ˆ respectively, when and emphazise that they are identical to qˆ and k, there is no censoring. Table 2 shows that qˆ is uniformly more accurate than kˆ across all the reported wind speed ranges regardless of the predictive accuracy measure under consideration. Furthermore, it is evident that the degree of fixed and random censoring clearly affects the relative forecasting performance of qˆ . Based on the UMBRAE measure we can quantify the relative improvement to about 10 percent when wind speeds are in the m/s ≥10 interval (this is the interval in which fixed censoring most frequently occurs) and about 5 percent when wind speeds are in the 5-10 m/s interval (this is the interval in which (tight) random censoring most frequently occurs). The simulation results are also clear in pointing out that it can be very costly in terms of loss of accuracy to ignore random censoring. More precisely, qˆ is about 8–9 percent more accurate than qˆ NORC and 14–16 percent more accurate than kˆ NORC based on GMRAE and UMBRAE (for all wind directions). Specifically, when wind speeds are in the 5-10 m/s interval qˆ , is 13–14 percent and 19–22 percent more accurate than qˆ NORC and kˆ NORC , respectively. In Fig. 1, for illustrational purposes, we have summarized and plotted the results from one single realization of the sample in each of the three cases as well as the true underlying function, i.e., . Figure 1a depicts the data in the case without random censoring. In this case, kˆ is generally over-smoothing and fails to accurately capture the true curvature of , in particular for wind speeds close to the fixed censoring cut-off points (a feature that is also evident in Fig. 1b and c). On the contrary, the qˆ estimator very precisely fits the underlying shape of the power curve and also very successfully fits the “kinks”; i.e., the points where the latent power curve, Y*, crosses the known censoring points, L and U (since we do observe the latent power curve, we do not observe these crossing points/kinks, not even when L and U are known). As there is no censoring in this case, qˆ and kˆ are identical to qˆ NORC and kˆ NORC , respectively. In Fig. 1b, the case of heteroskedastic random censoring, qˆ again performs very well, while kˆ is marginally worse than in the uncensored case. Although censoring is present in this case there are no apparent visual differences between qˆ and qˆ NORC and kˆ and kˆ NORC . Finally, in Fig. 1c the accuracy of qˆ seems unaffected by the presence of

C.M. Dahl, G. Effraimidis and M.H. Pedersen / Energy Economics 84 (2019) 104520

5

Table 1 ˆ qˆ NORC , and kˆ NORC when there is no random censoring in the population. The reported results are based Without random censoring: simulation results on the accuracy of qˆ , k, on sample sizes partitioned as (n1 , n2 ) = (8000, 2000), where n1 is used for estimation and n2 is used to evaluate predictive accuracy only. The number of replications equals M = 100. The terms RMSE (root mean squared errors), MAD (mean absolute deviation), SMAPE (Scale-independent Mean Absolute Prediction Error), GMRAE (Geometric Mean Relative Absolute Error) and UMBRAE (Unscaled Mean Bounded Relative Absolute Error) are the measures of predictive accuracy. GMRAE and UMBRAE are measured relative to qˆ , and if the reported measures are larger than unity the estimator is less accurate. Without random censoring

All qˆ kˆ qˆ NORC kˆ NORC m/s <5 qˆ kˆ qˆ NORC kˆ NORC 5≤ m/s <10 qˆ kˆ qˆ NORC kˆ NORC m/s ≥ 10 qˆ kˆ qˆ NORC kˆ NORC

RMSE

MAD

SMPAPE

GMRAE

UMBRAE

0.4523333 0.4705601 0.4523333 0.4705601

0.3614208 0.3750885 0.3614208 0.3750885

1.1148366 1.8399758 1.1148366 1.8399758

– 1.035157 1.0000000 1.035157

– 1.033404 1.000000 1.033404

0.4525611 0.4658218 0.4525611 0.4658218

0.3851477 0.3982945 0.3851477 0.3982945

2.0000000 4.4809360 2.0000000 4.4809360

– 1.045752 1.000000 1.045752

– 1.044559 1.000000 1.044559

0.4152378 0.4178494 0.4152378 0.4178494

0.3298512 0.3322265 0.3298512 0.3322265

1.2893317 1.8499373 1.2893317 1.8499373

– 1.007523 1.000000 1.007523

– 1.006948 1.000000 1.006948

0.5311782 0.5807335 0.5311782 0.5807335

0.4226358 0.4638871 0.4226358 0.4638871

0.1348378 0.1485684 0.1348378 0.1485684

– 1.097932 1.000000 1.097932

– 1.092777 1.000000 1.092777

Table 2 ˆ qˆ NORC , and kˆ NORC when there is heteroskedastic and tight random censoring, respectively, in the population. Random censoring: simulation results on the accuracy of qˆ , k, The reported results are based on sample sizes partitioned as (n1 , n2 ) = (8000, 2000), where n1 is used for estimation and n2 is used to evaluate predictive accuracy only. The number of replications equals M = 100. The terms RMSE (root mean squared errors), MAD (mean absolute deviation), SMAPE (Scale-independent Mean Absolute Prediction Error), GMRAE (Geometric Mean Relative Absolute Error) and UMBRAE (Unscaled Mean Bounded Relative Absolute Error) are the measures of predictive accuracy. GMRAE and UMBRAE are measured relative to qˆ , and if the reported measures are larger than unity the estimator is less accurate. Heteroskedastic random censoring

All qˆ kˆ qˆ NORC kˆ NORC m/s <5 qˆ kˆ

Tight random censoring

RMSE

MAD

SMPAPE

GMRAE

UMBRAE

RMSE

MAD

SMPAPE

GMRAE

UMBRAE

0.4440704 0.4660459 0.4447884 0.4670603

0.3529547 0.3689200 0.3536481 0.3698112

0.7121128 1.2156893 0.6196462 1.0303540

– 1.0435784 1.0024604 1.0465425

– 1.039412 1.002586 1.042220

0.4441517 0.4733164 0.4756953 0.5054773

0.3525447 0.3751544 0.3807180 0.4048958

0.7603902 0.8231915 0.6137903 0.7264602

– 1.0658405 1.0912401 1.1638456

– 1.0602840 1.0788974 1.1414947

0.3843992 0.3912779 0.3849103 0.3874539

0.3198871 0.3265114 0.3203541 0.3228222

3.4449672 6.6667594 2.8748188 5.3795910

– 1.0263106 1.0032567 1.0128609

– 1.024152 1.002887 1.011470

0.3868950 0.3868696 0.3951865 0.3914786

0.3213280 0.3213761 0.3291217 0.3257155

3.7694553 3.9022478 2.5313332 3.2034047

– 0.9990476 1.0314406 1.0185212

– 0.9996865 1.0270443 1.0159585

0.3306629 0.3368176 0.3317022 0.3392786

0.2341782 0.2266192 0.2288001 0.2537139

– 1.0178369 1.0032698 1.0262303

– 1.015389 1.003578 1.023326

0.4159362 0.4391475 0.4689036 0.4945784

0.3302763 0.3489807 0.3753886 0.3976304

0.2229602 0.2867228 0.3044382 0.3092977

– 1.0601455 1.1479936 1.2221223

– 1.0545153 1.1277022 1.1882746

0.4278923 0.4735292 0.4278923 0.4735292

0.1339655 0.1490348 0.1339655 0.1490348

– 1.1204043 1.0000000 1.1204043

– 1.110390 1.000000 1.110390

0.5357742 0.5898234 0.5357742 0.5898234

0.4269198 0.4735715 0.4269198 0.4735715

0.1336296 0.1490371 0.1336296 0.1490371

– 1.1257830 1.0000000 1.1257830

– 1.1160408 1.0000000 1.1160408

qˆ NORC qˆ NORC 5 ≤ m/s < 10 qˆ 0.4160417 kˆ 0.4246560 0.4172012 qˆ NORC kˆ NORC 0.4274338 m/s ≥ 10 qˆ 0.5356504 kˆ 0.5891569 qˆ NORC 0.5356504 qˆ NORC 0.5891569

very tight random censoring and still fits  well, while the bias of kˆ becomes very visible. In this case there is a very clear visual difference between qˆ and the remaining estimators confirming the importance of accounting for random censoring. To sum up, the results reported in Table 1 and in Fig. 1a–c all confirm Proposition 1, i.e., consistency of qˆ in the presence of fixed and possibly substantial random censoring. Overall, the simulation results are very encouraging and suggest that the theoretical results are likely to hold in finite samples.

7. Empirical illustration There exists a long tradition in forecasting wind power production, particularly from parametric models, see e.g., Nielsen et al. (2001) and Sanchez (2006). A common feature of these traditional parametric models is that the censored nature of the power output is ignored. Recently, however, Pinson (2012) has proposed a parameterized generalized linear model accounting explicitly for two-sided fixed censoring. Pinson’s model is dynamic and is designed for very short term forecasting, i.e., 10 min ahead. Furthermore, the conditional mean function of the model proposed by Pinson (2012) is assumed to be linear in the non-censored part of

6

C.M. Dahl, G. Effraimidis and M.H. Pedersen / Energy Economics 84 (2019) 104520

Fig. 1. Plots of the three different simulation scenarios (a) Without random censoring, (b) Heteroskedastic random censoring and (c) Tight random censoring.

the power output range, which seems rather restrictive. Jeon and Taylor (2012) consider a bivariate vector autoregressive moving average-generalized autoregressive conditional heteroskedastic model for characterizing the association between wind power and

wind direction. In addition, they use a conditional kernel density estimator for modeling the dependence between wind power and wind speed allowing for the general nonlinear relationship suggested by the power curve in Fig. 2a Our analysis will be based on data from ten (at the beginning of the sampling period) newly installed Vestas v112–3.0 Mw wind turbines all located within the same wind farm on the western shoreline of Jutland, Denmark.4 We have restricted our data in the following ways. Due to a small number of observations, we have excluded observations of extreme wind speeds, defined as wind speeds higher than or equal to 15 m/s. Secondly, we have excluded observations of wind speeds higher than 2.5 m/s but with an associated output exactly equal to 0. We have decided to exclude these observations after conferring with data specialists and receiving confirmation that this is due to turbines being switched off for service checks typically according to a predetermined schedule following the manufacturers recommendations. The estimator qˆ is affected very modestly by the presence of these observations in the sample but they have large impacts on kˆ making the comparison between the two less informative. Lastly, we have truncated the output to 0 Mw and 3.075 Mw for observations with negative outputs and observations with outputs greater than 3.075 Mw, respectively, since these are the maximum and minimum theoretical outputs for the wind turbines. In relation to identifying censoring in the data, we have set U and L to the theoretical upper and lower bounds of wind power outputs of the turbines; i.e., 3.075 Mw and 0 Mw, respectively. Regarding identification of random censoring we have set Ct to be the 99th percentile of outputs observed conditional on wind speed. Further, wind power production is frequently capped in periods with high production, implying either that individual turbines are switched off or that the upper bound of production is set to be less than the theoretical maximum. We have accounted for this by setting rt equal to 0 if an exact output (five significant ciphers) is observed more than four times with a wind speed higher than 11.8 m/s. In order to limit the analysis we will focus on four randomly selected turbines denoted Units 16, 18, 21, and 24. For all turbines we will be using data on power output, wind speed and wind direction sampled with a 10-min frequency from December 11th 2011 to December 23rd 2012. The Panel b of Fig. 2 illustrates a time series plot of power output and wind speed for Unit 16, while the top panel depicts the power curve. The presence of fixed censoring at 0 Mw and 3 Mw is pronounced in both panels. From Panel a the random right censoring is also quite evident in the form of the 3–4 vertical clusters in the 2-3 Mw range. Random right censoring of this nature occurs, for example, in periods when the turbines are calibrated or tested but can also occur when turbines are preventively halted due to unstable and volatile weather conditions with severe turbulence and/or highly gusty winds from abruptly changing directions. From Fig. (2 a) the type of random censoring referred to as heteroskedastic/tight random censoring in Section 6 and depicted in Fig. (1 b) also seems to be present in the sample. The significance of the nonlinear relationship between power output and wind speed, particularly for moderately low to moderately high wind speeds, is unambiguous from Fig. (2 a). In theory, this power curve relation would be cubic polynomial, but outside wind tunnels the shape of this relationship is unknown and should therefore be estimated under a minimal set of shape assumptions. Finally, and perhaps not so evident from Fig. 2, the variance of power output is characterized by being time dependent. This form of heteroskedasticity is primarily caused by differences in the

4

The exact location/identification of the wind turbines is undisclosed.

C.M. Dahl, G. Effraimidis and M.H. Pedersen / Energy Economics 84 (2019) 104520

7

Fig. 2. Power curve (panel a) and power output (panel b) for the Vestas v112–3.0 Mw Turbine labeled Unit 16. In the power curve (panel a) the blue line is the IEC PC (theoretical) power curve, which is supplied by the manufacturer (e.g., Vestas A/S). The data is sampled with a 10-minutes frequency. The sampling period for the time series plot of the power output and wind speed (panel b) is May 15th, 2012–May 29th, 2012. The sample period for the power curve for Unit 16 is December 11th, 2011–December 23rd, 2012.

stability of high and low pressure weather systems and their persistence at the location of the turbines over the time period under consideration. All of the above features of the sampling design, i.e., censoring, nonlinearities, and heteroskedasticity, have to the best of our knowledge never previously been applied in a unified framework when it comes to the modeling and forecasting of wind power output. Importantly, all these characteristics are allowed in the nonparametric censored model approach we are proposing. In order to evaluate the quality of the estimated power curves we will use two alternative benchmark forecasting approaches. The first benchmark forecasting approach is the theoretical power curve for the Vestas v112–3.0 Mw wind turbines. We will denote this power curve by p(Xt ) or simply p. The theoretical power curve according to the IEC standards is in this case supplied by Vestas in their sales prospects.5 Many resources are allocated to pre-

5 Data on the theoretical power curves were supplied by the wind park operator/owner.

cisely uncover and document the theoretical power curve. In this case, Vestas A/S installed and operated a meteorological station at the wind park where all the units used in this study are situated. Although the theoretical power curve only gives a binary relationship between power output and wind speed and does not include, for example, wind direction or other relevant input factors, we consider the theoretical power curve as a potential very strong benchmark model. This is in particular because the wind turbines in our sample are newly installed at the beginning of the sample period and therefore more likely to behave according to their technical specifications. The theoretical power curve is probably also the most widely used wind power output forecasting device in the industry. The second benchmark forecasting approach we will be using, ˆ t ) or simreferred to as the empirical power curve and denoted k(X ˆ has already been introduced in Section 6. ply k, As a general rule, for all subsequent results we will apply 80 percent of the available data for estimation and use the remaining 20 percent for out-of-sample forecasting (validation). For the nonpara-

8

C.M. Dahl, G. Effraimidis and M.H. Pedersen / Energy Economics 84 (2019) 104520

Fig. 3. The unconditional power curves for Units 16, 18, 21, and 24: Actual and Fitted. qˆ represents the fit from the nonparametric censored regression modeling approach, p denotes the fit generated from the power curve supplied by the manufacturer (e.g., Vestas A/S), while kˆ represents the fit from a kernel based (binned) method as dictated by the IEC standards for computing and reporting power curves. The sample period is December 11th, 2011–December 23rd, 2012. Data are sampled on 10-minutes frequency.

metric censored regression we will use a simple grid-search based cross validation approach for empirical bandwidth selection. 7.1. Results In Fig. 3, the fitted power curves are depicted for wind turbine Units 16, 18, 21, and 24. The power curves are unconditional in the sense that all wind directions are considered when the power curves are fitted. For wind speeds up to about 6–7 m/s the three depicted power curves are almost similar. However, for wind speeds exceeding 7–8 m/s the fitted curves given by kˆ tend to lie uniformly below the theoretical power curves and the nonparametric censored regression model based power curves represented by p and qˆ , respectively. For wind speeds in this range, the estimator given by kˆ seems to be biased downwards due to censoring and the presence of relatively many outliers. The presence of outliers is particularly visible for turbine Unit 21. It is also very noticeable

that, except for Unit 18, the fitted power curves given by p and qˆ seem to be very similar, also for wind speeds exceeding 7 m/s. This observation is important because the wind turbines, as mentioned previously, are newly installed and therefore likely to honor the production specifications given by the manufacturer. Hence, the power curve given by p could be interpreted as being very close to the true underlying production function. If one is willing to make this interpretation, the accuracy of the estimator qˆ in the presence of heavy censoring and many large outliers is indeed very encouraging. In Fig. 3, the so-called quality bands are introduced. The quality bands are constructed using the feature that for each given wind speed recorded at the one-digit level there are repeated measurements of power output available. Hence, the high quality bands are estimated by qˆ on the paired set of observations of power output and wind speed, where the observed power output for a given wind speed is in the range from the 75th to the 100th percentiles. Like-

C.M. Dahl, G. Effraimidis and M.H. Pedersen / Energy Economics 84 (2019) 104520

9

Fig. 4. Power curves for Unit 16 conditional on wind directions: Actual and Fitted. qˆ represents the fit from the nonparametric censored regression modeling approach, p denotes the fit generated from the power curve supplied by the manufacturer (e.g., Vestas A/S), while kˆ represents the fit from a kernel based (binned) method as dictated by the IEC standards for computing and reporting power curves. The sample period is December 11th, 2011–December 23rd, 2012. Data are sampled on 10-min frequency.

wise, the low quality band is estimated using the sample of paired sets of observations, where the power output for a given wind speed is in the range from 0 through the 25th percentile. These quality bands can loosely be interpreted as empirical confidence intervals associated with the estimator, qˆ . In Fig. 4, we have zoomed in on Unit 16. It is well documented, for example due to wind direction specific wake effects, that the wind direction can influence the conditional distribution of power output given wind speed. For this reason we divide the sample according to four categories: North, east, south, and west. Fig. 4 illustrates the theoretical and fitted power curves conditional on wind direction for Unit 16. It should be noted that the predominant wind directions at the western shores of Jutland are south and west which is also confirmed in our data. The sample size equals about 18,000 observations for each of these wind directions. For the wind directions north and east, the sample size equals 3000 and 8000 observations, respectively. Again, the power plots illus-

trate that the kˆ estimator seems to be strongly downward biased for wind speeds exceeding 8 m/s. It is also noticeable that across a much broader range of wind speeds there is now visible separation of the theoretical power curve p and the fitted power curves given by qˆ . Surprisingly, for wind speeds in the interval 8–12 m/s, the p curve appears to be noticably upward biased for wind direction north and also somewhat downward biased for wind direction south. In these two cases the fit of the qˆ estimator seems far more convincing and in accordance with the observed data. In Table 3 all the measures of out-of-sample forecasting accuracy, introduced and discussed in Section 6, are reported conditional on four wind directions, i.e., north, east, south and west, and for three wind speed intervals given by m/s < 5, 5 ≤ m/s < 10, m/s ≥ 10 . To condense the results we define Units 4 to include the pooled observations for the turbine Units 16, 18, 21, and 24 and Units All to be the pooled observations

10

C.M. Dahl, G. Effraimidis and M.H. Pedersen / Energy Economics 84 (2019) 104520

Table 3 Out-of-sample forecasting accuracy measured conditional on wind speed and wind directions for Units 4, which are Units 16, 18, 21, and 24 pooled and Units All, which are pooled observations for all ten units for which we have available data. For further details, see caption to Table 1. Windspeed

Units all

Units 4

RMSE

MAD

SMPAPE

GMRAE

UMBRAE

RMSE

MAD

SMPAPE

GMRAE

UMBRAE

0.27857075 0.31978147 0.26695479 0.27677192 0.26745249

0.13370748 0.15094015 0.14276966 0.13396140 0.14361845

0.2252147 0.4270091 0.3890283 0.2253732 0.3898305

– 1.0785592 1.1754076 1.0507074 1.1743886

– 1.2838956 1.3248285 1.0343705 1.3211034

0.31928921 0.34749702 0.30136233 0.31770945 0.30103772

0.13481361 0.14649668 0.15029662 0.13491447 0.14834896

0.2365974 0.4475144 0.4207440 0.2367341 0.3851660

– 1.0518039 1.2276085 1.0379827 1.2151082

– 1.284563 1.382055 1.021699 1.319133

qˆ NORC kˆ NORC

0.28777515 0.34673622 0.27191405 0.27908606 0.27434516

0.17215393 0.19355959 0.17010113 0.16948764 0.17249968

0.1870559 0.3012729 0.2741828 0.1862601 0.2697069

– 0.9502159 1.0362860 0.9871101 1.0391306

– 1.1142220 1.1200505 0.9914627 1.1140691

0.27408563 0.29596128 0.26399264 0.27247846 0.26554800

0.14504113 0.15284401 0.15435570 0.14632349 0.15708928

0.177602 0.2892413 0.2707445 0.1781866 0.2466219

– 1.0005394 1.2104484 1.0728710 1.2342248

– 1.123713 1.246207 1.045212 1.219894

South qˆ p kˆ qˆ NORC kˆ NORC

0.25053895 0.25012914 0.24247682 0.24904346 0.24264879

0.11983606 0.12877111 0.13105992 0.11995688 0.13389944

0.1596650 0.2726401 0.2394944 0.1597871 0.2372018

– 1.1680075 1.3449340 1.0334036 1.3744584

– 1.2849077 1.3514510 1.0398096 1.3586799

0.27899385 0.27857512 0.26834432 0.27666766 0.26960777

0.13288346 0.14115713 0.15107707 0.13346146 0.15645293

0.1756613 0.2892271 0.2596580 0.1759551 0.2580902

– 1.0946202 1.4183818 1.0688653 1.4595750

– 1.216761 1.358642 1.064554 1.366393

0.27355291 0.27231629 0.26564333 0.27232601 0.26650171

0.12525344 0.13229538 0.14052208 0.12544515 0.14632446

0.1336087 0.2183450 0.1972174 0.1337027 0.1938477

– 1.1624073 1.5005088 1.0432735 1.5695960

– 1.2686195 1.4009210 1.0446359 1.4169629

0.35517990 0.35270619 0.33881891 0.35291186 0.34039361

0.15099104 0.15891833 0.17423581 0.15144729 0.18642006

0.1530643 0.2287770 0.2206987 0.1532509 0.2196457

– 1.1665305 1.7044302 1.1129428 1.7987468

– 1.246367 1.444742 1.086928 1.454595

0.03982538 0.04374695 0.03982410 0.03981697 0.03979342

0.02511232 0.02902506 0.02543384 0.02511161 0.02537819

0.3329697 0.7768044 0.6825965 0.3331148 0.6676882

– 1.2634470 0.9808472 1.0001024 0.9863231

– 1.7727119 1.5622078 1.0003476 1.5387411

0.04333598 0.04675944 0.04330585 0.04335720 0.04328851

0.02711763 0.03010611 0.02751710 0.02712217 0.02741740

0.3482088 0.8108048 0.7260409 0.3483912 0.7105141

– 1.1836459 0.9795806 1.0048510 0.9862265

– 1.742705 1.617986 1.004806 1.590164

0.25043717 0.24613598 0.24410838 0.24979432 0.24410140

0.16288920 0.16387099 0.16553522 0.16287798 0.16573555

0.1483506 0.1496418 0.1500409 0.1483731 0.1502556

– 1.0426073 1.0531370 1.0049951 1.0568934

– 1.0367273 1.0465732 1.0043569 1.0498563

0.26326242 0.25808855 0.25780884 0.26269858 0.25777535

0.16295140 0.16619782 0.16655675 0.16296106 0.16631498

0.1491283 0.1516516 0.1512869 0.1491602 0.1512281

– 1.0607585 1.0594133 1.0041207 1.0568587

– 1.050726 1.048612 1.003628 1.047569

0.48439778 0.49043115 0.45410144 0.47925857 0.45531656

0.23626543 0.23799219 0.29350827 0.23730011 0.31032939

0.1004352 0.1009514 0.1208918 0.1008476 0.1269191

– 0.9423356 3.4959810 1.2693746 3.8695376

– 1.1211508 2.2118707 1.2469735 2.3002233

0.56183255 0.56459297 0.52850459 0.55535846 0.53080563

0.24963658 0.25104231 0.31778557 0.25146513 0.34804982

0.1119594 0.1123999 0.1366807 0.1127060 0.1477889

– 0.9926977 4.4465190 1.5283289 5.1200561

– 1.114907 2.313395 1.371175 2.415162

North qˆ p kˆ qˆ NORC kˆ NORC East qˆ p kˆ

West qˆ p kˆ qˆ NORC kˆ NORC m/s < 5 qˆ p kˆ qˆ NORC kˆ NORC 5 ≤ m/s < 10 qˆ p kˆ qˆ NORC kˆ NORC m/s ≥ 10 qˆ p kˆ qˆ NORC kˆ NORC

for all ten turbine units for which we have available data. Similar results for the individual turbine Units 16, 18, 21, and 24 are reported in Appendix B, while the results for the remaining individual turbines can be obtained from the authors upon request. The main findings, based on Table 3, are as follows:



• • Generally (except for two cases), the predictive accuracy of qˆ is superior to both kˆ and p for all four wind directions based on the robust accuracy measures MAD, SMAPE, GMRAE, and UMBRAE. This holds true for both Units 4 and Units All. • Generally and across units (except for one case), the predictive accuracy of qˆ is superior to qˆ NORC for all four wind directions based on the robust accuracy measures GMRAE and UMBRAE. • Based on the UMBRAE measure the relative improvement of qˆ over kˆ and p ranges from about 10 percent when the wind directions is east to no less than 22 percent when the wind direction is north, south, and west. • Based on the UMBRAE measure the relative improvement of qˆ over qˆ NORC ranges from about −1 percent when the wind direction is east to almost 9 percent when the wind direction is west (for Units 4). • Generally, the predictive accuracy of qˆ is superior to both kˆ and p when conditioning on wind speed intervals. This result holds





for both Units 4 and Units All for three out of the four robust accuracy measures; i.e., MAD, SMAPE, and UMBRAE. Generally, the predictive accuracy of qˆ is superior to qˆ NORC when conditioning on wind speed intervals. This result holds for both Units 4 and Units All based on GMREA and UMBRAE. As Fig. 3 illustrates, the difference in predictive accuracy among the estimators is limited when wind speeds are within the 5 ≤ m/s < 10 interval. improvement of qˆ Based on the UMBRAE measure the relative   over kˆ is more than 200 percent when m/s ≥ 10 . In this wind speed range the relative improvement of qˆ over p is about 6–8 percent. Based on the UMBRAE measure the relative improvement of  qˆ over qˆ NORC  is in the range of 25–37 percent when 5 ≤ m/s < 10 .

Again it is important to accentuate that contrary to the forecasting approaches based on kˆ and p, the nonparametric censored regression approach represented by qˆ is multivariate. Hence, the introduction of other important factors in addition to wind direction – for example temperature, humidity, wind stability – might improve the relative forecasting accuracy of qˆ even further. Work

C.M. Dahl, G. Effraimidis and M.H. Pedersen / Energy Economics 84 (2019) 104520

in this direction is ongoing. It would also be interesting to compare relative forecasting accuracy to “older” wind turbines. In this case the performance of p might be “outdated” due to wear-and-tear on the turbines. We believe this would work significantly in favor of improved relative forecasting accuracy of the nonparametric censored regression model approach. Our results also illustrate that ignoring random censoring by discarding censored obervations can be a very costly strategy that leads to significantly less accurate forecasts.

y

and the respective integrated hazard rate (y|Xt ):= 0 (u|Xt )du. The integrated hazard rate can be calculated as follows:



y

dP(Yt∗ ≤ u|Xt , Yt∗ > L) . P(Yt∗ ≥ u|Xt , Yt∗ > L)

(y|Xt ) = 0

Appendix A

Throughout Appendix A, we will use the definition n :=

ln n nhd

+



y

0

Kh (Xt − Xi )1(Yi > L)

vi (Xt ) = n

i=1

Kh (Xt − Xi )1(Yi > L)

Therefore,

(A.2)

Xt ∈ X

By Lemma 1 we have >



> L)| = Op ( n ).(A.3)

Using (6) we can conclude the first part of the proposition.where |q(Xt ) − q˜ (Xt )| < |ˆq(Xt ) − q(Xt )|. Since the density function of f (Yt∗ |Xt ) is continuous (w.r.t. both arguments) and Xt takes on compact set values, the term f (˜q(Xt )|Xt ) is bounded away from zero and infinity, and thus it suffices to consider the difference P(Yt∗ > qˆ (Xt )|Xt ) − P(Yt∗ > q(Xt )|Xt ) for deriving the uniform consistency rate of the difference qˆ (Xt ) − q(Xt ). By triangle inequality, |P(Yt∗ > qˆ (Xt )|Xt ) − P(Yt∗ > q(Xt )|Xt ) ˆ t∗ > qˆ (Xt )|Xt ) − P(Yt∗ > q(Xt )|Xt )| | < |P(Y

(A.5)

ˆ t∗ > qˆ (Xt )|Xt ) − P(Yt∗ > qˆ (Xt )|Xt )|. +|P(Y By definition of the conditional median, the first term on the right hand side equals zero. For the second term we ˆ t∗ > qˆ (Xt )|Xt ) − P(Yt∗ > qˆ (Xt )|Xt )| ≤ sup |P(Y ˆ t∗ > have: sup |P(Y Xt ∈ X y|Xt ) − P(Yt∗

y ∈ Y,Xt ∈ X

> y|Xt )| (since we look at values qˆ (Xt ) ∈ (L, U)). By using the first result of the proposition, we know that ˆ t∗ > y|Xt ) − P(Yt∗ > y|Xt )| = Op ( n ), which leads us to sup |P(Y

y ∈ Y,Xt ∈ X

the desired result. 䊐 Next, we present Lemma 1, which deals with the uniform consistency rate of P(Yt∗ ≤ y|Xt , Yt∗ > L). First, consider the hazard rate P(y ≤ Yt∗ < y + dy|Yt∗ ≥ y, Xt , Yt∗ > L) dy dy→0

(y|Xt ):= lim

n 



y

vi (Xt )1(Yi ≥ y).

ˆ t ≤ u, r = 1|Xt , Yt∗ > L) dP(Y

ˆ (y|X t) =

ˆ t ≥ u|Xt , Yt∗ > L) P(Y

and accordingly y|Xt , Yt∗

vi (Xt )1(Yi ≤ y, ri = 1),

(A.10)

and

0

L) − P(Yt∗

n  i=1

Hence,

y|Xt , Yt∗

.

(A.11)

i=1

and ˆ t∗ ≤ L|Xt ) − P(Yt∗ ≤ L|Xt )| = Op ( n ). sup |P(Y

(A.9)

To derive the estimator for the integrated hazard rate, we should develop the estimators for the probabilities P(Yt ≤ y, r = 1|Xt , Yt∗ > L) and P(Yt ≥ y|Xt , Yt∗ > L). Particularly, introduce the weights

Xt ∈ X



(A.8)

dP(Yt ≤ u, r = 1|Xt , Yt∗ > L) . P(Yt ≥ u|Xt , Yt∗ > L)

(y|Xt ) =

ˆ t ≥ y|Xt , Yt∗ > L) = P(Y

Proof of Proposition 1. From Hansen (2008) we have ˆ t∗ > L|Xt ) − P(Yt∗ > L|Xt )| = Op ( n ) (A.1) sup |P(Y

sup



1 − d(u|Xt ) .

Given that Ct ⊥⊥ Yt∗ |Xt , we can write

ˆ t ≤ y, r = 1|Xt , Yt∗ > L) = P(Y

h2 .

y ∈ Y,Xt ∈ X

 u≤y

In this paper we have proposed a novel nonparametric censored regression model approach for the forecasting of wind power output for individual or collections of wind turbines. The key feature of the model is that it leaves the conditional mean function unspecified while accounting for two-sided fixed censoring as well as random right censoring. A uniform consistency result of the nonparametric estimator is presented. A simulation study confirms that the theoretical results also hold in finite samples. Finally, we provide an empirical illustration and show that the nonparametric censored regression model approach in some relevant cases can improve forecasting accuracy substantially relative to some well known and frequently used forecasting devices.

ˆ t∗ |P(Y

(A.7)

Note that both (y|Xt ) and (y|Xt ) are equal to zero for all y ≤ L. The probability of interest P(Yt∗ ≤ y|Xt , Yt∗ > L) can be computed by applying the product integration operation (Gill, 1994) P(Yt∗ ≤ y|Xt , Yt∗ > L) = 1 −

8. Conclusion

11

(A.6)

ˆ t ≤ y|Xt , Yt∗ > L) = 1 − P(Y



,

(A.12)



ˆ 1 − d(u|X t) ,

(A.13)

u≤y

which gives us the estimator (7). The next lemma provides the uniform convergence rate of the estimator P(Yt∗ ≤ y|Xt , Yt∗ > L). The convergence rate established in Lemma 2 is faster than the convergence rate calculated for a similar quantity by Lecoutre and Ould-Said (1995). More precisely, they provide a discussion after their Theorem 2 and based on this, resulting convergence rate in our context would be equal to the Op

√ln n nhd

.

Lemma 1. Suppose that the second derivatives (w.r.t. Xt ) of f(Xt ), P(Yt∗ ≤ y, r = 1|Xt , Yt∗ > L)f (Xt ),P(Yt∗ ≥ y|Xt , Yt∗ > L)f (Xt ) are continuous and bounded. Then, ˆ t∗ ≤ y|Xt , Yt∗ > L) − P(Yt∗ ≤ y|Xt , Yt∗ > L)| = Op ( n ). sup |P(Y y ∈ Y,Xt ∈ X

Proof. We provide a sketch of the proof. The first step is to ˆ t≤ calculate the uniform convergence rate for the estimators P(Y ˆ t ≥ y|Xt , Yt∗ > L). We define P(Yt ≤ y, r = y, r = 1|Xt , Yt∗ > L) and P(Y 1, Xt , Yt∗ > L):=P(Yt ≤ y, r = 1|Xt , Yt∗ > L)f (Xt , Yt∗ > L) and P(Yt ≥ y, Xt , Yt > L):=P(Yt ≥ y|Xt , Yt∗ > L)f (Xt , Yt∗ > L), where f (Xt , Yt∗ > L) = f (Xt |Yt∗ > L)P(Yt∗ > L).  ˆ t ≤ y, r = 1, Xt , Yt∗ > L) = 1 n 1(Yi ≤ y, ri = We have P(Y n i=1 ˆ t ≥ y, Xt , Yt∗ > L) = 1 n 1(Yi ≥ y, Yi > 1, Yi > L)Kh (Xt − Xi ), P(Y L)Kh (Xt − Xi ) and fˆ (Xt , Yt∗ > L) =

1 n

n

i=1

n

i=1

1(Yi > L)Kh (Xt − Xi ).

12

C.M. Dahl, G. Effraimidis and M.H. Pedersen / Energy Economics 84 (2019) 104520

We notice that 1(Yi ≤ y, ri = 1, Yi > L) ≤ 1 and 1(Yi ≥ y, Yi > L) ≤ 1 with probability one. Consequently, by slight modification of the proof of Theorem 2 of Hansen (2008) and standard kernel calculations for the expectation we obtain ˆ t ≤ y, r = 1, Xt , Yt∗ > L) − P(Yt ≤ y, r = (A.14) sup |P(Y

sup y ∈ Y,Xt ∈ X

= Op ( n )

sup y ∈ Y,Xt ∈ X

sup y ∈ Y,Xt ∈ X

(A.14)

The second and third steps are identical to Shorack and Wellner (2009) (Theorem 1, Chapter 6). In summary, the second step proves that

and y ∈ Y,Xt ∈ X

ˆ t ≥ y|Xt , Yt∗ > L) − P(Yt ≥ y|Xt , Yt∗ > L)| = Op ( n ). |P(Y (A.17)

ˆ t ≤ y, r = 1, Xt , Yt∗ > L) |P(Y

− P(Yt ≤ y, r = 1, Xt , Yt∗ > L)| = Op ( n )

sup

(A.16)

and

y ∈ Y,Xt ∈ X

1, Xt , Yt∗ > L)| = Op ( n )

ˆ t ≤ y, r = 1|Xt , Yt∗ > L) − P(Yt ≤ y, r = 1|Xt , Yt∗ > L)| |P(Y

sup y ∈ Y,Xt ∈ X

ˆ t ≥ y, Xt , Yt∗ > L) − P(Yt ≥ y, Xt , Yt∗ > L)| = Op ( n ). |P(Y (A.15)

Identical convergence rates can be obtained for the estimator fˆ (Xt , Yt∗ > L). Therefore,

ˆ |(y|X t ) − (y|Xt )| = Op ( n ).

(A.18)

Once we obtain the above uniform convergence rate, we proceed with the third step where we can prove the final result. 䊐 Appendix B Table 4 Table 5

Table 4 Out-of-sample forecasting accuracy measures conditional on wind speed and wind directions for Units 16 and 18. For further details, see caption to Table 1. Windspeed

Unit 16

Unit 18

RMSE

MAD

SMPAPE

GMRAE

UMBRAE

RMSE

MAD

SMPAPE

GMRAE

UMBRAE

0.17531773 0.19075826 0.16951838 0.17521426 0.17260383

0.07663181 0.09043928 0.08175268 0.07658783 0.07838760

0.20272781 0.47509553 0.32851924 0.20266529 0.32739636

– 1.2059103 1.0995864 0.9900794 1.0345536

– 1.5311797 1.2357085 0.9947144 1.1969009

0.21619147 0.30297915 0.20854460 0.21603834 0.20908171

0.11362372 0.15413913 0.12058929 0.11403381 0.12211052

0.21180513 0.44559024 0.35305978 0.21204350 0.32224509

– 1.2908040 1.2278825 1.0829872 1.2790306

– 1.485984 1.363507 1.050503 1.345553

0.24553208 0.25736542 0.24361359 0.24434117 0.25258306

0.13221916 0.13440027 0.13972089 0.13428194 0.14641167

0.14029297 0.24005505 0.19210258 0.14098886 0.17861340

– 1.0015230 1.1604554 1.0873502 1.2082413

– 1.1082110 1.1618452 1.0502742 1.1610501

0.23857531 0.30613293 0.23455603 0.23702369 0.23460071

0.14410205 0.17229432 0.14711309 0.14429978 0.14816544

0.15359251 0.25109571 0.18071319 0.15365333 0.18121290

– 0.9778051 1.0699497 1.0337207 1.0752167

– 1.109333 1.082185 1.026849 1.085219

0.22583577 0.23179224 0.21394143 0.22564875 0.23569243

0.11685513 0.13824507 0.12549038 0.12448371 0.13822970

0.16447176 0.29819679 0.21701216 0.16755444 0.21887992

– 1.2321075 1.4280798 1.2473073 1.4907931

– 1.3289244 1.2697977 1.0946945 1.2598608

0.19275146 0.20008050 0.18782703 0.19180716 0.18855828

0.11267838 0.11774017 0.11937661 0.11308697 0.12141268

0.16116205 0.34090264 0.23688840 0.16156449 0.22754632

– 1.0815062 1.2290495 1.0625193 1.2685900

– 1.210292 1.273176 1.055506 1.279939

0.23667062 0.24963511 0.22877738 0.23409028 0.23466313

0.12175144 0.14518434 0.13158717 0.12292800 0.13949087

0.13774931 0.25592538 0.18911713 0.14131502 0.19192666

– 1.3033213 1.4399427 1.2037778 1.5066072

– 1.3821512 1.254723 1.1057271 1.2681748

0.17049067 0.18298201 0.16963714 0.17023908 0.17029335

0.10924334 0.11712664 0.11164375 0.10990346 0.11360206

0.13535019 0.22812039 0.16680654 0.13594239 0.16771482

– 1.0208687 1.1678934 1.0411119 1.2089936

– 1.171354 1.163310 1.039254 1.193060

0.04236636 0.05462224 0.04234902 0.04237032 0.04232254

0.02687164 0.03629718 0.02734230 0.02683162 0.02713872

0.32130598 0.81620422 0.51981973 0.32107527 0.51999558

– 0.9740046 1.0092042 0.9931547 0.9919820

– 1.8390178 1.3140380 0.9941250 1.2949953

0.03875898 0.04439241 0.03862588 0.03881361 0.03866490

0.02431989 0.02898290 0.02454920 0.02435075 0.02451224

0.32482707 0.80866314 0.62094130 0.32526087 0.55028881

– 1.2373916 0.9759504 1.0017999 0.9831161

– 1.788088 1.453015 1.001382 1.336281

qˆ NORC kˆ NORC

0.21254375 0.21952364 0.21231674 0.21231110 0.21231951

0.14533013 0.15910249 0.14686150 0.14570494 0.14731198

0.12837080 0.14278618 0.12910737 0.12861815 0.12931569

– 1.1890976 1.0158367 1.0064275 1.0243866

– 1.1557692 1.0147617 1.0074414 1.0212490

0.22899847 0.24227521 0.22750307 0.22883239 0.22755307

0.15937257 0.16356250 0.16054735 0.15944415 0.16096800

0.15203477 0.15589320 0.15284413 0.15209780 0.15321868

– 1.0152404 1.0196438 1.0011702 1.0240777

– 1.018486 1.018188 1.000675 1.024042

m/s ≥ 10 qˆ p kˆ qˆ NORC kˆ NORC

0.38037301 0.38458811 0.36103468 0.37218159 0.37085861

0.18825298 0.18978303 0.22853087 0.19407553 0.25733746

0.07593231 0.07644980 0.08991970 0.07793249 0.10018925

– 0.9726540 4.0178091 2.2899397 4.7277415

– 0.9798238 1.9599229 1.522112 2.0236575

0.30646976 0.34946900 0.29779588 0.30373749 0.29870251

0.16609189 0.18741219 0.18603428 0.16709583 0.19408408

0.06676197 0.07392407 0.07369228 0.06711316 0.07646867

– 0.9390265 1.9820104 1.1836550 2.2319494

– 1.106551 1.660433 1.180617 1.764476

North qˆ p kˆ qˆ NORC kˆ NORC East qˆ p kˆ qˆ NORC kˆ NORC South qˆ p kˆ qˆ NORC kˆ NORC West qˆ p kˆ qˆ NORC kˆ NORC m/s < 5 qˆ p kˆ qˆ NORC kˆ NORC 5 ≤ m/s < 10 qˆ p kˆ

C.M. Dahl, G. Effraimidis and M.H. Pedersen / Energy Economics 84 (2019) 104520

13

Table 5 Out-of-sample forecasting accuracy measures conditional on wind speed and wind directions for Units 21 and 24. For further details, see caption to Table 1. Windspeed

North qˆ p kˆ qˆ NORC kˆ NORC East qˆ p kˆ qˆ NORC kˆ NORC South qˆ p kˆ qˆ NORC kˆ NORC West qˆ p kˆ qˆ NORC kˆ NORC m/s < 5 qˆ p kˆ qˆ NORC kˆ NORC 5 ≤ m/s < 10 qˆ p kˆ qˆ NORC kˆ NORC m/s ≥ 10 qˆ p kˆ qˆ NORC kˆ NORC

Unit 21

Unit 24

RMSE

MAD

SMPAPE

GMRAE

UMBRAE

RMSE

MAD

SMPAPE

GMRAE

UMBRAE

0.41268393 0.42652799 0.37211098 0.41640410 0.38176157

0.15999055 0.16472200 0.18762131 0.16983576 0.18935767

0.2389635 0.4047984 0.3402357 0.2454137 0.3414286

– 0.9956712 1.3399800 1.1268456 1.3175030

– 1.2017061 1.2800603 1.0562407 1.2568215

0.20954580 0.24843161 0.20653248 0.20886974 0.20725067

0.10500272 0.12459158 0.11023468 0.10511044 0.11233719

0.25873571 0.52189514 0.39149708 0.25071959 0.35782834

– 0.9930878 1.1565274 1.0234061 1.2213064

– 1.4722559 1.3031996 1.0100355 1.2882458

0.33192676 0.33994625 0.31632770 0.32723063 0.31800959

0.15494065 0.15850929 0.16594939 0.15585113 0.16944612

0.1957170 0.3173329 0.3055595 0.1960869 0.2871391

– 1.0073816 1.2774321 1.0810906 1.2828135

– 1.1437869 1.2845067 1.0358408 1.2511069

0.25567321 0.30525303 0.25376536 0.25516888 0.25389184

0.15360084 0.16090418 0.15776035 0.15380815 0.15817024

0.21105384 0.33630172 0.24806920 0.21120793 0.24907457

– 0.9944533 1.0357413 1.0146611 1.0598906

– 1.1542374 1.0795523 1.0084573 1.0946355

0.40680340 0.40382085 0.38161735 0.40451276 0.38669009

0.17240187 0.17989350 0.20008382 0.17555612 0.21464124

0.208111 0.3088516 0.3115913 0.2094064 0.3012105

– 1.0972820 1.5972272 1.1345177 1.6592566

– 1.2016958 1.4240341 1.085879 1.4048516

0.24880140 0.24457799 0.23946025 0.24595926 0.24039763

0.12370878 0.13793621 0.13592629 0.12481968 0.14074237

0.14879538 0.30272689 0.19511261 0.14919107 0.19005869

– 1.1835077 1.3486679 1.0852742 1.4190632

– 1.4273246 1.3677964 1.1008070 1.398742

0.47771257 0.46898421 0.44130478 0.46699443 0.45243813

0.19128024 0.20438265 0.23014720 0.19638444 0.25726807

0.1718770 0.2562706 0.2676241 0.1740792 0.2486424

– 1.2314599 2.0969195 1.3254878 2.2232860

– 1.3166675 1.6029944 1.1801954 1.5567085

0.23503773 0.22985957 0.22410018 0.23396153 0.22433934

0.11287655 0.12314287 0.12864320 0.11290157 0.13212974

0.10178546 0.15342527 0.12419001 0.10180266 0.12544584

– 1.2344224 1.6725371 1.0413911 1.7384601

– 1.3232855 1.5010237 1.0441351 1.5274557

0.04412828 0.04400152 0.04345500 0.04423573 0.04343697

0.02697947 0.02760633 0.02786934 0.02697993 0.02761908

0.3788878 0.8147379 0.8117553 0.3791552 0.7747859

– 1.0480711 1.0621203 0.9923557 1.0568987

– 1.7174330 1.8259355 0.9931751 1.7305202

0.03863529 0.04292250 0.03868899 0.03860482 0.03860443

0.02327081 0.02892449 0.02374235 0.02326427 0.02367666

0.31106997 0.75157189 0.55141741 0.31098216 0.50636998

– 1.4618146 1.0161690 0.9947283 1.0138002

– 1.8945377 1.3990192 0.9956756 1.3253978

0.30159182 0.29489605 0.29399629 0.30091149 0.29437203

0.17323340 0.17916693 0.17717272 0.17324901 0.17572561

0.1574140 0.1615149 0.1598614 0.1574480 0.1590176

– 1.0633588 1.0402270 0.9990263 1.0265059

– 1.0537337 1.0372769 1.0003579 1.0266227

0.25165638 0.24082658 0.24091810 0.25087851 0.24101512

0.15436504 0.16035837 0.16194418 0.15437743 0.16280703

0.13812902 0.14254449 0.14270835 0.13815325 0.14336501

– 1.1257730 1.1346958 1.0026058 1.1473367

– 1.1072007 1.1116621 1.0041464 1.1217802

0.76016617 0.75611474 0.69733878 0.73825232 0.72080138

0.33741940 0.33703095 0.45770508 0.35188533 0.54359369

0.1642199 0.1642967 0.2105473 0.1704734 0.2459770

– 1.0541841 6.8105959 2.5497269 8.4275607

– 1.1420057 2.7174446 1.7267913 2.8274619

0.35630801 0.35173998 0.33671318 0.35233439 0.33836761

0.16556821 0.16728754 0.20314188 0.16704190 0.21449917

0.06658046 0.06727374 0.07972667 0.06710288 0.08371107

– 1.1071740 3.2743618 1.2577602 3.5596295

– 1.3598655 2.3846620 1.2490447 2.4770740

Appendix C. Supplementary data Supplementary data associated with this article can be found, in the online version, at https://doi.org/10.1016/j.eneco.2019.104520. References Armstrong, J., 1985. Measures of accuracy. In: Long-Range Forecasting: From Crystal Ball to Computer., pp. 346–354. Armstrong, J., Collopy, F., 1992. Error measures for generalizing about forecasting methods: empirical comparisons. Int. J. Forecast. 8 (1), 69–80. Chen, C., Twycross, J., Garibaldi, J., 2017. A new accuracy measure based on bounded relative error for time series forecasting. PLoS ONE 12 (3). Chen, S., Dahl, G.B., Khan, S., 2005. Nonparametric identification and estimation of a censored location-scale regression model. J. Am. Stat. Assoc. 100 (469), 212–221. Fildes, R., 1992. The evaluation of extrapolative forecasting methods. Int. J. Forecast. Int. J. Forecast. 8 (1), 81–98. Gill, R.D., 1994. Lectures on Survival Analysis. Springer.

Hansen, B.E., 2008. Uniform convergence rates for kernel estimation with dependent data. Econom. Theory 24 (03), 726–748. Jeon, J., Taylor, J.W., 2012. Using conditional kernel density estimation for wind power density forecasting. J. Am. Stat. Assoc. 107 (497), 66–79. Lecoutre, J.-P., Ould-Said, E., 1995. Convergence of the conditional Kaplan-Meier estimate under strong mixing. J. Stat. Plan. Inference 44 (3), 359–369. Lee, S., Lewbel, A., 2013. Nonparametric identification of accelerated failure time competing risks models. Econom. Theory 29 (5), 905–919. Lewbel, A., Linton, O., 2002. Nonparametric censored and truncated regression. Econometrica 70 (2), 765–779. Li, Q., Racine, J.S., 2006. Nonparametric Econometrics: Theory and Practice, No. 8355 in Economics Books. Princeton University Press. Nielsen, T.S., Madsen, H., Nielsen, H.A., Landberg, L., Giebel, G., 2001. Zephyr-the prediction models. European Wind Energy Conference 2001. Pinson, P., 2012. Very-short-term probabilistic forecasting of wind power with generalized logit-normal distributions. J. R. Stat. Soc.: Ser. C (Appl. Stat.) 61 (4), 555–576. Sanchez, I., 2006. Short-term prediction of wind energy production. Int. J. Forecast. 22 (1), 43–56. Shorack, G.R., Wellner, J.A., 2009. Empirical Processes With Applications to Statistics, vol. 59. SIAM.