Metaheuristic optimization algorithms to estimate statistical distribution parameters for characterizing wind speeds

Metaheuristic optimization algorithms to estimate statistical distribution parameters for characterizing wind speeds

Journal Pre-proof Metaheuristic optimization algorithms to estimate statistical distribution parameters for characterizing wind speeds Musaed Alrashid...

2MB Sizes 0 Downloads 31 Views

Journal Pre-proof Metaheuristic optimization algorithms to estimate statistical distribution parameters for characterizing wind speeds Musaed Alrashidi, Manisa Pipattanasomporn, Saifur Rahman PII:

S0960-1481(19)31920-2

DOI:

https://doi.org/10.1016/j.renene.2019.12.048

Reference:

RENE 12757

To appear in:

Renewable Energy

Received Date: 19 March 2019 Revised Date:

16 October 2019

Accepted Date: 9 December 2019

Please cite this article as: Alrashidi M, Pipattanasomporn M, Rahman S, Metaheuristic optimization algorithms to estimate statistical distribution parameters for characterizing wind speeds, Renewable Energy (2020), doi: https://doi.org/10.1016/j.renene.2019.12.048. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier Ltd.

3

Metaheuristic Optimization Algorithms to Estimate Statistical Distribution Parameters for Characterizing Wind Speeds

4 5 6

Musaed Alrashidi1, Manisa Pipattanasomporn1,2, Saifur Rahman1 Bradley Department of Electrical and Computer Engineering, Advanced Research Institute, Virginia Tech, USA 2 Smart Grid Research Unit, Department of Electrical Engineering, Chulalongkorn University, Bangkok, THAILAND

1 2

1

7 8

Corresponding author: Musaed Alrashidi, [email protected].

9 10

Abstract:

11 12 13 14 15 16 17 18 19 20 21

An accurate analysis of wind speeds is vital to justify wind energy projects. Statistical distributions can be used to characterize wind speeds through considering uncertainty in wind resources. However, the selection of the most suitable probability density function (PDF) is still a challenging task. Therefore, this study aims at developing a framework to accurately evaluate the performance of different PDFs to fit wind speeds, as well as presenting a new metaheuristic optimization algorithm method, called Social Spider Optimization (SSO), for wind characterization purposes. Seven sites in Saudi Arabia are used as case studies. Results indicate that combined PDFs outperform single PDFs in representing the observed wind speeds frequencies at all considered sites. Weibull distribution appears to be the most prevalent single distribution while no combined PDF dominates the others. In addition, the proposed SSO method is found to be the most efficient method for estimating PDFs parameters in Saudi Arabia. Overall, this proposed framework can be used to evaluate different wind PDFs in other countries.

22 23

Keywords: Wind speed, Probability density function, Combined density function, Metaheuristic optimization algorithm, Social Spider Optimization.

24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 1

45

1. Introduction

46

The global increase demand for clean energy resources, such as solar and wind, has been

47

expediting the pace of integrating these resources with electrical power grids. As a result, data

48

about the increase in renewable energy capacity and declination in renewable costs are reported

49

in 2017 [1]. In 2017, the total installed global renewable power capacity was 2,195 GW with

50

55% of the new additional capacity coming from solar photovoltaic (PV) followed by wind and

51

hydropower at 29% and 11%, respectively [1].

52

Wind energy is an eco-friendly, inexhaustible and sustainable source and many countries

53

started to utilize power extracted from wind to cover their domestic load. However, the uncertain

54

nature of wind leads to variation in wind power generation, which causes serious obstacles to the

55

power system operators. For instance, mixing the power generated from wind with other existing

56

traditional technologies (gas, oil, coal, etc.) led to additional requirements for ancillary services

57

[2]. Therefore, having a reliable wind speed data and understanding the distribution of wind

58

speeds result in reducing the risk of uncertainty as well as better evaluations to the potentials of

59

wind energy at any site [3].

60

A statistical distribution typically represents the wind speed data. Selecting the suitable

61

probability density function (PDF) is the key factor for a successful assessment of wind energy

62

[4]. Several single-parameter distribution functions have been used in the literature to describe

63

wind regimes at different sites. The two-parameter Weibull is the most commonly used

64

distribution function in the world for modeling wind speed frequencies [5–7]. Several advantages

65

made Weibull distribution attain this popularity, including its flexibility, has only two

66

parameters, easy to estimate these parameters, and has a closed form expression [8,9]. However,

67

the Weibull PDF may not always be the optimal distribution to fit the wind speeds. Therefore,

68

other single-parameter PDFs have been investigated by many researchers, including three-

69

parameter Weibull [10,11], Rayleigh [9,12–16], Lognormal [9,10,12–15,17–19], Gamma

70

[9,10,12–15,17,19], Inverse Weibull [20], generalized Gamma [10], Kappa [10], Burr [15],

71

Logistic [14,18], inverse Gaussian [9,15], Beta [9] etc.

72

However, literature is lacking in determining the most appropriate PDF for wind speed data.

73

In Saudi Arabia, most work related to wind is limited to Weibull distribution to study wind

74

behaviors without considering other PDFs. Baser et al. [21], for example, used Weibull 2

75

distribution to analyze wind characteristics and wind energy potential at seven sites at Jubail city

76

in Saudi Arabia. The study compared different parameter estimation methods to find Weibull

77

parameters and then calculated the maximum energy carrying capacity, most probable wind

78

speed, and energy output from five wind machines with rated power from 1.8 to 3.3 MW.

79

Results showed that Jubail industrial area (east) is most promising and the energy output from a

80

3 MW wind machine was found to be 11,136 MWh/year with a plant capacity factor of 41.3%.

81

Furthermore, Rehman and Naïf [22] utilized Weibull PDF to fit the wind speed frequency aiming

82

at carrying out a technical assessment at Yanbo city in Saudi Arabia. In their study, a wind

83

turbine with a rated capacity of 2.75 MW was used and the results stated that this wind turbine

84

could produce annually 6,681, 6,875 and 7,049 MWh of electricity with average plant capacity

85

factor of 27.7, 28.5 and 29.3% at corresponding hub heights of 60, 80 and 100 m, respectively.

86

Despite the fitting accuracy of Weibull distribution, other distributions are required to be

87

investigated. Hence, in this study three single PDFs are used and their performance is compared

88

with Weibull distribution based on their fitting capability at seven sites in Saudi Arabia.

89

In general, single-parameter distributions can provide a good fitting accuracy to the wind

90

speed; however, and specifically when wind regimes are complex, their performance and

91

efficiency are somewhat low [19]. Combined distribution models, therefore, have been employed

92

recently to overcome shortcomings of single distributions. A combined distribution means that at

93

least two independent single distributions are mixed together to form a new distribution [23].

94

Examples of such combined distributions include the merger of two Weibull distributions

95

[9,10,12,17] and two Gamma distributions [10,12]. In [12], the authors proposed four combined

96

distributions models, namely the merger of Weibull, Gamma, Lognormal and Rayleigh, as well

97

as a hierarchical merger of multiple distribution models (HMMD). Results of this study indicated

100

PDF dominates other distributions and the HMMD model outperformed all other models under

101

simplify the complexity that may exist in wind regimes at the study sites.

98 99

102

that in general, combined models outperform single models, no single-parameter or combined

study. Accordingly, ten combined distributions are proposed in this study to describe and

Each of the single-parameter and combined PDFs is defined by its parameters. Selecting the

103

optimal value of these parameters has significant effects on the performance of PDFs to fit the

104

actual wind speeds distribution. Several estimation methods have been utilized in the literature to 3

107

methods (CNMs), namely Maximum Likelihood Method (MLM) [5,9,24–29,10,13–17,19,21],

108

Method of Moment (MOM) [5,9,10,13,14,19,26,28,29], and Least Square Method (LSM) [9– 11,14,16,19,21,26,29]. Nevertheless, employing such numerical methods may result in

109

unsatisfactory fitting accuracy. In recent years, metaheuristic optimization algorithms methods

110

112

(GA) [10,13,30], Cuckoo Optimization Algorithm (COA) [14,19], Differential Evolution (DE)

113

[13,18], and Batt Algorithm (BA) [19], have been applied by some studies aiming to improve the parameter estimation process. However, there is still no persistent conclusion to select a certain

114

algorithm to estimate PDFs parameters.

105

estimate the parameters of single and combined distributions, such as the conventional numerical

106

111

115

(MOAMs), such as Particle Swarm Optimization (PSO) [5,13,18,19,25], Genetic Algorithms

Carneiro et al. [5], for example, used PSO to compute Weibull parameters for wind resources

116

in the Northeast Region of Brazil. PSO was compared with five CNMs, including MOM, MLM,

117

Empirical Method, Energy Pattern Factor Method, and Energy Equivalent Method. According to

118

the statistical tests, the results indicated that the

120

correlation (

121

assessed the wind potential at four stations in central China. For parameters estimation, three

122

CNMs (MLM, MOM, and LSM), the maximum entropy method, and Cuckoo Search (CS)

123

Algorithm were used. Results showed that the proposed CS provides the best estimation results

124

in term of high

119

125

) exceeding 99% and low relative bias and error values. Wang et al. [31]

offers the best performance with high

investigated six PDFs (Weibull, Logistic, Rayleigh, Normal, Lognormal, and Gamma) when they

and low root mean square errors.

Based on the above discussion, determining the most suitable wind speed PDF models and

126

selecting the optimal parameter values of PDFs are still considered challenging tasks. Therefore,

127

this study takes seven sites in Saudi Arabia as a case study to create a framework for evaluating

128

four single and ten combined PDFs and propose a new MOAM, called Social Spider

130

Optimization Algorithm (SSO), aiming to characterize wind speeds. This framework can be used

131

to other studies in the area of wind energy can be summarized as follows:

129

132 133 134 135

to evaluate the best wind PDFs in other countries. The main contributions of this study compared

(1) Considering the need for more advanced optimization algorithms, this paper introduces SSO for the first time to estimate PDF parameters for wind speed characterization and compares its performance with three commonly used CNMs, namely MLM, MOM, and LSM, and three popular optimization algorithms, namely PSO, GA, and COA. Results 4

indicate that, unlike the other algorithms, SSO can accurately provide the optimal parameters and the optimal values are converged quickly. (2) Since most studies related to wind in Saudi Arabia use the two-parameter Weibull PDF to describe wind speed regimes, this study shows that other PDFs cannot be always used to model the wind speed frequency distribution, and that Weibull demonstrates its superior performance to model wind speed in Saudi Arabia. The single-parameter PDFs used are Weibull, Rayleigh, Lognormal, and Gamma. (3) To overcome the shortcomings that may exist in single PDFs while being able to capture the complexity of wind regimes, this paper proposes the use of ten-combined distributions to characterize the wind regimes at Saudi sites better. The combined distributions are: Weibull (MWW), Rayleigh (MRR), Lognormal (MLL), Gamma (MGG), Weibull-Rayleigh (MWR), Weibull-Gamma (MWG), Weibull-Lognormal (MWL), Rayleigh-Gamma (MRG), Rayleigh-Lognormal (MRL), and Gamma-Lognormal (MGL). Here ‘M’ stands for ‘mixed’ meaning ‘combined’.

136 137 138 139 140 141 142 143 144 145 146 147 148 149

The rest of the paper is organized as follows: in Section 2, the framework together with the

150

153

background of the PDFs. In Section 4, the CNMs and MOAMs estimation approaches like LSM,

154

MOM, MLM, PSO, GA, COA, and SSO, are introduced. Section 5 describes the statistical

indicators used to evaluate the study models’ accuracy. Lastly, Section 6 shows the results of this

155

study and the comparison of the distribution models tested, and the performance of the

156

estimation approaches.

151

wind speed dataset used in this study are explained. Section 3 discusses the theoretical

152

157 158

2. Methodology

159

In this section, the overall study is discussed, together with wind speed data used, and the

160

fundamentals of single and combined PDFs. CNM and MOAM methods utilized to estimate the

161

distribution parameters are also described. The Saudi sites selected in this study are: Aljouf,

162

Alwajh, Hafer Al Batin, Jeddah, Riyadh, Sharurah, and Turaif. The Saudi sites are shown in Fig.

163

1.

5

164 165

Fig. 1: Saudi Arabia study sites [32]

166 167 168 169 170

2.1. Study Framework: The framework of the proposed study is depicted in Fig. 2 and Fig. 3 for analyzing the single and combined distributions, respectively. As shown in Fig. 2, first, the histogram of observed wind speed frequencies is built. After

173

combined distributions, as shown in Fig. 3, and since SSO shows better performance in

174

weight and parameter values of the ten combined distributions. After finding the optimal

175

parameters, the theoretical PDFs are constructed.

176

Root Mean Square Error (RMSE), Coefficient of Determination (R ), and Mean Absolute Error

171 172

177 178

that, the parameters of single distributions are obtained by employing CNMs and MOAMs. For

estimating single PDFs parameters (to be shown in Section 6), it is selected to obtain the optimal

These PDFs models are then compared with the observed wind speed frequency utilizing

(MAE) tests. The results are then computed and analyzed.

6

179 180

Fig. 2: The framework of the study with four single distributions

181 182

Fig. 3: The framework of the study with ten combined distributions

7

2.2. Wind Data Source

183 184

The wind speeds data utilized to conduct this study are obtained from King Abdullah City for

185

Atomic and Renewable Energy (K.A.CARE) [33]. In this study, hourly wind speeds data are

186

collected at 40m above ground level are used during a period between January 2016 to

187

December 2016. Fig. 2 shows the location of the investigated sites and Table 1 contains the

188

geographic features of all sites. Fig. 4 displays the monthly mean of wind speed data at the seven

189

locations.

190

Table 1: Selected Sites geographic features City

Region

Latitude (N)

Longitude (E)

Elevation (m)

Aljouf

North

29.891593

39.284135

10

Alwajh

Northwest

26.497667

36.347487

65

Hafer Al Batin

East

28.268806

44.203111

360

Jeddah

West

21.21536

39.221638

16

Riyadh

Middle

24.57642

46.35277

924

Sharurah

South

17.323417

47.073139

764

Turaif

North

31.649976

38.809603

850

191

192 193

Fig. 4: The monthly mean of wind speed

8

194

In order to comprehend and analyze the wind speed data, the following properties are used:

195

Mean, Variance, Standard Deviation (SD), Skewness, Kurtosis, and Maximum value of wind

196

speed. This is summarized in Table 2.

197

Table 2: Statistical characteristics of the wind speed at selected sites in Saudi Arabia City

Aljouf

Alwajh

Hafer Al Batin

Jeddah

Riyadh

Sharurah

Turaif

Mean

5.631961

5.102861

6.121250

5.672685

5.664841

6.001700

6.348944

Variance

6.606695

6.968867

6.995937

7.946597

6.089107

6.867020

7.192468

SD

2.570349

2.639861

2.644983

2.818971

2.467612

2.620500

2.681878

Skewness

0.584397

0.482017

0.390404

0.714200

0.567976

0.211977

0.372409

Kurtosis

3.582301

2.703814

3.011192

3.596616

2.936494

2.704404

3.103262

Maximum

22.731300

14.930000

17.231600

19.866000

14.907400

18.269400

18.090200

198 199

Mean values indicate the central tendency of the wind speed data. Variance and SD provide

200

information about how observed wind speed deviates from the central value. In addition, to

201

understand the pattern of the observed frequency distribution, Skewness and Kurtosis are

202

utilized. The symmetrical characteristic of the wind speed data is measured by Skewness while

203

the steep degree of data is described by Kurtosis value [18].

204 205

3.

206

In order to characterize and represent the wind speed effectively, this study utilizes four

207

single and ten combined PDFs. The PDFs and the cumulative distribution functions (CDFs) of

208

the four single-parameters and ten combined distributions are introduced in this section:

209 210 211

Wind Speed Distribution Model

3.1. Single-parameter Probability Density Functions 3.1.1. Weibull Distribution Weibull PDF has been used frequently by several studies to represent the frequency

212

distribution of wind speeds. Weibull distribution proved its efficiency to represent wind data as it

213

provides a good fit for the wind speed data at ground surface and upper layers [34]. Weibull

214

distribution is characterized by its PDF, ( ), and CDF, ( ), as follows [35]: ( ;

,

#$ %

) = ! "! "

#$

&'( )− ! " + for > 0 123 ,

>0

(1) 9

5

215

6

Where: : wind speed (m/s),

220

: Weibull scale parameter (m/s) and

= 2 . Rayleigh PDF has merely one parameter to estimate, making it popular to represent

wind speed regimes due to its simplicity. The Rayleigh PDF, ( ), and CDF, ( ), are expressed as follows: ( ;

221

) = ) + &'( )− 5

2

+ for > 0 123

( ) = 4 ( ) 3 = 1 − &'( )−

222

6

Where: : wind speed (m/s),

225 226

228 229 230 231 232 233

+

(3) (4)

: Rayleigh scale parameter (m/s)

9

9.

fit the wind speed frequency in many locations. The PDF, ( ), and CDF, ( ), of Gamma Gamma distribution is defined by two parameters

and

Its model has been employed to

distribution can be written as follows: ( ;

227

2

>0

3.1.3. Gamma Distribution

223 224

: Weibull shape parameter.

Rayleigh distribution is generated from Weibull distribution when its shape parameter

217

219

(2)

3.1.2. Rayleigh Distribution

216

218

#$

( ) = 4 ( ) 3 = 1 − &'( (− ! " )

9,

( )=

9) =

;5 ( 9 ) <:

;(

9)

#: %

#: 9 ;( 9 )

Where : wind speed (m/s),

&'( !−

9

" for > 0 123

9, 9

>0

(5)

(6) 9

is the scale parameter,

9

distribution. ; is Gamma function and for random variable z: ;(=) = >6 ;D is the incomplete Gamma function. E

exp(− ) 3 , and

is the shape parameter of the Gamma C

?%

3.1.4. Lognormal Distribution distribution of wind speeds. Lognormal is a two-parameters distribution and its PDF, ( ), and Lognormal distribution has been applied by some researchers to express the frequency

CDF, ( ), can be expressed as follows:

10

( ; F6 , G6 ) =

234 235 236

1

G6 √2I

J2 − F6 ( ) = Φ! " G6

&'( )

−(J2 − F6 ) + for > 0, −∞ < F6 < ∞, G6 > 0 2G6

(7) (8)

Where : wind speed (m/s), F6 is the scale parameter and G6 is the shape parameter of the

lognormal distribution. Φ: is the CDF of the standard normal distribution. 3.2. Combined probability density functions

237

A combined distribution means a combination of at least two distributions to fit the wind

238

speed data. The conventional (single) distributions may not be able to represent the wind regimes

239

properly due to their complexity [12]. Hence, to understand the wind characteristics accurately, it

240

is assumed that wind speed can be characterized by combining two or more distributions. When

241 242

there are N types of distributions, the PDF of this combination is defined using the following formula:

S

O( ) = P (Q Q ( ; RQ )

243 244 245 246

(9)

QT

Where (Q is the weight of the Uth distribution in the combined model, such that (Q ≥ 0, (U = 1,2,3, … , N), and ∑S QT (Q = 1. Q ( ) is the Uth PDF in the combined model and RQ is the parameters to be estimated of the Uth PDF.

In Table 3, OZZ distribution, for instance, means that this probability distribution contains two

Table 3 exhibits the PDFs of the ten combined distributions proposed and tested in this paper.

249

independent Weibull components; similarly, O

250

contain two independent Rayleigh, Gamma, and Lognormal, respectively. In addition, OZ

distribution implies that it is a combination of Weibull and Rayleigh. It is crucial to note that

251

using the combined distributions provides additional complexity to model wind speed as

252

compared to single-parameter distributions due to the difficulty in estimating the parameters. To

253

solve this issue, SSO is used to optimize the weights and parameters of these distributions.

254

Table 3: The probability density functions of the ten combined distributions

247 248

Combined Distribution

, O[[, and O\\ mean the distributions

Probability Distribution Function

11

1

2

3

4

5

6

7

8

9

10

OZZ( ; RQ]] ) O

( ; RQ__ )

O\\( ; RQ`` )

O[[( ; RQaa )

OZ ( ; RQ]_ )

OZ[( ; RQ]a ) OZ\( ; RQ]` ) O [( ; RQ_a ) O \( ; RQ_` ) O[\( ; RQa` )

255

P (Q ! " ! " Q

Q

QT

Q

#^ %

P (Q ) + &'( )− Q

QT

P QT

P QT

(Q

G6Q √2I

(Q #^ Q ;( Q )

( ! "! " ( ! "! " ( ! "! "

&'( )

#$ % #$ % #$ %

( ) + &'( )− ( ) + &'( )− (

#: 9 ;( 9 )

#: %

2

Q

Q

+

−(J2 − F6Q ) + 2G6Q

&'( !− "

#^ %

#^

&'( )− ! " +

Q

#$

&'( )− ! " + + ( ) + &'( )− #$

&'( )− ! " + + #$

&'( )− ! " + +

2 2

(

+ +

#: 9 ;( 9 )

(

+ +

&'( !−

9

G6 √2I

"+

(

#: 9 ;( 9 )

#: %

2

+

&'( !−

9

"

−(J2 − F6 ) &'( ) + 2G6 G6 √2I (

#: %

&'( )

&'( !−

9

"

−(J2 − F6 ) + 2G6

−(J2 − F6 ) &'( ) + 2G6 G6 √2I (

256 257 258

4. Estimation Methods Estimation of the parameters and weights of single and combined distributions, discussed in

259

Section 3, is crucial in determining the accurate probabilistic model to represent wind regimes at

260

all Saudi sites. The importance of these parameters originates from that they define the

261 262

distribution function. Taking Weibull distribution, for instance, the shape parameter ( )

263

provides information about the peak of Weibull PDF curve, while the scale parameter ( )

reflects the wind speed average, which may expand or narrow the curve [19]. In this regard, it is

264

essential that PDFs parameters are estimated accurately. Hence, this paper uses three

265

conventional numerical estimation methods, namely LSM, MOM, and MLM, and four

266

metaheuristic optimization algorithms, namely PSO, GA, COA, and SSO. The description of

267

these methods is explained in this section. 12

268 269

4.1. Conventional Numerical Estimation Methods 4.1.1. Least Square Method

270 271

To use the Least Square Method (LSM), the wind speed data must be represented in a

272

cumulative frequency distribution format [36]. Since the logarithmic transformation is the

273

fundamental of LSM, parameters of Weibull and Rayleigh distributions are estimated using this

274

approach since their CDFs contain an exponential term. Since the Lognormal and Gamma

275

distributions cannot be linearized, no least square estimator is considered for these two

276

distributions.

277 278 279 280 281

4.1.2. Maximum Likelihood Method

The Maximum Likelihood Estimation Method (O\O) is known as the likelihood function of

the wind speed data [37]. The O\O can be solved by numerical iteration to compute distribution parameters, such as using the Newton Raphson method. 4.1.3. Method of Moment

The Method of Moment (O O) uses the corresponding populations moments including the

284

mean of the observed wind speed ̅ and standard deviation of the wind data

285

Table 4 contains the mathematical formulas of the parameters for all four single distributions

286

utilizing LSM, MLM, and MOM [19].

287

Table 4: Numerical equations to estimate parameters of four single distributions using CNMs

282 283

Methods

MM

parameters of the considered distributions [38].

Weibull =!

Rayleigh

de

̅

1 = ̅ /;(1 + ) = (P( QT

1 =( P 2 k

QT

k

# # Q J2 Q ) / P Q # Q )

/#

QT

k

%

− P J2 ( )) 2 QT

Q

1 =l P 22 k

QT

estimate

Gamma

= ̅ h2/I

% .6f

"

k

MLM

de to

9

9

= ̅ /

=

J2(

Q

Lognormal

de /

̅

9) − k

F6 = J2 ( ̅ /i1 +

de

m(

9) k

= J2 (P Q / P J2 ( Q )) 9

QT

k

QT

= (P Q ) /2 QT

9

G6 = jJ2 (1 +

de

1 F6 = P J2 2 k

G6

Q

QT

1 = l P(J2 2 k

QT

13

Q

̅

)

de /

− F6 )

̅

LSM

=

S S N ∑S QT 'Q nQ − ∑QT 'Q ∑QT nQ S N(∑S QT 'Q ) − (∑QT 'Q )

= &'( )

288

∑S QT

S S 'Q ∑S QT 'Q nQ − ∑QT 'Q ∑QT S S S ∑ ∑ ∑ N QT 'Q nQ − QT 'Q QT nQ

nQ

+

1 1 = l &'( o− pP nQ − 2 P 'Q qr 2 N S

QT

S

QT

---

---

---

---

289

4.2. Metaheuristic optimization algorithms

290

The metaheuristic optimization algorithms are nature inspired algorithms. The examined

291

algorithms include PSO, GA, COA, and the proposed SSO. With these algorithms, the attempt in

292

this study is to minimize the difference between the measured frequency distribution of the wind

293

speed and theoretical values generated by the considered PDFs. Hence, the objective function is

294

as follows:

w( Q)

1 sttut ( Q ) = P v 2 k

QT6.|

w( Q)



xyz ( Q , RQ ){

(10)

xyz ( Q , RQ )

297

theoretical values generated by study PDFs, and 2 is the number of classes of wind speed.

298

population size was set in 50 while 1000 as maximum iterations. The upper and lower bounds of

299

search space are set in the range [0,10] for single PDFs and [0,20] for combined PDFs. The

300

proposed SSO is introduced in the next subsection.

295 296

Where

is the measured frequency distribution of wind speed class,

is the

Detailed description of PSO, GA, and COA is shown in Appendix (A). For all algorithms, the

301 302

4.2.1. Social Spider Optimization Algorithm

Social Spider Optimization (

303

) is a swarm intelligence algorithm introduced in 2013 by

304

Cuevas et al. [39]. SSO mimics the cooperative style of social spiders where male and female are

305

the two searching agents considered in this algorithm. Usually in spider colonies, female spiders

306

have a higher number than male spiders, roughly 65-90% female of the whole colony population

307 308 309 310

N.

According to the

algorithm introduced in [39], the mathematical steps are as follows:

female spiders N} is generated randomly between 65-90% using the following equation:

Step 1: Determine the female and male spiders’ numbers in the search space. The number of

14

N} = Juut~0.9 − t123 × 0.25‚. N‚

312

Where t123 is a random number within the range ~0,1‚. The male spiders’ number is then:

313

As a result, the population

314

‹Œ , Œ , … . , ŒS •), such that = ƒŒ =

311

315 316 317

Nw = N − N} .

spider ( = ƒ , , … . ,

S„ …)

(11)

contains N elements and is divided into two sub-groups: female

and male spiders (O = †‡ , ‡ , … . , ‡Sˆ ‰), where ,Œ =

, … . , ŒS„ =

S„ , ŒS„ Ž

=

∪ O ( =

= ‡ , ŒS„ Ž = ‡ , … , ŒS = ‡Sˆ •.

Step 2: Assign weight •Q for each spider implying the solution quality of the spider U in the population . The weight of everyone is calculated from the following expression: •Q =

‘(ŒQ ) − •utŒ’“ ”&Œ’“ − •utŒ’“

(12)

319

Where ‘(ŒQ ) is the fitness value of spider evaluated by the objective value ‘(. ), see Eq. (10), of

320

population, respectively, and defined as follows:

318

321 322

323 324 325 326 327 328

the spider position ŒQ . •utŒ’“ and ”&Œ’“ are corresponding to the worst and best individual in the ”&Œ’“ = ‡1'#∈‹

, ,…S• v‘( # ){ 123 •utŒ’“

= ‡U2#∈‹

, ,…S• v‘( # ){

Step 3: Identify the vibration process. If for example, a spider U perceives a vibration sent from a spider –, this vibration process can be written as:

—˜™Q,š = •š × &'( (−3Q,š )

(14)

Where 3Q,š is the Euclidian distance between spiders U and –, such that 3Q,š = ›

Q



(13)

š ›.

spider, i.e., U, in the population receives either of these three types of vibration as follows: I. II.

III.

Each

Closest spider, , that has the highest fitness value (—˜™Q,< = •< × &'( (−3Q,< )); Spider, ”, that has the best fitness value in the entire population (—˜™Q,œ = •œ × &'( (−3Q,œ )); Closest female spider, , to the male, U, v—˜™Q,} = •} × &'( (−3Q,} ){.

with N spider position. The positions’ coordination for each

or ‡Q , is an n-dimensional vector determined by the number of parameters to be

329

Step 4: Initialize the population

330

spider,

331

optimized. The values of these parameters are randomly generated within the predefined

332

upper,(š

Q

•Qž•

, and lower, (šŸ

¡

, bounds. This is described by the following equations:

15

f¤,¥6 = (šŸ

¢£ 6 ‡#,¥ = (šŸ

¡

¡

+ rand(0,1). v(š

•Qž•

+ rand(0,1). v(š

•Qž•

− (šŸ

− (šŸ

¡

{ U = 1,2, … , N} ; – = 1,2, … , 2

¡

{ = 1,2, … , Nw ; – = 1,2, … , 2

(15)

Where – 123 U are the parameter indexes whereas is the spider index. Zero indicates the initial

335

population. rand(0,1) is a random number generated between 0 and 1, and

336

Step 5: The cooperative interaction behavior within the colony individuals is based on the spider

333 334

337 338

individual position that has –th parameter.

equation is defined that explains the change in position of the female spider, U, in each iteration: (©, ª ) = P ‡U2‹‖3Q −

340 341

QT

# ‖ |

= 1,2, … -•

(16)

Where © is the dataset, and ª is the clustering center vector.

Based on other spider’s vibration that transmitted over the colony web, the movement of

attraction or dislike can be modeled as follows:

1 ) + ³ . —˜™Q,< . vŒ< − Q ( ){ + ´ . —˜™Q,œ . vŒœ − Q ( ){ + µ . !t123 − " < 2 Q ( + 1) = ² ° ( ) − ³ . —˜™ . vŒ − ( ){ − ´ . —˜™ . vŒ − ( ){ + µ . !t123 − 1" ≥ Q,< < Q Q,œ œ Q ¯ Q 2 ® ± ¯

342

is the Uth female

gender. To imitate the cooperative behavior of the female spider, the following mathematical k

339

Q,š

Q(

Where ³, ´, µ and t123 are random numbers in the range of 0 and 1;

is the threshold value determined; Œ< 123 Œœ are the nearest is the number of

345

best spider to the spider U and the best spider in the entire population according to the fitness

346

Step 6: Define the male cooperative behavior. In the spider population, there are dominant and

347

non-dominant male spiders. The dominant ones have high-quality fitness values and better

348

chances to attract the closed female spiders. Non-dominant male spiders, in contrast, tend to

349

gather in the male population center to exploit resources lost by dominant ones:

343 344

iteration, which is set to be 1000;

(17)

value, respectively.

16

350 351 352

±± ‡ ( ) + ³ . —˜™ . · − ‡ ( )• + µ . !t123 − 1" U • Q,¶ } Q S„¸^ > •S„¸ˆ ¯¯ Q 2 ˆ ‡Q ( + 1) = ∑S ‡ ( ). •S„¸¹ °°‡ ( ) + ³ . p •T • − ‡Q ( ) q U •S„¸^ > •S„¸ˆ Q ˆ ¯¯ ∑S •S„¸¹ ® •T ® Where,

}

is the nearest female spider to the male spider U and the term )

represents the mean value of the male spiders O in the population .

º

ˆ w (#).¡ ∑¹»$ º ¹ º

ˆ¡ ∑¹»$ º

„¸¹

„¸¹

+

Step 7: Select the best spiders to represent the next spider generation. Within a certain radius

353

calculated using Eq. (19), the dominant male and female spiders are matings resulting in new

354

spiders. After that, the fitness of newly produced spiders is evaluated and compared with their

355

parents. If new spiders have better quality than the parents, the new spiders continue, and the

356

parents are eliminated. t=

∑kšT ((š•Qž• − (šŸ 2 . 2

358

Where 2 represents the problem dimension and (š

359

combined PDFs.

357

(18)

•Qž•

¡

)

123 (šŸ

(19) ¡

are the upper and lower bounds,

respectively. In this study, they have values in the range [0,10] for single PDFs and [0,20] for

360 361

5.

Goodness of Fit Tests

362

The accuracy and efficiency of the considered numerical and optimization methods to show evaluated using the following statistical indicators: Root Mean Square ( O ¼), Coefficient of

363

how close the theoretical frequency distribution to the empirical frequency distribution are

364

Determination (

365

), and Mean Absolute Error (MAE) tests.

1 O ¼ = l P( N k

=

∑kQT (

Q

QT

Q

− •Q )

− —½ ) − ∑kQT ( ∑kQT ( Q − —½ )

(20) Q

− •Q )

(21)

17

1 O¾¼ = P| 2 k

QT

366 367

Q

368 369 370

6.

− •Q |

(22)

is the actual wind speed data, •Q is the estimated data generated from thermotical

¿ is the mean value of PDFs, —

Where:

Q

Q

and 2 is the number of wind bin classes.

Results and Discussion In this section, the performance comparison between SSO with other algorithms and the wind

371

speed frequency distributions at all seven Saudi sites are discussed and determined. Firstly, the

372

evaluation of SSO performance is presented. Secondly, the comparison between the performance

373

of four single and ten combined PDFs in describing wind speeds at the studied sites, as well as

374

the comparison between MOAMs and CNMs in estimating PDFs parameters, are discussed.

375

6.1. Performance Comparison:

376

To prove the performance of SSO, SSO is used to obtain the parameters of the four single

377

and ten combined PDFs mentioned in Sections 3.1 and 3.2. SSO is compared with three popular

378

algorithms applied in the literature for parameter estimation, including PSO, GA, and COA. The

379

results of each algorithm consider the output of 50 runs with stopping criteria of 1000 iterations.

380

The selection of the final fitness values represents the median values of these 50 executions.

381

Accordingly, the performance experiment has been conducted, and the comparison considers the

382

following five performance measure indexes: the Best Fitness Value (BFV), Worst Fitness Value

383

(WFV), Average of Best Fitness Values (ABFV), Median of Best Fitness Values (MBFV), and

384

Standard Deviation of Best Fitness Values (SDBV).

385

Table 5 shows the results of the five indexes by minimizing the objective function, Eq. (10),

386

with Weibull PDFs at Aljouf, Jeddah, Sharurah, and Turaif. Results indicate that SSO

387

outperforms other algorithms with best BFV and low WFV, ABFV, MBFV, and SDBV. This is

388

because of the ability of SSO to balance between exploitation and exploration [40]. Similar

389

results found with Rayleigh, Gamma, and Lognormal distribution functions at all locations.

390

However, due to space and word limitations, the results of these PDFs (except Weibull PDF) are

391

not presented herein. 18

392

Table 5: Results of five indexes for minimizing the objective function with Weibull PDF Aljouf BFV

Jeddah

PSO

GE

COA

SSO

PSO

GE

COA

SSO

1.4261E-04

1.4364E-04

1.9754E-04

1.4261E-04

1.0373E-04

1.0274E-04

1.5328E-04

1.0373E-04

WFV

2.7175E-03

1.1374E-03

4.6333E-02

3.4570E-04

3.1002E-03

1.4616E-03

5.5573E-02

1.3458E-03

ABFV

1.5804E-04

1.5884E-04

5.8175E-04

1.4448E-04

1.1747E-04

1.2181E-04

4.8688E-04

1.0949E-04

MBFV

1.4260E-04

1.4481E-04

3.0960E-04

1.1261E-04

1.0373E-04

1.0386E-04

2.2422E-04

7.1037E-05

SDFV

8.4720E-05

9.2489E-05

1.6710E-03

1.6504E-05

9.6191E-05

1.0601E-04

1.8431E-03

7.5791E-05

Shaeurah

Turaif

PSO

GE

COA

SSO

PSO

GE

COA

SSO

BFV

8.6308E-04

8.6308E-04

9.1181E-04

8.6308E-04

5.3696E-04

5.3896E-04

5.8579E-04

5.3690E-04

WFV

2.2094E-03

1.8021E-03

6.6286E-02

9.6255E-04

2.2084E-03

1.6556E-03

9.6938E-02

8.7994E-04

ABFV

8.7209E-04

8.6795E-04

1.2183E-03

8.0447E-04

5.4721E-04

5.4107E-04

8.2193E-04

5.3909E-04

MBFV

8.6308E-04

8.6323E-04

9.6658E-04

6.6310E-04

5.3696E-04

5.3704E-04

6.4819E-04

2.3697E-04

SDFV

4.4871E-05

3.9543E-05

2.2162E-03

9.8837E-06

5.5525E-05

4.1021E-05

3.0667E-03

1.9147E-05

393 394

Since analyzing the final fitness values cannot always describe the capability of an

395

optimization algorithm, a convergence experiment has been accomplished to evaluate how

396

quickly the optimal value can be obtained. Fig. 5 presents the convergence rate plots of PSO,

397

GA, COA, and SSO with selected PDFs that characterize wind speeds at Alwajh, Hafer Al Batin,

398

Jeddah, and Riyadh. This figure proves that SSO converges the fastest compared to other

399

algorithms and can attain the best parameters in less than 100 iterations.

(a) Alwajh

(b) Hafer Al Batin

19

(c) Jeddah

(d) Riyadh

400

Fig. 5: The convergence plots of PSO, GA, COA and SSO in obtaining the optimal parameters.

401

6.2.Analysis of Single PDF

402

In this part, the four single distributions are compared according to the goodness of fit tests,

403

RMSE,

, and MAE, to test their performance to fit the observed frequency. Results of the best

404

distribution, corresponding estimated parameters, and statistical errors of single distributions are

405

shown in Table 6 using CNMs and Table 7 using MOAMs. Figs. 6-8 display a graphical

407

representation of the goodness of fit tests, O ¼ (in Fig. 6),

408

for which their parameters are estimated by seven estimation approaches, excluding LSM for

409

Gamma and Lognormal distributions) considered in this study with the single distributions at

410

each of Saudi site.

411

Table 6: Parameters and goodness of fit results with the best single distributions using numerical methods

406

(in Fig. 7), and O¾¼ (in Fig. 8),

in the form of a heat map. These figures aim to compare all 26 models (four single distributions

MOM

ÀÁ (Â/Ã) 5.78708

ÄÁ

2.09765

Weibull

MOM

5.16142

Hafer Al Batin

Weibull

MOM

6.34887

Jeddah

Weibull

MOM

Riyadh

Weibull

Sharurah Turaif

0.00406

ÅÆ

0.99459

0.00260

1.76149

0.00925

0.97005

0.00590

2.24802

0.00671

0.98383

0.00428

5.83069

1.91972

0.00347

0.99554

0.00239

MOM

5.83351

2.21117

0.00676

0.98621

0.00429

Weibull

MOM

6.21513

2.22093

0.01085

0.95735

0.00695

Weibull

MOM

6.61003

2.31422

0.00791

0.97710

0.00482

City

Best Distribution

Method

Aljouf

Weibull

Alwajh

RMSE

MAE

20

412 413

Table 7: Parameters and goodness of fit results with the best single distribution using metaheuristic optimization

414

methods

415

416 417

City

Best Distribution

Aljouf

Weibull

Alwajh

Weibull

Hafer Al Batin

Weibull

Jeddah

Weibull

Riyadh

Weibull

Sharurah

Weibull

Turaif

Weibull

ÀÁ (Â/Ã) 5.86970

ÄÁ

2.09269

PSO, SSO

5.46465

PSO, SSO

6.50326

PSO, SSO PSO, SSO

PSO, SSO Method

PSO, SSO PSO, SSO

0.00378

ÅÆ

0.99531

0.00244

1.70825

0.00758

0.98989

0.00574

2.25453

0.00616

0.98640

0.00417

5.87299

1.95255

0.00322

0.99617

0.00227

5.75853

2.14617

0.00627

0.98815

0.00402

6.53534

2.22919

0.00729

0.98771

0.00437

6.77495

2.35907

0.00733

0.98037

0.00481

RMSE

MAE

Fig. 6: O ¼ test results with the 26 models of single distributions at Saudi sites

21

418 419

Fig. 7:

420

Fig. 8: O¾¼ results with the 26 models of single distributions at Saudi sites

421 422

test results with the 26 models of single distributions at Saudi sites

Table 6 and Table 7 and Figs. 6-8, show that: (i) Overall MOAMs outperform the CNMs in

423

obtaining distributions best parameters with low RMSE and MAE values and high correlation

424

scores. Regarding models fitting accuracy, the best models with MOAMs show improvements

425

compared with CNMs’ best models, where RMSE and MAE improved between 6.9 to 18% and 22

426

between 2.5 to 8.3%, respectively. This is also can be deduced from Fig. 9(a) through Fig. 9(g)

427

when the PDFs and CDF modeled by MOAMs and CNMS are plotted against the observed wind

428

speed frequencies. Figs. 9(a)-9(g) show that PDFs of MOAMs yield a good fit to the observed

429

wind speeds. According to Table 6 and Table 7, the

431

≥ 0.98771 for all sites while

432

0.957735, and MAE is 0.00695. On the other hand, best MOAM model (SSO-Weibull) gives an

433

RMSE value of 0.00729,

430

values for the best CNM are ≥ 0.95735. In Sharurah city, for

values of the most accurate MOAM are

example, the value of RMSE with the best CNM model (MOM-Weibull) is 0.01085,

is

of 0.98771, and MAE of 0.00437.

434

(ii) Table 6 and Table 7 and Figs. 6-8, and by comparing the different distribution models

435

including Weibull, Rayleigh, Lognormal, and Gamma, Weibull distribution can be considered as

436

the dominant distribution that can best capture wind speed distributions in the selected Saudi

437

sites. For instance, MOM-Weibull has better correlation values than MOM-Rayleigh, MOM-

438

Gamma, and MOM-Lognormal, and vice versa with RMSE and MAE. Similarly, SSO-Weibull

439

models are better than SSO-Rayleigh, SSO-Gamma, and SSO-Lognormal at all sites. This result

440

is found also with all estimation methods. (iii) For numerical approaches, LSM has the worst

441

estimation results, and the MOM shows that it is the most precise numerical method followed by

444

performance in estimating the parameters of the considered distributions. PSO and SSO methods

445

the best velocity in attaining these values as it was shown in Section 6.1. Considering Turaif site,

446

for example, Table 6 shows that MOM is the accurate CNM to calculate Weibull distribution

447

449

PSO and SSO are the best MOAMs to tune Weibull distribution parameters with c equal to

450

6.77485 and k equal to 2.35907. In this respect, SSO presents the most accurate and fastest

estimation approach to obtain PDFs parameters. The PSO and GA could be considered as the

451

next best algorithms; whereas COA appears to be the poorest MOAM.

442 443

448

MLM. On the other hand, more than one MOAMs approaches have the same accuracy and

have a similar or negligible differences in terms of estimating parameter values. Yet, SSO has where c and k found to be 6.61003 and 2.31422, respectively; whereas Table 7 reveals that

23

(a) Aljouf, Weibull Distribition

(b) Alwajh, Weibull Distribution

(c) Hafer Al Batin, Weibull Distribution

(d) Jeddah, Weibull Distribution 24

(f) Sharurah, Weibull Distribution

(e) Riyadh, Weibull Distribution

(g) Turaif, Weibull Distribution 452

Fig.

9:

Measured

frequencies

with

best

single

distributions

and

all

estimation

methods

at

each

Saudi

site.

25

(a) Alwajh

(b) Jeddah 453

Fig. 10: Measured frequencies and ten combined distributions at (a) Aljouf and (b) Jeddah sites

454 455

6.3.Analysis of combined PDF

456

This part analyzes and compares the properties of the ten combined distributions proposed in

457

this study. With the

method used to tune the weights and estimate the parameters of

458

combined distributions, the PDFs and CDFs of combined distributions at Alwajh and Jeddah

459

sites are shown in Fig. 10(a) and Fig. 10(b), respectively. These two sites are only depicted for

460

the sake of clarification and due to space limitation.

26

461

The goodness of fit test results of combined distributions are summarized in Table 8. Results

462

indicate that combined distributions have better performance compared to single distributions in

463

which the

464

single-parameter models at all considered sites, the fitting accuracy of the combined models

465

shows noticeable improvements, where RMSE and MAE are improved between 55 to 76% and

466

between 55 to 73%, respectively. This high level of accuracy implies that wind speeds are more

467

accurate if they are modeled and characterized by mixing two distributions to capture the

468

behavior of wind speed. Table 8 includes the best combined distribution at each site where

469 470 471 472 473 474

MWL, MGG, MRG, and MWW are the four combined distributions that prevailed in this study. wind speed regimes while MGG appears to fit accurately in Riyadh and Jeddah. MRG, on the In Jeddah, Riyadh, and Turaif sites, MWW distribution considers the best to characterize the Lastly, MWL distribution has the best performance in Aljouf site.

other hand, shows its ability to represent the wind accurately in Hafer Al Batin and Sharurah.

Table 8: Weights and parameters results with the best combined distributions at Saudi sites City Aljouf

475 476

values exceed 0.9971. When comparing the best combined models with the best

Best Distribution MWL

ËÁ

0.9504

ËÆ

0.0496

ÀÁ , ÀÆ , Ä Ì

5.9652

ÀÁ , ÄÌ , ÍÎ 0.4000

ÄÁ , ÀÌ

2.2625

ÄÁ , ÀÌ , ÏÎ

0.4972

RMSE

ÅÆ

MAE

0.0017

0.9991

0.0010

Alwajh

MGG

0.7515

0.2485

5.4692

2.7957

1.0378

0.6360

0.0022

0.9984

0.0015

Hafer Al Batin

MRG

0.8087

0.1913

4.1807

20.0000

2.0000

0.3577

0.0027

0.9975

0.0018

Jeddah

MWW

0.0524

0.9476

5.4704

5.9055

5.7823

1.8602

0.0012

0.9994

0.0010

Riyadh

MWW

0.8723

0.1277

6.1918

3.0349

2.4362

3.4055

0.0015

0.9993

0.0012

Sharurah

MRG

0.5083

0.4917

3.0703

14.0749

2.0000

0.5152

0.0028

0.9971

0.0019

Turaif

MWW

0.8229

0.1771

6.3570

7.5199

2.0631

5.6880

0.0025

0.9977

0.0017

Figs. 11-13 compare the statistical indicators results, O ¼,

, and O¾¼, respectively, of

477

the ten combined distributions. In Aljouf, RMSE value found to be of 0.0017 and

478

values of 0.9991 and 0.0010, respectively. Comparing these results with the best single

479

distribution results shown in Table 7, we found a remarkable improvement in RMSE (55%

480

improvement), MAE (59% improvement), and correlation values. Higher improvement in error

481

percentages are also acquired in all sites. In addition, and as can be noticed from Figs. 11-13, the

482 483

performance of O

and MAE

and O \ models are the same, and Rayleigh distribution dominates the

Lognormal if they are combined to represent the wind regimes at all study areas.

27

484 485

Fig. 11: O ¼ test results with the ten models of combined distributions at Saudi sites

486 487

Fig. 12:

test results with the ten models of combined distributions at Saudi sites

28

488 489 490

Fig. 13: O¾¼ test results with the ten models of combined distributions at Saudi sites

491

7.

Conclusion

492

In this paper, a statistical study has been conducted to represent the wind distributions at

493

seven locations in Saudi Arabia. Firstly, to understand the probabilistic behavior of the wind

494

regime, four single and ten combined PDFs have been tested. Secondly, since the performance of

495

these PDFs depends mainly on their parameter values, SSO is proposed for the first time in wind

496

energy applications to estimate the single and combined parameters. Furthermore, the efficiency

498

of all study’s models has been evaluated based on Root Mean Square Error, Coefficient of

499

models and performance of the estimation approaches, the conclusion can be summarized as

500

follows:

497

501 502 503 504 505 506

Determination, and Mean Absolute Error tests. Analyzing the results of the single and combined

1. According to the goodness of fit tests, Weibull distribution proves to be the best model to accurately fit the observed wind speeds, and the combined distributions outperform single distributions in representing the wind regimes at all locations with R values exceeding 0.9971. 2. Method of Moment offers the best CNM to fit study distributions than MLM; whereas LSM gives the poorest estimation results.

29

507 508 509 510 511 512 513

3. Overall MOAMs provide better and satisfying results in obtaining distribution parameters compared to CNMs with R ≥ 0.98771 and RMSE ≤ 0.00758. 4. SSO performs the best compared to PSO, GA, and COA in terms of having good fitness value metrics and a fast rate of convergence. Therefore, it can be employed to estimate different PDFs parameters in wind energy applications. 5. The optimal distribution model is the SSO based Weibull distribution. Hence, this model can be used to assess wind energy resources.

514

Overall, the framework provided in this study is helpful in identifying the best PDFs that

515

characterize wind speeds in different locations and justify the economic and technical viability of

516

any wind energy project. Yet and as it is mentioned before that this paper examined four single

517

and ten combined distributions, the parameters of which were obtained by using three commonly

518

used numerical methods and three popular optimization algorithms and the proposed SSO.

519

Further analyses taking into account additional probability distribution functions and different

520

numerical and recent optimization methods could be explored. Finally, this study was conducting

521

based on hourly wind speed data. Using shorter temporal resolution, such as 10 minutes, may

522

provide a more thorough analysis and insights of the distributions.

523 524

ACKNOWLEDGMENTS

525 526

The first author, Musaed Alrashidi, would like to thank Qassim University, Saudi Arabia, for the financial support in the form of funded educational scholarships.

527

Funding: This research did not receive any specific grant from funding agencies in the public,

528

commercial, or not-for-profit sectors.

529 530 531 532 533 534 535 536 537 538 539 540 541 542 543

REFERENCES [1] [2] [3] [4] [5]

[6]

Adib REN R, Folkecenter M, Bank AD, Eckhart Mohamed El-Ashry David Hales Kirsty Hamilton Peter Rae M, Bariloche F. RENEWABLES 2018: GLOBAL STATUS REPORT. n.d. Symposium AMEI. Managing Large-Scale Penetration of Intermittent Renewables 2011. Ucar A, Balo F. Evaluation of wind energy potential and electricity generation at six locations in Turkey. Appl Energy 2009;86:1864–72. doi:10.1016/j.apenergy.2008.12.016. Masseran N. Evaluating wind power density models and their statistical properties. Energy 2015;84:533–41. doi:10.1016/j.energy.2015.03.018. Carneiro TC, Melo SP, Carvalho PCM, Plínio A, Braga S. Particle Swarm Optimization method for estimation of Weibull parameters: A case study for the Brazilian northeast region. Renew Energy 2016;86:751–9. doi:10.1016/j.renene.2015.08.060. Ouarda TBMJ, Charron C. On the mixture of wind speed distribution in a Nordic region. Energy Convers Manag 2018;174:33–44. doi:10.1016/j.enconman.2018.08.007. 30

544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594

[7] [8]

[9]

[10]

[11] [12] [13]

[14] [15] [16] [17]

[18] [19]

[20]

[21]

[22] [23] [24]

[25] [26] [27]

Usta I. An innovative estimation method regarding Weibull parameters for wind energy applications. Energy 2016;106:301–14. doi:10.1016/j.energy.2016.03.068. Ramírez P, Carta JA. Influence of the data sampling interval in the estimation of the parameters of the Weibull wind speed probability density distribution: a case study. Energy Convers Manag 2005;46:2419–38. doi:10.1016/J.ENCONMAN.2004.11.004. Carta JA, Ramírez P, Velá Zquez S. A review of wind speed probability distributions used in wind energy analysis Case studies in the Canary Islands. Renew Sustain Energy Rev 2009;13:933–55. doi:10.1016/j.rser.2008.05.005. Ouarda TBMJ, Charron C, Shin JY, Marpu PR, Al-Mandoos AH, Al-Tamimi MH, et al. Probability distributions of wind speed in the UAE. Energy Convers Manag 2015;93:414–34. doi:10.1016/j.enconman.2015.01.036. Wais P. Two and three-parameter Weibull distribution in available wind power analysis. Renew Energy 2017;103:15–29. doi:10.1016/j.renene.2016.10.041. Hu Q, Wang Y, Xie Z, Zhu P, Yu D. On estimating uncertainty of wind energy with mixture of distributions. Energy 2016;112:935–62. doi:10.1016/j.energy.2016.06.112. Dong Y, Wang J, Jiang H, Shi X. Intelligent optimized wind resource assessment and wind turbines selection in Huitengxile of Inner Mongolia, China. Appl Energy 2013;109:239–53. doi:10.1016/j.apenergy.2013.04.028. Wang J, Hu J, Ma K. Wind speed probability distribution estimation and wind energy assessment. Renew Sustain Energy Rev 2016;60:881–99. doi:10.1016/j.rser.2016.01.057. Brano V Lo, Orioli A, Ciulla G, Culotta S. Quality of wind speed fitting distributions for the urban area of Palermo, Italy. Renew Energy 2011;36:1026–39. doi:10.1016/j.renene.2010.09.009. Hussain Hulio Z, Jiang W, Rehman S. Technical and economic assessment of wind power potential of Nooriabad, Pakistan. Energy Sustain Soc 2017;7:35. doi:10.1186/s13705-017-0137-9. Kollu R, Rayapudi SR, Narasimham S, Pakkurthi KM. Mixture probability distribution functions to model wind speed distributions. Int J Energy Environ Eng 2012;3:27. doi:10.1186/2251-68323-27. Wu J, Wang J, Chi D. Wind energy potential assessment for the site of Inner Mongolia in China. Renew Sustain Energy Rev 2013;21:215–28. doi:10.1016/j.rser.2012.12.060. Jiang H, Wang J, Wu J, Geng W. Comparison of numerical methods and metaheuristic optimization algorithms for estimating parameters for wind energy potential assessment in low wind regions. Renew Sustain Energy Rev 2017;69:1199–217. doi:10.1016/j.rser.2016.11.241. Gül Akgül F, ßenog BS, Arslan T. An alternative distribution to Weibull for modeling the wind speed data: Inverse Weibull distribution. Energy Convers Manag 2016;114:234–40. doi:10.1016/j.enconman.2016.02.026. Baseera MA, Meyera JP, Rehmana S, Mahbubalama. Wind power characteristics of seven data collection sites in Jubail, Saudi Arabia using Weibull parameters. Renew Energy J 2017;102:35– 49. doi:10.1016/j.renene.2016.10.040. Rehman S, Al-Abbadib NM. WIND POWER CHARACTERISTICS ON THE NORTH WEST COAST OF SAUDI ARABIA. Energy Environ 2009;2021:1257–70. Soukissian TH, Karathanasi FE. On the selection of bivariate parametric models for wind data. Appl Energy 2017;188:280–304. doi:10.1016/j.apenergy.2016.11.097. Rehman S, Mahbub Alam AM, Meyer JP, Al-Hadhrami LM. Wind speed characteristics and resource assessment using weibull parameters. Int J Green Energy 2012;9:800–14. doi:10.1080/15435075.2011.641700. Chang TP. Wind energy assessment incorporating particle swarm optimization method. Energy Convers Manag 2011;52:1630–7. doi:10.1016/j.enconman.2010.10.024. Kantar YM, Usta I. Analysis of the upper-truncated Weibull distribution for wind speed. Energy Convers Manag 2015;96:81–8. doi:10.1016/j.enconman.2015.02.063. Chang TP. Estimation of wind energy potential using different probability density functions. Appl Energy 2011;88:1848–56. doi:10.1016/j.apenergy.2010.11.010. 31

595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645

[28]

Arslan T, Bulut YM, Yavuz AA. Comparative study of numerical methods for determining Weibull parameters for wind energy potential. Renew Sustain Energy Rev 2014;40:820–5. doi:10.1016/j.rser.2014.08.009. [29] Bagiorgas HS, Giouli M, Rehman S, Al-Hadhrami LM. Weibull parameters estimation using four different methods and most energy-carrying wind speed analysis. Int J Green Energy 2011;8:529– 54. doi:10.1080/15435075.2011.588767. [30] Shin J-Y, Heo J-H, Jeong C, Lee T. Meta-heuristic maximum likelihood parameter estimation of the mixture normal distribution for hydro-meteorological variables. Stoch Environ Res Risk Assess 2014;28:347–58. doi:10.1007/s00477-013-0753-7. [31] Wang J, Hu J, Ma K. Wind speed probability distribution estimation and wind energy assessment. Renew Sustain Energy Rev 2016;60:881–99. doi:10.1016/j.rser.2016.01.057. [32] Renewable Resource Atlas. Saudi Arab n.d. https://rratlas.kacare.gov.sa/RRMMPublicPortal/ (accessed October 10, 2018). [dataset] [33] King Abdullah City for Atomic and Renewable Energy (K.A.CARE). Renewable Resource Atlas n.d. https://rratlas.kacare.gov.sa/RRMMPublicPortal/?q=en/Home (accessed September 25, 2019). [34] Chaurasiya PK, Ahmed S, Warudkar V. Comparative analysis of Weibull parameters for wind data measured from met-mast and remote sensing techniques. Renew Energy 2018;115:1153–65. doi:10.1016/j.renene.2017.08.014. [35] Bataineh KM, Dalalah D. Assessment of wind energy potential for selected areas in Jordan. Renew Energy 2013;59:75–81. doi:10.1016/j.renene.2013.03.034. [36] Werapun W, Tirawanichakul Y, Waewsak J. Comparative Study of Five Methods to Estimate Weibull Parameters for Wind Speed on Phangan Island, Thailand. Energy Procedia, vol. 79, Elsevier B.V.; 2015, p. 976–81. doi:10.1016/j.egypro.2015.11.596. [37] Indhumathy D, Seshaiah C V, Sukkiramathi K. Estimation of Weibull Parameters for Wind speed calculation at Kanyakumari in India. Int J Innov Res Sci 2014;3:8340–5. [38] Azad AK, Rasul MG, Yusaf T. Statistical diagnosis of the best weibull methods for wind power assessment for agricultural applications. Energies 2014;7:3056–85. doi:10.3390/en7053056. [39] Cuevas E, Cienfuegos M, Zaldívar D, Pérez-Cisneros M. A swarm optimization algorithm inspired in the behavior of the social-spider. Expert Syst Appl 2013;40:6374–84. doi:10.1016/j.eswa.2013.05.041. [40] Luque-Chang A, Cuevas E, Fausto F, Zaldívar D, Pérez M. Social Spider Optimization Algorithm: Modifications, Applications, and Perspectives. Math Probl Eng 2018;2018:1–29. doi:10.1155/2018/6843923. [41] Kennedy J, Eberhart R. Particle swarm optimization. Neural Networks, 1995 Proceedings, IEEE Int Conf 1995;4:1942–8. doi:10.1109/ICNN.1995.488968. [42] Pai P-F, Hong W-C. Forecasting regional electricity load based on recurrent support vector machines with genetic algorithms. Electr Power Syst Res 2005;74:417–25. doi:10.1016/j.epsr.2005.01.006. [43] K. Sastry, D. Goldberg and G. Kendall (2006). Genetic algorithms. In: E.K. Burke and G. Kendall (eds.) (2005). Introductory Tutorials in Optimization, Decision Support and Search Methodology. ISBN: 0387234608, Springer. Chapter 4, 97-125. [44] Rajabioun R. Cuckoo Optimization Algorithm. Appl Soft Comput 2011;11:5508–18. doi:10.1016/j.asoc.2011.05.008. [45] Mousavirad SJ, Ebrahimpour-Komleh H. Entropy based optimal multilevel thresholding using cuckoo optimization algorithm. Proc - 2015 11th Int Conf Innov Inf Technol IIT 2015 2016:302– 7. doi:10.1109/INNOVATIONS.2015.7381558.

32

646 647 648 649 650 651 652 653 654 655 656 657 658

659 660 661 662 663 664

665 666 667 668 669 670 671 672 673 674 675 676 677 678

Appendix (A)

A.1. Particle Swarm Optimization (

)

Particle Swarm Optimization ( ) was proposed by Kennedy and Eberhart in 1995 by observing the movement behavior of species, such as birds and fish swarms [41]. In PSO, a group of particles evolve in the search space aiming to obtain the optimal solution. In a ©dimensional searching space, each of these particles are assigned with the position vector ÑQ = ~'Q , 'Q , … . , 'QS ‚ and the velocity vector —Q = ~ Q , Q , … , QS ‚. In each of the algorithm iterations, the fitness values of each particle is evaluated based on the objective function, see Eq. (10), and the best position Q = ~(Q , (Q , … , (QS ‚ is recorded. The coordinate of the best particle fitness of the swarm is assigned as the global best position ž = Ò(ž , (ž , … , (žS Ó. Until the stopping criteria are satisfied, the position and the velocity of U th particle is updated in each iteration based on the following equations:

(A1) × t123() × v Q# − ÑQ# { + Ö × t123() × v ×# − ÑQ# { #Ž # #Ž (A2) ÑQ = ÑQ + —Q Where: Ô is the inertia weight, Õ , Ö are social and cognitive parameters, respectively, t123() is a random number selected in the range [0, 1].

—Q#Ž = Ô × —Q +

Õ

In this study, the algorithm was initiated with 1000 maximum iteration (U’&twØÙ ), the upper and lower values of the searching space are in the rang [0,10], and 50 particles as a random population. The inertia weight (Ô), the social parameter ( Õ ), and the cognitive parameter ( Ö ) are updated nonlinearly at each iteration (–) using equations A3-A5 [5]:

Ú – (A3) Ô(–) = !1 − " (ÔwØÙ − ÔwQk ) + ÔwQk U’&twØÙ Û – (A4) " v Õ,wØÙ − wQk { + Õ,wQk Õ (–) = !1 − U’&twØÙ Ü – (A5) ( ) v Ö,wQk − wØÙ { + Ö,wØÙ – = !1 − " Ö U’&twØÙ Where, ÔwØÙ 123 ÔwQk are the inertia weight maximum and minimum values, which are 0.9 and 0.4, respectively. Õ,wØÙ 123 Õ,wQk are the maximum and minimum values of the social parameters, which are 2.5 and 0, respectively. The maximum and minimum values of the cognitive parameters Ö,wØÙ 123 Ö,wQk are set to 2.5 and 0, respectively. The power coefficients ³, ´ 123 Ý are set to 0.5, 1.5 and 1, respectively.

Genetic algorithm ([¾) is an evolutionary algorithm which is driven by the natural selection and genetics [42,43]. Three main components describe [¾ are: chromosome, population and generation [30]. The main idea of the [¾ deepens on surviving of the finest individuals (chromosomes) in the searching space. [¾ process can be described as follows [42]: A.2. Genetic Algorithm

Step 1: The initial chromosome populations are generated randomly, and the considered distribution parameters represent the problem chromosome. The bounds of the searching space are between 0 and 10. 33

679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718

719 720 721

Step 2: The fitness of each chromosome in the population is evaluated using the objective function shown in Eq. (10) in the main manuscript. Step 3: The chromosomes that have highest fitness values are selected since they have a higher chance to reproduce the algorithms next generation. Step 4: Adding a new offspring to the population using the crossover process. For each pair of parents, the ‘1’ bit is replaced by ‘0’ bit and vice versa randomly within the genes. After that, diversify the population by applying the mutation process. Both the mutation and crossover process are defined probabilistically. Step 5: The best fitness values of the new population continue to the next generation. Step 6: Check if the termination conditions are satisfied; otherwise, the algorithm should return to Step 2. Step 7: The optimum chromosome represents the optimal solution to the problem (best distribution parameters). Cuckoo Optimization Algorithm (ª ¾) is an evolutionary algorithm established by mimicking the behavior of Cuckoo birds in their survival strategies. ª ¾ was introduced by R. Rajabioun in 2011 [44]. The cuckoos lay their eggs in other birds’ nest called host birds. In case these host birds discovered cuckoo eggs, they throw these eggs out of the nest. The surviving eggs create the next cuckoo generation and follow the same egg laying style in other habitats. A.3. Cuckoo Optimization Algorithm

Detailed steps of the ª ¾ is described as follow [44,45]:

Step 1: Initialize the parameters of ª ¾: number of initial cuckoo in the habitat (N5Øà ), the upper (á() and lower (Ju•) bounds of number of eggs for each cuckoo, maximum number of cuckoos that live at the same time (NwØÙ ), and maximum number of iterations (U’&twØÙ ). In this study, these variables are set as follows: N5Øà = 5, á( and Ju• bounds are, 0 and 10, respectively. NwØÙ were set to 20 and U’&twØÙ is 1000. Step 2: Randomly generate the number of eggs (Nâžž“ ) for each cuckoo using the following formula:

Nâžž“ = Juut((á( − Ju• ). t123 + Ju•) Where: t123 is a random number.

(A6)

Step 3: Determine the “Egg Laying Radius (¼\ )” which represents the maximum distance in where cuckoos can lay the eggs from their habitat. ¼\ is defined from the flowing equation:

¼\ = ³ .

Sãwœâà } <ãààâkd <ã<#

ä “ âžž“

å dØŸ kãwœâà } âžž“

. (á( − Ju•)

Where, α is an integer number aiming to control the maximum value of ELR, which sets in this study to value of 5.

34

(A7)

722 723 724 725 726 727 728 729 730 731 732 733 734 735

736 737 738 739 740 741 742 743 744 745 746

Step 4: Randomly lay the generated cuckoos’ eggs in another host birds’ nest within the predetermined ¼\ . If the host birds discovered the cuckoo eggs, they will throw the eggs out of their nest. % of the laying eggs is going to be detected and have no chance to survive. Step 5: The new growing cuckoos are living around their hatching areas. At the time of laying their eggs, they intend to immigrate to new habitats where have a high surviving opportunity. To distinguish between communities, K-mean clustering algorithm is applied (K is 3-5 is enough). After that, the area that has the best profit value represents the new best habitats and is the goal of others cuckoo toward where they should immigrate.

The cuckoos’ movements toward new best habitats are regularized. Cuckoos can move to the target point by è% of all distance and has a deviation of é radian. These two values are generated by a uniform distribution defined as follows: (A8) è~ë(0,1) (A9) é~ë(−Ô, Ô) Where è~ë(0,1) is a uniformly generated number between [0,1], Ô is a number that control the deviation from the target habitat.

Step 6: Eliminate the cuckoos belongs to worst habitats and keep NwØÙ of cuckoos that have best profit values. Step 7: Check if the stopping criteria are fulfilled; otherwise, the algorithm should go back to step 2. Step 8: The optimal solution of the problem (best distribution parameters) is represented by the optimum nest position.

35

Highlights: • • • •

Wind characteristics are analyzed for wind energy potentials at seven sites in Saudi Arabia Combined distributions outperform single-parameter distributions to fit the observed wind data Metaheuristic optimization algorithms are investigated to obtain the optimal distribution parameters Social Spider Optimization algorithm has the fastest convergence rate to obtain optimal parameters

Declaration of interests ☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. ☐The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: