A support vector machine approach to estimate global solar radiation with the influence of fog and haze

A support vector machine approach to estimate global solar radiation with the influence of fog and haze

Renewable Energy 128 (2018) 155e162 Contents lists available at ScienceDirect Renewable Energy journal homepage: www.elsevier.com/locate/renene A s...

474KB Sizes 4 Downloads 79 Views

Renewable Energy 128 (2018) 155e162

Contents lists available at ScienceDirect

Renewable Energy journal homepage: www.elsevier.com/locate/renene

A support vector machine approach to estimate global solar radiation with the influence of fog and haze Wanxiang Yao a, b, Chunxiao Zhang c, *, Haodong Hao d, Xiao Wang e, Xianli Li c a

Tianjin Key Laboratory of Civil Structure Protection and Reinforcement, Tianjin Chengjian University, Tianjin, 300384, PR China State Key Lab of Subtropical Building Science, South China University of Technology, Guangzhou, 510640, PR China c School of Energy and Safety Engineering, Tianjin Chengjian University, Tianjin, 300384, PR China d School of Control and Mechanical Engineering, Tianjin Chengjian University, Tianjin, 300384, PR China e School of Economics and Management, Tianjin Chengjian University, Tianjin, 300384, PR China b

a r t i c l e i n f o

a b s t r a c t

Article history: Received 7 February 2018 Received in revised form 25 April 2018 Accepted 19 May 2018 Available online 21 May 2018

In recent years, fog and haze occurred frequently, due to energy crisis and environmental pollution. Fog and haze have significant scattering-weakening effect on solar radiation, resulting in a severe weaken to solar radiation received on a horizontal surface. In this paper, air quality index (AQI) is taken as an additional input parameter, and some new models for estimating global solar radiation on a horizontal surface are proposed based on a support vector machine (SVM). The accuracy of SVM-1 and SVM-2 models are compared and analyzed, and the results show that the performance of SVM-2 models with an extra input parameter AQI are generally improved, for which the R value is promoted from 0.848 to 0.876, the NSE value is lifted from 0.682 to 0.740, the RMSE value is reduced from 0.114 to 0.102, and the MAPE value is decreased from 9.257 to 8.214. Comparing with existing models, SVM models proposed in this paper can improve the accuracy of global solar radiation models. If AQI is used as an additional input parameter to estimate global solar radiation, the accuracy will be further improved. © 2018 Elsevier Ltd. All rights reserved.

Keywords: Support vector machines Global solar radiation Fog and haze Air quality index

1. Introduction In recent years, population growth has led to the increasingly intensified energy crisis and environmental pollution. All countries have focused their research on renewable energy. As a renewable energy source with no pollution, large reserves and wide distribution, solar energy has been widely used [1e3]. Solar radiation data plays an important role in solar energy conversion, building energy conservation, and regional resource assessment. However, in some developing countries, due to lack of observation equipment, equipment maintenance or other reasons, solar radiation data is seriously missing. Therefore, it is particularly necessary to establish daily global solar radiation models. Before reaching the Earth’s surface, solar radiation is absorbed and scattered by the atmosphere. Due to the influence of weather conditions, atmospheric conditions and other factors, solar radiation received on the earth’s surface shows a random variation trend. Some scholars have studied the influence of conventional

* Corresponding author. E-mail address: [email protected] (C. Zhang). https://doi.org/10.1016/j.renene.2018.05.069 0960-1481/© 2018 Elsevier Ltd. All rights reserved.

meteorological parameters (e.g. latitude, sunshine duration, temperature, relative humidity, etc.) on a horizontal surface of solar radiation, and many empirical models have been proposed based on a large number of historical data [4e6]. In addition, some researchers have tried to estimate global solar radiation on a horizontal surface by coupling satellite images with ground observation data [7e9], which is more suitable to estimate solar radiation in a large scale (larger than 10 km), especially for special weather conditions (such as overcast days, cloudy days, rain and snow days, etc.). Solar radiation is influenced by many factors at the same time. Considering the relative simplification of functional relationship, empirical models can only reflect the influence of one or several parameters on solar radiation. However, the interaction of multiple parameters is rarely considered. With the development of artificial intelligence and bioanthropophic technology, many researchers apply neural networks to solar energy, especially in the study of solar radiation [10e12]. Because neural network is not restricted by conventional functions and can consider the interaction between multiple parameters at the same time, it has unique advantages to estimate solar radiation. Support vector machine (SVM) has a wide range of applicability compared to other neural networks, which can not

156

W. Yao et al. / Renewable Energy 128 (2018) 155e162

Nomenclature Tmax Tmin DTair DTsur N S S0 WS RH PM2:5 PM10 AQI SVM ANN GP R NSE RMSE MAPE

Daily maximum air temperature ( C) Daily minimum air temperature ( C) Daily air temperature difference ( C) Daily surface temperature difference ( C) Number of days between 1 (January 1st) and 365 or 366 (December 31st) Sunshine duration (h) Daily maximum possible sunshine duration (h) Wind speed (m/s) Relative humidity (%) Particulate matter on a scale of 2.5 Particulate matter on a scale of 10 Air quality index (dimensionless) Support vector machine Artificial neural network Genetic programming Correlation coefficient (dimensionless) Nash-Sutcliffe Equation (dimensionless) Root mean square error (dimensionless) Mean absolute percentage error (%)

only deal with linear problems, but also map nonlinear problems into high-dimensional linear space by kernel function to ensure the adaptability of this algorithm. In addition, based on statistical learning theory, SVM uses the principle of structural risk minimization to learn from finite samples, and seeks trade-off between approximation accuracy and approximation function complexity of given data, so that this algorithm has strong generalization ability. Therefore, lots of solar radiation models have been established by various scholars based on SVM algorithm in the past decades [13e18]. Chen et al. [13] proposed a series of models which use a combination of seven temperatures as input parameters, and linear, polynomial and radial basis functions as kernel functions. These models use support vector machine to estimate monthly average solar radiation. Results show that the model which uses polynomial as kernel function and Tmax and Tmin as input parameters is the best. Based on historical data (solar radiation, cloud cover, relative humidity and wind speed, etc.), Zeng et al. [14] used SVM algorithm to predict atmospheric transmissivity and then converted it into solar energy according to latitude and the time of day. Through verification with the National Solar Radiation Database, this model shows a good consistency. If additional meteorological variables are used, especially cloud cover, the accuracy of this model will be further improved. Chen et al. [16] established global solar radiation models based on sunshine duration using support vector machine learning, and compared SVM models of seven different input parameters with five empirical models. The results demonstrate that the performance of SVM model is better than empirical models. Meanwhile, combining sunshine duration with other meteorological parameters can improve the accuracy of models. Z. Ramedani et al. [15] proposed a support vector regression method to predict global solar radiation. Two SVR models are studied, where one is radial basis function model (SVR-rbf) and the other is a polynomial function model (SVR-poly). The estimated performance of SVR-rbf model is better than SVR-poly model, which can effectively improve the accuracy and shorten calculation time. Olatomiwa et al. [17] used sunshine duration, maximum and minimum temperature as input parameters, and combined support vector machine with firefly algorithm (SVM-FFA) to predict monthly average solar radiation on a horizontal surface. Compared artificial neural

network (ANN) and Genetic Programming (GP) models, the SVMFFA models have higher prediction accuracy. S. Shamshirband et al. [18] combined support vector machine with wavelet analysis to establish a coupled model for estimating solar radiation, and tested with data of Iran (cloud cover and clearness index). As a result, the model has a good agreement with measured data. In addition, environmental pressure is increasing with the process of industrialization. There are different degrees of fog and haze in many regions of China, in which Beijing-Tianjin-Hebei region is particularly serious [19,20]. The number of air pollution days in Beijing was 143 days in 2017, accounting for 39.2% of the whole year, with severe and serious pollution for 27 days. On sunny days, solar radiation is absorbed and scattered by the upper atmosphere; when it is overcast or cloudy days, especially heavy fog and hazy weather, solar radiation is not only passing through atmosphere, but also low-level clouds and particles in air near the ground. Solar radiation is scattered, reflected and refracted, and it is severely weakened when reaching the ground. Therefore, near-surface atmosphere seriously affects the acceptable solar radiation on a horizontal surface. The near-surface atmospheric conditions can be characterized by air quality index (AQI), which is almost directly proportional to the concentration of particulate matter (PM) in the air [21,22]. Some researchers proposed AQI as an additional input parameter to establish solar radiation models. For example, Yao et al. [23] established empirical model of daily diffuse solar radiation based on the data of nearly 55 years in Beijing, China, and modified it with AQI. Result demonstrates that the accuracy of this model is improved. Based on meteorological data of Tehran in Iran for one year, Masoud Vakili et al. [24] established daily global solar radiation models by using an artificial neural network which input parameters include temperature, relative humidity, wind speed and air particulate matter (PM). The main purpose of this paper is using SVM algorithm to realize the learning mechanism of finite samples based on the principle of minimizing structural risk, and investigate the influence of different input parameters on output performance of these models, especially the contribution of AQI to output performance. In this paper, based on the influence of fog and haze, SVM neural network is applied to establish daily global solar radiation models of Beijing. The influence of different input parameters (surface temperature difference, air temperature difference, relative humidity, sunshine duration and AQI) on the accuracy of SVM models are compared and analyzed. In addition, these new SVM models and existing models are compared and analyzed based on statistical parameters. The results of this paper can provide a reference for estimating solar radiation in the regions which are lacking of detection equipment, and provides a basis for the evaluation, conversion, and utilization of solar energy resources, especially for regions where environmental pollution is more serious. 2. Method and data 2.1. Support vector machine Support vector machines (SVM) were originally proposed by V. Vapnik [25,26] for classification, non-linear regression and other related fields. The theoretical basis of SVM is statistical learning theory, more precisely, SVM is an approximate realization of structural risk minimization. According to statistical learning theory, the actual risk of regression estimation is related to empirical risk and confidence range. The empirical risk is related to training sample, and the confidence range is related to VapnikChervonenkis dimension (VC dimension) of learning machine and number of training samples. When these two factors are both small, that is, not only empirical risk is small, but also VC dimension is

W. Yao et al. / Renewable Energy 128 (2018) 155e162

minimum (confidence interval is minimize), it can minimize actual risk and have good generalization ability for unknown data [27,28]. Training dataset is assumed as fðxi ;yi Þ; i ¼ 1; 2;/; lg, where xi 2 RN is input value, yi 2R is corresponding target value, and l is the number of samples. In order to prevent over-fitting, tolerance value (ε > 0) is imported, where ε is related to the accuracy of sample dataset based on regression function. ε-insensitive loss function is calculated as Eq.(1).

 jy  f ðxÞjε ¼

0 jy  f ðxÞj  ε

jy  f ðxÞj  ε jy  f ðxÞj > ε

(1)

Where f ðxÞ is the regression estimation function constructed by learning sample dataset, and y is corresponding target value of x. The purpose of SVM training for samples is to establish regression function f ðxÞ, so the distance between estimated value and expected value of regression function is less than ε. When VC dimension of regression function is the smallest, regression function has the best generalization ability, which perfectly solves the problem of sample over fitting. Because regression estimation of sample dataset cannot be achieved by linear functions in a low-dimensional space, the sample dataset are mapped to a high-dimensional linear feature space in which a linear regression can be implemented. If mapping function of sample dataset is a nonlinear function 4ðxÞ, the regression estimation function f ðxÞ can be expressed as Eq.(2).

f ðxÞ ¼ w$4ðxÞ þ b

l  X 1 min kwk2 þ C xi þ x*i 2 w;b;x i¼1

!

s:t: yi  w$4ðxi Þ  b  ε þ xi *

w$4ðxi Þ þ b  yi  ε þ xi

(3)

xi  0 x*i  0; i ¼ 1; 2; /; l The lagrange multiplier method is utilized to solve the optimization problem, and regression estimation function is shown as Eq.(4).

f ðxÞ ¼

X

ai 

xj 2SV

a*i

object. Annual mean daily global solar radiation in Beijing is 14.71 MJ/m2 and average daily sunshine duration is 6.83 h. The data of Beijing (39+ 480 N, 116+ 280 E) [29] are used to establish SVM models and to discuss the coupling effects of fog and haze with other meteorological parameters on solar radiation. The data from December, 2013 to August, 2016 is used to train SVM neural network models, a total of 1000 groups of data. The data from September, 2016 to March, 2017 is employed to verify the accuracy of SVM models, a total of 210 groups of data. Data used in this paper includes daily global solar radiation, sunshine duration, temperature, relative humidity and air quality index (AQI).

2.3. Statistical parameters Correlation coefficient (R), Nash-Sutcliffe Equation (NSE), Root mean square error (RMSE) and Mean absolute percentage error (MAPE) are used to characterize the error between measured value and calculated value of different models, which reflect the performance of different models [23,30e32]. These four statistical parameters are calculated as follows: n P

ðci  ca Þ$ðmi  ma Þ i¼1 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi R ¼ v" #" #ffi u n n u P P 2 2 t ðci  ca Þ ðmi  ma Þ i¼1

(2)

Where dimensions of w is the dimension of high-dimensional linear feature space after sample dataset is mapped. In order to analyze different emphasis degree of data, slack variables x and x* are introduced, respectively representing relaxation factors at the upper and lower bounds. The penalty factor C is an empirical value, which determines recognition degree of this algorithm on the loss of different data. Hence, Eq.(2) is rewritten as Eq.(3).

   K xi $xj þ b

(4)

Where ai and a*i are lagrange multipliers, and Kðxi $xj Þ ¼ 4ðxi Þ$4ðxj Þ is kernel function. Although SVM algorithm needs to map a sample set to the highdimensional feature space by a nonlinear function, it only needs to calculate the kernel function when calculating the regression function. In order to avoid dimensionality disaster caused by highdimensional feature space, the kernel function must satisfy Merce condition. 2.2. Data Considering the present situation of fog and haze in China, Beijing with more serious fog and haze is selected as the research

157

i¼1 n P

NSE ¼ 1 

(5)

ðmi  ci Þ2

i¼1 n P

(6)

ðmi  ma Þ2

i¼1

vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u n u1 X RMSE ¼ t ðc  mi Þ2 n i¼1 i

MAPE ¼

 Pn ci mi   $100 i¼1  mi n

(7)

(8)

Where ci is calculated value; ca is average calculated value; mi is measured value; and ma is average measured value. When R and NSE are closer to 1, and MAPE and RMSE are closer to 0, the model is more accurate.

3. Results and discussion In this paper, the SVM algorithm is used to modeling and evaluate global solar radiation with MATLAB. The detailed steps are as follows: (1) All data are preprocessed, and unreasonable data is removed. Then global solar radiation on a horizontal surface and sunshine duration are normalized to get clearness index and sunshine percentage. (2) The data is imported into MATLAB software, and divided into training dataset (1000 groups of data) and verification dataset (210 groups of data). (3) Learning the sample data through the training of data sets, and a linear kernel function is used to establish global solar radiation model (SVM-1 model). The calculated value and measured value are comparatively analyzed to evaluate the accuracy of different models.

158

W. Yao et al. / Renewable Energy 128 (2018) 155e162

(4) To analyze the internal relationship between different influence factors and explore its impact on solar radiation, AQI is imported to establish SVM-2 models based on SVM-1 global solar radiation models. (5) The SVM-1 models, SVM-2 models and the existing models are comparatively analyzed to evaluate the accuracy of SVM models. According to the influence of different input parameters on solar radiation, the SVM models are divided into two categories (SVM-1 and SVM-2 models), as shown in Fig. 1. The first category is SVM-1 models, and their input parameters include surface temperature difference, air temperature difference, relative humidity and sunshine duration. In addition, air quality index (AQI) is added to establish SVM-2 models based on SVM-1 models, and the input parameters of SVM-2 models include surface temperature difference, air temperature difference, relative humidity, sunshine duration and AQI. Table 1 and Fig. 2 show the accuracy of these models based on SVM algorithm is improved in different degrees. For training dataset, compared with SVM-1 models, the accuracy of SVM-2 models after adding AQI parameters is improved accordingly. Among average value of these statistical parameters, the R value is promoted from 0.864 to 0.894, the NSE value is lifted from 0.712 to 0.764, the RMSE value is reduced from 0.091 to 0.088, and the MAPE value is decreased from 8.203 to 7.692. Similarly, for validation dataset, the R value is promoted from 0.848 to 0.876, the NSE value is lifted from 0.682 to 0.740, the RMSE value is reduced from 0.114 to 0.102, and the MAPE value is decreased from 9.257 to 8.214. Among the first type of models (SVM-1 models), SVM-1-8 model based on 4 input parameters is the most accurate, because the SVM model takes into account the coupling effect among different factors, and the meteorological parameters can be repaired with each other to improve the accuracy of global solar radiation. What’s more, sunshine duration represents the time of duration with a certain intensity (greater than or equal to 120 W/ m2), which plays a most significant impact on solar radiation. Therefore, SVM-1-4 model based on sunshine duration is the most accurate, which can be verified from Table 1. Meanwhile, sunshine duration and conventional meteorological parameters are used as input parameters, and SVM-1-5 model, SVM-1-6 model and SVM1-7 model are obtained, in which SVM-1-5 model (its input parameters are sunshine duration and surface temperature difference) is the best. Its R value is 0.940, NSE value is 0.851, and RMSE

Pretreatment remove unreasonable data Training dataset

value is 0.083. The second type of models (SVM-2 models) is similar to the first type of models (SVM-1 models), in which SVM-2-8 model based on five input parameters is the most accurate. When AQI is added into the input parameters, the accuracy of SVM-2 models is improved to a certain extent compared with SVM-1 models. The performance improvement of the SVM-1-2 model is the most significant (see Fig. 2). Its R is increased 18.41%, NSE is increased 55.58%, and RMSE is decreased 13.32%. In Fig. 3, when clearness index is larger (greater than 0.7), it is a sunny day, and its air quality is good. At this time, the effect of fog and haze on solar radiation is small, and the corrective effect of AQI is not significant, which the difference between the calculated value of SVM-1-8 model and SVM-2-8 model is small. When clearness index is small (less than 0.2), it is characterized by overcast or cloudy days. The solar radiation is obscured by clouds in the upper air, and the space further modified by fog and haze near the surface is smaller, as a result, the corrective effect by AQI is poor, or even the accuracy is decreased. Based on the sample data, SVM algorithm seeks optimal solution between the best fitting ability and the best generalization ability, and it has a strong generality. In this paper, although the performance of SVM1-8 is slightly worse than that of SVM-2-8, it is no larger fluctuations Fig. 4. SVM models proposed in this paper are compared with existing models (including empirical models and neural network models), as shown in Table 2. In the existing models, The RMSE value is varied from 0.1944MJ,m2 ,day1 for said model (relative sunshine duration based model) to 3.8 MJ,m2 ,day1 for Ramedani model (temperature based model). The MAPE is distributed between 0.8 and 78.28 with mean value of 10.301, and The R is distributed between 0.7727 and 0.9835 with mean value of 0.907. The accuracy of SVM-1-8 and SVM-2-8 models recommended in this paper has been improved relative to existing models. Relative to the empirical models, the RMSE of SVM-2-8 is decreased from 2.028MJ,m2 ,day1 to 1.18MJ,m2 ,day1 , R value is increased from 0.907 to 0.952, and MAPE is decreased from 10.301 to 3.992. In general, the SVM-2-8 model based on the SVM algorithm and adding AQI input parameters has a good applicability in Beijing. Considering the time scale, daily value reflects the variation of solar radiation within a day, and it is more random than the monthly average daily value. So the accuracy of monthly average daily solar radiation models is generally higher than the daily models. From the aspect of input parameters, the performance of models which used sunshine hours as the main input parameters is

Tsur

SVM-1-1

Tair

SVM-1-2

RH

SVM-1-3

S

SVM-1-4

S , Tsur

SVM-1-5

S , Tair

SVM-1-6

S , RH

SVM-1-7

All parameters

SVM-1-8

Normalization processing

AQI

AQI

AQI

AQI

SVM-2-1

SVM-2-2

Validation dataset

SVM-2-3

SVM-2-4 Output data

AQI

AQI

AQI

AQI

Calculate validation error

SVM-2-5

SVM-2-6

SVM-2-7

SVM-2-8

Fig. 1. The flow chart of solar radiation modeling and evaluation based on support vector machine.

Optimal models

W. Yao et al. / Renewable Energy 128 (2018) 155e162

159

Table 1 Statistical parameters of different models in training and testing section. Models

Input parameters

SVM - 1 1 2 3 4 5 6 7 8 SVM - 2 1 2 3 4 5 6 7 8

Train

Surface temperature difference Air temperature difference Relative humidity Sunshine duration Sunshine duration, surface temperature difference Sunshine duration, air temperature difference Sunshine duration, relative humidity Surface temperature difference, air temperature difference, relative humidity, sunshine duration Surface temperature difference, AQI Air temperature difference, AQI Relative humidity, AQI Sunshine duration, AQI Sunshine duration, surface temperature difference, AQI Sunshine duration, air temperature difference, AQI Sunshine duration, relative humidity, AQI Surface temperature difference, air temperature difference, relative humidity, sunshine duration, AQI

1.00

Mean absolute percentage erroe (MAPE)

0.90

Correlation coefficient (R)

NSE

RMSE MAPE R

NSE

RMSE MAPE

0.797 0.652 0.770 0.932 0.942 0.933 0.935 0.949 0.845 0.782 0.776 0.948 0.942 0.952 0.943 0.962

0.492 0.351 0.463 0.871 0.891 0.873 0.876 0.875 0.555 0.458 0.655 0.893 0.891 0.899 0.871 0.889

0.131 0.148 0.134 0.066 0.061 0.065 0.065 0.059 0.122 0.135 0.135 0.058 0.061 0.062 0.066 0.061

0.390 0.309 0.530 0.835 0.851 0.836 0.850 0.858 0.530 0.481 0.584 0.852 0.853 0.871 0.870 0.875

0.168 0.178 0.147 0.087 0.083 0.087 0.083 0.081 0.147 0.155 0.138 0.064 0.082 0.077 0.077 0.076

0.85 0.80 0.75 0.70 0.65 0.60 0.55

0

1

2

3

4

5

6

7

8

10.928 14.501 14.710 7.647 6.725 7.598 7.654 4.296 9.102 11.951 14.086 6.103 6.957 6.297 7.226 3.992

SVM-1 SVM-2

12 10 8 6 4 2 0

9

0

1

2

3

4

5

6

7

8

9

Models

1.0

0.20

SVM-1 SVM-2

0.9

SVM-1 SVM-2

0.18

0.8

Root mean square error (RMSE)

Nash-Sutcliffe Equation (NSE)

0.760 0.611 0.749 0.923 0.940 0.924 0.933 0.945 0.789 0.723 0.783 0.936 0.940 0.941 0.942 0.952

14

Models

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0

9.543 11.896 10.899 7.096 6.361 6.412 7.217 6.196 7.134 9.699 10.086 7.558 6.406 7.409 7.282 5.960

16

SVM-1 SVM-2

0.95

0.50

Test

R

0.16 0.14 0.12 0.10 0.08 0.06 0.04 0.02

0

1

2

3

4

5

6

7

8

9

0.00

0

1

2

Models

3

4

5

6

7

8

9

Models Fig. 2. The performance of different SVM models.

better than models which use other input parameter such as temperature, relative humidity, etc. From Table 1, SVM-1-4 model based on sunshine duration has good performance, and addition of other parameters (surface temperature difference, air temperature difference and relative humidity) can improve the accuracy of these models. If the effect of fog and haze on solar radiation can be considered, and AQI is used as an extra input parameter, the accuracy will be further improved.

4. Conclusion The horizontal solar radiation plays a crucial role in the utilization of solar energy. In this paper, the effect of near-surface fog and haze on solar radiation is considered. Based on support vector machine, air quality index (AQI) is employed as an additional input parameter to establish daily global solar radiation model and compared with existing models. The conclusions are as follows:

160

W. Yao et al. / Renewable Energy 128 (2018) 155e162

1.0

Measured SVM-1-2 SVM-2-2

0.9 0.8

Clearness index

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0

0

30

60

90

120

150

180

210

Days Fig. 3. Comparison between estimated value of SVM-1-2 and SVM-2-2 models with measured value.

1.0

Measured SVM-1-8 SVM-2-8

0.9 0.8

Clearness index

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0

0

30

60

90

120

150

180

210

Days Fig. 4. Comparison between estimated value of SVM-1-8 and SVM-2-8 models with measured value.

(1) To study the influence of different meteorological parameters on solar radiation, SVM algorithm is introduced to establish SVM-1 models. Among SVM-1 models, the SVM-18 model which considers sunshine duration, surface

temperature difference, air temperature difference and relative humidity is the most accurate. (2) In order to study the impact of fog and haze on solar radiation, SVM-2 models are established based on SVM-1 models with AQI. The comparison of these models shows that the

W. Yao et al. / Renewable Energy 128 (2018) 155e162

161

Table 2 Statistical analysis of proposed models in contrast with existing models. Authors

Method

R.said et al. [33] Empirical model Muyiwa S. Adaramola [34] Empirical model Empirical model Empirical model Empirical model Yao et al. [35] Empirical model Chen et al. [36] SVM SVM SVM Ramedani et al. [37] SVR_rbf SVR_poly ANFIS ANN R.Meenal et al. [38] SVM SVM SVM ANN ANN ANN Victor H. Quej et al. [39] SVM Masoud Vakili et al. [40] Current paper

ANN-1 ANN-2 SVM-1-8 SVM-2-8

Station (location)

Time scale

RMSE (MJ/m2) MAPE (%) R

Tripoli, Libya Akure, Nigeria Akure, Nigeria Akure, Nigeria Akure, Nigeria Shanghai, China Chaoyang, China Dalian, China Shenyang, China Tehran province, Iran Tehran province, Iran Tehran province, Iran Tehran province, Iran India India India India India India Yucat an Peninsula, xico Me Tehran, Iran Tehran, Iran Beijing, China Beijing, China

Monthly Monthly Monthly Monthly Monthly Daily Daily Daily Daily Daily Daily Daily Daily Monthly Monthly Monthly Monthly Monthly Monthly Daily

0.1944 0.9252 1.3248 1.6092 1.1628 3.704 2.302 1.801 2.003 3.3 3.4 3.8 3.7 1.3002 1.3846 1.1617 1.4468 2.2197 1.1848 2.786

0.8 3.619 6.626 8.253 4.798 78.28 e e e e e e e e e e 2.325 2.5079 1.5918 e

e e e e e e e e e 0.889 0.887 0.899 0.894 0.9443 0.9472 0.9706 0.9739 0.7727 0.9835 0.8142

e 0.260 0.510 0.986 0.459 55.221 e e e e e e e e e e 2.325 2.5079 1.5918 e

S; S0 S; S0 DTair RH Tair;ave S; S0 S; S0

e e 1.19 1.18

1.5 3.04 4.296 3.992

e e 0.945 0.952

e e e e

Tmax ; Tmin ; RH; WS; PM2:5 and PM10 Tmax ; Tmin ; RH; WS DTsur ; DTair ; RH; S; S0 DTsur ; DTair ; RH; S; S0 ; AQI

average average average average

average average average average average average

Daily Daily Daily Daily

accuracy of SVM-2 models is improved. Among them, the R value is promoted from 0.848 to 0.876, the NSE value is lifted from 0.682 to 0.740, the RMSE value is reduced from 0.114 to 0.102, and the MAPE value is decreased from 9.257 to 8.214. Compared with existing models, the RMSE value of SVM-2-8 is decreased from 2.028MJ,m2 ,day1 to 1.18MJ,m2 ,day1 , R value is increased from 0.907 to 0.952, and MAPE value is decreased from 10.301 to 3.992. (3) Prospects: To study the influence of fog and haze on solar radiation, especially overcast or cloudy days, different neural network algorithms should be considered to explore the coupling influence of fog and different meteorological parameters on solar radiation.

[8]

[9]

[10]

[11]

[12]

Acknowledgements The authors gratefully acknowledge the support of the National Natural Science Foundation of China (Grant Nos. 51508372 and 51506141), Natural Science Foundation of Tianjin (Grant Nos. 17JCYBJC22100 and 17JCTPJC52700), State Key Lab of Subtropical Building Science, South China University Of Technology (Grant No. 2017ZB16), Tianjin urban & rural construction commission science and technology project (Grant No. 2017-9).

[13]

References

[17]

[1] A. Ferreira, S.S. Kunh, K.C. Fagnani, T.A. De Souza, C. Tonezer, G.R. Dos Santos, C.H. Coimbra-Araújo, Economic overview of the use and production of photovoltaic solar energy in Brazil, Renew. Sustain. Energy Rev. 81 (2018) 181e191. [2] M. Ouria, H. Sevinc, Evaluation of the potential of solar energy utilization in Famagusta, Cyprus, Sustain. Cities and Soc. 37 (2018) 189e202. [3] E. Kabir, P. Kumar, S. Kumar, A.A. Adelodun, K.-H. Kim, Solar energy: potential and future prospects, Renew. Sustain. Energy Rev. 82 (2018) 894e900. [4] G. Lewis, Irradiance estimates for Zambia, Sol. Energy 26 (1) (1981) 81e85. [5] A. Will, J. Bustos, M. Bocco, J. Gotay, C. Lamelas, On the use of niching genetic algorithms for variable selection in solar radiation estimation, Renew. Energy 50 (2013) 168e176. [6] A.A. Trabea, M.A.M. Shaltout, Correlation of global solar radiation with meteorological parameters over Egypt, Renew. Energy 21 (2) (2000) 297e308. [7] S.E. Rus¸en, The performance evaluation of a new model which based on bright sunshine hours and satellite imagery, for the estimation of daily global solar

[14] [15]

[16]

[18]

[19]

[20]

[21] [22] [23]

MPE (%) Input parameters

Tmax ; Tmin ; N; S; S0 Tmax ; Tmin ; N; S; S0 Tmax ; Tmin ; N; S; S0 Tmax ; Tmin ; N; S; S0 S; S0 Tmax ; Tmin Tmax ; Tmin ; S; S0 S; S0 Tmax ; Tmin Tmax ; Tmin ; S; S0 Tmax ; Tmin ; RF

radiation for two locations in Turkey, in: ICERE 2015 International Conference on Environment and Renewable Energy, 2015. S.E. Rusen, A. Hammer, B.G. Akinoglu, Coupling satellite images with surface measurements of bright sunshine hours to estimate daily solar irradiation on horizontal surface, Renew. Energy 55 (4) (2013) 212e219. €kel, F. Werner, R. Weigel, T.C. Krisna, M. Wendisch, A. Ehrlich, E. Ja € schl, M.O. Andreae, Comparing airborne and S. Borrmann, C. Mahnke, U. Po satellite retrievals of optical and microphysical properties of cirrus and deep convective clouds using a radiance ratio technique, Atmos. Chem. Phys. (2017) 1e37. S. Shamshirband, K. Mohammadi, H.-L. Chen, G. Narayana Samy, D. Petkovi c, C. Ma, Daily global solar radiation prediction from air temperatures using kernel extreme learning machine: a case study for Iran, J. Atmos. Sol. Terr. Phys. 134 (2015) 109e117. C. Renno, F. Petito, A. Gatto, ANN model for predicting the direct normal irradiance and the global radiation for a solar application to a residential building, J. Clean. Prod. 135 (2016) 1298e1316. nez-Ferna ndez, A. Aybar-Ruiz, C. Casanova-Mateo, S. Salcedo-Sanz, S. Jime J. Sanz-Justo, R. García-Herrera, A CRO-species optimization scheme for robust global solar radiation statistical downscaling,, Renew. Energy 111 (2017) 63e76. J.-L. Chen, H.-B. Liu, W. Wu, D.-T. Xie, Estimation of monthly solar radiation from measured temperatures using support vector machines e a case study, Renew. Energy 36 (1) (2011) 413e420. J. Zeng, W. Qiao, Short-term solar power prediction using a support vector machine, Renew. Energy 52 (2013) 118e127. Z. Ramedani, M. Omid, A. Keyhani, S. Shamshirband, B. Khoshnevisan, Potential of radial basis function based support vector regression for global solar radiation prediction, Renew. Sustain. Energy Rev. 39 (2014) 1005e1011. J.-L. Chen, G.-S. Li, S.-J. Wu, Assessing the potential of support vector machine for estimating daily solar radiation using sunshine duration, Energy Convers. Manag. 75 (2013) 311e318. L. Olatomiwa, S. Mekhilef, S. Shamshirband, K. Mohammadi, D. Petkovi c, C. Sudheer, A support vector machineefirefly algorithm-based model for global solar radiation prediction,, Sol. Energy 115 (2015) 632e644. S. Shamshirband, K. Mohammadi, H. Khorasanizadeh, P.L. Yee, M. Lee, D. Petkovi c, E. Zalnezhad, Estimating the diffuse solar radiation using a coupled support vector machineewavelet transform model, Renew. Sustain. Energy Rev. 56 (2016) 428e435. P. Du, R. Du, W. Ren, Z. Lu, P. Fu, Seasonal variation characteristic of inhalable microbial communities in PM2.5 in Beijing city, China, Sci. Total Environ. (2018) 308e315, 610-611. Y. Zhang, X. Li, T. Nie, J. Qi, J. Chen, Q. Wu, Source apportionment of PM2.5 pollution in the central six districts of Beijing, China, J. Clean. Prod. 174 (2018) 661e669. C.-M. Liu, Effect of PM2.5 on AQI in Taiwan, Environ. Model. Software 17 (1) (2002) 29e37. M. Ruggieri, A. Plaia, An aggregate AQI: comparing different standardizations and introducing a variability index, Sci. Total Environ. 420 (2012) 263e272. W. Yao, C. Zhang, X. Wang, J. Sheng, Y. Zhu, S. Zhang, The research of new

162

[24]

[25] [26]

[27] [28] [29] [30]

[31]

[32]

W. Yao et al. / Renewable Energy 128 (2018) 155e162 daily diffuse solar radiation models modified by air quality index (AQI) in the region with heavy fog and haze, Energy Convers. Manag. 139 (2017) 140e150. M. Vakili, S.R. Sabbagh-Yazdi, S. Khosrojerdi, K. Kalhor, Evaluating the effect of particulate matter pollution on estimation of daily global solar radiation using artificial neural network modeling based on meteorological data, J. Clean. Prod. 141 (2017) 1275e1285. V.N. Vapnik, The Nature of Statistic Learning Theory, 2000. V. Vapnik, S.E. Golowich, A. Smola, Support vector method for function approximation, regression estimation, and signal processing, Adv. Neural Inf. Process. Syst. 9 (1996) 281e287. C. Cortes, V. Vapnik, Support-vector networks, Mach. Learn. 20 (3) (1995) 273e297. H. Drucker, C.J.C. Burges, L. Kaufman, A. Smola, V. Vapnik, Support vector regression machines, Adv. Neural Inf. Process. Syst. (1996) 779e784. M.o.E. Protection, . N.a. Elagib, S.H. Alvi, M.G. Mansell, Correlationships between clearness index and relative sunshine duration for Sudan, Renew. Energy 17 (4) (1999) 473e498. J. Soares, A.P. Oliveira, M.Z. Bo znar, P. Mlakar, J.F. Escobedo, A.J. Machado, ~o Paulo using a neuralModeling hourly diffuse solar-radiation in the city of Sa network technique, Appl. Energy 79 (2) (2004) 201e214. C.K. Pandey, A.K. Katiyar, A comparative study to estimate daily diffuse solar radiation over India, Energy 34 (11) (2009) 1792e1796.

[33] R. Said, M. Mansor, T. Abuain, Estimation of global and diffuse radiation at Tripoli, Renew. Energy 14 (1) (1998) 221e227. [34] M.S. Adaramola, Estimating global solar radiation using common meteorological data in Akure, Nigeria, Renew. Energy 47 (2012) 38e44. [35] W. Yao, Z. Li, Y. Wang, F. Jiang, L. Hu, Evaluation of global solar radiation models for Shanghai, China, Energy Convers. Manag. 84 (2014) 597e612. [36] J.-L. Chen, G.-S. Li, S.-J. Wu, Assessing the potential of support vector machine for estimating daily solar radiation using sunshine duration, Energy Convers. Manag. 75 (2013) 311e318. [37] Z. Ramedani, M. Omid, A. Keyhani, S. Shamshirband, B. Khoshnevisan, Potential of radial basis function based support vector regression for global solar radiation prediction, Renew. Sustain. Energy Rev. 39 (2014) 1005e1011. [38] R. Meenal, A.I. Selvakumar, Assessment of SVM, Empirical and ANN based solar radiation prediction models with most influencing input parameters, Renew. Energy 121 (2017) 324e343. [39] V.H. Quej, J. Almorox, J.A. Arnaldo, L. Saito, ANFIS, SVM and ANN softcomputing techniques to estimate daily global solar radiation in a warm sub-humid environment, J. Atmos. Sol. Terr. Phys. 155 (2017) 62e70. [40] M. Vakili, S.R. Sabbagh-Yazdi, S. Khosrojerdi, K. Kalhor, Evaluating the effect of particulate matter pollution on estimation of daily global solar radiation using artificial neural network modeling based on meteorological data, J. Clean. Prod. 141 (2017) 1275e1285.