Journal Pre-proof Application of functional deep belief network for estimating daily global solar radiation: A case study in China Haixiang Zang, Lilin Cheng, Tao Ding, Kwok W. Cheung, Miaomiao Wang, Zhinong Wei, Guoqiang Sun PII:
S0360-5442(19)32197-8
DOI:
https://doi.org/10.1016/j.energy.2019.116502
Reference:
EGY 116502
To appear in:
Energy
Received Date: 31 March 2019 Revised Date:
5 September 2019
Accepted Date: 6 November 2019
Please cite this article as: Zang H, Cheng L, Ding T, Cheung KW, Wang M, Wei Z, Sun G, Application of functional deep belief network for estimating daily global solar radiation: A case study in China, Energy (2019), doi: https://doi.org/10.1016/j.energy.2019.116502. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier Ltd.
Application of functional deep belief network for estimating daily global solar radiation: A case study in China
Haixiang Zang a,*, Lilin Cheng a, Tao Ding b, Kwok W. Cheung c, Miaomiao Wang a, Zhinong Wei a, Guoqiang Sun a a
College of Energy and Electrical Engineering, Hohai University, Nanjing 210098, China
b
Department of Electrical Engineering, Xi’an Jiaotong University, Xi’an 710049, China
c
GE Grid Solutions, Redmond, WA 98052, USA
Abstract Solar energy plays an essential role in environment governance and resource protection, as it is totally pollution-free and extensively accessed. An accurate knowledge of solar radiation is beneficial to the deployments of solar energy constructions, photovoltaic and thermal solar systems. In this study, a deep learning method is proposed for estimating daily global solar radiation, which is constituted by embedding clustering (EC) and functional deep belief network (DBN). Based on the curve shapes of daily solar radiation, EC divides the overall dataset into different subsets, which can be modeled separately. Knowledge from empirical radiation models is also merged as the input of functional DBN. The model can be directly applied to solar estimation in various stations due to its strong nonlinear representation. The case study in China is adopted that involves radiation data from a total of 30 stations to validate the practicability and accuracy of the proposed method. From the results, the method obtains better estimation precision with empirical knowledge, achieving 1.706 MJ/m2 of mean absolute error (MAE), 2.352 MJ/m2 of root mean square error (RMSE) and 13.71% of mean absolute percentage error (MAPE) according to the average values at the 30 stations. Keywords: Daily global solar radiation; deep belief network; empirical knowledge; solar radiation model
1. Introduction Due to the rising consciousness of fossil fuel substitution and environment protection, it is of more necessity to develop and promote renewable energy sources. Among those, solar energy is a significant resource and its exploration development grows rapidly, as it is abundant, stable and widely distributed [1, 2]. For the sake of urban planning, photovoltaic array sizing and solar thermal system designing [3, 4], a proper knowledge of daily global solar radiation is required at various regions [5-7]. However, daily solar radiation data are usually missing and even unavailable at many meteorological stations, due to complex equipment and high maintenance cost of measurements [8, 9]. Therefore, numerous research focusing on solar radiation estimation has been carried out recently. Three major methods are commonly applied in estimating daily global solar radiation, i.e. satellitederived, stochastic and relationship methods [10]. Satellite-derived method can analyze earthatmospheric reflectivity of irradiation [11], gather daily radiation values accurately [12], and estimate solar radiation within a large range of regions [13, 14]. Nevertheless, equipment cost of satellite is also *
Corresponding author. Tel.: +86 13770719919; fax: +86 25 58099087. E-mail address:
[email protected].
1
quite high, which is unacceptable at some meteorological stations. In stochastic algorithms, daily global solar radiation data are generated based on the mean values of historical weather observations [15]. Apparently, the method is infeasible if a station possesses no recent historical solar radiation records. Relationship methods are aimed to establish relations between solar radiation and other meteorological elements [16]. In the methods, empirical equations or soft-computing models are commonly utilized. Those methods have been most broadly studied as they can easily achieve high estimation precision with correct weather inputs. Besides, another advantage is that the historical solar radiation data involved for training those models are sometimes not needed as model inputs at the forecasting phase. Relationship models that apply empirical equations are also named empirical models. They use dissimilar weather elements, e.g. temperature, sunshine duration, cloud cover, to form different empirical models [17]. Temperature-based (TB) models are the most commonly-used ones, as ambient temperature records are more accessible than any other meteorological data [18]. The difference between the daily maximum and minimum temperature values is usually considered in TB models. Because the temperature difference is able to indicate the fluctuation of daily global solar radiation [19, 20]. In [21], the new TB model also involves daily mean temperature and demonstrates great estimation performance. Similarly, sunshine duration is an important factor that influences the level of radiation. Hence, sunshine-duration-based (SDB) models are proposed to analyze the factor [22-24]. It remains a problem that sunshine duration data are unavailable at some stations. In this case, cloud-based models can be adopted instead of SDB models [25, 26], as cloud cover is able to be calculated according to daily sunshine duration. However, sometimes cloud-based models are not appropriate in regions with heavy haze due to the poor air condition [27]. Moreover, day-of-the-year-based (DYB) model is another kind of empirical model, also called independent model [28]. It holds a concise formula without using meteorological inputs [29-32], whereas its precision is usually worse than TB and SDB models. In contrary to empirical models, soft-computing methods utilize historical datasets to train a machine learning model, where the relations between solar radiation and meteorological elements can be modeled. They are able to merge various weather inputs efficiently and adapt to different climatic regions, receiving more attention from researchers than conventional empirical equations. In [33-35], artificial neural network (ANN) has been validated for estimating and predicting solar radiation. It possesses a great potential of self-learning, while often suffering from local optimum problems. Support vector regression (SVR) and adaptive neuro-fuzzy inference system (ANFIS) are two machine learning models suitable for high-dimensional inputs, which are frequently applied and studied in the field of solar radiation estimation [36, 37]. Nevertheless, they can hardly deal with big datasets. Besides, as parameters in Gaussian process regression (GPR) and extreme learning machine (ELM) can be optimized under adaptive searching approaches, hybrid models based on the two methods are proposed in [38, 39]. Among those, GPR model is a method that is usually able to acquire promising results with less computational cost [40, 41]. It is noted that their accuracy may decrease when a great deal of inputs raise the difficulty of optimization. The above machine learning models have all been proven feasible and practicable in the field of solar radiation estimation. However, they share a common weakness that they often perform with low training efficiency on a large number of samples. For example, the computation complexity of kernel functions in SVR and GPR will rise sharply, causing it hardly to complete the training process. Hence, those models are normally site-dependent, which means that they are trained and applied at different sites individually. On the contrary, deep neural networks utilize mini-batch gradient decent training strategy, which is suitable for huge samples. Additionally, the estimation accuracy can be further improved under deep models, as they usually possess better nonlinear representation abilities than conventional machine learning models. Thus, deep-learning-based models haven been proposed and applied recently in the field of renewable energy assessment. In [42], long short-term memory (LSTM) network is proposed to forecast global solar radiation. Owing to the recurrent use of hidden layer units, LSTM network is suitable for processing series inputs. Comparatively, another common kind of deep models is built by stacking massive hidden layers. They can adapt to point inputs of multiple elements, which are established for solar radiation prediction and estimation [43, 44]. However, a deep model with a great number of layers may also face gradient vanishing problems [45], leading to nonconvergence issues. Motivated by this, a deep learning model named deep belief network (DBN) is 2
introduced to estimate daily global solar radiation in this study. It contains a pre-training operation when stacking hidden layers, which alleviates the non-convergence problem [46]. A sole DBN model contains a great number of parameters and can be directly utilized at different stations. The knowledge of empirical equations is also merged in the proposed DBN using functional network (FN). As there are few researches of solar radiation estimation focusing on deep learning methods nowadays [43], the functional DBN is thus proposed and will be validated based on the case study in China. The main contributions of this study can be summarized as follows: (1) Deep learning method is introduced to solar radiation estimation. Owing to its strong capabilities of nonlinear representation and big data analysis, it is avoidable to model solar radiation at different meteorological stations separately. (2) The DBN model is utilized instead of ordinary deep neural network. Its parameters can be pretrained under restricted Boltzmann machine (RBM), which reduces the training difficulty of deep networks. (3) The knowledge from empirical equations is involved in the estimation model by means of FN. Thus, reliability of the proposed model is guaranteed and its training difficulty can be further decreased. (4) A novel clustering method called embedding clustering (EC) is proposed to divide meteorological stations. It consists of auto-encoder (AE) and k-means. The method can overcome the weakness of conventional k-means that the clustering result is easily affected by the initial clusters. The rest of this paper is organized as follows. The case study, along with training and testing datasets, is introduced in Section 2. The framework and algorithms of the utilized method are described in Section 3. In Section 4, results and discussions based on the case study are presented. The conclusion is drawn in Section 5. 2. Study area and datasets A total of 30 meteorological stations in China that hold vast climatic differences are studied in this paper. Their geographical locations and annual average weather values are shown in Table 1. According to the positions, those stations cover wide ranges of latitudes from 18.04°N in Sanya to 46.14°N in Jiamusi, longitudes from 75.16°E in Kashgar to 130.05°E in Jiamusi and altitudes from 2.5m in Tianjin to 4507.0m in Nagqu, which are presented in Fig. 1. Their daily global solar radiation data will be estimated under an entire deep learning model or under a small number of ones after clustering, since deep learning methods possess a powerful representational capability. Meteorological data records utilized in this study are obtained from China Meteorological Data Sharing Service System (CMDSSS), including daily maximum dry-bulb temperature (Tmax), daily minimum dry-bulb temperature (Tmin), daily mean dry-bulb temperature (Tmean), daily mean wind speed (WS), daily mean relative humidity (RH), daily sunshine duration (S) and daily global solar radiation on a horizontal surface (H), which are all measured in daily intervals. The overall datasets are collected within 22 years altogether, from the 1st of January, 1994 to the 31st of December, 2015 with relatively small measurement deviations. Through the raw dataset, the missing and outlier values occupy about 0.08% of the total ones, indicating a good data quality. Among those, outliers are discriminated using two-tail test with a critical value of three times the standard deviation. Since those missing and outlier values are small in number and scattered in distribution, they are cleaned based on quadratic spline interpolation. After data cleaning and normalization, all the above meteorological elements, along with geographical locations and the number of days in a year (nday), are taken as input variables of the estimation models. Daily global solar radiation is the output. In order to ensure training effects and prevent over-fitting problems, datasets in 1994~2012 are selected as training samples (about 85%) and those in 2013~2015 are testing ones. Table 1 Geographical locations and annual average values of meteorological data at the 30 stations studied in China. Station Beijing
Latitude (°N) 39.13
Longitude (°E) Altitude (m) 116.08 31.3
Tmax (°C) 18.6
3
Tmin (°C) 8.5
Tmean (°C) S (hours) 13.3 6.7
WS (m/s) 2.3
RH 0.53
H (MJ/m2) 13.55
Changchun Changsha Dongsheng Fuzhou Guangzhou Haikou Hami Hangzhou Harbin Hefei Jiamusi Jinan Kashgar Kunming Lhasa Nagqu Nanchang Nanjing Nanning Sanya Shanghai Shenyang Taiyuan Tianjin Urumqi Wuhan Xining Yinchuan Zhengzhou
43.15 28.04 39.14 26.01 23.03 20.01 42.14 30.04 45.13 31.14 46.14 36.10 39.08 25.00 29.11 31.08 28.10 32.00 22.11 18.04 31.07 41.12 37.13 39.01 43.13 30.10 36.12 38.08 34.12
125.04 112.15 109.16 119.05 113.06 110.06 93.09 120.03 126.13 117.04 130.05 117.01 75.16 102.11 91.02 92.01 115.15 118.13 108.04 109.09 121.08 123.08 112.09 117.01 87.11 114.02 101.13 106.04 113.11
70°E
236.8 68.0 1460.4 84.0 41.0 13.9 737.2 41.7 142.3 27.9 81.2 170.3 1288.7 1892.4 3648.7 4507.0 46.7 7.1 121.6 5.9 6.0 44.7 778.3 2.5 935.0 23.1 2295.2 1111.4 110.4
11.7 22.1 12.9 25.2 26.9 28.4 18.8 22.0 10.7 21.2 9.9 19.9 18.9 21.8 16.8 7.6 22.4 21.0 26.5 28.8 20.8 14.4 17.8 18.6 13.2 22.0 14.6 16.8 20.8
1.7 14.9 2.8 17.6 19.4 22.1 3.4 14.1 0.2 13.0 –1.4 10.8 6.9 11.8 3.1 –6.5 15.5 12.7 18.5 23.0 14.3 3.3 5.2 8.6 3.7 14.1 –0.3 4.4 10.8
90°E
80°E
6.5 17.9 7.4 20.6 22.5 24.6 10.5 17.5 5.3 16.7 4.2 14.9 12.8 16.1 9.1 –0.1 18.5 16.4 21.8 25.3 17.2 8.6 11.0 13.1 7.9 17.5 6.0 10.2 15.4
7.0 4.2 8.4 4.2 4.3 5.0 9.3 4.5 6.3 4.9 6.5 6.0 8.0 6.0 8.2 7.2 5.0 5.3 4.1 6.1 4.8 6.5 6.7 6.2 7.3 5.0 7.0 7.5 5.1
110°E
100°E
3.2 2.2 2.8 2.6 1.8 2.6 1.3 2.0 2.6 2.6 2.8 3.0 1.8 2.1 1.7 2.3 1.9 2.3 1.4 2.7 3.1 2.7 1.9 2.4 2.4 1.4 1.0 2.2 2.2
120°E
0.61 0.77 0.47 0.73 0.75 0.82 0.45 0.73 0.64 0.74 0.66 0.57 0.48 0.68 0.40 0.52 0.74 0.73 0.79 0.80 0.73 0.64 0.56 0.61 0.57 0.75 0.58 0.52 0.62
13.46 10.77 16.11 12.41 11.75 14.19 16.67 12.13 12.98 12.39 12.59 13.19 15.45 15.55 20.41 19.09 12.14 12.50 12.38 16.46 12.57 13.44 13.77 13.36 14.47 11.85 15.49 15.77 12.93
130°E 50°N
Jiamusi
40°N
Harbin
Hami
Urumqi
Changchun
Kashgar
Shenyang
Dongsheng
40°N
Beiji ng Tianj in
Yinchuan Taiyuan
30°N
Jinan
Xining
Zhengzhou Nagqu
Hefei
Lhasa
Nanjing Shanghai
30°N
Hangzhou Wuhan Nanchang
Changsha
Fuzhou
Kunming
20°N
Guangzhou Nanni ng
20°N Haikou Sanya
90°E
110°E
100°E
120°E
Fig. 1. The geographical locations of the 30 studied meteorological stations in China.
3. Methodology 3.1. Estimation framework for daily global solar radiation From Fig. 2, solar radiation estimation is to build and train a model with inputs of meteorological elements and the output of daily global solar radiation. As there are commonly a small number of trainable parameters in empirical or machine learning methods, one model is suitable for merely one station. In order to avoid building estimation models separately for different stations, enhancing the convenience and efficiency of practical applications, deep learning models, i.e. functional DBNs, are introduced to estimate solar radiation, which specialize in analyzing big datasets. In this study, two estimation approaches based on deep learning are proposed. One is to build a functional DBN based on 4
the overall dataset that involves radiation samples from all the 30 stations. It may be difficult to train such a huge model in this approach, so the other one is applied which combines functional DBNs with EC method. EC is a novel clustering method that splits the dataset into several parts according to the dissimilar curve shapes of daily solar radiation. The different parts are analyzed under different models. The principles and algorithms of those methods will be described in details.
1
i. Estimation merely using functional DBN
Basic principle of estimation
Meteorological elements
Functional DBN
iii. Framework of functional DBN
Daily global solar radiation
Meteorological elements input
Knowledge functions from empirical models
ii. Estimation using EC + Functional DBN
2 Proposed embedding clustering (EC) + functional deep belief network (DBN)
Radiation curves Polynomial functions for knowledge neurons Embedding clustering (EC)
Meteorological elements Eliminate redundant neurons
Functional DBN 1
3 Estimation model
Daily global solar radiation
Functional DBN 2
Functional DBN 3
Functional DBN 4
Deep belief network (DBN)
Comparison of conventional method & deep learning method
Station 1:
Meteorological elements 1
Estimation model 1
Daily global solar radiation 1
Station 1:
Meteorological elements 1
Station 2:
Meteorological elements 2
Estimation model 2
Daily global solar radiation 2
Station 2:
Meteorological elements 2
……
……
……
……
……
……
……
Station N:
Meteorological elements N
Estimation model N
Daily global solar radiation N
Station N:
Meteorological elements N
Daily global solar radiation N
Conventional empirical models or machine learning models
Daily global solar radiation 1
Deep learning estimation model
Daily global solar radiation 2
Proposed deep learning model
Fig. 2. The overall framework of the proposed estimation method for daily global solar radiation.
3.2. Radiation curves clustering As mentioned before, two modeling approaches are considered, where one is to establish an entire deep-learning-based model to estimate solar radiation at 30 stations, and the other is to utilize a clustering method and divide those stations into several groups. In the second approach, different groups can be modeled separately. Thus, there is no necessity to establish models individually for each station in both methods. Because radiation curves hold certain similarities, they are analyzed and clustered as the representatives of stations in this study. Moreover, in order to overcome the problem that conventional clustering methods are heavily affected by initial centers, a novel EC method is introduced, whose structure is shown in Fig. 3, which combines AE and k-means clustering. The operating procedures of the proposed EC method can be summarized as follows: 1. Initialize the number of clusters; 2. Train an AE model to calculate classification probabilities of clusters for stations, where annual average daily solar radiation curves of those stations are chosen as inputs; 3. Initialize cluster centers based on classification probabilities and operate k-means clustering; 4. Evaluate clustering results under the indicator called residual sum of squares (RSS), and finally determine the number of clusters.
5
sigmoid
ReLU
ReLU
softmax
Decoder
softmax
ReLU ReLU
ReLU
ReLU
Encoder
Auto-encoder + k-means clustering
Fig. 3. The structure of embedding clustering (EC).
An AE model consists of two parts, i.e. encoder and decoder, which are both made of dense layers (fully-connected layers) [47]: n yil = α ∑ wijl x lj−1 + bil j =1
(1)
l where wijl is the weight of the lth layer that connects the ith output and the jth input; bi represents the l ith bias of the lth layer; x lj−1 and yi are the jth input of the (l–1)th layer and the ith output of the lth
layer, respectively; α (⋅) represents an activation function and n denotes the number of input units. The outputs of the final dense layer in encoder represent the extracted high order features. In the proposed EC method, their values indicate the classification probabilities of different clusters. As a result, the number of those outputs is equal to that of clusters. Based on the classification probabilities, an decoder is aimed to reconstruct input features with the minimum deviation:
L AE = X AE − Xˆ AE
2
(2)
where L AE is the training loss function of AE; X A E and Xˆ AE denote the input vector and the reconstructed output vector of AE, respectively. Additionally, the utilized model in this study is not stacked with shallow AEs. Instead, it is directly built with a deep encoder and a deep decoder, which are both four layers. According to Fig. 3, it can be also noted that three diverse activation functions are applied: αReLU (x) = max(0, x) (3) 1 1 + e− x exi α softmax ( xi ) = x ∑je j
α sigmoid ( x ) =
(4) (5)
where xi is the ith unit for activation, sigmoid is a conventional nonlinear activation function, ReLU (rectified linear unit) can benefit back propagation training algorithm and prevent gradient vanishing, and softmax function is used to calculate classification probabilities. After training an AE model, the outputs of encoder are obtained to initialize cluster centers: Ci =
∑ p ⋅X ∑p k
2 ik
k
k
(6)
2 ik
where X k is a radiation curve vector of the kth station and p ik denotes the probability that the kth radiation curve belongs to the ith cluster. The initial centers (Ci) are adopted in k-means clustering. The clustering method is a simple and fast aggregation algorithm that determines the representative by computing the mean value of the members in a cluster [48]. It should be noted that the embedding method can be also combined with other clustering methods, e.g. fuzzy c-means [49] and k-medians 6
[50], which need initial clustering centers. Specially, in order to decide the number of clusters, the indicator of RSS based on Pearson correlation is applied in this study as: RSS = ∑i ∑k ρ (C%i , X% ik ) (7) where ρ (⋅) is Pearson correlation function; C% and X% are the final results obtained by EC method, i
ik
which denote the ith cluster center and the kth curve vector that belongs to the ith cluster, respectively. The number of clusters can be determined by observing the changes of RSS, and finally the clustering results of solar radiation curves can be obtained. 3.3. Functional deep belief network g1 Tmax
x1
Tmin
x2
g2
Tmax Tmin RH
nday
S/So
...
Input
g3
... ln(Tmax –Tmin)
f1
e(Tmax–Tmin)/So ...
f2
...
g4
(Tmax –Tmin+RH)
g5 0.5
f3
DBN
g6 Knowledge
Polynomial
Elimination
Fig. 4. The functional deep belief network structure for daily global solar radiation.
Functional DBN is a hybrid model combining DBN and FN. The structure of the utilized functional DBN is shown in Fig. 4. It consists of four parts, i.e. knowledge functional layer, polynomial functional layer, neuron elimination layer and DBN. Among those, the neuron (node) in functional layers can be described as: yil = f i ( x1l −1 , x2l −1 , L ,x lj−1 , L , xnl −1 ) (8)
where f i ( ⋅) is a computational function of the ith neuron in functional layer. Hence, the connections of neuron nodes in functional layers are realized by those functions, rather than weight matrices in dense layers. There are various families of functions, e.g. polynomial, exponential and trigonometric ones, where it is hard to decide exactly which ones to utilize. Therefore, knowledge extracted from empirical solar radiation models is selected as basic functions, which are usually in the form of nonlinear elements. Subsequently, those basic functions are further transformed into polynomial functions. Due to the large number of polynomial combinations, a backward elimination (BE) algorithm is utilized to eliminate redundant functions. The core of BE is to test each function by deleting it from the model. If the deletion of the function causes no deterioration on the performance of models, it means that the function is a redundant one that should be eliminated [51]. 3.3.1 Knowledge extraction from empirical models
It has been mentioned previously that knowledge extracted from empirical models is selected as basic functions, which are the neurons of knowledge functional layer. There are two merits of knowledge extraction: on one hand, as those neurons themselves contain strong nonlinear characterizations, the network for solar radiation has already achieved a high complexity, and has no necessity to be designed into extremely deep architecture. On the other hand, the functional neurons help extract features from original inputs in advance, which reduces the difficulties of model training and improves the estimation accuracy. As for empirical models, meteorological variables hold a strong correlation with global solar radiation, especially sunshine duration and temperature. Thus, SDB and TB models have been applied 7
worldwide. Among those, Angstrom-Prescott linear model is one of the most common SDB models, which is established to calculate clearness index (H/Ho) [52]: H S = a+b (9) Ho So Furthermore, polynomial SDB models are also widely developed [53-55]: 2
3
S H S S = a + b +c + d Ho So So So
(10)
where a, b, c and d are trainable parameters. The daily extraterrestrial solar radiation (Ho) and daily maximum possible sunshine duration (So) can be calculated as [56]: 360nday 2πωs 24 Ho = I SC 1 + 0.034cos sin ϕ sin δ (11) × cos ϕ cos δ sin ωs + π 365 360 So =
2 cos−1 (−tanϕ tan δ ) 15
(12)
where φ is the latitude of a station; and ISC denotes the solar constant, which is equal to 1367 W/m2. The daily solar declination (δ) and daily sunrise hour angle (ωs) are computed under the following equations: 360(nday + 284) 365
δ = 23.45sin
(13)
ωs = cos−1 (− tan δ tan ϕ) (14) Similar to SDB model, TB model is aimed at estimating daily global solar radiation using ambient temperature. Different temperature types, i.e. Tmax, Tmin and Tmean, have been merged in building TB models. Hargreaves model is a typical kind of TB models to estimate global solar radiation [57, 58]:
H c = a ∆T +b + Ho Ho
(15)
where a, b and c are parameters. ∆T represents daily temperature difference, which is equal to Tmax–Tmin. Another two TB models combining daily temperature difference and maximum possible sunshine duration are introduced in [59]: 2
3
∆T ∆T H ∆T = a +b + c +d Ho So So So
(16)
H = a + eb(∆T / So ) Ho
(17)
In [60], a modified Hargreaves model is proposed based on logarithmic temperature difference:
H = a +bln(∆T) Ho
(18)
Besides, hybrid variable models are also proven effective in solar radiation estimation, where the following two that involve temperature and humidity are referred in this study [61, 62]:
∆T + RH H = a + b Ho So
0.5
S T H = a cos ϕ + b cos(nday ) + cTmax + d + e max + So RH
(19) 2
T f max + g RH
(20)
where a, b, c, d, e, f and g are trainable parameters. According to all the above empirical models, basic neurons in knowledge functional layer can be decided. The number of neurons are 21, as presented in Table 2. In polynomial functional layer, those neurons are combined to constitute a 3th-order polynomial family of functions. Furthermore, the final number of inputs in DBN is chosen as 182 after neuron elimination via BE method. Table 2
8
Basic functional neurons in the knowledge functional layer for estimating daily global solar radiation. Neuron
Tmax
Type basic input basic input basic input basic input
Description / Relevant empirical models latitude longitude altitude daily maximum dry-bulb temperature
Tmin
basic input
daily minimum dry-bulb temperature
T m ean
basic input
daily mean dry-bulb temperature
ϕ
λ h
RH
basic input
daily mean relative humidity
WS
basic input
daily mean wind speed
nday
basic input
the number of day in a year
S
basic input function function function function function
daily sunshine duration the cosine function of latitude the sine function of latitude the cosine function of daily solar declination the sine function of daily solar declination
cos ϕ sin ϕ
cos δ sin δ S / So Tm ax − Tmin
function
( )
H Ho
= a + b SSo +c
H Ho
= a ∆T + b +
H Ho
= a + b ∆STo + c
(Tmax − Tm in ) / S o
function
e(Tmax −Tmin )/ So
function
H Ho
= a + e b ( ∆T / S o )
ln (Tmax − Tmin )
function
H Ho
= a + b ln(∆T )
function
Tmax −Tmin + RH So
H Ho
(
2
+d
S So
3
c Ho
( )
∆T + RH So
( )
∆T So
)
2
+d
( ) ∆T So
3
0.5
H = acosϕ +bcos(nday ) +cTmax +d
function
Tmax / RH
= a+b
S So
( ) S So
( ) + f ( ) +g
+e
Tmax RH
Tmax 2 RH
3.3.2 Unsupervised learning strategy of deep belief network v^ 1
v1 h1
v^ 2
v2 Output
h2 v^ 3
v3
RBM2
Feature II
v^ 1
v1
Feature I
h1 v^ 2
v2 h2
Input
v^ 3
v3 h3
v^ 4
v4
DBN
RBM1
Fig. 5. The hierarchical architecture of deep belief network (DBN).
DBN is a deep network stacked by a special kind of dense layers, namely RBMs, whose parameters can be pre-trained under an unsupervised learning strategy. Hence, the entire training process of DBN can be divided into two phases: unsupervised pre-training phase and supervised back-propagation (finetuning) phase. RBM is an energy-based model, whose aim is to extract hidden features and pre-train weights of dense layers, in order to accelerate converge rate of training a deep network, as well as to avoid gradient vanishing and local optimum problems. As shown in Fig. 5, it is made up of a visible layer and a hidden layer, which can be defined as an energy function ε [63]: n
m
m
n
j =1
i=1
ε (vj , hi | θ ) = −∑∑ωij hv i j − ∑aj v j − ∑bh i i i=1 j =1
(21)
9
where vj and hi are the jth unit in a visible layer and the ith unit in a hidden layer, respectively; m and n are the numbers of visible units and hidden units, respectively. θ = {ω ij , a j , bi :1 ≤ i ≤ n ,1 ≤ j ≤ m} is the trainable parameter set, where ωij represents the weight between vj and hi, aj and bi are the biases in the visible layer and hidden layer, respectively. The training process of RBM is realized layer by layer, where the outputs of the former hidden layer are drawn as the inputs of the latter visible layer. As a result, its strategy is to learn a probability distribution from the visible layer to the hidden layer. First, their joint probability distribution P is defined in terms of the energy function:
P(v, h | θ ) =
1 −ε (v,h|θ ) e Z (θ )
(22)
Z (θ ) = ∑e−ε (v,h|θ )
(23)
v ,h
where Z(θ) denotes the partition function. Next, the individual probabilities, of vj given h and of hi given v, can be deduced as follows [64]: n
P(vj =1| h) = αsigmoid (aj + ∑ωij hi )
(24)
i=1 m
P(hi = 1| v) = αsigmoid (bi + ∑ωij vj )
(25)
j =1
Since training a RBM means adjusting its parameter set θ, the objective function is defined to maximize the log-likelihood probability: ns
ns
i =1
i =1
L RBM (θ ) = ln ∏ p(v(i ) | θ ) = ∑ ln P(v(i ) | θ )
(26)
where L RBM (θ ) denotes the log-likelihood function in terms of parameter set θ, ns is the number of samples and v(i) is the hidden unit for the ith sample. Then, the gradient ascending algorithm is adopted to solve the objective function and adjust parameters: ∂L RBM (θ ) ∂ ns = ∑ ln ∑ P(v(i ) , h | θ ) ∂θ ∂θ i =1 h −ε ( v , h|θ ) ∂ ns ∑h e (27) = ∑ln ∂θ i =1 ∑v,h e−ε (v,h|θ ) ns ∂ ∂ = ∑ −EP( h|v(i ) ,θ ) ε (v(l ) , h | θ ) + EP( v,h|θ ) ε (v, h | θ ) ∂θ ∂θ i =1 where E P (⋅) represents the mathematical expectation of distribution P. When units in the first visible layer are given as training samples, parameters of each RBM can be adjusted. However, the gradient ascending formula holds a computational complexity of O(2n+m), causing a poor efficiency of training deep networks. Therefore, Markov chain Monte Carlo (MCMC) and Gibbs sampling can be applied to estimate the expectations. Further, contrastive divergence (CD) is introduced for approximate sampling, since usually getting those expectation samples is also intractable [65]. After CD operation, the pretraining phase can be completed. (i)
4. Results and discussions In order to validate the reliability and accuracy of the proposed methods, a case study including 30 meteorological stations in China is adopted. In this study, DBN model without functional neurons, the proposed functional DBN and functional DBN with EC method are established to estimate daily global solar radiation. Three machine learning models, namely SVR, GPR and ANFIS, are built for comparisons. As they demonstrate low efficiency when trained on a large number of samples, they are modeled separately for each station, i.e., site-specific models. Furthermore, three empirical models from 10
related literature are also established in this study, which outperform conventional models on the dataset of China: (1) Empirical model 1 (EM1), which is referred from the P3 model in [52]: H = H o [a + b( SSo ) c + d ln( ∆T ) + eTmean ] (28) (2) Empirical model 2 (EM2), named the expansion of improved Bristow-Campbell model [66], which involves more weather elements into the Bristow-Campbell TB model:
H = a + b sin Ho
( ) + c cos ( ) + dR 2π jday
2π jday
365
365
H
+ eS × 1 − exp(− f ∆T g )
(29)
where jday represents the Julian day. (3) Empirical model 3 (EM3), which is a general model considering the locations of stations [67]:
H = Ho (a∗ ∆T + b∗ RH + c∗ SSo ) + d ∗ a∗ = a1ϕ + a2λ + a3h + a4 ∗ (30) b = b1ϕ + b2λ + b3h + b4 ∗ c = c1ϕ + c2λ + c3h + c4 d ∗ = d ϕ + d λ + d h + d 1 2 3 4 * * * * where a , b , c and d are coefficients in terms of locations (i.e. latitude φ, longitude λ and altitude h); a1~a4, b1~b4, c1~c4 and d1~d4 are the trainable parameters. The estimation results of the above all models, as well as the comparisons and discussions, will be exhibited in this section as follows. Moreover, the proposed method is extended to daily global solar radiation forecasting and hourly global solar radiation estimation. The results will be presented and discussed as well. 4.1. Performance criteria Four criteria are utilized in this study to measure the estimation performances of different models, i.e. mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE) and correlation coefficient (R) [66-69]. They can be calculated under the following equations: 1 ns MAE = ∑ H m,i − H e,i (31) ns i =1 1 ns
RMSE =
MAPE =
1 ns
ns
∑(H i =1
m ,i
− H e ,i )
ns
H m ,i − H e ,i
i =1
H m ,i
∑
2
(32)
×100%
(33)
∑ ( H − H )( H − H ) ( H − H ) ( H − H ) ∑ ∑ ns
R=
i=1
m,i
m,i
2
ns
i=1
m,i
m,i
e,i
e,i
i=1
(34)
2
ns
e,i
e,i
where ns is the number of samples; Hm,i and He,i are the ith measured value and the ith estimated value, respectively; Hm,i and H e,i are the mean values of Hm,i and He,i, respectively. For MAE, RMSE and MAPE, they are greater than zero and a value close to zero denotes a high estimation accuracy. R is within the range of [0, 1], and a stronger correlation will be obtained between measured and estimated values where there is a bigger R value. 4.2. Clustering results Based on the proposed EC method, the clustering results of those global solar radiation curves can be achieved, where the centers of clusters are shown in Fig. 6. The stations with dissimilar solar radiation curves are thus divided, and their names are marked under different colors in Fig. 1. According to the 11
daily solar radiation (MJ/m2)
daily solar radiation (MJ/m2)
EC results and locations, stations in the same cluster possess certain geographic correlations. Kunming station belongs to a special cluster, i.e. cluster 2, which receives great amount of global solar radiation over the year. Moreover, the number of clusters is determined based on the changes of RSS, which are presented in Fig. 7. RSS continues decreasing when the number of clusters increases, but the tendency slows down gradually. Hence, the number of clusters is decided to be 4, as more clusters are not able to significantly reduce the value of RSS. The structure of an AE model with 4 clusters is shown in Table 3. The results of EC method can be combined with the proposed functional DBN, and it will be further validated in this study if the estimation performance can be improved after clustering. Cluster 1
25 20
20
15
15
10
10
5
0
100
200
5
300
Cluster 3
25
20
15
15
10
10
5
5 0
100
0
100
200
200
0
300
nday
300
Cluster 4
25
20
0
Cluster 2
25
0
100
200
300
nday
0.09
1.18
0.06
1.12
0.03
1.06
RSS
Changes of RSS
Fig. 6. The clustering centers of radiation curves from 30 meteorological statins.
0.01 0
1
2
3
4
5
6
7
8
9
10
1
Number of clusters
Fig. 7. The indicator of RSS and its changes based on different numbers of clusters. Table 3 The structure and hyper-parameters of an AE with 4 clusters. Layer
Hyper-parameters
Input
Input number: 365 Node number: 5 Activation: ReLU Node number: 16 Activation: ReLU Node number: 4 Activation: softmax Node number: 16 Activation: ReLU Node number: 5 Activation: ReLU Node number: 365 Activation: sigmoid –
Encoder 1 (Dense) Encoder 2 (Dense) Cluster (Dense) Decoder 1 (Dense) Decoder 2 (Dense) Output (Dense) Summary
The number of trainable parameters None 1,830 96 68 80 85 2,190 4,349
4.3. Daily global solar radiation estimation The structures and hyper-parameters of the utilized DBN and functional DBN are presented in Table 4, where DBN has on knowledge, polynomial and elimination layers. In functional DBN, neurons in 12
functional layer and elimination layer are fixed before training, so trainable parameters only exist in DBN, which are pre-trained by RBMs. The numbers of hidden layers and hidden neurons are equal in both DBN and functional DBN, for the convenience of comparison. The estimation results of daily global solar radiation for 30 stations in China are presented, which are obtained from three empirical models, SVR, GPR, ANFIS, DBN, functional DBN and EC + functional DBN, as shown in Table 5. EM1, EM2, EM3, SVR, GPR and ANFIS models are built and applied on each meteorological station independently. DBN and functional DBN models are established and trained on the overall dataset that involves all the 30 stations, while EC + functional DBN divides the dataset into four clusters and then builds estimation models for those clusters separately. According to the comparisons, functional DBN with EC achieves the best scores on the average values of performance criteria over the entire testing dataset, with 1.706 MJ/m2 of MAE, 2.352 MJ/m2 of RMSE, 13.71% of MAPE and 0.955 of R, indicating a high accuracy and practicability. It should be noted that all the validated models are trained 10 times, and the comparison results in Table 5 are the mean values of performance criteria. The box-plots of those performance criteria are shown in Fig. 8. Besides, the case studies are conducted using a personal computer with hardware of Intel Core-i7-7700K CPU, NVIDIA GTX-1080 GPU, 16GB RAM and software of Python 3.5. The total processing time of training models for 30 stations is provided in Table 6. For empirical models, the boxes of EM2 are the highest and longest, indicating low estimation accuracy and poor stability. Thus, an empirical model with complex and strongly nonlinear equations would be hard to train, as it is easier to fall into local minimum. EM3 is the most stable and accurate model, but also possesses the most parameters and training time. For machine learning methods, they are all lower than those of empirical models and flatter than those of deep learning ones. Hence, they are more precise than empirical models. Among those, GPR shows the best accuracy with 1.817 MJ/m2 of MAE, 2.445 MJ/m2 of RMSE, 14.42% of MAPE and 0.952 of R, slightly worse than those of DBN. According to the length of boxes, since machine learning models are site-dependent, which are built at each station independently, the fluctuation of their estimation errors is smaller than that of deep learning models. However, although those site-dependent models are easily trained and can achieve small training time at each station, it still costs a long processing time to train massive models at different stations. On the contrary, a well-trained deep learning model can be directly used on those stations, decreasing the total training time. From Table 6, the total processing time on 30 stations of the SVR and GPR is longer than that of DBN and functional DBN, though training a single site-specific model for one station may cost less than one minute. Especially, the total training time of GPR is 1442.32 s, which is about three times as that of functional DBN. Moreover, in terms of precision, the average estimation errors can be further decreased using deep learning models. By examining the box-plots of the three DBN methods, the box of functional DBN is flatter and lower than that of common DBN. Thus, with the assistance of functional neurons and empirical knowledge, functional DBN indeed obtains better stability and precision than the common DBN. Moreover, the estimation performance of the functional DBN is further improved when it is combined with EC method, where the height of its box is remarkably narrowed. The ranges of MAE, RMSE, MAPE and R are reduced by 0.080 MJ/m2, 0.074 MJ/m2, 0.43% and 0.002 from DBN to the hybrid model, respectively. As a result, the proposed EC + functional DBN method shows both high accuracy and robustness. But it should be noted that the proposed hybrid method requires the longest training time among all tested models, more than 1500 s, as four trained deep models are needed. In summary, based on all the above comparisons and discussions, including accuracy and processing time, the operators can determine the estimation models according to the number of stations and training samples. Four meteorological stations that belong to different clusters, i.e. Beijing, Kunming, Changsha and Hefei, are chosen as representatives for exhibition. The probabilistic distributions of errors between the measured and estimated values in the four stations are presented in Fig. 9. The probabilistic distribution that holds a sharper curve close to zero denotes higher estimation accuracy. The curves of the three deep learning models in Beijing and Kunming are obviously sharper than those of other benchmark models, proving the superiority of the proposed methods. Especially in Kunming station, the indicators of MAE, RMSE, MAPE and R are improved from 3.431 MJ/m2, 4.475 MJ/m2, 22.21% and 0.898 using EM2 to 1.414 MJ/m2, 1.920 MJ/m2, 11.09% and 0.959 using EC + functional DBN, respectively. In Changsha, 13
ANFIS achieves similar results with DBN and functional DBN, but it performs worse than EC + functional DBN. In Hefei, all models obtain good estimation results with sharp distribution curves, while the curve of EC + functional DBN is slightly closer to zero. Therefore, the proposed hybrid method based on clustering and functional DBN possesses the best performance among all the four testing stations. Table 4 The structure and hyper-parameters of the proposed functional DBN. Layer
Hyper-parameters
The number of trainable parameters
Input Knowledge Polynomial Elimination
Input number: 10 Node number: 21 Node number: 274 Node number: 182 Node number: 128 Activation: ReLU Node number: 64 Activation: ReLU Node number: 64 Activation: ReLU Node number: 50 Activation: ReLU Node number: 1 Activation: sigmoid –
None None None None
RBM 1 (Dense) RBM 2 (Dense) RBM 3 (Dense) RBM 4 (Dense) Output (Dense) Summary
23,424 8,256 4,160 3,250 51 39,141
14
Table 5 Comparisons of performance criteria based on different solar radiation models at the 30 stations studied in China. EM1 Station
EM2
EM3
SVR
GPR
ANFIS
DBN
Functional DBN
EC + Functional DBN
Cluster MAE
RMSE MAPE R
MAE
RMSE MAPE R
MAE
RMSE MAPE R
MAE
RMSE MAPE R
MAE
RMSE MAPE R
MAE
RMSE MAPE R
MAE
RMSE MAPE R
MAE
RMSE MAPE R
MAE
RMSE MAPE R
1
1.624
2.144
13.22
0.963
1.603
2.220
13.84
0.965
1.603
2.106
14.38
0.962
1.415
1.814
12.48
0.974
1.289
1.687
11.44
0.978
1.522
1.956
14.00
0.968
1.294
1.734
11.60
0.975
1.222
1.652
11.10
0.977
1.165
1.581
10.73
0.978
Changchun 1
1.749
2.252
15.76
0.946
1.441
1.959
13.24
0.969
1.526
2.019
14.48
0.961
1.389
1.803
13.03
0.969
1.266
1.690
11.72
0.973
1.301
1.750
11.89
0.971
1.285
1.733
11.61
0.971
1.219
1.672
11.11
0.974
1.187
1.631
10.99
0.975
Changsha
Beijing
3
1.724
2.282
19.89
0.953
2.709
3.449
27.77
0.930
1.651
2.237
17.49
0.955
1.767
2.187
19.52
0.960
1.579
2.001
19.15
0.966
1.543
2.064
18.85
0.962
1.566
2.101
19.46
0.961
1.509
2.011
18.26
0.965
1.385
1.854
16.59
0.970
Dongsheng 1
2.263
3.514
12.81
0.947
2.318
3.264
14.40
0.954
2.215
3.427
13.38
0.941
2.365
3.427
14.18
0.954
2.205
3.298
12.87
0.956
2.328
3.428
13.68
0.939
1.993
3.115
11.24
0.958
1.978
3.097
11.35
0.957
1.953
3.077
11.21
0.958
Fuzhou
0.965
3
1.924
2.390
18.78
0.950
3.001
3.682
26.07
0.920
1.819
2.350
16.12
0.954
2.130
2.731
18.61
0.934
1.774
2.334
16.70
0.953
1.657
2.180
16.62
0.961
1.819
2.337
17.43
0.961
1.616
2.129
15.65
0.965
1.592
2.104
15.88
Guangzhou 3
1.836
2.214
18.28
0.943
2.889
3.442
25.82
0.931
1.659
2.025
15.80
0.962
1.422
1.814
14.30
0.969
1.367
1.757
14.17
0.970
1.424
1.814
14.60
0.968
1.320
1.669
13.64
0.970
1.350
1.722
13.97
0.971
1.340
1.721
13.93
0.971
Haikou
3
2.318
2.909
17.58
0.926
2.881
3.552
20.18
0.914
2.094
2.615
14.56
0.938
1.805
2.327
14.16
0.952
1.776
2.296
14.01
0.953
1.823
2.358
14.44
0.950
1.803
2.350
14.07
0.952
1.801
2.324
14.07
0.952
1.781
2.296
14.08
0.953
Hami
1
1.733
2.203
13.61
0.967
1.859
2.607
13.24
0.968
1.673
2.117
13.02
0.971
1.645
2.127
12.86
0.968
1.465
1.972
11.27
0.974
1.670
2.278
13.24
0.963
1.300
1.883
9.71
0.977
1.357
1.945
10.05
0.976
1.318
1.879
9.87
0.977
Hangzhou
4
3.187
4.522
23.71
0.905
3.590
4.807
28.38
0.901
3.098
4.454
21.75
0.911
2.962
4.488
20.03
0.905
2.905
4.462
19.68
0.908
2.977
4.486
20.73
0.910
3.050
4.505
21.46
0.913
2.934
4.402
20.62
0.911
2.824
4.382
19.41
0.913
Harbin
1
2.729
3.482
20.25
0.919
3.058
3.793
24.54
0.918
2.471
3.272
19.75
0.930
2.240
2.900
17.65
0.946
2.242
2.886
17.49
0.949
2.509
3.235
19.64
0.935
2.487
3.135
19.03
0.949
2.277
2.900
17.97
0.952
2.237
2.849
17.82
0.953 0.962
Hefei
4
1.918
2.534
17.56
0.950
2.544
3.262
22.45
0.930
1.911
2.565
16.21
0.947
1.862
2.405
16.71
0.960
1.776
2.321
16.16
0.962
1.801
2.371
16.58
0.959
1.857
2.418
16.94
0.961
1.784
2.347
16.22
0.962
1.652
2.225
15.25
Jiamusi
1
2.115
2.899
17.18
0.931
1.817
2.576
14.59
0.946
2.067
2.817
18.14
0.926
1.848
2.556
16.07
0.947
1.742
2.479
14.64
0.950
1.805
2.580
14.84
0.944
1.710
2.513
13.61
0.949
1.671
2.419
13.78
0.952
1.640
2.424
13.44
0.952
Jinan
1
1.623
2.106
13.43
0.949
1.979
2.459
17.01
0.961
1.445
1.896
12.83
0.964
2.206
2.730
18.19
0.930
1.661
2.073
14.12
0.960
1.392
1.784
12.67
0.968
1.315
1.691
11.99
0.972
1.284
1.660
11.96
0.973
1.289
1.647
11.79
0.973 0.980
Kashgar
1
1.774
2.292
12.48
0.964
2.131
2.855
17.21
0.953
1.779
2.264
13.73
0.963
1.605
2.067
11.82
0.971
1.466
1.892
10.74
0.976
1.570
2.025
11.84
0.970
1.342
1.743
9.68
0.979
1.308
1.693
9.60
0.980
1.290
1.668
9.56
Kunming
2
2.014
2.658
14.41
0.937
3.431
4.475
22.21
0.898
1.627
2.166
11.62
0.947
1.585
2.062
12.01
0.950
1.485
1.962
11.47
0.956
1.525
2.041
11.81
0.954
1.481
1.994
11.76
0.956
1.436
1.949
11.55
0.957
1.414
1.920
11.09
0.959
Lhasa
1
1.673
2.223
8.99
0.909
1.730
2.248
9.44
0.936
1.248
1.630
7.07
0.960
1.425
1.772
8.06
0.955
1.333
1.669
7.50
0.960
1.338
1.732
7.44
0.953
1.215
1.582
6.76
0.962
1.261
1.629
7.08
0.962
1.236
1.600
6.90
0.963 0.881
Nagqu
1
2.139
2.889
11.84
0.867
3.816
4.908
20.04
0.838
2.027
2.722
11.29
0.877
2.096
2.786
11.80
0.871
2.030
2.714
11.39
0.878
2.043
2.718
11.44
0.876
1.971
2.662
10.94
0.880
1.981
2.666
11.06
0.883
1.988
2.684
11.11
Nanchang
3
1.579
2.153
15.79
0.958
2.100
2.747
19.55
0.948
1.483
2.078
14.38
0.960
1.556
2.047
15.48
0.960
1.429
1.934
14.57
0.965
1.417
1.949
15.10
0.964
1.364
1.863
14.57
0.967
1.375
1.894
14.38
0.967
1.311
1.842
14.03
0.969
Nanjing
4
1.608
2.019
15.27
0.971
1.974
2.436
17.92
0.957
1.691
2.126
15.31
0.967
1.593
1.976
14.83
0.972
1.531
1.907
14.93
0.974
1.526
1.927
15.15
0.972
1.459
1.831
14.56
0.975
1.441
1.826
14.39
0.974
1.431
1.822
14.32
0.975
Nanning
3
1.892
2.388
18.93
0.940
2.688
3.293
27.71
0.937
1.598
2.026
16.20
0.955
1.302
1.684
14.43
0.970
1.253
1.631
14.22
0.972
1.323
1.729
14.63
0.969
1.311
1.711
14.29
0.970
1.257
1.630
14.26
0.972
1.253
1.621
14.63
0.973
Sanya
3
2.418
3.017
15.21
0.883
4.723
5.560
29.57
0.837
3.237
3.790
19.89
0.897
2.217
2.811
14.31
0.914
2.167
2.761
13.93
0.918
2.226
2.826
14.52
0.911
2.553
3.212
17.27
0.915
2.304
2.940
15.68
0.913
2.193
2.828
14.55
0.919 0.968
Shanghai
4
1.658
2.186
17.19
0.961
2.418
2.985
21.62
0.939
1.887
2.450
17.28
0.948
1.717
2.198
16.42
0.965
1.615
2.092
15.62
0.968
1.560
2.035
15.50
0.968
1.641
2.157
15.92
0.967
1.613
2.092
15.79
0.967
1.506
1.986
15.42
Shenyang
1
2.345
3.272
17.24
0.895
2.379
3.537
18.61
0.902
2.149
3.061
17.97
0.910
1.924
2.859
15.28
0.922
1.808
2.770
14.51
0.927
1.909
2.904
15.49
0.920
1.815
2.776
14.31
0.928
1.739
2.735
13.72
0.929
1.729
2.736
13.61
0.929
Taiyuan
1
3.170
4.091
18.61
0.951
3.019
3.842
18.18
0.955
2.987
3.878
18.28
0.949
3.414
4.257
20.47
0.957
3.242
4.079
19.43
0.960
3.134
4.022
18.55
0.944
2.782
3.586
16.95
0.961
2.901
3.722
17.43
0.961
2.910
3.730
17.49
0.960
Tianjin
1
2.256
3.107
17.12
0.933
2.051
2.909
15.43
0.934
2.268
3.167
17.92
0.924
2.073
2.888
15.93
0.936
1.994
2.825
15.20
0.939
2.151
2.924
17.08
0.931
2.033
2.899
15.60
0.940
1.971
2.840
15.18
0.941
1.940
2.793
15.08
0.941
Urumqi
1
3.932
5.277
23.31
0.899
3.881
5.210
24.77
0.893
3.853
5.184
24.61
0.889
4.227
5.606
26.03
0.887
4.016
5.395
24.84
0.894
3.978
5.437
24.31
0.875
3.742
5.153
22.91
0.898
3.701
5.097
23.06
0.897
3.673
5.070
22.97
0.898
Wuhan
4
1.973
2.912
16.93
0.928
2.406
3.235
20.30
0.922
1.944
2.899
15.27
0.929
2.040
2.870
17.22
0.934
1.920
2.809
16.48
0.935
1.891
2.834
16.25
0.932
1.835
2.736
15.90
0.938
1.795
2.749
15.39
0.937
1.828
2.807
16.18
0.934
Xining
1
1.697
2.289
13.16
0.945
2.521
3.330
19.30
0.926
1.687
2.226
13.57
0.955
1.549
2.062
12.50
0.959
1.497
2.025
11.99
0.962
1.619
2.138
13.49
0.959
1.507
2.063
11.53
0.960
1.492
2.048
11.52
0.962
1.493
2.062
11.58
0.963
Yinchuan
1
1.955
2.706
13.13
0.919
2.291
3.108
17.54
0.951
1.630
2.226
12.43
0.955
1.505
2.019
11.61
0.960
1.423
1.984
10.93
0.962
1.840
2.440
14.42
0.946
1.554
2.085
11.59
0.959
1.462
2.112
10.76
0.959
1.429
2.090
10.46
0.960
Zhengzhou
1
1.441
1.942
13.00
0.963
1.772
2.335
15.04
0.955
1.586
2.140
13.35
0.955
1.325
1.727
12.28
0.970
1.241
1.643
11.45
0.973
1.314
1.739
12.11
0.969
1.511
1.925
13.34
0.972
1.304
1.749
11.76
0.972
1.206
1.635
11.39
0.973
1
Table 6 The total processing time of training models for 30 stations. Method
Time (total)
Number of models
Time (average per model)
EM1
10.26 s
30
0.34 s
EM2
50.44 s
30
1.68 s
EM3
92.56 s
30
3.09 s
SVR
781.23 s
30
26.04 s
GPR
1442.32 s
30
48.08 s
ANFIS
336.13 s
30
11.20 s
DBN
376.79 s
1
376.79 s
Functional
527.02 s
1
527.02 s
EC + Functional DBN
1530.93 s
4
382.73 s
Root mean square error (RMSE)
Mean abosolute error (MAE)
3.8 2.6 2.4 2.2 2 1.8
EM1
EM2
EM3
SVR
GPR
ANFIS
DBN
3.6 3.4 3.2 3
Outliers
2.8
Upper adjacent
2.6
75th percentile
2.4
EC + FDBN FDBN
EM1
EM2
EM3
SVR
GPR
ANFIS
DBN
Median
EC + FDBN FDBN
Correlation coefficient (R)
21 20 19 18 17 16 15 14
25th percentile
0.95 0.94 0.93
Lower adjacent 0.92 0.91 0.9
EM1
EM2
EM3
SVR
GPR
ANFIS
DBN
EC + FDBN FDBN
EM1
EM2
EM3
SVR
GPR
ANFIS
DBN
FDBN
EC + FDBN
Fig. 8. The box-plots of performance criteria from 10 times of validations on the overall testing dataset. Beijing (cluster 1)
0.45
Probability density function
0.3 0.25 0.2 0.15
Probability density function
EM1 EM2 EM3 SVR GPR ANFIS DBN Functional DBN EC + Functional DBN
0.35
0.1
EM1 EM2 EM3 SVR GPR ANFIS DBN Functional DBN EC + Functional DBN
0.3 0.25 0.2 0.15 0.1 0.05
0.05 0 -10
Kunming (cluster 2)
0.35
0.4
-8
-6
-4
-2
0
2
4
6
8
10
0 -12
12
-10
-8
Estimation error Hm,i – He,i (MJ/m2)
0.2 0.15
Probability density function
0.25
0.1 0.05 0 -12
-10
-8
-6
-4
-2
0
2
4
-4
6
-2
0
2
4
6
8
10
12
Hefei (cluster 4)
0.3 EM1 EM2 EM3 SVR GPR ANFIS DBN Functional DBN EC + Functional DBN
0.3
-6
Estimation error Hm,i – He,i (MJ/m2)
Changsha (cluster 3)
0.35
Probability density function
Mean absolute percentage error (MAPE) %
Mean
8
10
12
14
2
EM1 EM2 EM3 SVR GPR ANFIS DBN Functional DBN EC + Functional DBN
0.25 0.2 0.15 0.1 0.05 0 -12
-10
-8
-6
-4
-2
0
2
4
6
Estimation error Hm,i – He,i (MJ/m2)
Estimation error Hm,i – He,i (MJ/m )
Fig. 9. Probabilistic density curves of estimation errors in four different stations.
1
8
10
12
14
4.4. Multi-day-ahead daily global solar radiation forecasting Besides daily solar estimation, a case study of multi-day-ahead forecasting is also conducted, from one to three days. The difference of estimation and forecasting lies in that they utilize different inputs. In solar radiation estimation, the measured meteorological elements at the day to be estimated are chosen as inputs, while the numerical weather prediction (NWP) inputs are used in forecasting. Historical solar radiation data are commonly involved in forecasting as well. In this study, four example stations from different clusters are selected to study multi-day-ahead forecasting solar radiation forecasting. As to forecasting models, the same daily elements as provided in Table 2 are utilized, which are numerical predicted values. The historical daily global solar radiation of the last one week is also used as inputs. Therefore, machine learning methods and the proposed deep-learning-based methods are adopted. The daily global solar radiation data of a whole year in 2015 are chosen as testing datasets, where the forecasting results are presented in Table 7. The prediction curves using EC + functional DBN, along with the comparison bar-plots of forecasting matrices are shown in Fig. 10. From the prediction curves in Fig. 10, the best forecasting results are obtained in Kunming, with the smallest MAE, RMSE and MAPE matrices. The large and stable quantity of daily global solar radiation received by Kunming contributes to the good precision. The indicators of one-day-ahead forecasting using the proposed hybrid method in Kunming are 3.658 MJ/m2 of MAE, 4.603 MJ/m2 of RMSE, 20.68% of MAPE and 0.693 of R, respectively. For Beijing station, the solar radiation curves possess the best seasonal regularity. Hence, the best R indicator is achieved by Beijing. The other three indicators in Beijing are also smaller than those in Changsha and Hefei. According to Table 7 and bar-plots, forecasting errors of the two functional DBN methods are usually smaller, indicating their effectiveness and stability, but actually the differences among those forecasting models are minor. The reason why the accuracy has not been improved much may be that the intrinsic prediction errors in NWP data restrict the forecasting performance, especially the high errors of predicted wind speed and relative humidity. It can be discovered that the forecasting errors rise remarkably in two-day-ahead and three-day-ahead situations. Due to increasing prediction errors of NWP, the model is trained to perform more conservative forecasting. As a result, the extreme points of solar radiation curves cannot be well predicted. Based on the above results and analysis, the precision of NWP information is most vital in solar radiation forecasting, which greatly affects the performances of prediction models. Besides, since the proposed hybrid method can obtain stable results in multi-dayahead forecasting as well, it is feasible and practical to extend the method into the field of solar radiation forecasting. Table 7 Comparisons of multi-day-ahead forecasting performance at Beijing, Kunming, Changsha and Hefei stations. Beijing Ahead 1-day
2-days
3-days
Kunming
Changsha
Hefei
Model MAE
RMSE
MAPE
R
MAE
RMSE
MAPE
R
MAE
RMSE
MAPE
R
MAE
RMSE
MAPE
R
SVR
4.283
5.512
24.48
0.703
3.743
4.681
21.18
0.680
4.928
6.041
34.71
0.629
4.829
6.273
26.24
0.586
GPR
4.552
5.864
26.19
0.659
3.698
4.673
20.87
0.679
4.827
6.000
33.73
0.637
5.474
6.994
31.70
0.467
ANFIS
4.074
5.230
23.98
0.746
3.742
4.700
21.04
0.675
4.832
5.996
33.42
0.636
4.810
6.079
26.49
0.601
DBN
4.058
5.226
23.41
0.743
3.699
4.639
20.63
0.687
4.785
5.943
32.93
0.644
4.855
6.111
27.12
0.596
Functional DBN
4.053
5.213
23.22
0.745
3.698
4.650
20.59
0.685
4.796
5.945
33.49
0.644
4.821
6.074
26.35
0.602
EC + Functional DBN 4.033
5.206
23.43
0.748
3.658
4.603
20.68
0.693
4.744
5.924
32.80
0.648
4.820
6.105
26.05
0.598
SVR
4.474
5.581
24.87
0.704
4.250
5.207
22.62
0.575
5.429
6.493
35.55
0.549
5.464
6.617
28.96
0.489
GPR
5.032
6.350
28.69
0.598
4.707
5.781
24.74
0.465
5.456
6.608
34.92
0.534
6.655
8.294
38.23
0.435
ANFIS
4.466
5.535
25.24
0.710
4.282
5.281
22.57
0.559
5.642
6.661
36.04
0.517
5.613
6.781
29.23
0.456
DBN
4.411
5.550
23.78
0.710
4.284
5.229
22.95
0.576
5.389
6.392
34.46
0.566
5.485
6.620
29.68
0.493
Functional DBN
4.418
5.528
24.29
0.713
4.210
5.222
21.99
0.577
5.378
6.398
33.62
0.565
5.452
6.707
27.70
0.496
EC + Functional DBN 4.331
5.521
23.02
0.708
4.270
5.234
22.38
0.574
5.350
6.380
35.44
0.568
5.479
6.605
28.30
0.493
SVR
4.476
5.586
24.02
0.697
4.419
5.370
23.05
0.549
5.483
6.484
36.37
0.548
5.621
6.878
30.77
0.460
GPR
5.013
6.587
27.23
0.554
4.966
6.031
25.86
0.409
5.580
6.637
37.48
0.524
6.378
8.045
36.48
0.453
ANFIS
4.607
5.708
24.15
0.687
4.437
5.426
22.93
0.524
5.487
6.456
35.97
0.554
5.840
7.049
31.62
0.406
DBN
4.474
5.580
23.88
0.708
4.365
5.303
22.72
0.567
5.574
6.529
36.52
0.540
5.641
6.735
30.87
0.466
Functional DBN
4.489
5.613
23.74
0.705
4.386
5.281
22.98
0.572
5.559
6.544
36.51
0.536
5.598
6.690
28.49
0.476
EC + Functional DBN 4.355
5.543
23.15
0.702
4.353
5.330
22.55
0.549
5.346
6.382
35.05
0.569
5.582
6.707
29.86
0.473
2
Daily global solar radiation (MJ/m2)
Prediction curves under EC + functional DBN:
35
1-day-ahead forecasting
2-day-ahead forecasting
3-day-ahead forecasting
Measured
30 25 20 15 10 5 0
0
50
100
150
200
250
300
350
Time (day) Comparisons of forecasting matrices:
GPR
ANFIS
DBN
Functional DBN
2
1-dayahead
2-dayahead
2
1-dayahead
2-dayahead
20
R
4 0
3-dayahead
30
6
MAPE
4
0
SVR
8
RMSE
MAE
6
10 0
3-dayahead
1-dayahead
2-dayahead
0.8 0.6 0.4 0.2 0
3-dayahead
EC + Functional DBN
1-dayahead
2-dayahead
3-dayahead
Daily global solar radiation (MJ/m2)
(a) Prediction curves under EC + functional DBN:
30
1-day-ahead forecasting
2-day-ahead forecasting
3-day-ahead forecasting
Measured
25 20 15 10 5 0
0
50
100
150
200
250
300
350
Time (day) Comparisons of forecasting matrices:
2
1-dayahead
2-dayahead
3-dayahead
ANFIS
DBN
Functional DBN
4 2
1-dayahead
2-dayahead
0.6
20 10 0
3-dayahead
EC + Functional DBN
0.8
R
6
0
GPR
30
MAPE
4
0
SVR
8
RMSE
MAE
6
0.4 0.2
1-dayahead
2-dayahead
0
3-dayahead
1-dayahead
2-dayahead
3-dayahead
Daily global solar radiation (MJ/m2)
(b) Prediction curves under EC + functional DBN:
30
1-day-ahead forecasting
2-day-ahead forecasting
3-day-ahead forecasting
Measured
25 20 15 10 5 0
0
50
100
150
200
250
300
350
Time (day) SVR
8 4 2 0
1-dayahead
2-dayahead
3-dayahead
6
MAPE
RMSE
MAE
6
4 2 0
1-dayahead
2-dayahead
3-dayahead
(c)
3
50 40 30 20 10 0
GPR
ANFIS
DBN
Functional DBN
EC + Functional DBN
0.8 0.6
R
Comparisons of forecasting matrices:
0.4 0.2
1-dayahead
2-dayahead
3-dayahead
0
1-dayahead
2-dayahead
3-dayahead
Daily global solar radiation (MJ/m2)
Prediction curves under EC + functional DBN:
30
1-day-ahead forecasting
2-day-ahead forecasting
3-day-ahead forecasting
Measured
25 20 15 10 5 0
0
50
100
150
200
250
300
350
Time (day) Comparisons of forecasting matrices:
2
1-dayahead
2-dayahead
5 0
3-dayahead
ANFIS
DBN
Functional DBN
1-dayahead
2-dayahead
0.6
30 20 10 0
3-dayahead
EC + Functional DBN
0.8
R
4
GPR
40
MAPE
6
0
SVR
10
RMSE
MAE
8
0.4 0.2
1-dayahead
2-dayahead
3-dayahead
0
1-dayahead
2-dayahead
3-dayahead
(d) Fig. 10. Prediction curves and comparison of matrices based on the multi-day-ahead forecasting case. (a) Beijing; (b) Kunming; (c) Changsha; (d) Hefei.
4.5. Hourly global solar radiation estimation In addition to daily solar radiation estimation and prediction, the feasibility of the proposed method to estimate hourly radiation data is also validated in this study. In the hourly forecasting case of this study, the data measured in daily interval are not involved, including daily mean, maximum and minimum temperature, daily sunshine duration, etc. Instead, hourly measured data and the solar hour angle are utilized as inputs. The solar hour angle can be computed as [70]: θ h = 15 × (12 − ts ) (35) where ts is the true solar time. Thus, the basic and functional neurons for hourly solar radiation estimation are provided in Table 8. It should be noted that the hourly temperature in knowledge functions is converted to Kelvin units, in order to avoid the negative values in square root and logarithmic functions. From Table 8, the proposed hybrid method usually obtains the best indicators. In Beijing, the method achieves the best scores of all the four indictors, i.e. 0.137 MJ/m2 of MAE, 0.282 MJ/m2 of RMSE, 23.25% of MAPE and 0.950 of R, respectively. The probabilistic density curves of estimation errors are shown in Fig. 11. Among those, the hybrid method outperforms the others in Beijing, Kunming and Changsha stations, as it obtains obviously sharper curves. Hence, with accurate values of hourly measured meteorological elements, the proposed method can be also extended to estimate hourly global solar radiation. Table 8 Basic and functional neurons for estimating hourly global solar radiation. Neuron ϕ λ Th
Description latitude longitude altitude hourly mean dry-bulb temperature
h Rh
hourly mean relative humidity
Wh
hourly mean wind speed
n day
the number of day in a year
θh cos ϕ
solar hour angle
sin ϕ
cos δ sin δ cos θ h sin θ h
Th
the cosine function of latitude the sine function of latitude the cosine function of daily solar declination the sine function of daily solar declination the cosine function of solar hour angle the sine function of solar hour angle the square root function of temperature
4
the logarithmic function of temperature the square root function of the sum of temperature and relative humidity the ratio of temperature to relative humidity
ln T h
Th + R h
Th / Rh
Table 9 Comparisons of hourly estimation performance at Beijing, Kunming, Changsha and Hefei stations. Beijing
Kunming
Changsha
Hefei
Model MAE
RMSE
MAPE
R
MAE
RMSE
MAPE
R
MAE
RMSE
MAPE
R
MAE
RMSE
MAPE
R
SVR
0.180
0.306
24.73
0.942
0.240
0.353
32.13
0.931
0.160
0.271
30.43
0.941
0.183
0.268
26.60
0.951
GPR
0.164
0.285
23.54
0.949
0.216
0.345
57.84
0.934
0.144
0.256
29.10
0.946
0.147
0.259
27.65
0.952
ANFIS
0.154
0.287
24.77
0.948
0.223
0.355
57.47
0.930
0.146
0.267
29.97
0.941
0.146
0.259
28.14
0.950
DBN
0.148
0.291
24.42
0.947
0.201
0.344
31.24
0.934
0.137
0.270
31.18
0.940
0.146
0.273
28.61
0.945
Functional DBN
0.147
0.289
24.43
0.947
0.199
0.333
29.88
0.938
0.132
0.263
30.04
0.943
0.145
0.266
28.24
0.948
EC + Functional DBN 0.137
0.282
23.25
0.950
0.187
0.340
21.86
0.935
0.125
0.252
29.41
0.948
0.140
0.254
27.30
0.952
Kunming (cluster 2) Probability density function
Probability density function
Beijing (cluster 1) SVR GPR ANFIS DBN Functional DBN EC + Functional DBN
25 20 15 10 5 0 -0.08 -0.06
-0.04
-0.02
0
0.02
0.04
0.06
15
10
SVR GPR ANFIS DBN Functional DBN EC + Functional DBN
5
0
0.08
Hourly estimation errors (MJ/m2)
-0.2
-0.1
SVR GPR ANFIS DBN Functional DBN EC + Functional DBN
25 20 15 10 5 0 -0.1
-0.05
0
0.1
0.2
Hefei (cluster 4) Probability density function
Probability density function
Changsha (cluster 3) 30
0
Hourly estimation errors (MJ/m2)
0.05
0.1
Hourly estimation errors (MJ/m2)
20 SVR GPR ANFIS DBN Functional DBN EC + Functional DBN
15 10 5 0 -0.1
-0.05
0
0.05
0.1
Hourly estimation errors (MJ/m2)
Fig. 11. Probabilistic density curves of hourly estimation errors in four different stations.
5. Conclusion Daily global solar radiation estimation is an essential step for solar energy utilization. Various meteorological elements, e.g. sunshine duration, wind speed, temperature and humidity, can be beneficial to the improvement of estimation accuracy, where it remains unsolved on how to efficiently merge them into an estimation model. Empirical models and machine learning models are commonly used in this field, but very often they appear poor adaptability and must be re-calculated in order to be utilized at another location. Considering the fact that deep learning models have superb generalization capability and are especially suitable to be trained on a big number of datasets, they are introduced and validated in this study to estimate daily global solar radiation of different meteorological stations in China. The feasibilities are studied as well of the proposed method in multi-day-ahead global solar radiation forecasting and hourly radiation estimation. DBN, FN and EC are combined together in the proposed hybrid estimation method. DBN is a deep learning model stacked by RBMs, which are able to pre-train parameters in DBN and reduce the difficulty of training an entire neural network. FN involves empirical knowledge into the estimation model, which further improves its reliability and robustness. EC is a new clustering method utilized to divide radiation curves, where similar curves can be estimated under one deep learning model. After validation and comparison, the proposed functional DBN with EC achieves the best performance according to several criteria on the overall testing dataset. The MAE, RMSE, MAPE and R values of the proposed EC + functional DBN are 1.706 MJ/m2, 2.352 MJ/m2, 13.71% and 0.955, respectively, exhibiting its accuracy and superiority. The present study validates the performances of several DBN models for global solar radiation and prediction. Although a sole DBN is able to compute more precise results than machine learning models like SVR and ANFIS, those results vary greatly at different times of validations, due to the difficulty of 5
training a deep neural network to global optimum. Therefore, further work can be related to state-of-art training technologies and approaches of deep learning. Acknowledgments The research is supported by National Natural Science Foundation of China (Program No. 51507052) and the Fundamental Research Funds for the Central Universities (Program No. 2018B15414). The authors would also like to extend the gratitude to the China Meteorological Administration. References [1] Khosravi A, Koury RNN, Machado L, Pabon JJG. Prediction of hourly solar radiation in Abu Musa Island using machine learning algorithms. Journal of Cleaner Production. 2018;176:63-75. [2] Monjoly S, Andre M, Calif R, Soubdhan T. Hourly forecasting of global solar radiation based on multiscale decomposition methods: A hybrid approach. Energy. 2017;119:288-98. [3] Lund H. Renewable heating strategies and their consequences for storage and grid infrastructures comparing a smart grid to a smart energy systems approach. Energy. 2018;151:94-102. [4] Kalogirou S. The potential of solar industrial process heat applications. Appl Energ. 2003;76(4):337-61. [5] Yildirim HB, Celik O, Teke A, Barutcu B. Estimating daily Global solar radiation with graphical user interface in Eastern Mediterranean region of Turkey. Renew Sust Energ Rev. 2018;82:1528-37. [6] Hartner M, Mayr D, Kollmann A, Haas R. Optimal sizing of residential PV-systems from a household and social cost perspective A case study in Austria. Sol Energy. 2017;141:49-58. [7] Achour L, Bouharkat M, Assas O, Behar O. Hybrid model for estimating monthly global solar radiation for the Southern of Algeria: (Case study: Tamanrasset, Algeria). Energy. 2017;135:526-39. [8] Hassan MA, Khalil A, Kaseb S, Kassem MA. Potential of four different machine-learning algorithms in modeling daily global solar radiation. Renew Energ. 2017;111:52-62. [9] Halabi LM, Mekhilef S, Hossain M. Performance evaluation of hybrid adaptive neuro-fuzzy inference system models for predicting monthly global solar radiation. Appl Energ. 2018;213:247-61. [10] Quej VH, Almorox J, Arnaldo JA, Saito L. ANFIS, SVM and ANN soft-computing techniques to estimate daily global solar radiation in a warm sub-humid environment. Journal of Atmospheric and Solar-Terrestrial Physics. 2017;155:6270. [11] Janjai S, Pankaew P, Laksanaboonsong J, Kitichantaropas R. Estimation of solar radiation over Cambodia from longterm satellite data. Renew Energ. 2011;36(4):1214-20. [12] Bakirci K. Prediction of global solar radiation and comparison with satellite data. Journal of Atmospheric and SolarTerrestrial Physics. 2017;152:41-9. [13] Polo J, Wilbert S, Ruiz-Arias JA, Meyer R, Gueymard C, Suri M, et al. Preliminary survey on site-adaptation techniques for satellite-derived and reanalysis solar radiation datasets. Sol Energy. 2016;132:25-37. [14] Vindel JM, Valenzuela RX, Navarro AA, Zarzalejo LF. Methodology for optimizing a photosynthetically active radiation monitoring network from satellite-derived estimations: A case study over mainland Spain. Atmos Res. 2018;212:227-39. [15] Hocaoglu FO. Stochastic approach for daily solar radiation modeling. Sol Energy. 2011;85(2):278-87. [16] Ayodele TR, Ogunjuyigbe ASO. Prediction of monthly average global solar radiation based on statistical distribution of clearness index. Energy. 2015;90:1733-42. [17] Chukwujindu NS. A comprehensive review of empirical models for estimating global solar radiation in Africa. Renew Sust Energ Rev. 2017;78:955-95. [18] Almorox J, Bocco M, Willington E. Estimation of daily global solar radiation from measured temperatures at Canada de Luque, Cordoba, Argentina. Renew Energ. 2013;60:382-7. [19] Fan JL, Chen BQ, Wu LF, Zhang FC, Lu XH, Xiang YZ. Evaluation and development of temperature-based empirical models for estimating daily global solar radiation in humid regions. Energy. 2018;144:903-14. [20] Almorox J, Hontoria C, Benito M. Models for obtaining daily global solar radiation with measured air temperature data in Madrid (Spain). Appl Energ. 2011;88(5):1703-9. [21] Hassan GE, Youssef ME, Mohamed ZE, Ali MA, Hanafy AA. New Temperature-based Models for Predicting Global Solar Radiation. Appl Energ. 2016;179:437-50. [22] Chelbi M, Gagnon Y, Waewsak J. Solar radiation mapping using sunshine duration-based models and interpolation techniques: Application to Tunisia. Energ Convers Manage. 2015;101:203-15. [23] Fan JL, Wu LF, Zhang FC, Cai HJ, Zeng WZ, Wang XK, et al. Empirical and machine learning models for predicting daily global solar radiation from sunshine duration: A review and case study in China. Renew Sust Energ Rev. 2019;100:186-212. [24] Makade RG, Jamil B. Statistical analysis of sunshine based global solar radiation (GSR) models for tropical wet and dry climatic Region in Nagpur, India: A case study. Renew Sust Energ Rev. 2018;87:22-43.
6
[25] Ogunjobi KO, Kim YJ, He Z. Influence of the total atmospheric optical depth and cloud cover on solar irradiance components. Atmos Res. 2004;70(3-4):209-27. [26] Komar L, Kocifaj M. Statistical cloud coverage as determined from sunshine duration: a model applicable in daylighting and solar energy forecasting. Journal of Atmospheric and Solar-Terrestrial Physics. 2016;150:1-8. [27] Liu JD, Linderholm H, Chen DL, Zhou XJ, Flerchinger GN, Yu Q, et al. Changes in the relationship between solar radiation and sunshine duration in large cities of China. Energy. 2015;82:589-600. [28] Hassan MA, Khalil A, Kaseb S, Kassem MA. Independent models for estimation of daily global solar radiation: A review and a case study. Renew Sust Energ Rev. 2018;82:1565-75. [29] Quej VH, Almorox J, Ibrakhimov M, Saito L. Estimating daily global solar radiation by day of the year in six cities located in the Yucatan Peninsula, Mexico. Journal of Cleaner Production. 2017;141:75-82. [30] Hassan GE, Youssef ME, Ali MA, Mohamed ZE, Shehata AI. Performance assessment of different day-of-the-yearbased models for estimating global solar radiation - Case study: Egypt. Journal of Atmospheric and Solar-Terrestrial Physics. 2016;149:69-80. [31] Khorasanizadeh H, Mohammadi K, Jalilvand M. A statistical comparative study to demonstrate the merit of day of the year-based models for estimation of horizontal global solar radiation. Energ Convers Manage. 2014;87:37-47. [32] Zang H, Cheng L, Ding T, Cheung KW, Wang M, Wei Z, et al. Estimation and validation of daily global solar radiation by day of the year-based models for different climates in China. Renew Energ. 2018. [33] Marzo A, Trigo-Gonzalez M, Alonso-Montesinos J, Martinez-Durban M, Lopez G, Ferrada P, et al. Daily global solar radiation estimation in desert areas using daily extreme temperatures and extraterrestrial radiation. Renew Energ. 2017;113:303-11. [34] Loghmari I, Timoumi Y, Messadi A. Performance comparison of two global solar radiation models for spatial interpolation purposes. Renew Sust Energ Rev. 2018;82:837-44. [35] David M, Luis MA, Lauret P. Comparison of intraday probabilistic forecasting of solar irradiance using only endogenous data. Int J Forecasting. 2018;34(3):529-47. [36] Zendehboudi A, Baseer MA, Saidur R. Application of support vector machine models for forecasting solar and wind energy resources: A review. Journal of Cleaner Production. 2018;199:272-85. [37] Mohammadi K, Shamshirband S, Tong CW, Alam KA, Petković D. Potential of adaptive neuro-fuzzy system for prediction of daily global solar radiation by day of the year. Energ Convers Manage. 2015;93:406-13. [38] Guermoui M, Melgani F, Danilo C. Multi-step ahead forecasting of daily global and direct solar radiation: A review and case study of Ghardaia region. Journal of Cleaner Production. 2018;201:716-34. [39] Salcedo-Sanz S, Deo RC, Cornejo-Bueno L, Camacho-Gomez C, Ghimire S. An efficient neuro-evolutionary hybrid modelling mechanism for the estimation of daily global solar radiation in the Sunshine State of Australia. Appl Energ. 2018;209:79-94. [40] Sharifzadeh M, Sikinioti-Lock A, Shah N. Machine-learning methods for integrated renewable power generation: A comparative study of artificial neural networks, support vector regression, and Gaussian Process Regression. Renew Sust Energ Rev. 2019;108:513-38. [41] Cornejo-Bueno L, Casanova-Mateo C, Sanz-Justo J, Salcedo-Sanz S. Machine learning regressors for solar radiation estimation from satellite data. Sol Energy. 2019;183:768-75. [42] Srivastava S, Lessmann S. A comparative study of LSTM neural networks in forecasting day-ahead global horizontal irradiance with satellite data. Sol Energy. 2018;162:232-47. [43] Kaba K, Sarigul M, Avci M, Kandirmaz HM. Estimation of daily global solar radiation using deep learning model. Energy. 2018;162:126-35. [44] Lago J, De Brabandere K, De Ridder F, De Schutter B. Short-term forecasting of solar irradiance without local telemetry: A generalized model using satellite data. Sol Energy. 2018;173:566-77. [45] Hanin B. Which Neural Net Architectures Give Rise to Exploding and Vanishing Gradients? Advances in Neural Information Processing Systems 31 (Nips 2018). 2018;31. [46] Qureshi AS, Khan A, Zameer A, Usman A. Wind power prediction using deep neural network based meta regression and transfer learning. Applied Soft Computing. 2017;58:742-55. [47] Jiao RH, Huang XJ, Ma XH, Han LY, Tian W. A Model Combining Stacked Auto Encoder and Back Propagation Algorithm for Short-Term Wind Power Forecasting. Ieee Access. 2018;6:17851-8. [48] Schutz T, Schraven MH, Fuchs M, Remmen P, Muller D. Comparison of clustering algorithms for the selection of typical demand days for energy system synthesis. Renew Energ. 2018;129:570-82. [49] Zhang Y, Li ZM, Zhang H, Yu Z, Lu TT. Fuzzy c-means clustering-based mating restriction for multiobjective optimization. Int J Mach Learn Cyb. 2018;9(10):1609-21. [50] Brusco MJ, Shireman E, Steinley D. A Comparison of Latent Class, K-Means, and K-Median Methods for Clustering Dichotomous Data. Psychol Methods. 2017;22(3):563-80. [51] Ahmed A, Khalid M. An intelligent framework for short-term multi-step wind speed forecasting based on Functional Networks. Appl Energ. 2018;225:902-11. [52] Fan JL, Wang XK, Wu LF, Zhang FC, Bai H, Lu XH, et al. New combined models for estimating daily global solar radiation based on sunshine duration in humid regions: A case study in South China. Energ Convers Manage. 2018;156:618-25.
7
[53] Bayrakci HC, Demircan C, Kecebas A. The development of empirical models for estimating global solar radiation on horizontal surface: A case study. Renew Sust Energ Rev. 2018;81:2771-82. [54] Yaniktepe B, Genc YA. Establishing new model for predicting the global solar radiation on horizontal surface. Int J Hydrogen Energ. 2015;40(44):15278-83. [55] Ouali K, Alkama R. A new Model of global solar radiation based on meteorological data in Bejaia City (Algeria). Enrgy Proced. 2014;50:670-6. [56] Despotovic M, Nedic V, Despotovic D, Cvetanovic S. Review and statistical analysis of different global solar radiation sunshine models. Renew Sust Energ Rev. 2015;52:1869-80. [57] Rivero M, Orozco S, Sellschopp FS, Loera-Palomo R. A new methodology to extend the validity of the HargreavesSamani model to estimate global solar radiation in different climates: Case study Mexico. Renew Energ. 2017;114:1340-52. [58] Chen RS, Ersi K, Yang JP, Lu SH, Zhao WZ. Validation of five global radiation models with measured daily data in China. Energ Convers Manage. 2004;45(11-12):1759-69. [59] Ayodele TR, Ogunjuyigbe ASO. Performance assessment of empirical models for prediction of daily and monthly average global solar radiation: the case study of Ibadan, Nigeria. International Journal of Ambient Energy. 2016;38(8):803-13. [60] Chegaar M, Chibani A. Global solar radiation estimation in Algeria. Energ Convers Manage. 2001;42(8):967-73. [61] Kolebaje OT, Ikusika A, Akinyemi P. Estimating solar radiation in ikeja and port harcourt via correlation with relative humidity and temperature. International Journal of Energy Production and Management. 2016;1(3):253-62. [62] Ajayi OO, Ohijeagbon OD, Nwadialo CE, Olasope O. New model to estimate daily global solar radiation over Nigeria. Sustainable Energy Technologies and Assessments. 2014;5:28-36. [63] Hinton GE, Osindero S, Teh YW. A fast learning algorithm for deep belief nets. Neural Comput. 2006;18(7):1527-54. [64] Wang HZ, Wang GB, Li GQ, Peng JC, Liu YT. Deep belief network based deterministic and probabilistic wind speed forecasting approach. Appl Energ. 2016;182:80-93. [65] Wu F, Wang ZH, Lu WM, Li X, Yang Y, Luo JB, et al. Regularized Deep Belief Network for Image Attribute Detection. Ieee T Circ Syst Vid. 2017;27(7):1464-77. [66] Zou L, Wang LC, Xia L, Lin AW, Hu B, Zhu HJ. Prediction and comparison of solar radiation using improved empirical models and Adaptive Neuro-Fuzzy Inference Systems. Renew Energ. 2017;106:343-53. [67] Li MF, Tang XP, Wu W, Liu HB. General models for estimating daily global solar radiation for different solar radiation zones in mainland China. Energ Convers Manage. 2013;70:139-48. [68] Zhao N, Zeng XF, Han SM. Solar radiation estimation using sunshine hour and air pollution index in China. Energ Convers Manage. 2013;76:846-51. [69] Feng L, Lin AW, Wang LC, Qin WM, Gong W. Evaluation of sunshine-based models for predicting diffuse solar radiation in China. Renew Sust Energ Rev. 2018;94:168-82. [70] Loutfi H, Bernatchou A, Raoui Y, Tadili R. Learning Processes to Predict the Hourly Global, Direct, and Diffuse Solar Irradiance from Daily Global Radiation with Artificial Neural Networks. Int J Photoenergy. 2017.
8
Highlights
One sole deep model can be applied at multiple sites to estimate solar radiation.
The pre-training of RBM reduces the optimization difficulty of deep networks.
The knowledge from empirical equations is involved by means of functional neurons.
The embedding clustering method provides initial cluster centers for k-means.
The proposed method is feasible for both daily and hourly estimation.