Application of functional deep belief network for estimating daily global solar radiation: A case study in China

Application of functional deep belief network for estimating daily global solar radiation: A case study in China

Journal Pre-proof Application of functional deep belief network for estimating daily global solar radiation: A case study in China Haixiang Zang, Lili...

2MB Sizes 0 Downloads 44 Views

Journal Pre-proof Application of functional deep belief network for estimating daily global solar radiation: A case study in China Haixiang Zang, Lilin Cheng, Tao Ding, Kwok W. Cheung, Miaomiao Wang, Zhinong Wei, Guoqiang Sun PII:

S0360-5442(19)32197-8

DOI:

https://doi.org/10.1016/j.energy.2019.116502

Reference:

EGY 116502

To appear in:

Energy

Received Date: 31 March 2019 Revised Date:

5 September 2019

Accepted Date: 6 November 2019

Please cite this article as: Zang H, Cheng L, Ding T, Cheung KW, Wang M, Wei Z, Sun G, Application of functional deep belief network for estimating daily global solar radiation: A case study in China, Energy (2019), doi: https://doi.org/10.1016/j.energy.2019.116502. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier Ltd.

Application of functional deep belief network for estimating daily global solar radiation: A case study in China

Haixiang Zang a,*, Lilin Cheng a, Tao Ding b, Kwok W. Cheung c, Miaomiao Wang a, Zhinong Wei a, Guoqiang Sun a a

College of Energy and Electrical Engineering, Hohai University, Nanjing 210098, China

b

Department of Electrical Engineering, Xi’an Jiaotong University, Xi’an 710049, China

c

GE Grid Solutions, Redmond, WA 98052, USA

Abstract Solar energy plays an essential role in environment governance and resource protection, as it is totally pollution-free and extensively accessed. An accurate knowledge of solar radiation is beneficial to the deployments of solar energy constructions, photovoltaic and thermal solar systems. In this study, a deep learning method is proposed for estimating daily global solar radiation, which is constituted by embedding clustering (EC) and functional deep belief network (DBN). Based on the curve shapes of daily solar radiation, EC divides the overall dataset into different subsets, which can be modeled separately. Knowledge from empirical radiation models is also merged as the input of functional DBN. The model can be directly applied to solar estimation in various stations due to its strong nonlinear representation. The case study in China is adopted that involves radiation data from a total of 30 stations to validate the practicability and accuracy of the proposed method. From the results, the method obtains better estimation precision with empirical knowledge, achieving 1.706 MJ/m2 of mean absolute error (MAE), 2.352 MJ/m2 of root mean square error (RMSE) and 13.71% of mean absolute percentage error (MAPE) according to the average values at the 30 stations. Keywords: Daily global solar radiation; deep belief network; empirical knowledge; solar radiation model

1. Introduction Due to the rising consciousness of fossil fuel substitution and environment protection, it is of more necessity to develop and promote renewable energy sources. Among those, solar energy is a significant resource and its exploration development grows rapidly, as it is abundant, stable and widely distributed [1, 2]. For the sake of urban planning, photovoltaic array sizing and solar thermal system designing [3, 4], a proper knowledge of daily global solar radiation is required at various regions [5-7]. However, daily solar radiation data are usually missing and even unavailable at many meteorological stations, due to complex equipment and high maintenance cost of measurements [8, 9]. Therefore, numerous research focusing on solar radiation estimation has been carried out recently. Three major methods are commonly applied in estimating daily global solar radiation, i.e. satellitederived, stochastic and relationship methods [10]. Satellite-derived method can analyze earthatmospheric reflectivity of irradiation [11], gather daily radiation values accurately [12], and estimate solar radiation within a large range of regions [13, 14]. Nevertheless, equipment cost of satellite is also *

Corresponding author. Tel.: +86 13770719919; fax: +86 25 58099087. E-mail address: [email protected].

1

quite high, which is unacceptable at some meteorological stations. In stochastic algorithms, daily global solar radiation data are generated based on the mean values of historical weather observations [15]. Apparently, the method is infeasible if a station possesses no recent historical solar radiation records. Relationship methods are aimed to establish relations between solar radiation and other meteorological elements [16]. In the methods, empirical equations or soft-computing models are commonly utilized. Those methods have been most broadly studied as they can easily achieve high estimation precision with correct weather inputs. Besides, another advantage is that the historical solar radiation data involved for training those models are sometimes not needed as model inputs at the forecasting phase. Relationship models that apply empirical equations are also named empirical models. They use dissimilar weather elements, e.g. temperature, sunshine duration, cloud cover, to form different empirical models [17]. Temperature-based (TB) models are the most commonly-used ones, as ambient temperature records are more accessible than any other meteorological data [18]. The difference between the daily maximum and minimum temperature values is usually considered in TB models. Because the temperature difference is able to indicate the fluctuation of daily global solar radiation [19, 20]. In [21], the new TB model also involves daily mean temperature and demonstrates great estimation performance. Similarly, sunshine duration is an important factor that influences the level of radiation. Hence, sunshine-duration-based (SDB) models are proposed to analyze the factor [22-24]. It remains a problem that sunshine duration data are unavailable at some stations. In this case, cloud-based models can be adopted instead of SDB models [25, 26], as cloud cover is able to be calculated according to daily sunshine duration. However, sometimes cloud-based models are not appropriate in regions with heavy haze due to the poor air condition [27]. Moreover, day-of-the-year-based (DYB) model is another kind of empirical model, also called independent model [28]. It holds a concise formula without using meteorological inputs [29-32], whereas its precision is usually worse than TB and SDB models. In contrary to empirical models, soft-computing methods utilize historical datasets to train a machine learning model, where the relations between solar radiation and meteorological elements can be modeled. They are able to merge various weather inputs efficiently and adapt to different climatic regions, receiving more attention from researchers than conventional empirical equations. In [33-35], artificial neural network (ANN) has been validated for estimating and predicting solar radiation. It possesses a great potential of self-learning, while often suffering from local optimum problems. Support vector regression (SVR) and adaptive neuro-fuzzy inference system (ANFIS) are two machine learning models suitable for high-dimensional inputs, which are frequently applied and studied in the field of solar radiation estimation [36, 37]. Nevertheless, they can hardly deal with big datasets. Besides, as parameters in Gaussian process regression (GPR) and extreme learning machine (ELM) can be optimized under adaptive searching approaches, hybrid models based on the two methods are proposed in [38, 39]. Among those, GPR model is a method that is usually able to acquire promising results with less computational cost [40, 41]. It is noted that their accuracy may decrease when a great deal of inputs raise the difficulty of optimization. The above machine learning models have all been proven feasible and practicable in the field of solar radiation estimation. However, they share a common weakness that they often perform with low training efficiency on a large number of samples. For example, the computation complexity of kernel functions in SVR and GPR will rise sharply, causing it hardly to complete the training process. Hence, those models are normally site-dependent, which means that they are trained and applied at different sites individually. On the contrary, deep neural networks utilize mini-batch gradient decent training strategy, which is suitable for huge samples. Additionally, the estimation accuracy can be further improved under deep models, as they usually possess better nonlinear representation abilities than conventional machine learning models. Thus, deep-learning-based models haven been proposed and applied recently in the field of renewable energy assessment. In [42], long short-term memory (LSTM) network is proposed to forecast global solar radiation. Owing to the recurrent use of hidden layer units, LSTM network is suitable for processing series inputs. Comparatively, another common kind of deep models is built by stacking massive hidden layers. They can adapt to point inputs of multiple elements, which are established for solar radiation prediction and estimation [43, 44]. However, a deep model with a great number of layers may also face gradient vanishing problems [45], leading to nonconvergence issues. Motivated by this, a deep learning model named deep belief network (DBN) is 2

introduced to estimate daily global solar radiation in this study. It contains a pre-training operation when stacking hidden layers, which alleviates the non-convergence problem [46]. A sole DBN model contains a great number of parameters and can be directly utilized at different stations. The knowledge of empirical equations is also merged in the proposed DBN using functional network (FN). As there are few researches of solar radiation estimation focusing on deep learning methods nowadays [43], the functional DBN is thus proposed and will be validated based on the case study in China. The main contributions of this study can be summarized as follows: (1) Deep learning method is introduced to solar radiation estimation. Owing to its strong capabilities of nonlinear representation and big data analysis, it is avoidable to model solar radiation at different meteorological stations separately. (2) The DBN model is utilized instead of ordinary deep neural network. Its parameters can be pretrained under restricted Boltzmann machine (RBM), which reduces the training difficulty of deep networks. (3) The knowledge from empirical equations is involved in the estimation model by means of FN. Thus, reliability of the proposed model is guaranteed and its training difficulty can be further decreased. (4) A novel clustering method called embedding clustering (EC) is proposed to divide meteorological stations. It consists of auto-encoder (AE) and k-means. The method can overcome the weakness of conventional k-means that the clustering result is easily affected by the initial clusters. The rest of this paper is organized as follows. The case study, along with training and testing datasets, is introduced in Section 2. The framework and algorithms of the utilized method are described in Section 3. In Section 4, results and discussions based on the case study are presented. The conclusion is drawn in Section 5. 2. Study area and datasets A total of 30 meteorological stations in China that hold vast climatic differences are studied in this paper. Their geographical locations and annual average weather values are shown in Table 1. According to the positions, those stations cover wide ranges of latitudes from 18.04°N in Sanya to 46.14°N in Jiamusi, longitudes from 75.16°E in Kashgar to 130.05°E in Jiamusi and altitudes from 2.5m in Tianjin to 4507.0m in Nagqu, which are presented in Fig. 1. Their daily global solar radiation data will be estimated under an entire deep learning model or under a small number of ones after clustering, since deep learning methods possess a powerful representational capability. Meteorological data records utilized in this study are obtained from China Meteorological Data Sharing Service System (CMDSSS), including daily maximum dry-bulb temperature (Tmax), daily minimum dry-bulb temperature (Tmin), daily mean dry-bulb temperature (Tmean), daily mean wind speed (WS), daily mean relative humidity (RH), daily sunshine duration (S) and daily global solar radiation on a horizontal surface (H), which are all measured in daily intervals. The overall datasets are collected within 22 years altogether, from the 1st of January, 1994 to the 31st of December, 2015 with relatively small measurement deviations. Through the raw dataset, the missing and outlier values occupy about 0.08% of the total ones, indicating a good data quality. Among those, outliers are discriminated using two-tail test with a critical value of three times the standard deviation. Since those missing and outlier values are small in number and scattered in distribution, they are cleaned based on quadratic spline interpolation. After data cleaning and normalization, all the above meteorological elements, along with geographical locations and the number of days in a year (nday), are taken as input variables of the estimation models. Daily global solar radiation is the output. In order to ensure training effects and prevent over-fitting problems, datasets in 1994~2012 are selected as training samples (about 85%) and those in 2013~2015 are testing ones. Table 1 Geographical locations and annual average values of meteorological data at the 30 stations studied in China. Station Beijing

Latitude (°N) 39.13

Longitude (°E) Altitude (m) 116.08 31.3

Tmax (°C) 18.6

3

Tmin (°C) 8.5

Tmean (°C) S (hours) 13.3 6.7

WS (m/s) 2.3

RH 0.53

H (MJ/m2) 13.55

Changchun Changsha Dongsheng Fuzhou Guangzhou Haikou Hami Hangzhou Harbin Hefei Jiamusi Jinan Kashgar Kunming Lhasa Nagqu Nanchang Nanjing Nanning Sanya Shanghai Shenyang Taiyuan Tianjin Urumqi Wuhan Xining Yinchuan Zhengzhou

43.15 28.04 39.14 26.01 23.03 20.01 42.14 30.04 45.13 31.14 46.14 36.10 39.08 25.00 29.11 31.08 28.10 32.00 22.11 18.04 31.07 41.12 37.13 39.01 43.13 30.10 36.12 38.08 34.12

125.04 112.15 109.16 119.05 113.06 110.06 93.09 120.03 126.13 117.04 130.05 117.01 75.16 102.11 91.02 92.01 115.15 118.13 108.04 109.09 121.08 123.08 112.09 117.01 87.11 114.02 101.13 106.04 113.11

70°E

236.8 68.0 1460.4 84.0 41.0 13.9 737.2 41.7 142.3 27.9 81.2 170.3 1288.7 1892.4 3648.7 4507.0 46.7 7.1 121.6 5.9 6.0 44.7 778.3 2.5 935.0 23.1 2295.2 1111.4 110.4

11.7 22.1 12.9 25.2 26.9 28.4 18.8 22.0 10.7 21.2 9.9 19.9 18.9 21.8 16.8 7.6 22.4 21.0 26.5 28.8 20.8 14.4 17.8 18.6 13.2 22.0 14.6 16.8 20.8

1.7 14.9 2.8 17.6 19.4 22.1 3.4 14.1 0.2 13.0 –1.4 10.8 6.9 11.8 3.1 –6.5 15.5 12.7 18.5 23.0 14.3 3.3 5.2 8.6 3.7 14.1 –0.3 4.4 10.8

90°E

80°E

6.5 17.9 7.4 20.6 22.5 24.6 10.5 17.5 5.3 16.7 4.2 14.9 12.8 16.1 9.1 –0.1 18.5 16.4 21.8 25.3 17.2 8.6 11.0 13.1 7.9 17.5 6.0 10.2 15.4

7.0 4.2 8.4 4.2 4.3 5.0 9.3 4.5 6.3 4.9 6.5 6.0 8.0 6.0 8.2 7.2 5.0 5.3 4.1 6.1 4.8 6.5 6.7 6.2 7.3 5.0 7.0 7.5 5.1

110°E

100°E

3.2 2.2 2.8 2.6 1.8 2.6 1.3 2.0 2.6 2.6 2.8 3.0 1.8 2.1 1.7 2.3 1.9 2.3 1.4 2.7 3.1 2.7 1.9 2.4 2.4 1.4 1.0 2.2 2.2

120°E

0.61 0.77 0.47 0.73 0.75 0.82 0.45 0.73 0.64 0.74 0.66 0.57 0.48 0.68 0.40 0.52 0.74 0.73 0.79 0.80 0.73 0.64 0.56 0.61 0.57 0.75 0.58 0.52 0.62

13.46 10.77 16.11 12.41 11.75 14.19 16.67 12.13 12.98 12.39 12.59 13.19 15.45 15.55 20.41 19.09 12.14 12.50 12.38 16.46 12.57 13.44 13.77 13.36 14.47 11.85 15.49 15.77 12.93

130°E 50°N

Jiamusi

40°N

Harbin

Hami

Urumqi

Changchun

Kashgar

Shenyang

Dongsheng

40°N

Beiji ng Tianj in

Yinchuan Taiyuan

30°N

Jinan

Xining

Zhengzhou Nagqu

Hefei

Lhasa

Nanjing Shanghai

30°N

Hangzhou Wuhan Nanchang

Changsha

Fuzhou

Kunming

20°N

Guangzhou Nanni ng

20°N Haikou Sanya

90°E

110°E

100°E

120°E

Fig. 1. The geographical locations of the 30 studied meteorological stations in China.

3. Methodology 3.1. Estimation framework for daily global solar radiation From Fig. 2, solar radiation estimation is to build and train a model with inputs of meteorological elements and the output of daily global solar radiation. As there are commonly a small number of trainable parameters in empirical or machine learning methods, one model is suitable for merely one station. In order to avoid building estimation models separately for different stations, enhancing the convenience and efficiency of practical applications, deep learning models, i.e. functional DBNs, are introduced to estimate solar radiation, which specialize in analyzing big datasets. In this study, two estimation approaches based on deep learning are proposed. One is to build a functional DBN based on 4

the overall dataset that involves radiation samples from all the 30 stations. It may be difficult to train such a huge model in this approach, so the other one is applied which combines functional DBNs with EC method. EC is a novel clustering method that splits the dataset into several parts according to the dissimilar curve shapes of daily solar radiation. The different parts are analyzed under different models. The principles and algorithms of those methods will be described in details.

1

i. Estimation merely using functional DBN

Basic principle of estimation

Meteorological elements

Functional DBN

iii. Framework of functional DBN

Daily global solar radiation

Meteorological elements input

Knowledge functions from empirical models

ii. Estimation using EC + Functional DBN

2 Proposed embedding clustering (EC) + functional deep belief network (DBN)

Radiation curves Polynomial functions for knowledge neurons Embedding clustering (EC)

Meteorological elements Eliminate redundant neurons

Functional DBN 1

3 Estimation model

Daily global solar radiation

Functional DBN 2

Functional DBN 3

Functional DBN 4

Deep belief network (DBN)

Comparison of conventional method & deep learning method

Station 1:

Meteorological elements 1

Estimation model 1

Daily global solar radiation 1

Station 1:

Meteorological elements 1

Station 2:

Meteorological elements 2

Estimation model 2

Daily global solar radiation 2

Station 2:

Meteorological elements 2

……

……

……

……

……

……

……

Station N:

Meteorological elements N

Estimation model N

Daily global solar radiation N

Station N:

Meteorological elements N

Daily global solar radiation N

Conventional empirical models or machine learning models

Daily global solar radiation 1

Deep learning estimation model

Daily global solar radiation 2

Proposed deep learning model

Fig. 2. The overall framework of the proposed estimation method for daily global solar radiation.

3.2. Radiation curves clustering As mentioned before, two modeling approaches are considered, where one is to establish an entire deep-learning-based model to estimate solar radiation at 30 stations, and the other is to utilize a clustering method and divide those stations into several groups. In the second approach, different groups can be modeled separately. Thus, there is no necessity to establish models individually for each station in both methods. Because radiation curves hold certain similarities, they are analyzed and clustered as the representatives of stations in this study. Moreover, in order to overcome the problem that conventional clustering methods are heavily affected by initial centers, a novel EC method is introduced, whose structure is shown in Fig. 3, which combines AE and k-means clustering. The operating procedures of the proposed EC method can be summarized as follows: 1. Initialize the number of clusters; 2. Train an AE model to calculate classification probabilities of clusters for stations, where annual average daily solar radiation curves of those stations are chosen as inputs; 3. Initialize cluster centers based on classification probabilities and operate k-means clustering; 4. Evaluate clustering results under the indicator called residual sum of squares (RSS), and finally determine the number of clusters.

5

sigmoid

ReLU

ReLU

softmax

Decoder

softmax

ReLU ReLU

ReLU

ReLU

Encoder

Auto-encoder + k-means clustering

Fig. 3. The structure of embedding clustering (EC).

An AE model consists of two parts, i.e. encoder and decoder, which are both made of dense layers (fully-connected layers) [47]:  n  yil = α  ∑ wijl x lj−1 + bil   j =1 

(1)

l where wijl is the weight of the lth layer that connects the ith output and the jth input; bi represents the l ith bias of the lth layer; x lj−1 and yi are the jth input of the (l–1)th layer and the ith output of the lth

layer, respectively; α (⋅) represents an activation function and n denotes the number of input units. The outputs of the final dense layer in encoder represent the extracted high order features. In the proposed EC method, their values indicate the classification probabilities of different clusters. As a result, the number of those outputs is equal to that of clusters. Based on the classification probabilities, an decoder is aimed to reconstruct input features with the minimum deviation:

L AE = X AE − Xˆ AE

2

(2)

where L AE is the training loss function of AE; X A E and Xˆ AE denote the input vector and the reconstructed output vector of AE, respectively. Additionally, the utilized model in this study is not stacked with shallow AEs. Instead, it is directly built with a deep encoder and a deep decoder, which are both four layers. According to Fig. 3, it can be also noted that three diverse activation functions are applied: αReLU (x) = max(0, x) (3) 1 1 + e− x exi α softmax ( xi ) = x ∑je j

α sigmoid ( x ) =

(4) (5)

where xi is the ith unit for activation, sigmoid is a conventional nonlinear activation function, ReLU (rectified linear unit) can benefit back propagation training algorithm and prevent gradient vanishing, and softmax function is used to calculate classification probabilities. After training an AE model, the outputs of encoder are obtained to initialize cluster centers: Ci =

∑ p ⋅X ∑p k

2 ik

k

k

(6)

2 ik

where X k is a radiation curve vector of the kth station and p ik denotes the probability that the kth radiation curve belongs to the ith cluster. The initial centers (Ci) are adopted in k-means clustering. The clustering method is a simple and fast aggregation algorithm that determines the representative by computing the mean value of the members in a cluster [48]. It should be noted that the embedding method can be also combined with other clustering methods, e.g. fuzzy c-means [49] and k-medians 6

[50], which need initial clustering centers. Specially, in order to decide the number of clusters, the indicator of RSS based on Pearson correlation is applied in this study as: RSS = ∑i ∑k ρ (C%i , X% ik ) (7) where ρ (⋅) is Pearson correlation function; C% and X% are the final results obtained by EC method, i

ik

which denote the ith cluster center and the kth curve vector that belongs to the ith cluster, respectively. The number of clusters can be determined by observing the changes of RSS, and finally the clustering results of solar radiation curves can be obtained. 3.3. Functional deep belief network g1 Tmax

x1

Tmin

x2

g2

Tmax Tmin RH

nday

S/So

...

Input

g3

... ln(Tmax –Tmin)

f1

e(Tmax–Tmin)/So ...

f2

...

g4

(Tmax –Tmin+RH)

g5 0.5

f3

DBN

g6 Knowledge

Polynomial

Elimination

Fig. 4. The functional deep belief network structure for daily global solar radiation.

Functional DBN is a hybrid model combining DBN and FN. The structure of the utilized functional DBN is shown in Fig. 4. It consists of four parts, i.e. knowledge functional layer, polynomial functional layer, neuron elimination layer and DBN. Among those, the neuron (node) in functional layers can be described as: yil = f i ( x1l −1 , x2l −1 , L ,x lj−1 , L , xnl −1 ) (8)

where f i ( ⋅) is a computational function of the ith neuron in functional layer. Hence, the connections of neuron nodes in functional layers are realized by those functions, rather than weight matrices in dense layers. There are various families of functions, e.g. polynomial, exponential and trigonometric ones, where it is hard to decide exactly which ones to utilize. Therefore, knowledge extracted from empirical solar radiation models is selected as basic functions, which are usually in the form of nonlinear elements. Subsequently, those basic functions are further transformed into polynomial functions. Due to the large number of polynomial combinations, a backward elimination (BE) algorithm is utilized to eliminate redundant functions. The core of BE is to test each function by deleting it from the model. If the deletion of the function causes no deterioration on the performance of models, it means that the function is a redundant one that should be eliminated [51]. 3.3.1 Knowledge extraction from empirical models

It has been mentioned previously that knowledge extracted from empirical models is selected as basic functions, which are the neurons of knowledge functional layer. There are two merits of knowledge extraction: on one hand, as those neurons themselves contain strong nonlinear characterizations, the network for solar radiation has already achieved a high complexity, and has no necessity to be designed into extremely deep architecture. On the other hand, the functional neurons help extract features from original inputs in advance, which reduces the difficulties of model training and improves the estimation accuracy. As for empirical models, meteorological variables hold a strong correlation with global solar radiation, especially sunshine duration and temperature. Thus, SDB and TB models have been applied 7

worldwide. Among those, Angstrom-Prescott linear model is one of the most common SDB models, which is established to calculate clearness index (H/Ho) [52]: H S = a+b (9) Ho So Furthermore, polynomial SDB models are also widely developed [53-55]: 2

3

S H S S = a + b +c   + d   Ho So  So   So 

(10)

where a, b, c and d are trainable parameters. The daily extraterrestrial solar radiation (Ho) and daily maximum possible sunshine duration (So) can be calculated as [56]: 360nday    2πωs 24  Ho = I SC 1 + 0.034cos sin ϕ sin δ  (11)  ×  cos ϕ cos δ sin ωs + π 365   360   So =

2 cos−1 (−tanϕ tan δ ) 15

(12)

where φ is the latitude of a station; and ISC denotes the solar constant, which is equal to 1367 W/m2. The daily solar declination (δ) and daily sunrise hour angle (ωs) are computed under the following equations:  360(nday + 284)   365  

δ = 23.45sin 

(13)

ωs = cos−1 (− tan δ tan ϕ) (14) Similar to SDB model, TB model is aimed at estimating daily global solar radiation using ambient temperature. Different temperature types, i.e. Tmax, Tmin and Tmean, have been merged in building TB models. Hargreaves model is a typical kind of TB models to estimate global solar radiation [57, 58]:

H c = a ∆T +b + Ho Ho

(15)

where a, b and c are parameters. ∆T represents daily temperature difference, which is equal to Tmax–Tmin. Another two TB models combining daily temperature difference and maximum possible sunshine duration are introduced in [59]: 2

3

 ∆T   ∆T  H ∆T = a +b + c  +d  Ho So  So   So 

(16)

H = a + eb(∆T / So ) Ho

(17)

In [60], a modified Hargreaves model is proposed based on logarithmic temperature difference:

H = a +bln(∆T) Ho

(18)

Besides, hybrid variable models are also proven effective in solar radiation estimation, where the following two that involve temperature and humidity are referred in this study [61, 62]:

 ∆T + RH  H = a + b  Ho  So 

0.5

 S  T  H = a cos ϕ + b cos(nday ) + cTmax + d   + e  max  +  So   RH 

(19) 2

T  f  max  + g  RH 

(20)

where a, b, c, d, e, f and g are trainable parameters. According to all the above empirical models, basic neurons in knowledge functional layer can be decided. The number of neurons are 21, as presented in Table 2. In polynomial functional layer, those neurons are combined to constitute a 3th-order polynomial family of functions. Furthermore, the final number of inputs in DBN is chosen as 182 after neuron elimination via BE method. Table 2

8

Basic functional neurons in the knowledge functional layer for estimating daily global solar radiation. Neuron

Tmax

Type basic input basic input basic input basic input

Description / Relevant empirical models latitude longitude altitude daily maximum dry-bulb temperature

Tmin

basic input

daily minimum dry-bulb temperature

T m ean

basic input

daily mean dry-bulb temperature

ϕ

λ h

RH

basic input

daily mean relative humidity

WS

basic input

daily mean wind speed

nday

basic input

the number of day in a year

S

basic input function function function function function

daily sunshine duration the cosine function of latitude the sine function of latitude the cosine function of daily solar declination the sine function of daily solar declination

cos ϕ sin ϕ

cos δ sin δ S / So Tm ax − Tmin

function

( )

H Ho

= a + b SSo +c

H Ho

= a ∆T + b +

H Ho

= a + b ∆STo + c

(Tmax − Tm in ) / S o

function

e(Tmax −Tmin )/ So

function

H Ho

= a + e b ( ∆T / S o )

ln (Tmax − Tmin )

function

H Ho

= a + b ln(∆T )

function

Tmax −Tmin + RH So

H Ho

(

2

+d

S So

3

c Ho

( )

∆T + RH So

( )

∆T So

)

2

+d

( ) ∆T So

3

0.5

H = acosϕ +bcos(nday ) +cTmax +d

function

Tmax / RH

= a+b

S So

( ) S So

( ) + f ( ) +g

+e

Tmax RH

Tmax 2 RH

3.3.2 Unsupervised learning strategy of deep belief network v^ 1

v1 h1

v^ 2

v2 Output

h2 v^ 3

v3

RBM2

Feature II

v^ 1

v1

Feature I

h1 v^ 2

v2 h2

Input

v^ 3

v3 h3

v^ 4

v4

DBN

RBM1

Fig. 5. The hierarchical architecture of deep belief network (DBN).

DBN is a deep network stacked by a special kind of dense layers, namely RBMs, whose parameters can be pre-trained under an unsupervised learning strategy. Hence, the entire training process of DBN can be divided into two phases: unsupervised pre-training phase and supervised back-propagation (finetuning) phase. RBM is an energy-based model, whose aim is to extract hidden features and pre-train weights of dense layers, in order to accelerate converge rate of training a deep network, as well as to avoid gradient vanishing and local optimum problems. As shown in Fig. 5, it is made up of a visible layer and a hidden layer, which can be defined as an energy function ε [63]: n

m

m

n

j =1

i=1

ε (vj , hi | θ ) = −∑∑ωij hv i j − ∑aj v j − ∑bh i i i=1 j =1

(21)

9

where vj and hi are the jth unit in a visible layer and the ith unit in a hidden layer, respectively; m and n are the numbers of visible units and hidden units, respectively. θ = {ω ij , a j , bi :1 ≤ i ≤ n ,1 ≤ j ≤ m} is the trainable parameter set, where ωij represents the weight between vj and hi, aj and bi are the biases in the visible layer and hidden layer, respectively. The training process of RBM is realized layer by layer, where the outputs of the former hidden layer are drawn as the inputs of the latter visible layer. As a result, its strategy is to learn a probability distribution from the visible layer to the hidden layer. First, their joint probability distribution P is defined in terms of the energy function:

P(v, h | θ ) =

1 −ε (v,h|θ ) e Z (θ )

(22)

Z (θ ) = ∑e−ε (v,h|θ )

(23)

v ,h

where Z(θ) denotes the partition function. Next, the individual probabilities, of vj given h and of hi given v, can be deduced as follows [64]: n

P(vj =1| h) = αsigmoid (aj + ∑ωij hi )

(24)

i=1 m

P(hi = 1| v) = αsigmoid (bi + ∑ωij vj )

(25)

j =1

Since training a RBM means adjusting its parameter set θ, the objective function is defined to maximize the log-likelihood probability: ns

ns

i =1

i =1

L RBM (θ ) = ln ∏ p(v(i ) | θ ) = ∑ ln P(v(i ) | θ )

(26)

where L RBM (θ ) denotes the log-likelihood function in terms of parameter set θ, ns is the number of samples and v(i) is the hidden unit for the ith sample. Then, the gradient ascending algorithm is adopted to solve the objective function and adjust parameters:  ∂L RBM (θ ) ∂  ns =  ∑ ln ∑ P(v(i ) , h | θ )  ∂θ ∂θ  i =1 h  −ε ( v , h|θ )  ∂  ns ∑h e   (27) = ∑ln ∂θ  i =1 ∑v,h e−ε (v,h|θ )    ns ∂ ∂   = ∑ −EP( h|v(i ) ,θ ) ε (v(l ) , h | θ )  + EP( v,h|θ ) ε (v, h | θ ) ∂θ ∂θ  i =1  where E P (⋅) represents the mathematical expectation of distribution P. When units in the first visible layer are given as training samples, parameters of each RBM can be adjusted. However, the gradient ascending formula holds a computational complexity of O(2n+m), causing a poor efficiency of training deep networks. Therefore, Markov chain Monte Carlo (MCMC) and Gibbs sampling can be applied to estimate the expectations. Further, contrastive divergence (CD) is introduced for approximate sampling, since usually getting those expectation samples is also intractable [65]. After CD operation, the pretraining phase can be completed. (i)

4. Results and discussions In order to validate the reliability and accuracy of the proposed methods, a case study including 30 meteorological stations in China is adopted. In this study, DBN model without functional neurons, the proposed functional DBN and functional DBN with EC method are established to estimate daily global solar radiation. Three machine learning models, namely SVR, GPR and ANFIS, are built for comparisons. As they demonstrate low efficiency when trained on a large number of samples, they are modeled separately for each station, i.e., site-specific models. Furthermore, three empirical models from 10

related literature are also established in this study, which outperform conventional models on the dataset of China: (1) Empirical model 1 (EM1), which is referred from the P3 model in [52]: H = H o [a + b( SSo ) c + d ln( ∆T ) + eTmean ] (28) (2) Empirical model 2 (EM2), named the expansion of improved Bristow-Campbell model [66], which involves more weather elements into the Bristow-Campbell TB model:

H  = a + b sin Ho 

( ) + c cos ( ) + dR 2π jday

2π jday

365

365

H

+ eS  × 1 − exp(− f ∆T g ) 

(29)

where jday represents the Julian day. (3) Empirical model 3 (EM3), which is a general model considering the locations of stations [67]:

H = Ho (a∗ ∆T + b∗ RH + c∗ SSo ) + d ∗ a∗ = a1ϕ + a2λ + a3h + a4  ∗ (30) b = b1ϕ + b2λ + b3h + b4  ∗ c = c1ϕ + c2λ + c3h + c4 d ∗ = d ϕ + d λ + d h + d 1 2 3 4  * * * * where a , b , c and d are coefficients in terms of locations (i.e. latitude φ, longitude λ and altitude h); a1~a4, b1~b4, c1~c4 and d1~d4 are the trainable parameters. The estimation results of the above all models, as well as the comparisons and discussions, will be exhibited in this section as follows. Moreover, the proposed method is extended to daily global solar radiation forecasting and hourly global solar radiation estimation. The results will be presented and discussed as well. 4.1. Performance criteria Four criteria are utilized in this study to measure the estimation performances of different models, i.e. mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE) and correlation coefficient (R) [66-69]. They can be calculated under the following equations: 1 ns MAE = ∑ H m,i − H e,i (31) ns i =1 1 ns

RMSE =

MAPE =

1 ns

ns

∑(H i =1

m ,i

− H e ,i )

ns

H m ,i − H e ,i

i =1

H m ,i



2

(32)

×100%

(33)

∑ ( H − H )( H − H )  ( H − H )  ( H − H )  ∑ ∑ ns

R=

i=1

m,i

m,i

2

ns

i=1

m,i

m,i

e,i

e,i

i=1

(34)

2

ns

e,i

e,i

where ns is the number of samples; Hm,i and He,i are the ith measured value and the ith estimated value, respectively; Hm,i and H e,i are the mean values of Hm,i and He,i, respectively. For MAE, RMSE and MAPE, they are greater than zero and a value close to zero denotes a high estimation accuracy. R is within the range of [0, 1], and a stronger correlation will be obtained between measured and estimated values where there is a bigger R value. 4.2. Clustering results Based on the proposed EC method, the clustering results of those global solar radiation curves can be achieved, where the centers of clusters are shown in Fig. 6. The stations with dissimilar solar radiation curves are thus divided, and their names are marked under different colors in Fig. 1. According to the 11

daily solar radiation (MJ/m2)

daily solar radiation (MJ/m2)

EC results and locations, stations in the same cluster possess certain geographic correlations. Kunming station belongs to a special cluster, i.e. cluster 2, which receives great amount of global solar radiation over the year. Moreover, the number of clusters is determined based on the changes of RSS, which are presented in Fig. 7. RSS continues decreasing when the number of clusters increases, but the tendency slows down gradually. Hence, the number of clusters is decided to be 4, as more clusters are not able to significantly reduce the value of RSS. The structure of an AE model with 4 clusters is shown in Table 3. The results of EC method can be combined with the proposed functional DBN, and it will be further validated in this study if the estimation performance can be improved after clustering. Cluster 1

25 20

20

15

15

10

10

5

0

100

200

5

300

Cluster 3

25

20

15

15

10

10

5

5 0

100

0

100

200

200

0

300

nday

300

Cluster 4

25

20

0

Cluster 2

25

0

100

200

300

nday

0.09

1.18

0.06

1.12

0.03

1.06

RSS

Changes of RSS

Fig. 6. The clustering centers of radiation curves from 30 meteorological statins.

0.01 0

1

2

3

4

5

6

7

8

9

10

1

Number of clusters

Fig. 7. The indicator of RSS and its changes based on different numbers of clusters. Table 3 The structure and hyper-parameters of an AE with 4 clusters. Layer

Hyper-parameters

Input

Input number: 365 Node number: 5 Activation: ReLU Node number: 16 Activation: ReLU Node number: 4 Activation: softmax Node number: 16 Activation: ReLU Node number: 5 Activation: ReLU Node number: 365 Activation: sigmoid –

Encoder 1 (Dense) Encoder 2 (Dense) Cluster (Dense) Decoder 1 (Dense) Decoder 2 (Dense) Output (Dense) Summary

The number of trainable parameters None 1,830 96 68 80 85 2,190 4,349

4.3. Daily global solar radiation estimation The structures and hyper-parameters of the utilized DBN and functional DBN are presented in Table 4, where DBN has on knowledge, polynomial and elimination layers. In functional DBN, neurons in 12

functional layer and elimination layer are fixed before training, so trainable parameters only exist in DBN, which are pre-trained by RBMs. The numbers of hidden layers and hidden neurons are equal in both DBN and functional DBN, for the convenience of comparison. The estimation results of daily global solar radiation for 30 stations in China are presented, which are obtained from three empirical models, SVR, GPR, ANFIS, DBN, functional DBN and EC + functional DBN, as shown in Table 5. EM1, EM2, EM3, SVR, GPR and ANFIS models are built and applied on each meteorological station independently. DBN and functional DBN models are established and trained on the overall dataset that involves all the 30 stations, while EC + functional DBN divides the dataset into four clusters and then builds estimation models for those clusters separately. According to the comparisons, functional DBN with EC achieves the best scores on the average values of performance criteria over the entire testing dataset, with 1.706 MJ/m2 of MAE, 2.352 MJ/m2 of RMSE, 13.71% of MAPE and 0.955 of R, indicating a high accuracy and practicability. It should be noted that all the validated models are trained 10 times, and the comparison results in Table 5 are the mean values of performance criteria. The box-plots of those performance criteria are shown in Fig. 8. Besides, the case studies are conducted using a personal computer with hardware of Intel Core-i7-7700K CPU, NVIDIA GTX-1080 GPU, 16GB RAM and software of Python 3.5. The total processing time of training models for 30 stations is provided in Table 6. For empirical models, the boxes of EM2 are the highest and longest, indicating low estimation accuracy and poor stability. Thus, an empirical model with complex and strongly nonlinear equations would be hard to train, as it is easier to fall into local minimum. EM3 is the most stable and accurate model, but also possesses the most parameters and training time. For machine learning methods, they are all lower than those of empirical models and flatter than those of deep learning ones. Hence, they are more precise than empirical models. Among those, GPR shows the best accuracy with 1.817 MJ/m2 of MAE, 2.445 MJ/m2 of RMSE, 14.42% of MAPE and 0.952 of R, slightly worse than those of DBN. According to the length of boxes, since machine learning models are site-dependent, which are built at each station independently, the fluctuation of their estimation errors is smaller than that of deep learning models. However, although those site-dependent models are easily trained and can achieve small training time at each station, it still costs a long processing time to train massive models at different stations. On the contrary, a well-trained deep learning model can be directly used on those stations, decreasing the total training time. From Table 6, the total processing time on 30 stations of the SVR and GPR is longer than that of DBN and functional DBN, though training a single site-specific model for one station may cost less than one minute. Especially, the total training time of GPR is 1442.32 s, which is about three times as that of functional DBN. Moreover, in terms of precision, the average estimation errors can be further decreased using deep learning models. By examining the box-plots of the three DBN methods, the box of functional DBN is flatter and lower than that of common DBN. Thus, with the assistance of functional neurons and empirical knowledge, functional DBN indeed obtains better stability and precision than the common DBN. Moreover, the estimation performance of the functional DBN is further improved when it is combined with EC method, where the height of its box is remarkably narrowed. The ranges of MAE, RMSE, MAPE and R are reduced by 0.080 MJ/m2, 0.074 MJ/m2, 0.43% and 0.002 from DBN to the hybrid model, respectively. As a result, the proposed EC + functional DBN method shows both high accuracy and robustness. But it should be noted that the proposed hybrid method requires the longest training time among all tested models, more than 1500 s, as four trained deep models are needed. In summary, based on all the above comparisons and discussions, including accuracy and processing time, the operators can determine the estimation models according to the number of stations and training samples. Four meteorological stations that belong to different clusters, i.e. Beijing, Kunming, Changsha and Hefei, are chosen as representatives for exhibition. The probabilistic distributions of errors between the measured and estimated values in the four stations are presented in Fig. 9. The probabilistic distribution that holds a sharper curve close to zero denotes higher estimation accuracy. The curves of the three deep learning models in Beijing and Kunming are obviously sharper than those of other benchmark models, proving the superiority of the proposed methods. Especially in Kunming station, the indicators of MAE, RMSE, MAPE and R are improved from 3.431 MJ/m2, 4.475 MJ/m2, 22.21% and 0.898 using EM2 to 1.414 MJ/m2, 1.920 MJ/m2, 11.09% and 0.959 using EC + functional DBN, respectively. In Changsha, 13

ANFIS achieves similar results with DBN and functional DBN, but it performs worse than EC + functional DBN. In Hefei, all models obtain good estimation results with sharp distribution curves, while the curve of EC + functional DBN is slightly closer to zero. Therefore, the proposed hybrid method based on clustering and functional DBN possesses the best performance among all the four testing stations. Table 4 The structure and hyper-parameters of the proposed functional DBN. Layer

Hyper-parameters

The number of trainable parameters

Input Knowledge Polynomial Elimination

Input number: 10 Node number: 21 Node number: 274 Node number: 182 Node number: 128 Activation: ReLU Node number: 64 Activation: ReLU Node number: 64 Activation: ReLU Node number: 50 Activation: ReLU Node number: 1 Activation: sigmoid –

None None None None

RBM 1 (Dense) RBM 2 (Dense) RBM 3 (Dense) RBM 4 (Dense) Output (Dense) Summary

23,424 8,256 4,160 3,250 51 39,141

14

Table 5 Comparisons of performance criteria based on different solar radiation models at the 30 stations studied in China. EM1 Station

EM2

EM3

SVR

GPR

ANFIS

DBN

Functional DBN

EC + Functional DBN

Cluster MAE

RMSE MAPE R

MAE

RMSE MAPE R

MAE

RMSE MAPE R

MAE

RMSE MAPE R

MAE

RMSE MAPE R

MAE

RMSE MAPE R

MAE

RMSE MAPE R

MAE

RMSE MAPE R

MAE

RMSE MAPE R

1

1.624

2.144

13.22

0.963

1.603

2.220

13.84

0.965

1.603

2.106

14.38

0.962

1.415

1.814

12.48

0.974

1.289

1.687

11.44

0.978

1.522

1.956

14.00

0.968

1.294

1.734

11.60

0.975

1.222

1.652

11.10

0.977

1.165

1.581

10.73

0.978

Changchun 1

1.749

2.252

15.76

0.946

1.441

1.959

13.24

0.969

1.526

2.019

14.48

0.961

1.389

1.803

13.03

0.969

1.266

1.690

11.72

0.973

1.301

1.750

11.89

0.971

1.285

1.733

11.61

0.971

1.219

1.672

11.11

0.974

1.187

1.631

10.99

0.975

Changsha

Beijing

3

1.724

2.282

19.89

0.953

2.709

3.449

27.77

0.930

1.651

2.237

17.49

0.955

1.767

2.187

19.52

0.960

1.579

2.001

19.15

0.966

1.543

2.064

18.85

0.962

1.566

2.101

19.46

0.961

1.509

2.011

18.26

0.965

1.385

1.854

16.59

0.970

Dongsheng 1

2.263

3.514

12.81

0.947

2.318

3.264

14.40

0.954

2.215

3.427

13.38

0.941

2.365

3.427

14.18

0.954

2.205

3.298

12.87

0.956

2.328

3.428

13.68

0.939

1.993

3.115

11.24

0.958

1.978

3.097

11.35

0.957

1.953

3.077

11.21

0.958

Fuzhou

0.965

3

1.924

2.390

18.78

0.950

3.001

3.682

26.07

0.920

1.819

2.350

16.12

0.954

2.130

2.731

18.61

0.934

1.774

2.334

16.70

0.953

1.657

2.180

16.62

0.961

1.819

2.337

17.43

0.961

1.616

2.129

15.65

0.965

1.592

2.104

15.88

Guangzhou 3

1.836

2.214

18.28

0.943

2.889

3.442

25.82

0.931

1.659

2.025

15.80

0.962

1.422

1.814

14.30

0.969

1.367

1.757

14.17

0.970

1.424

1.814

14.60

0.968

1.320

1.669

13.64

0.970

1.350

1.722

13.97

0.971

1.340

1.721

13.93

0.971

Haikou

3

2.318

2.909

17.58

0.926

2.881

3.552

20.18

0.914

2.094

2.615

14.56

0.938

1.805

2.327

14.16

0.952

1.776

2.296

14.01

0.953

1.823

2.358

14.44

0.950

1.803

2.350

14.07

0.952

1.801

2.324

14.07

0.952

1.781

2.296

14.08

0.953

Hami

1

1.733

2.203

13.61

0.967

1.859

2.607

13.24

0.968

1.673

2.117

13.02

0.971

1.645

2.127

12.86

0.968

1.465

1.972

11.27

0.974

1.670

2.278

13.24

0.963

1.300

1.883

9.71

0.977

1.357

1.945

10.05

0.976

1.318

1.879

9.87

0.977

Hangzhou

4

3.187

4.522

23.71

0.905

3.590

4.807

28.38

0.901

3.098

4.454

21.75

0.911

2.962

4.488

20.03

0.905

2.905

4.462

19.68

0.908

2.977

4.486

20.73

0.910

3.050

4.505

21.46

0.913

2.934

4.402

20.62

0.911

2.824

4.382

19.41

0.913

Harbin

1

2.729

3.482

20.25

0.919

3.058

3.793

24.54

0.918

2.471

3.272

19.75

0.930

2.240

2.900

17.65

0.946

2.242

2.886

17.49

0.949

2.509

3.235

19.64

0.935

2.487

3.135

19.03

0.949

2.277

2.900

17.97

0.952

2.237

2.849

17.82

0.953 0.962

Hefei

4

1.918

2.534

17.56

0.950

2.544

3.262

22.45

0.930

1.911

2.565

16.21

0.947

1.862

2.405

16.71

0.960

1.776

2.321

16.16

0.962

1.801

2.371

16.58

0.959

1.857

2.418

16.94

0.961

1.784

2.347

16.22

0.962

1.652

2.225

15.25

Jiamusi

1

2.115

2.899

17.18

0.931

1.817

2.576

14.59

0.946

2.067

2.817

18.14

0.926

1.848

2.556

16.07

0.947

1.742

2.479

14.64

0.950

1.805

2.580

14.84

0.944

1.710

2.513

13.61

0.949

1.671

2.419

13.78

0.952

1.640

2.424

13.44

0.952

Jinan

1

1.623

2.106

13.43

0.949

1.979

2.459

17.01

0.961

1.445

1.896

12.83

0.964

2.206

2.730

18.19

0.930

1.661

2.073

14.12

0.960

1.392

1.784

12.67

0.968

1.315

1.691

11.99

0.972

1.284

1.660

11.96

0.973

1.289

1.647

11.79

0.973 0.980

Kashgar

1

1.774

2.292

12.48

0.964

2.131

2.855

17.21

0.953

1.779

2.264

13.73

0.963

1.605

2.067

11.82

0.971

1.466

1.892

10.74

0.976

1.570

2.025

11.84

0.970

1.342

1.743

9.68

0.979

1.308

1.693

9.60

0.980

1.290

1.668

9.56

Kunming

2

2.014

2.658

14.41

0.937

3.431

4.475

22.21

0.898

1.627

2.166

11.62

0.947

1.585

2.062

12.01

0.950

1.485

1.962

11.47

0.956

1.525

2.041

11.81

0.954

1.481

1.994

11.76

0.956

1.436

1.949

11.55

0.957

1.414

1.920

11.09

0.959

Lhasa

1

1.673

2.223

8.99

0.909

1.730

2.248

9.44

0.936

1.248

1.630

7.07

0.960

1.425

1.772

8.06

0.955

1.333

1.669

7.50

0.960

1.338

1.732

7.44

0.953

1.215

1.582

6.76

0.962

1.261

1.629

7.08

0.962

1.236

1.600

6.90

0.963 0.881

Nagqu

1

2.139

2.889

11.84

0.867

3.816

4.908

20.04

0.838

2.027

2.722

11.29

0.877

2.096

2.786

11.80

0.871

2.030

2.714

11.39

0.878

2.043

2.718

11.44

0.876

1.971

2.662

10.94

0.880

1.981

2.666

11.06

0.883

1.988

2.684

11.11

Nanchang

3

1.579

2.153

15.79

0.958

2.100

2.747

19.55

0.948

1.483

2.078

14.38

0.960

1.556

2.047

15.48

0.960

1.429

1.934

14.57

0.965

1.417

1.949

15.10

0.964

1.364

1.863

14.57

0.967

1.375

1.894

14.38

0.967

1.311

1.842

14.03

0.969

Nanjing

4

1.608

2.019

15.27

0.971

1.974

2.436

17.92

0.957

1.691

2.126

15.31

0.967

1.593

1.976

14.83

0.972

1.531

1.907

14.93

0.974

1.526

1.927

15.15

0.972

1.459

1.831

14.56

0.975

1.441

1.826

14.39

0.974

1.431

1.822

14.32

0.975

Nanning

3

1.892

2.388

18.93

0.940

2.688

3.293

27.71

0.937

1.598

2.026

16.20

0.955

1.302

1.684

14.43

0.970

1.253

1.631

14.22

0.972

1.323

1.729

14.63

0.969

1.311

1.711

14.29

0.970

1.257

1.630

14.26

0.972

1.253

1.621

14.63

0.973

Sanya

3

2.418

3.017

15.21

0.883

4.723

5.560

29.57

0.837

3.237

3.790

19.89

0.897

2.217

2.811

14.31

0.914

2.167

2.761

13.93

0.918

2.226

2.826

14.52

0.911

2.553

3.212

17.27

0.915

2.304

2.940

15.68

0.913

2.193

2.828

14.55

0.919 0.968

Shanghai

4

1.658

2.186

17.19

0.961

2.418

2.985

21.62

0.939

1.887

2.450

17.28

0.948

1.717

2.198

16.42

0.965

1.615

2.092

15.62

0.968

1.560

2.035

15.50

0.968

1.641

2.157

15.92

0.967

1.613

2.092

15.79

0.967

1.506

1.986

15.42

Shenyang

1

2.345

3.272

17.24

0.895

2.379

3.537

18.61

0.902

2.149

3.061

17.97

0.910

1.924

2.859

15.28

0.922

1.808

2.770

14.51

0.927

1.909

2.904

15.49

0.920

1.815

2.776

14.31

0.928

1.739

2.735

13.72

0.929

1.729

2.736

13.61

0.929

Taiyuan

1

3.170

4.091

18.61

0.951

3.019

3.842

18.18

0.955

2.987

3.878

18.28

0.949

3.414

4.257

20.47

0.957

3.242

4.079

19.43

0.960

3.134

4.022

18.55

0.944

2.782

3.586

16.95

0.961

2.901

3.722

17.43

0.961

2.910

3.730

17.49

0.960

Tianjin

1

2.256

3.107

17.12

0.933

2.051

2.909

15.43

0.934

2.268

3.167

17.92

0.924

2.073

2.888

15.93

0.936

1.994

2.825

15.20

0.939

2.151

2.924

17.08

0.931

2.033

2.899

15.60

0.940

1.971

2.840

15.18

0.941

1.940

2.793

15.08

0.941

Urumqi

1

3.932

5.277

23.31

0.899

3.881

5.210

24.77

0.893

3.853

5.184

24.61

0.889

4.227

5.606

26.03

0.887

4.016

5.395

24.84

0.894

3.978

5.437

24.31

0.875

3.742

5.153

22.91

0.898

3.701

5.097

23.06

0.897

3.673

5.070

22.97

0.898

Wuhan

4

1.973

2.912

16.93

0.928

2.406

3.235

20.30

0.922

1.944

2.899

15.27

0.929

2.040

2.870

17.22

0.934

1.920

2.809

16.48

0.935

1.891

2.834

16.25

0.932

1.835

2.736

15.90

0.938

1.795

2.749

15.39

0.937

1.828

2.807

16.18

0.934

Xining

1

1.697

2.289

13.16

0.945

2.521

3.330

19.30

0.926

1.687

2.226

13.57

0.955

1.549

2.062

12.50

0.959

1.497

2.025

11.99

0.962

1.619

2.138

13.49

0.959

1.507

2.063

11.53

0.960

1.492

2.048

11.52

0.962

1.493

2.062

11.58

0.963

Yinchuan

1

1.955

2.706

13.13

0.919

2.291

3.108

17.54

0.951

1.630

2.226

12.43

0.955

1.505

2.019

11.61

0.960

1.423

1.984

10.93

0.962

1.840

2.440

14.42

0.946

1.554

2.085

11.59

0.959

1.462

2.112

10.76

0.959

1.429

2.090

10.46

0.960

Zhengzhou

1

1.441

1.942

13.00

0.963

1.772

2.335

15.04

0.955

1.586

2.140

13.35

0.955

1.325

1.727

12.28

0.970

1.241

1.643

11.45

0.973

1.314

1.739

12.11

0.969

1.511

1.925

13.34

0.972

1.304

1.749

11.76

0.972

1.206

1.635

11.39

0.973

1

Table 6 The total processing time of training models for 30 stations. Method

Time (total)

Number of models

Time (average per model)

EM1

10.26 s

30

0.34 s

EM2

50.44 s

30

1.68 s

EM3

92.56 s

30

3.09 s

SVR

781.23 s

30

26.04 s

GPR

1442.32 s

30

48.08 s

ANFIS

336.13 s

30

11.20 s

DBN

376.79 s

1

376.79 s

Functional

527.02 s

1

527.02 s

EC + Functional DBN

1530.93 s

4

382.73 s

Root mean square error (RMSE)

Mean abosolute error (MAE)

3.8 2.6 2.4 2.2 2 1.8

EM1

EM2

EM3

SVR

GPR

ANFIS

DBN

3.6 3.4 3.2 3

Outliers

2.8

Upper adjacent

2.6

75th percentile

2.4

EC + FDBN FDBN

EM1

EM2

EM3

SVR

GPR

ANFIS

DBN

Median

EC + FDBN FDBN

Correlation coefficient (R)

21 20 19 18 17 16 15 14

25th percentile

0.95 0.94 0.93

Lower adjacent 0.92 0.91 0.9

EM1

EM2

EM3

SVR

GPR

ANFIS

DBN

EC + FDBN FDBN

EM1

EM2

EM3

SVR

GPR

ANFIS

DBN

FDBN

EC + FDBN

Fig. 8. The box-plots of performance criteria from 10 times of validations on the overall testing dataset. Beijing (cluster 1)

0.45

Probability density function

0.3 0.25 0.2 0.15

Probability density function

EM1 EM2 EM3 SVR GPR ANFIS DBN Functional DBN EC + Functional DBN

0.35

0.1

EM1 EM2 EM3 SVR GPR ANFIS DBN Functional DBN EC + Functional DBN

0.3 0.25 0.2 0.15 0.1 0.05

0.05 0 -10

Kunming (cluster 2)

0.35

0.4

-8

-6

-4

-2

0

2

4

6

8

10

0 -12

12

-10

-8

Estimation error Hm,i – He,i (MJ/m2)

0.2 0.15

Probability density function

0.25

0.1 0.05 0 -12

-10

-8

-6

-4

-2

0

2

4

-4

6

-2

0

2

4

6

8

10

12

Hefei (cluster 4)

0.3 EM1 EM2 EM3 SVR GPR ANFIS DBN Functional DBN EC + Functional DBN

0.3

-6

Estimation error Hm,i – He,i (MJ/m2)

Changsha (cluster 3)

0.35

Probability density function

Mean absolute percentage error (MAPE) %

Mean

8

10

12

14

2

EM1 EM2 EM3 SVR GPR ANFIS DBN Functional DBN EC + Functional DBN

0.25 0.2 0.15 0.1 0.05 0 -12

-10

-8

-6

-4

-2

0

2

4

6

Estimation error Hm,i – He,i (MJ/m2)

Estimation error Hm,i – He,i (MJ/m )

Fig. 9. Probabilistic density curves of estimation errors in four different stations.

1

8

10

12

14

4.4. Multi-day-ahead daily global solar radiation forecasting Besides daily solar estimation, a case study of multi-day-ahead forecasting is also conducted, from one to three days. The difference of estimation and forecasting lies in that they utilize different inputs. In solar radiation estimation, the measured meteorological elements at the day to be estimated are chosen as inputs, while the numerical weather prediction (NWP) inputs are used in forecasting. Historical solar radiation data are commonly involved in forecasting as well. In this study, four example stations from different clusters are selected to study multi-day-ahead forecasting solar radiation forecasting. As to forecasting models, the same daily elements as provided in Table 2 are utilized, which are numerical predicted values. The historical daily global solar radiation of the last one week is also used as inputs. Therefore, machine learning methods and the proposed deep-learning-based methods are adopted. The daily global solar radiation data of a whole year in 2015 are chosen as testing datasets, where the forecasting results are presented in Table 7. The prediction curves using EC + functional DBN, along with the comparison bar-plots of forecasting matrices are shown in Fig. 10. From the prediction curves in Fig. 10, the best forecasting results are obtained in Kunming, with the smallest MAE, RMSE and MAPE matrices. The large and stable quantity of daily global solar radiation received by Kunming contributes to the good precision. The indicators of one-day-ahead forecasting using the proposed hybrid method in Kunming are 3.658 MJ/m2 of MAE, 4.603 MJ/m2 of RMSE, 20.68% of MAPE and 0.693 of R, respectively. For Beijing station, the solar radiation curves possess the best seasonal regularity. Hence, the best R indicator is achieved by Beijing. The other three indicators in Beijing are also smaller than those in Changsha and Hefei. According to Table 7 and bar-plots, forecasting errors of the two functional DBN methods are usually smaller, indicating their effectiveness and stability, but actually the differences among those forecasting models are minor. The reason why the accuracy has not been improved much may be that the intrinsic prediction errors in NWP data restrict the forecasting performance, especially the high errors of predicted wind speed and relative humidity. It can be discovered that the forecasting errors rise remarkably in two-day-ahead and three-day-ahead situations. Due to increasing prediction errors of NWP, the model is trained to perform more conservative forecasting. As a result, the extreme points of solar radiation curves cannot be well predicted. Based on the above results and analysis, the precision of NWP information is most vital in solar radiation forecasting, which greatly affects the performances of prediction models. Besides, since the proposed hybrid method can obtain stable results in multi-dayahead forecasting as well, it is feasible and practical to extend the method into the field of solar radiation forecasting. Table 7 Comparisons of multi-day-ahead forecasting performance at Beijing, Kunming, Changsha and Hefei stations. Beijing Ahead 1-day

2-days

3-days

Kunming

Changsha

Hefei

Model MAE

RMSE

MAPE

R

MAE

RMSE

MAPE

R

MAE

RMSE

MAPE

R

MAE

RMSE

MAPE

R

SVR

4.283

5.512

24.48

0.703

3.743

4.681

21.18

0.680

4.928

6.041

34.71

0.629

4.829

6.273

26.24

0.586

GPR

4.552

5.864

26.19

0.659

3.698

4.673

20.87

0.679

4.827

6.000

33.73

0.637

5.474

6.994

31.70

0.467

ANFIS

4.074

5.230

23.98

0.746

3.742

4.700

21.04

0.675

4.832

5.996

33.42

0.636

4.810

6.079

26.49

0.601

DBN

4.058

5.226

23.41

0.743

3.699

4.639

20.63

0.687

4.785

5.943

32.93

0.644

4.855

6.111

27.12

0.596

Functional DBN

4.053

5.213

23.22

0.745

3.698

4.650

20.59

0.685

4.796

5.945

33.49

0.644

4.821

6.074

26.35

0.602

EC + Functional DBN 4.033

5.206

23.43

0.748

3.658

4.603

20.68

0.693

4.744

5.924

32.80

0.648

4.820

6.105

26.05

0.598

SVR

4.474

5.581

24.87

0.704

4.250

5.207

22.62

0.575

5.429

6.493

35.55

0.549

5.464

6.617

28.96

0.489

GPR

5.032

6.350

28.69

0.598

4.707

5.781

24.74

0.465

5.456

6.608

34.92

0.534

6.655

8.294

38.23

0.435

ANFIS

4.466

5.535

25.24

0.710

4.282

5.281

22.57

0.559

5.642

6.661

36.04

0.517

5.613

6.781

29.23

0.456

DBN

4.411

5.550

23.78

0.710

4.284

5.229

22.95

0.576

5.389

6.392

34.46

0.566

5.485

6.620

29.68

0.493

Functional DBN

4.418

5.528

24.29

0.713

4.210

5.222

21.99

0.577

5.378

6.398

33.62

0.565

5.452

6.707

27.70

0.496

EC + Functional DBN 4.331

5.521

23.02

0.708

4.270

5.234

22.38

0.574

5.350

6.380

35.44

0.568

5.479

6.605

28.30

0.493

SVR

4.476

5.586

24.02

0.697

4.419

5.370

23.05

0.549

5.483

6.484

36.37

0.548

5.621

6.878

30.77

0.460

GPR

5.013

6.587

27.23

0.554

4.966

6.031

25.86

0.409

5.580

6.637

37.48

0.524

6.378

8.045

36.48

0.453

ANFIS

4.607

5.708

24.15

0.687

4.437

5.426

22.93

0.524

5.487

6.456

35.97

0.554

5.840

7.049

31.62

0.406

DBN

4.474

5.580

23.88

0.708

4.365

5.303

22.72

0.567

5.574

6.529

36.52

0.540

5.641

6.735

30.87

0.466

Functional DBN

4.489

5.613

23.74

0.705

4.386

5.281

22.98

0.572

5.559

6.544

36.51

0.536

5.598

6.690

28.49

0.476

EC + Functional DBN 4.355

5.543

23.15

0.702

4.353

5.330

22.55

0.549

5.346

6.382

35.05

0.569

5.582

6.707

29.86

0.473

2

Daily global solar radiation (MJ/m2)

Prediction curves under EC + functional DBN:

35

1-day-ahead forecasting

2-day-ahead forecasting

3-day-ahead forecasting

Measured

30 25 20 15 10 5 0

0

50

100

150

200

250

300

350

Time (day) Comparisons of forecasting matrices:

GPR

ANFIS

DBN

Functional DBN

2

1-dayahead

2-dayahead

2

1-dayahead

2-dayahead

20

R

4 0

3-dayahead

30

6

MAPE

4

0

SVR

8

RMSE

MAE

6

10 0

3-dayahead

1-dayahead

2-dayahead

0.8 0.6 0.4 0.2 0

3-dayahead

EC + Functional DBN

1-dayahead

2-dayahead

3-dayahead

Daily global solar radiation (MJ/m2)

(a) Prediction curves under EC + functional DBN:

30

1-day-ahead forecasting

2-day-ahead forecasting

3-day-ahead forecasting

Measured

25 20 15 10 5 0

0

50

100

150

200

250

300

350

Time (day) Comparisons of forecasting matrices:

2

1-dayahead

2-dayahead

3-dayahead

ANFIS

DBN

Functional DBN

4 2

1-dayahead

2-dayahead

0.6

20 10 0

3-dayahead

EC + Functional DBN

0.8

R

6

0

GPR

30

MAPE

4

0

SVR

8

RMSE

MAE

6

0.4 0.2

1-dayahead

2-dayahead

0

3-dayahead

1-dayahead

2-dayahead

3-dayahead

Daily global solar radiation (MJ/m2)

(b) Prediction curves under EC + functional DBN:

30

1-day-ahead forecasting

2-day-ahead forecasting

3-day-ahead forecasting

Measured

25 20 15 10 5 0

0

50

100

150

200

250

300

350

Time (day) SVR

8 4 2 0

1-dayahead

2-dayahead

3-dayahead

6

MAPE

RMSE

MAE

6

4 2 0

1-dayahead

2-dayahead

3-dayahead

(c)

3

50 40 30 20 10 0

GPR

ANFIS

DBN

Functional DBN

EC + Functional DBN

0.8 0.6

R

Comparisons of forecasting matrices:

0.4 0.2

1-dayahead

2-dayahead

3-dayahead

0

1-dayahead

2-dayahead

3-dayahead

Daily global solar radiation (MJ/m2)

Prediction curves under EC + functional DBN:

30

1-day-ahead forecasting

2-day-ahead forecasting

3-day-ahead forecasting

Measured

25 20 15 10 5 0

0

50

100

150

200

250

300

350

Time (day) Comparisons of forecasting matrices:

2

1-dayahead

2-dayahead

5 0

3-dayahead

ANFIS

DBN

Functional DBN

1-dayahead

2-dayahead

0.6

30 20 10 0

3-dayahead

EC + Functional DBN

0.8

R

4

GPR

40

MAPE

6

0

SVR

10

RMSE

MAE

8

0.4 0.2

1-dayahead

2-dayahead

3-dayahead

0

1-dayahead

2-dayahead

3-dayahead

(d) Fig. 10. Prediction curves and comparison of matrices based on the multi-day-ahead forecasting case. (a) Beijing; (b) Kunming; (c) Changsha; (d) Hefei.

4.5. Hourly global solar radiation estimation In addition to daily solar radiation estimation and prediction, the feasibility of the proposed method to estimate hourly radiation data is also validated in this study. In the hourly forecasting case of this study, the data measured in daily interval are not involved, including daily mean, maximum and minimum temperature, daily sunshine duration, etc. Instead, hourly measured data and the solar hour angle are utilized as inputs. The solar hour angle can be computed as [70]: θ h = 15 × (12 − ts ) (35) where ts is the true solar time. Thus, the basic and functional neurons for hourly solar radiation estimation are provided in Table 8. It should be noted that the hourly temperature in knowledge functions is converted to Kelvin units, in order to avoid the negative values in square root and logarithmic functions. From Table 8, the proposed hybrid method usually obtains the best indicators. In Beijing, the method achieves the best scores of all the four indictors, i.e. 0.137 MJ/m2 of MAE, 0.282 MJ/m2 of RMSE, 23.25% of MAPE and 0.950 of R, respectively. The probabilistic density curves of estimation errors are shown in Fig. 11. Among those, the hybrid method outperforms the others in Beijing, Kunming and Changsha stations, as it obtains obviously sharper curves. Hence, with accurate values of hourly measured meteorological elements, the proposed method can be also extended to estimate hourly global solar radiation. Table 8 Basic and functional neurons for estimating hourly global solar radiation. Neuron ϕ λ Th

Description latitude longitude altitude hourly mean dry-bulb temperature

h Rh

hourly mean relative humidity

Wh

hourly mean wind speed

n day

the number of day in a year

θh cos ϕ

solar hour angle

sin ϕ

cos δ sin δ cos θ h sin θ h

Th

the cosine function of latitude the sine function of latitude the cosine function of daily solar declination the sine function of daily solar declination the cosine function of solar hour angle the sine function of solar hour angle the square root function of temperature

4

the logarithmic function of temperature the square root function of the sum of temperature and relative humidity the ratio of temperature to relative humidity

ln T h

Th + R h

Th / Rh

Table 9 Comparisons of hourly estimation performance at Beijing, Kunming, Changsha and Hefei stations. Beijing

Kunming

Changsha

Hefei

Model MAE

RMSE

MAPE

R

MAE

RMSE

MAPE

R

MAE

RMSE

MAPE

R

MAE

RMSE

MAPE

R

SVR

0.180

0.306

24.73

0.942

0.240

0.353

32.13

0.931

0.160

0.271

30.43

0.941

0.183

0.268

26.60

0.951

GPR

0.164

0.285

23.54

0.949

0.216

0.345

57.84

0.934

0.144

0.256

29.10

0.946

0.147

0.259

27.65

0.952

ANFIS

0.154

0.287

24.77

0.948

0.223

0.355

57.47

0.930

0.146

0.267

29.97

0.941

0.146

0.259

28.14

0.950

DBN

0.148

0.291

24.42

0.947

0.201

0.344

31.24

0.934

0.137

0.270

31.18

0.940

0.146

0.273

28.61

0.945

Functional DBN

0.147

0.289

24.43

0.947

0.199

0.333

29.88

0.938

0.132

0.263

30.04

0.943

0.145

0.266

28.24

0.948

EC + Functional DBN 0.137

0.282

23.25

0.950

0.187

0.340

21.86

0.935

0.125

0.252

29.41

0.948

0.140

0.254

27.30

0.952

Kunming (cluster 2) Probability density function

Probability density function

Beijing (cluster 1) SVR GPR ANFIS DBN Functional DBN EC + Functional DBN

25 20 15 10 5 0 -0.08 -0.06

-0.04

-0.02

0

0.02

0.04

0.06

15

10

SVR GPR ANFIS DBN Functional DBN EC + Functional DBN

5

0

0.08

Hourly estimation errors (MJ/m2)

-0.2

-0.1

SVR GPR ANFIS DBN Functional DBN EC + Functional DBN

25 20 15 10 5 0 -0.1

-0.05

0

0.1

0.2

Hefei (cluster 4) Probability density function

Probability density function

Changsha (cluster 3) 30

0

Hourly estimation errors (MJ/m2)

0.05

0.1

Hourly estimation errors (MJ/m2)

20 SVR GPR ANFIS DBN Functional DBN EC + Functional DBN

15 10 5 0 -0.1

-0.05

0

0.05

0.1

Hourly estimation errors (MJ/m2)

Fig. 11. Probabilistic density curves of hourly estimation errors in four different stations.

5. Conclusion Daily global solar radiation estimation is an essential step for solar energy utilization. Various meteorological elements, e.g. sunshine duration, wind speed, temperature and humidity, can be beneficial to the improvement of estimation accuracy, where it remains unsolved on how to efficiently merge them into an estimation model. Empirical models and machine learning models are commonly used in this field, but very often they appear poor adaptability and must be re-calculated in order to be utilized at another location. Considering the fact that deep learning models have superb generalization capability and are especially suitable to be trained on a big number of datasets, they are introduced and validated in this study to estimate daily global solar radiation of different meteorological stations in China. The feasibilities are studied as well of the proposed method in multi-day-ahead global solar radiation forecasting and hourly radiation estimation. DBN, FN and EC are combined together in the proposed hybrid estimation method. DBN is a deep learning model stacked by RBMs, which are able to pre-train parameters in DBN and reduce the difficulty of training an entire neural network. FN involves empirical knowledge into the estimation model, which further improves its reliability and robustness. EC is a new clustering method utilized to divide radiation curves, where similar curves can be estimated under one deep learning model. After validation and comparison, the proposed functional DBN with EC achieves the best performance according to several criteria on the overall testing dataset. The MAE, RMSE, MAPE and R values of the proposed EC + functional DBN are 1.706 MJ/m2, 2.352 MJ/m2, 13.71% and 0.955, respectively, exhibiting its accuracy and superiority. The present study validates the performances of several DBN models for global solar radiation and prediction. Although a sole DBN is able to compute more precise results than machine learning models like SVR and ANFIS, those results vary greatly at different times of validations, due to the difficulty of 5

training a deep neural network to global optimum. Therefore, further work can be related to state-of-art training technologies and approaches of deep learning. Acknowledgments The research is supported by National Natural Science Foundation of China (Program No. 51507052) and the Fundamental Research Funds for the Central Universities (Program No. 2018B15414). The authors would also like to extend the gratitude to the China Meteorological Administration. References [1] Khosravi A, Koury RNN, Machado L, Pabon JJG. Prediction of hourly solar radiation in Abu Musa Island using machine learning algorithms. Journal of Cleaner Production. 2018;176:63-75. [2] Monjoly S, Andre M, Calif R, Soubdhan T. Hourly forecasting of global solar radiation based on multiscale decomposition methods: A hybrid approach. Energy. 2017;119:288-98. [3] Lund H. Renewable heating strategies and their consequences for storage and grid infrastructures comparing a smart grid to a smart energy systems approach. Energy. 2018;151:94-102. [4] Kalogirou S. The potential of solar industrial process heat applications. Appl Energ. 2003;76(4):337-61. [5] Yildirim HB, Celik O, Teke A, Barutcu B. Estimating daily Global solar radiation with graphical user interface in Eastern Mediterranean region of Turkey. Renew Sust Energ Rev. 2018;82:1528-37. [6] Hartner M, Mayr D, Kollmann A, Haas R. Optimal sizing of residential PV-systems from a household and social cost perspective A case study in Austria. Sol Energy. 2017;141:49-58. [7] Achour L, Bouharkat M, Assas O, Behar O. Hybrid model for estimating monthly global solar radiation for the Southern of Algeria: (Case study: Tamanrasset, Algeria). Energy. 2017;135:526-39. [8] Hassan MA, Khalil A, Kaseb S, Kassem MA. Potential of four different machine-learning algorithms in modeling daily global solar radiation. Renew Energ. 2017;111:52-62. [9] Halabi LM, Mekhilef S, Hossain M. Performance evaluation of hybrid adaptive neuro-fuzzy inference system models for predicting monthly global solar radiation. Appl Energ. 2018;213:247-61. [10] Quej VH, Almorox J, Arnaldo JA, Saito L. ANFIS, SVM and ANN soft-computing techniques to estimate daily global solar radiation in a warm sub-humid environment. Journal of Atmospheric and Solar-Terrestrial Physics. 2017;155:6270. [11] Janjai S, Pankaew P, Laksanaboonsong J, Kitichantaropas R. Estimation of solar radiation over Cambodia from longterm satellite data. Renew Energ. 2011;36(4):1214-20. [12] Bakirci K. Prediction of global solar radiation and comparison with satellite data. Journal of Atmospheric and SolarTerrestrial Physics. 2017;152:41-9. [13] Polo J, Wilbert S, Ruiz-Arias JA, Meyer R, Gueymard C, Suri M, et al. Preliminary survey on site-adaptation techniques for satellite-derived and reanalysis solar radiation datasets. Sol Energy. 2016;132:25-37. [14] Vindel JM, Valenzuela RX, Navarro AA, Zarzalejo LF. Methodology for optimizing a photosynthetically active radiation monitoring network from satellite-derived estimations: A case study over mainland Spain. Atmos Res. 2018;212:227-39. [15] Hocaoglu FO. Stochastic approach for daily solar radiation modeling. Sol Energy. 2011;85(2):278-87. [16] Ayodele TR, Ogunjuyigbe ASO. Prediction of monthly average global solar radiation based on statistical distribution of clearness index. Energy. 2015;90:1733-42. [17] Chukwujindu NS. A comprehensive review of empirical models for estimating global solar radiation in Africa. Renew Sust Energ Rev. 2017;78:955-95. [18] Almorox J, Bocco M, Willington E. Estimation of daily global solar radiation from measured temperatures at Canada de Luque, Cordoba, Argentina. Renew Energ. 2013;60:382-7. [19] Fan JL, Chen BQ, Wu LF, Zhang FC, Lu XH, Xiang YZ. Evaluation and development of temperature-based empirical models for estimating daily global solar radiation in humid regions. Energy. 2018;144:903-14. [20] Almorox J, Hontoria C, Benito M. Models for obtaining daily global solar radiation with measured air temperature data in Madrid (Spain). Appl Energ. 2011;88(5):1703-9. [21] Hassan GE, Youssef ME, Mohamed ZE, Ali MA, Hanafy AA. New Temperature-based Models for Predicting Global Solar Radiation. Appl Energ. 2016;179:437-50. [22] Chelbi M, Gagnon Y, Waewsak J. Solar radiation mapping using sunshine duration-based models and interpolation techniques: Application to Tunisia. Energ Convers Manage. 2015;101:203-15. [23] Fan JL, Wu LF, Zhang FC, Cai HJ, Zeng WZ, Wang XK, et al. Empirical and machine learning models for predicting daily global solar radiation from sunshine duration: A review and case study in China. Renew Sust Energ Rev. 2019;100:186-212. [24] Makade RG, Jamil B. Statistical analysis of sunshine based global solar radiation (GSR) models for tropical wet and dry climatic Region in Nagpur, India: A case study. Renew Sust Energ Rev. 2018;87:22-43.

6

[25] Ogunjobi KO, Kim YJ, He Z. Influence of the total atmospheric optical depth and cloud cover on solar irradiance components. Atmos Res. 2004;70(3-4):209-27. [26] Komar L, Kocifaj M. Statistical cloud coverage as determined from sunshine duration: a model applicable in daylighting and solar energy forecasting. Journal of Atmospheric and Solar-Terrestrial Physics. 2016;150:1-8. [27] Liu JD, Linderholm H, Chen DL, Zhou XJ, Flerchinger GN, Yu Q, et al. Changes in the relationship between solar radiation and sunshine duration in large cities of China. Energy. 2015;82:589-600. [28] Hassan MA, Khalil A, Kaseb S, Kassem MA. Independent models for estimation of daily global solar radiation: A review and a case study. Renew Sust Energ Rev. 2018;82:1565-75. [29] Quej VH, Almorox J, Ibrakhimov M, Saito L. Estimating daily global solar radiation by day of the year in six cities located in the Yucatan Peninsula, Mexico. Journal of Cleaner Production. 2017;141:75-82. [30] Hassan GE, Youssef ME, Ali MA, Mohamed ZE, Shehata AI. Performance assessment of different day-of-the-yearbased models for estimating global solar radiation - Case study: Egypt. Journal of Atmospheric and Solar-Terrestrial Physics. 2016;149:69-80. [31] Khorasanizadeh H, Mohammadi K, Jalilvand M. A statistical comparative study to demonstrate the merit of day of the year-based models for estimation of horizontal global solar radiation. Energ Convers Manage. 2014;87:37-47. [32] Zang H, Cheng L, Ding T, Cheung KW, Wang M, Wei Z, et al. Estimation and validation of daily global solar radiation by day of the year-based models for different climates in China. Renew Energ. 2018. [33] Marzo A, Trigo-Gonzalez M, Alonso-Montesinos J, Martinez-Durban M, Lopez G, Ferrada P, et al. Daily global solar radiation estimation in desert areas using daily extreme temperatures and extraterrestrial radiation. Renew Energ. 2017;113:303-11. [34] Loghmari I, Timoumi Y, Messadi A. Performance comparison of two global solar radiation models for spatial interpolation purposes. Renew Sust Energ Rev. 2018;82:837-44. [35] David M, Luis MA, Lauret P. Comparison of intraday probabilistic forecasting of solar irradiance using only endogenous data. Int J Forecasting. 2018;34(3):529-47. [36] Zendehboudi A, Baseer MA, Saidur R. Application of support vector machine models for forecasting solar and wind energy resources: A review. Journal of Cleaner Production. 2018;199:272-85. [37] Mohammadi K, Shamshirband S, Tong CW, Alam KA, Petković D. Potential of adaptive neuro-fuzzy system for prediction of daily global solar radiation by day of the year. Energ Convers Manage. 2015;93:406-13. [38] Guermoui M, Melgani F, Danilo C. Multi-step ahead forecasting of daily global and direct solar radiation: A review and case study of Ghardaia region. Journal of Cleaner Production. 2018;201:716-34. [39] Salcedo-Sanz S, Deo RC, Cornejo-Bueno L, Camacho-Gomez C, Ghimire S. An efficient neuro-evolutionary hybrid modelling mechanism for the estimation of daily global solar radiation in the Sunshine State of Australia. Appl Energ. 2018;209:79-94. [40] Sharifzadeh M, Sikinioti-Lock A, Shah N. Machine-learning methods for integrated renewable power generation: A comparative study of artificial neural networks, support vector regression, and Gaussian Process Regression. Renew Sust Energ Rev. 2019;108:513-38. [41] Cornejo-Bueno L, Casanova-Mateo C, Sanz-Justo J, Salcedo-Sanz S. Machine learning regressors for solar radiation estimation from satellite data. Sol Energy. 2019;183:768-75. [42] Srivastava S, Lessmann S. A comparative study of LSTM neural networks in forecasting day-ahead global horizontal irradiance with satellite data. Sol Energy. 2018;162:232-47. [43] Kaba K, Sarigul M, Avci M, Kandirmaz HM. Estimation of daily global solar radiation using deep learning model. Energy. 2018;162:126-35. [44] Lago J, De Brabandere K, De Ridder F, De Schutter B. Short-term forecasting of solar irradiance without local telemetry: A generalized model using satellite data. Sol Energy. 2018;173:566-77. [45] Hanin B. Which Neural Net Architectures Give Rise to Exploding and Vanishing Gradients? Advances in Neural Information Processing Systems 31 (Nips 2018). 2018;31. [46] Qureshi AS, Khan A, Zameer A, Usman A. Wind power prediction using deep neural network based meta regression and transfer learning. Applied Soft Computing. 2017;58:742-55. [47] Jiao RH, Huang XJ, Ma XH, Han LY, Tian W. A Model Combining Stacked Auto Encoder and Back Propagation Algorithm for Short-Term Wind Power Forecasting. Ieee Access. 2018;6:17851-8. [48] Schutz T, Schraven MH, Fuchs M, Remmen P, Muller D. Comparison of clustering algorithms for the selection of typical demand days for energy system synthesis. Renew Energ. 2018;129:570-82. [49] Zhang Y, Li ZM, Zhang H, Yu Z, Lu TT. Fuzzy c-means clustering-based mating restriction for multiobjective optimization. Int J Mach Learn Cyb. 2018;9(10):1609-21. [50] Brusco MJ, Shireman E, Steinley D. A Comparison of Latent Class, K-Means, and K-Median Methods for Clustering Dichotomous Data. Psychol Methods. 2017;22(3):563-80. [51] Ahmed A, Khalid M. An intelligent framework for short-term multi-step wind speed forecasting based on Functional Networks. Appl Energ. 2018;225:902-11. [52] Fan JL, Wang XK, Wu LF, Zhang FC, Bai H, Lu XH, et al. New combined models for estimating daily global solar radiation based on sunshine duration in humid regions: A case study in South China. Energ Convers Manage. 2018;156:618-25.

7

[53] Bayrakci HC, Demircan C, Kecebas A. The development of empirical models for estimating global solar radiation on horizontal surface: A case study. Renew Sust Energ Rev. 2018;81:2771-82. [54] Yaniktepe B, Genc YA. Establishing new model for predicting the global solar radiation on horizontal surface. Int J Hydrogen Energ. 2015;40(44):15278-83. [55] Ouali K, Alkama R. A new Model of global solar radiation based on meteorological data in Bejaia City (Algeria). Enrgy Proced. 2014;50:670-6. [56] Despotovic M, Nedic V, Despotovic D, Cvetanovic S. Review and statistical analysis of different global solar radiation sunshine models. Renew Sust Energ Rev. 2015;52:1869-80. [57] Rivero M, Orozco S, Sellschopp FS, Loera-Palomo R. A new methodology to extend the validity of the HargreavesSamani model to estimate global solar radiation in different climates: Case study Mexico. Renew Energ. 2017;114:1340-52. [58] Chen RS, Ersi K, Yang JP, Lu SH, Zhao WZ. Validation of five global radiation models with measured daily data in China. Energ Convers Manage. 2004;45(11-12):1759-69. [59] Ayodele TR, Ogunjuyigbe ASO. Performance assessment of empirical models for prediction of daily and monthly average global solar radiation: the case study of Ibadan, Nigeria. International Journal of Ambient Energy. 2016;38(8):803-13. [60] Chegaar M, Chibani A. Global solar radiation estimation in Algeria. Energ Convers Manage. 2001;42(8):967-73. [61] Kolebaje OT, Ikusika A, Akinyemi P. Estimating solar radiation in ikeja and port harcourt via correlation with relative humidity and temperature. International Journal of Energy Production and Management. 2016;1(3):253-62. [62] Ajayi OO, Ohijeagbon OD, Nwadialo CE, Olasope O. New model to estimate daily global solar radiation over Nigeria. Sustainable Energy Technologies and Assessments. 2014;5:28-36. [63] Hinton GE, Osindero S, Teh YW. A fast learning algorithm for deep belief nets. Neural Comput. 2006;18(7):1527-54. [64] Wang HZ, Wang GB, Li GQ, Peng JC, Liu YT. Deep belief network based deterministic and probabilistic wind speed forecasting approach. Appl Energ. 2016;182:80-93. [65] Wu F, Wang ZH, Lu WM, Li X, Yang Y, Luo JB, et al. Regularized Deep Belief Network for Image Attribute Detection. Ieee T Circ Syst Vid. 2017;27(7):1464-77. [66] Zou L, Wang LC, Xia L, Lin AW, Hu B, Zhu HJ. Prediction and comparison of solar radiation using improved empirical models and Adaptive Neuro-Fuzzy Inference Systems. Renew Energ. 2017;106:343-53. [67] Li MF, Tang XP, Wu W, Liu HB. General models for estimating daily global solar radiation for different solar radiation zones in mainland China. Energ Convers Manage. 2013;70:139-48. [68] Zhao N, Zeng XF, Han SM. Solar radiation estimation using sunshine hour and air pollution index in China. Energ Convers Manage. 2013;76:846-51. [69] Feng L, Lin AW, Wang LC, Qin WM, Gong W. Evaluation of sunshine-based models for predicting diffuse solar radiation in China. Renew Sust Energ Rev. 2018;94:168-82. [70] Loutfi H, Bernatchou A, Raoui Y, Tadili R. Learning Processes to Predict the Hourly Global, Direct, and Diffuse Solar Irradiance from Daily Global Radiation with Artificial Neural Networks. Int J Photoenergy. 2017.

8

Highlights 

One sole deep model can be applied at multiple sites to estimate solar radiation.



The pre-training of RBM reduces the optimization difficulty of deep networks.



The knowledge from empirical equations is involved by means of functional neurons.



The embedding clustering method provides initial cluster centers for k-means.



The proposed method is feasible for both daily and hourly estimation.