Accepted Manuscript Real time energy management and control strategy for micro-grid based on deep learning adaptive dynamic programming
Nan Wu, Honglei Wang PII:
S0959-6526(18)32766-5
DOI:
10.1016/j.jclepro.2018.09.052
Reference:
JCLP 14187
To appear in:
Journal of Cleaner Production
Received Date:
30 September 2017
Accepted Date:
07 September 2018
Please cite this article as: Nan Wu, Honglei Wang, Real time energy management and control strategy for micro-grid based on deep learning adaptive dynamic programming, Journal of Cleaner Production (2018), doi: 10.1016/j.jclepro.2018.09.052
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Real time energy management and control strategy for micro-grid based on deep learning adaptive dynamic programming Nan Wu a, b, Honglei Wanga a School of management, Guizhou University, Guizhou 550025, China b School of mechanical and electrical engineering, Guizhou Normal University, Guizhou 550025, China E-mail addresses:
[email protected] (H.L Wang),
[email protected] (N. Wu)
Abstract: With the wide spread use of distributed new energy in power systems, energy management problems in micro-grid have become increasingly significant. Offline or static optimization methods are frequently used to solve the problems which are typically discrete and nonlinear. There are few online optimization methods but they are not only complex but consider normally the distributed generations and loads in micro-grid as a whole. The outcome is that the online methods, fail to reflect the composing characteristic of distributed multi-energy and the contribution made by developing new energy to reduce the consumption of traditional fossil energy. Moreover, results obtained using either the offline or the on-line methods can deviate to certain extent from the true values. Using system control theory, this paper treats the management of distributed energy in micro-grid as an optimal control problem and based on deep learning adaptive dynamic programming, establishes a real-time management strategy for distributed energy in micro-grid. Due to the introduction of the concept of closed-loop feedback, the new management control strategy was realtime, and furthermore the achieved accuracy of managing and controlling the objective function was higher, which suggests that the proposed strategy can improve the management of energy in micro-grid. Finally, the real-time and effectiveness of the proposed control strategy are proved using simulation. Key words: Micro-grid, deep learning, adaptive dynamic programming, energy management, real-time, distributed energy
ACCEPTED MANUSCRIPT 1. Introduction With the rapid rising global consumption of fossil energy, carbon emissions and environmental pollution problems become more and more serious. Therefore, being a clean energy, renewable energy has been increasingly used all over the world, and has become a focus in the future development of global energy. Power system, as a key area in the consumption of fossil energy, has also begun to increase gradually the use of clean energy such as wind energy and solar energy, which reduces the consumption of traditional fossil fuels effectively (Newbery, 2012). Among the modes of using renewable energy, the regional distributed multi-energy supply mode integrated with smart grid is one of the mainstreams developments of smart grid. One region can form a micro-grid. The micro-grid consists of distributed generations, batteries, and various loads and is a power supply system that can either operate independently or integrate with a power distribution network. By connecting with the power distribution network, the micro-grid can sell redundant clean power to the network or purchase power from the network if more power is needed (Katiraei et al., 2008). Micro-grid can not only effectively reduce long-distance power transmission loss, but also can reduce fossil energy power generation, thereby reducing carbon emissions. At present, micro-grid has been researched for energy services, network expansion, and improving efficiency of using energy and there have been some practical applications (Amin and Wollenberg, 2005; Vojdani, 2008; Ipakchi and Albuyeh, 2008). In terms of the energy management strategy (EMS) of micro-grid, the aim is to optimize the operational micro- grid. For instance, long-term running costs can be optimized through purchasing power from or selling power to distribution systems whilst renewable energy is fully used. The composition of micro-grid is complex because it contains a variety of distributed generations and energy storage devices. Therefore, the problem of energy management for micro-grid is a typical off-line nonlinear problem of day-ahead scheduling optimization (Chaouachi et al. Palma-Behnke. et, 2011; al., 2013; Khodaei, 2014; Shi et al., 2015), which requires early predicting before optimizing. However, distributed energy is intermittent, and most of the loads can change
ACCEPTED MANUSCRIPT instantaneously, making it difficult to obtain predictions close real values. There is still a gap between the optimal control strategy and the actual operation of the system. At present, offline optimization algorithm mostly particle swarm optimization, mixed integer programming, and sequence quadratic programming, etc. (Choi et al. Cecati et, 2011; al., 2011). In addition, the offline optimizing strategy is not real-time, it can’t make real-time adjustments to the strategy of energy management and control according to the changing environment. If the environment doesn’t change in accordance with the trend of prior forecast, this method will be invalid, which greatly reduces the efficiency of energy management. To solve the offline optimization problems mentioned above, Siano (Siano et al., 2012) and Pourmousavi (Pourmousavi et al., 2010) proposed a real-time energy optimizing strategy based on dividing the timeline into small intervals to improve the efficiency of energy management. Other researchers tried to solve the problems of real-time energy management for microgrid arising from intermittent energy, loads, and demands in micro-grid. (Salinas et al. Huang et at, 2013; 2014; Sun et al., 2014).These methods can solve environmental changes caused management optimization of the long-term cost of new energy in micro-grid, but they all assumed that the supply and demand of the system is balanced, ignoring the constrains of operating distributed power system in micro-grid, Furthermore, there is no optimal correction target, which makes it difficult for them to be applied practical control decision-making. In order to make the strategy of energy management for micro-grid more applicable, real-time and precise, this paper based on the principle of closed-loop control with approximation and correction, solves the problem of management of distributed energy in micro-grid as an optimal control problem. By giving optimal control targets and critic targets and using the approximation structure of functions, we establish based on deep learning adaptive dynamic programming, a real-time optimization strategy of management to distributed energy in micro-grid. The proposed strategy can achieve the following goals: (1).realizing optimal control of the long-term operation cost of micro-grid so that new energy can be fully used by micro-
ACCEPTED MANUSCRIPT grid, carbon emissions and the level of environment pollution caused by the consumption of fossil energy can be reduced, and moreover the strategy of energy management can be optimize. (2). micro-grid is able to provide power supply services of higher quality to users whilst the energy management is optimized. (3).being able to provide real-time management and control decisions for the energy management of micro-grid. This paper is organized as follows: the system model and objective function are introduced in section two; a deep learning adaptive dynamic programming based strategy of on-line energy management for micro-grid is proposed in section three. The effectiveness of the proposed strategy is verified in the section four; the outcome of the paper is summarized in section five. 2. System mode The model of the micro-grid system is described in this section. Assume that the entire micro-grid is run by an operator whose goal is to minimize the cost of power in the micro-grid whilst the power quality of the supply service is ensured. The model of distributed energy resources (DERs) are given first, including adjustable and nonadjustable parts, and then the cost model of storage device and the shedding model of flexible loads are given, and finally the purchase and sale model of the micro-grid and distribution network and model of overall cost optimization the system. 2.1 Overview of the micro-grid system Various kinds of distributed generators supply unit such as: wind, solar, gas, etc. exist in the micro-grid. The distributed generator supply units are represented as
G g1 , g 2 , g 3 ...g G . In order to use intermittent energy effectively and without causing impacts to the grid, the energy is usually stored first before transmitting to customers. Storage devices in the grid are expressed as S b1 , b2 , b3 ...bS .The loads in the grid can be divided into rigid and flexible loads. Operators of the micro-grid can divide or combine the flexible loads in order to achieve the lowest cost of the power supply, whilst ensuring that the quality of the power supply service of the entire system will not be reduced significantly. The load in grid can be expressed as
ACCEPTED MANUSCRIPT L l1, l 2 , l3 ,...l L . All the distributed power supplies, energy storage devices, and
controllable loads in the micro-grid are connected to the control center through a twoway communication network. The micro-grid energy control center (MGECC) can acquire real-time information of the distributed power supplies and flexible loads, and then make adjustments to these devices by local controllers (LCs) according to the optimization strategy of energy management. The diagram of the system is shown in figure 1. Solar
Wind
Energy Storage
Hydro Power
Gas Power
MGECC LC
LC
LC
LC
LC
LC
Elastic Loads
Inelastic Loads
Figure 1 Diagram of micro-grid system
2.2 Distributed generation model In the micro-grid, the distributed power supplies can be divided into two categories. One is the intermittent power sources, such as wind and solar power. The generated power is constrained by the natural environment (i.e., the intensity of the light, and the strength of the wind), rather than manually controlled. The other is the adjustable powers, such as gas and water power. 2.2.1. Intermittent distributed renewable generations Intermittent distributed renewable generation units consist of solar and wind power, which can be expressed as g r G . These renewable generations can be considered as only incurring the cost of maintenance and on cost of generation. Therefore, the cost of generation new energy in this paper is considered to be zero. In order to overcome the instability caused by the intermittent power supplies, generated energy is often stored in the energy storing device before being transmitted to loads. 2.2.2. Traditional power generation Traditional power generation can be adjusted according to changing of power consumption in the micro-grid. The adjustable power supplies in the micro-grid are
ACCEPTED MANUSCRIPT mainly gas and water power. The generation unit of water power can be expressed as g w G whose constraints of operation are:
0 p w (t ) p w max p w (t ) p w (t 1) p max
(1) (2)
where p w max is the maximum generated power of water power. The model of the cost of generation at time t is (Lin et al., 2007)
C w ( p w (t )) a w p w ( t ) 2 w p w (t ) w
(3)
where a w , w , w are constants. The other adjustable power supply is gas whose generation unit is g g G , and at time t constrains and the model of cost of power generation are:
0 pg (t ) pg max
(4)
p g (t ) p g (t 1) p g max
(5)
C g ( p g (t )) a g p g ( t )
(6)
where p g max (t ) is the maximum output of the gas station and a g is a constant. 2.3 Storage device model In this paper, the storage device for distributed renewable energy in the micro-grid is storage batteries. For a battery in the micro-grid b S , if power is charged at time t
ps (t ) is positive, and if power is discharged at time t ps (t ) is negative. The constraints of operation are as follows:
p d max p s (t ) pc max
(7)
E s (t 1) E s (t ) ps (t )
(8)
Es min E s (t ) E s max
(9)
where p d max (t ) is the maximum allowable discharging rate, p c max (t ) is the maximum allowable charging rate, E s (t ) is the capacity of the battery at time t, E s min (t ) is the
ACCEPTED MANUSCRIPT lowest battery capacity of the battery required to be retained at the discharging state, and E s max (t ) is the maximum capacity of the battery. In order to avoid the impact on the life of the storage battery in the process of fast charging and discharging, a cost function is used to control the battery. The model of the cost function is (Sun et al., 2014) C s ( p s (t )) a s p s (t ) 2 s
(10)
where a s , s are constants. 2.4 Flexible load model Loads in micro-rid mainly consist of rigid and flexible loads. Under normal conditions, the power supply must be ensured for rigid loads. For flexible loads, such as central air conditioning, charging stations for electric vehicles, etc., power can be supplied according to the condition of the power supply or appropriate strategies of supplying power can be made. For each load unit l L in the micro-grid, the supplied power at time t pl (t ) must satisfy:
p l min (t ) pl (t ) pl max (t )
(11)
where pl min (t ) is the rigid load that must be satisfied, pl max (t ) is the maximum demand of power in the gird. Micro-grid operators should try to meet the needs of customers, and the ideal state is that the left side of (11) is equal to the right side. However, the power demand of customers is random and unpredictable. Therefore a shortage or a surplus of the distributed power supplies can be expected in the microgrid. Especially when the shortage of the power supplies is severe, a lot of flexible loads in the grid will be cut off to ensure the lowest cost of power supply, which can cause a sharp decline in the customer satisfaction. To avoid this situation, power supplies in the micro-grid must be optimized the managed. In this paper, in order to let the micro-grid operators provide the supply services with a high quality, avoid the operation of cutting flexible loads to achieve the minimum cost of power supplies, a cost function is used to represent the operation of cutting flexible loads, so as to punish the act of sacrificing customer satisfaction for
ACCEPTED MANUSCRIPT the lowest cost of power supplies. 1/ 2 C( l p l (t )) l ( p l max p l (t ))
(12)
where l is the positive penalty factor 2.5 Optimization of the objective function In the micro-grid, the aim of energy management is to retain the lowest power operation cost of power generation whilst providing stable and high quality power supplies, and furthermore be able to make real-time control decisions according to changes of energy. However, the generation of distributed energy resources supplies is intermittent, which is a challenge to the real-time balance of the supply and demand in the micro-grid. With the problem, the micro-grid operators can neither cut off too many flexible loads when the power supplies are not sufficient, nor for satisfying all customers purchase a large quantity of thermal power increasing operating cost. Therefore, a strategy of real-time energy management is needed to solve the problem. For the micro-grid, due to significant influences of the environment and season to distributed energy resources, the insufficiency of power supplies frequent. In this situation, micro-grid operators need to purchase power from elsewhere for supplying to the customers, the cost is:
C( m p m (t )) . p m (t )
(13)
where is the price of power, p( m t ) is the amount of purchased. In order to ensure the minimum cost of power supplies, the operators usually will not buy all the needed power, but will buy a portion of the needed power and cut off some flexible loads at the same time. To guarantee the quality of the supply services, a function of shedding flexible load is used for limiting the operation of cutting off flexible loads:
u (t )
1 p max E ( p( l t )) T p max p min
(14)
where p max p min is the total amount of flexible loads that can be cut off,
p max E ( pl (t )) is the amount of flexible loads being cut off, and is the removal ration of flexible loads.
ACCEPTED MANUSCRIPT One of the goals of operating the micro-grid is to retain the lowest long-term cost of power generation and supply including costs of generating power, energy storage and conversion, purchasing power, and cutting-off flexible load. Therefore the objective function of optimization can be derived as:
min ( w C w ( p w (t )) g C g ( p g (t )) s C s ( p s (t )) l C l ( pl (t )) m C m ( p m (t ))) u (t )
s.t (1) (2) (4) (5) (7)-(9) (11)
(15)
The optimization process of the above objective is constrained by (14), i.e., the micro-grid operators cannot cut off too many flexible loads in pursuit of the lowest cost and the removal ratio of flexible load cannot exceed . This is a typical closedloop control in a control system. If the objective function is regarded as the control object, and the removal ratio of flexible loads is considered as the error of the feedback control, the problem becomes a typical optimization problem. Meanwhile, the real-time performance of a control problem is one of the basic properties of the problem, which can guarantees the dynamic and real-time properties of the target of optimization. To solve the typical optimal control problems, this paper proposes a deep learning adaptive dynamic programming based approach which can not only meet the characteristics mentioned above, but also improve the accuracy of optimization so that the obtained results are closer to the true values. Therefore, the deep learning adaptive dynamic programming is used to solve the optimization problem, and obtain the optimal strategy of energy management for the micro-grid. 3. Deep learning adaptive dynamic programming real-time energy management strategy 3.1 Principle of dynamic programming and adaptive dynamic programming Dynamic systems are ubiquitous in nature. An important branch of dynamic systems theory is optimal control. Optimal control has been widely used in many fields, such as system engineering, economic management, decision making, etc. In 1957, Bellman proposed an effective tool for solving optimal control problems: dynamic programming (Bellman, 1957). The core of this method is Bellman’s principle of optimality which says that if there is a nonlinear system whose dynamic
ACCEPTED MANUSCRIPT equation is:
x(k 1) F ( x(k ), u (k ), k )
(16)
where x(k ) is state of the system whose initial state x(k ) x k is a given value,
u (k ) is the control input of the system, and F (.) is utility function of the system. The performance indicator function of the system can be defined as
J ( x(i ), i ) k i F ( x(k ), u (k ), k )
(17)
k 1
The target of control is to solve the sequence of admission control (or decision)
u (k ), k 1,2,3,... , that the cost function (17) reaches its minimum. According to Bellman principle, starting from time k the minimum cost of any state includes two parts. One is the minimum cost required for time k, and the other is the sum of the minimum cost from time k+1 to infinite, i.e.:
J ( x(k ), ) min{F ( x(k ), u (k ), k ) J ( x(k 1))} u(k )
(18)
At this point, the corresponding control strategy u (t ) at time k is also optimal, i.e.:
u (k ) arg min{F ( x(k ), u (k ), k ) J ( x(k 1))}
(19)
u(k )
Therefore, the dynamic programming method is a powerful tool for solving optimal control problems (Zhang, et al., 2013). However, it is difficult to use dynamic programming directly in practical applications. The reason is that optimal control needs to affect a system in a time sequencing manner, and give outs optimal control indicators of a control sequence. Nevertheless, the performance indicator function of the entire system is completely unknown before the sequence is completed, i.e., the problem of "Curse of dimensionality". Webros in 1977 first proposed adaptive dynamic programming (ADP) to solve the problem. The ADP method is essentially using the structure of function to approximate the cost function and control strategy of dynamic programming, so as to obtain solutions for optimal control of nonlinear systems (Powell, 2007). A typical ADP structure is shown in Figure 2,
ACCEPTED MANUSCRIPT
J(x(k))
Critic Network
x(k)
U(k)
Action Network
—
J(x(k+1))
x(k+1)
Critic Network
Model Network
+ l(x(k),u(k))
Figure 2 Structure of adaptive dynamic programming
In Figure 2, the performance indicator function can be expressed as
J xk l xk , u xk J x x 1
(20)
where u xk is the variable of feedback control , the performance indicator function
J xk and J xk 1 are the output of the critic network. If the weight of the critic net is w , the right side of (20) can be put as
d xk , w l xk , u xk J xk 1, w
(21)
The left side of (20) can be written as J xk , w , and then by adjusting the weight w of the critic network a function of the mean square error can be minimized:
w arg min J xk , w d xk , w w
2
(22)
The optimal performance indicator function is obtained. According to the principle of optimality, the optimal control has to satisfy the first order differential necessary condition, that is J xk l xk , u k J xk 1 l xk , u k J xk 1 f xk , u k u k u k u k u k u k 1 u k (23)
The obtained optimal control is J x k l x k , u k J * x k 1 f x k , u k u* arg min | | u u k u k x k 1 u k
(24) In recent years, the theory of adaptive dynamic programming theory has been developed. Online adaptive methods have been proposed to solve optimal control problem of nonlinear analogue systems, and optimal tracking and optimal stabilizing problems of nonlinear discrete system (Vamvoudakis et al. Vamvoudakis and, 2009; Lewis, 2010; Dierks and Jagannthan, 2009; Kyriakos and Vamvoudakisand 2012). Therefore, adaptive dynamic programming has become an extremely important
ACCEPTED MANUSCRIPT method for solving both scientific and engineering problems of modern complex system (Lewis and and Liu, 2013). The method has already been used in many practical engineering systems, such as population control, temperature control, tracking control of UAV helicopters, ect. (David et al., 2013; Zhang et al., 2012; Yadav et ai., 2007; Tang et al., 2007). 3.2 Deep learning adaptive dynamic programming real-time energy management strategy By simplifying the basic forms of adaptive dynamic programming, this paper proposes a deep learning adaptive dynamic programming optimizing strategy to solve the problem of energy management for the micro-grid energy. This novel method does not require predicting the system state and only contain the critic network and action network, reducing the dependence of the controller on system model. The structure of the method has a framework of control system with closed-loop feedback. A basic diagram of the structure is shown in Figure 3. Deep learning origins from neural network, using deep learning can give out deep level representations of characteristics and avoid the complexity of selecting the characteristics manually (LeCun et al., 2015). Critic network J(X(k+1)) J(X(k)) X(k)
Action network
u(k)
Dynamic system
Figure 3 Basic structure of deep learning adaptive dynamic programming
In this system, the input of the action network is the state of the controlled object, i.e., the amount of distributed power in the micro-grid, pw , pg and ps . The output is the control strategy, i.e., the amount of power p m that has to be purchased in order to satisfy the objective function of the minimum cost. The power p m is equal to zero if the constrain given by (14) does not exist. In that case, system cost of the system is the minimum, whilst no power is purchased from the distribution network.
ACCEPTED MANUSCRIPT However, due to the constraint described of (14), to calculate the minimum cost of the system, the amount of p m has to be optimized. Since the output of the control network u (t ) pm (t ) needs satisfy the constraints, the output p m has to train the system, The input of the critic network is the state of the controlled object and the control strategy, the output is cost function. The utility function can be defined according to the target of control. The objective function in this paper is defined to realize the lowest cost of generating and supplying power in the micro-grid. In order to improve the approximation of the system to the objective function and achieve higher calculation accuracy, it is necessary to control the error of the output of action network, i.e., reducing the error to the lowest acceptable value (the minimum error principle of control system). Therefore, for both Action network and Critic network) in figure 3 multilayer neural network of deep learning is used. The topology is shown in figure4. Wr¹
ƹ/ƒ Wr ²
X1
ƹ/ƒ
X2
...
...
ƹ/ƒ
S
...
Xn
U
ƹ/ƒ
ƹ/ƒ
Figure 4 Topology of multilayer neural network of deep learning
For the micro-grid, environmental change will cause fluctuations of the output of distributed power supplies, which requires real-time decision making to form a basis for the energy management of micro-grid. In order to retain the long-term minimum cost of generating and supplying power in the micro-grid, appropriate measures must be adopted for flexible loads. The measures include purchasing a portion of the demanded power and removing some of the flexible loads (or energy saving measures), which have been reflected in the objective function of system optimization described earlier in this paper. Based on the above structure of the model, in order to allow the real-time optimization of the objective function, a flowchart of implementation the deep learning adaptive dynamic programming optimizing strategy
ACCEPTED MANUSCRIPT is shown in figure 5 start
Set parameters of action network and critic network
Optimize target parameters
Collect data of distributed energy Initialize action network and critic network
Y
Are the current values of the target function meeting the constraint of user satis faction?
N The s tate of the controlled object is used as the input of action network. Train action network and update the weight of action network. Obtain the output which is the control strategy
The s tate of the controlled object and the control strategy are used as the input of critic network. Update the weight of critic network and train critic network. Obtain the output which is the cost function
Save the control strategy for this moment. Calculate the corrected value. Calculate the state of the controlled object at the next moment
N Are the values of the corrected objective function meeting the constraint of user satisfaction? Y Obtain the control strategy of real-time management, the amount of purchas ed power and the values of the target function
end
Fig 5 Flowchart of micro-grid energy management strategy based on deep learning adaptive dynamic programming
The steps are as follows: 1) Set parameters of the system structure, action network, and critic network. Both action network and critic network use multilayer neural network of deep learning. The settings of the neural network include the learning rate of the network, the node number of the input layer, hidden layer , and output layer, and the maximum number of iteration; 2) Collect data and set input parameters of the controlled state. The collected data includes real time output power of distributed power supplies; 3) Initialize training of the control network and critic network, including the input
ACCEPTED MANUSCRIPT variables and the learning rate; 4) Use equation (14) to calculate and decide whether the constraint relating to satisfaction with service is met. If not, the next network training is going to be carried out. If yes, no changes are made to the current values; 5) Use the state of controlled object as the input of action network, carry out training of action network, update the weight of action network, and obtain the output, i.e., control strategy; 6) Use the state of controlled object and the control strategy as the inputs of critic network, carry out training of critic network, update the weight of critic network, obtain the output, i.e., the cost function; 7) Reserve the control strategy and calculate the corrected objective function; 8) Repeat steps (4) - (7) during the observation period until the end of the optimization process, and obtain the output, i.e., control strategy. After training of the action network, the amount of the purchased power p(t is m ) obtained. If both condition of the objective function and satisfaction with service are met, the training will be stopped. Otherwise, the training of action network and critic network will be carried out again and again, until an optimizing strategy meeting the conditions is found. 4. Simulation and analysis In order to analyze the effectiveness of the proposed optimal control strategy, distributed power supplies in the micro-grid were sampled in accordance with 15 minutes specified in the standard of Power Grid Corp. The total length of acquired data was 465 minutes, as shown in figure 6. For the cost model of distributed power, the parameters were set as: w 5 、 w 12 、 w 0 、 g 0.2 s 1 、 、
s 0 、 l 0.3 、 0.4 ,Parameters of the target function of optimization were set as: w g s l m 1 . The minimum demand of rigid load was set to be 50% of the maximum demand of loads. In order to guarantee the service quality of power supplies in the micro networks, the rate of cutting off loads had to be less than
ACCEPTED MANUSCRIPT 15% of flexible loads. The capacity of each storage battery in the micro grid was 6 kW. During operation, a capacity of 3 kW was set for the storage batteries. The distributed wind and photovoltaic power was converted to the power of standard frequency and voltage through the storage devices and then transmitted to customers. 450
600
100
550
400 90
500
350 80
200
450 400
70
Ps (MW)
250
Pg (MW)
Pw (MW)
300
60
150
300 250
50
100
200
40
50 0
350
150
0
50
100
150
200
250 time
300
350
400
450
30
500
0
50
100
150
200
250 time
300
350
400
450
100
500
0
50
100
150
200
250 time
300
350
400
450
500
Fig 6 A amount of power of distributed power supplies and energy storage devices
The collected data and the above parameters were substituted into the deep learning adaptive dynamic programming strategy proposed in this paper. A five-layer neural network was used in deep learning to train and optimize the control network and evaluation network. Training: R=1
600
500
400
300
200
100
0
0
100
Train Best
10
Mean Squared Error (mse)
700
Output ~= 1*Target + -1.3e-13
5
Data Fi t Y = T
200
300
400
500
600
Target
(a) Training iterations
700
0
10
10
-5
10
-10
10
-15
10
-20
0
20
40
60
80
100
120
140
160
167 Epochs
(b) Training times, iterations, and errors
Figure 7 Training process of deep learning
Figure 7shows that better training results were obtained using the deep learning method. The target values of the training process were basically consistent with the output values and the best approximation with a mean square error of 4.3967*10-23 were achieved after 167 iterations. The results suggest that the proposed method can converge quickly to actual values of the objective function, and moreover the obtained control strategy was real-time and accurate. Figure 8 shows that using the energy management and control strategy proposed in this paper, the process of approximating the objective cost function of the micro-grid.
ACCEPTED MANUSCRIPT 4
10
1.03832 1.0383 1.03828
Cost
1.03826 1.03824 1.03822 1.0382 1.03818 1.03816 1.03814 0
20
40
60
80
100
120
Time
Figure 8 Approximating process of the objective function of the micro-grid with optimal cost control
Meanwhile, Figure 9 shows the strategy of the real-time power supply in the microgrid. If distributed power supplies were insufficient, the micro-grid operators can follow the principle of the lowest cost and buy a portion of the demanded power from the distribution network to meet the power requirement of most flexible loads. 900
900
pst+pm+pgt+pwt pmaxt pmint
pmaxs pm
800
800
Power(MW)
Power(MW)
700 600 500 400
700
600
500
300 200
400
100 0
300
75
150
225
300
375
450
75
150
Time
225
300
375
450
Time
(a) Maximum demanded power and purchased power (b) Maximum demanded power, supplied power and minimum demanded power
Figure 9 Strategy of real time power supply in the micro-grid
Figure 9 also shows that the micro-grid operators cut off some flexible loads from the network in order to achieve the lowest cost. This can make some users in the micro-grid have to change their habit of using power and actively adopt advanced energy-saving technologies to reduce their power consumption, especially for large power consuming devices such as central air conditioners. Furthermore, this can drive some flexible loads to make appropriate strategies of consuming power so that their power demand can be met as much as possible. The simulation results also indicate that the cost components of power supplies affected the optimization strategy of power supplies in the micro-grid greatly. This effect provides room for engineers to
ACCEPTED MANUSCRIPT further improve and upgrade the relevant technologies and equipment, such as improving the efficiency of power generation using new energy, developing storage batteries with larger capacities and lower charging and discharging costs, etc. Achieving these goals will not only improve the efficiency of energy management for micro-grid greatly but also be helpful for achieving the global energy saving and emission reducing targets. 5. Conclusion Based on the optimal control theory of closed-loop feedback, a deep learning adaptive dynamic programming based strategy of real-time energy management for micro-grids proposed in this paper. Using deep learning adaptive dynamic programming, and the cost function and control strategy of dynamic programming can be approximated. Bellman's principle of optimality was used to realize real-time solving of optimal control and obtain the performance indicator function so that the defined optimal problem can be solved. Comparing with the offline management strategy and some online strategies, the proposed method was real-time, reliable and accurate. Analysis of the collected real-time data using the adaptive neural network of deep learning shows that the accuracy of the decision data of real-time energy management for the micro-grid was improved greatly. Moreover, thanks to the realtime correction of the control error, the control strategy was getting closer to the actual changes and was able to follow the trend of the changes. The simulation analysis proves the effectiveness of the proposed management strategy. The strategy can give out the best result of the objective function with optimal real-time data, which is helpful for micro-grid operators to make online realtime decisions, optimize the cost of operation and improve the efficiency of utilizing new energy. The benefit will be that carbon emissions and environmental pollution caused by the extensive use of fossil fuels are reduced effectively.
Reference Amin, S. M. & Wollenberg, B. F. 2005. 'Toward a smart grid: Power delivery for the 21st century.' Journal of IEEE Power & Energy Magazine, 3, 34-41.
ACCEPTED MANUSCRIPT Bellman, R. E. 1957. 'Dynamic Programming.' Princeton: Princeton University Press. Chaouachi, A., Kamel, R.M., Andoulsi, R. & Nagasaka, K. 2011. 'Multi-objective intelligent energy management for a micro-grid.' Journal of IEEE Transactions on Industrial Electronics,60, 468-476. Choi, S., Park, S., Kang, D.J., Han, S.J. & Kim, H. M. 2011. 'A micro-grid energy management system for inducing optimal demand response.' in: Proceedings of IEEE International Conference, Smart Grid Communications, Brussels, Belgium, 47, 1924. Cecati, C., Citro, C. & Siano, P. 2011. 'Combined operations of renewable energy systems and responsive demand in a smart grid.' Journal of IEEE Transactions on Sustainable Energy, 2, 468-676. Dierks, T. & Jagannathan, S. 2009. 'Optimal tracking control of affine nonlinear discrete-time systems with unknown internal dynamics.' in: Proceedings of the 48th IEEE Conference on Decision and Control, Shanghai, China: IEEE, 6750-6755. David, N., Hassan, Z. & Sarangapani, J. 2013. 'Neural network based optimal adaptive output feedback control of a helicopter UAV.' Journal of IEEE Transactions on Neural Networks and Learning Systems, 4, 1061-1073. Huang, Y., Mao, S. & Nelms, R. M. 2013. 'Adaptive electricity scheduling in microgrids.' Journal of IEEE Transactions on Smart Grid, 5, 270-281. Ipakchi, A. & Albuyeh, F. 2009. 'Grid of the future: Are we ready to transition to a smart grid.' Journal of IEEE Power & Energy Magazine, 7, 52-62. Katiraei, F., Iravani, R., Hatziangyriou, N. & Dimeas, A. 2008. 'Micro-grids management.' Journal of IEEE Power & Energy Magazine, 6, 54-65. Kyriakos, G., Vamvoudakis. & Lewis, F.L.2012. 'Online solution of nonlinear twoplayer zero-sum games using synchronous policy iteration.' Journal of Robust and Nonlinear Control, 22, 1460–1483. Khodaei, A. 2014. 'Micro-grid optimal scheduling with multi-period islanding constraints.' Journal of IEEE Transactions on Very Large Scale Integration Systems, 29, 1383-1392. Liu, Y., Tan, C. S. & Lo, K. L. 2007. 'The effects of location marginal price on electricity market with distributed generation.' Journal of Proceedings of the CSEE, 27, 89-97. Lewis, F.L. & Liu, D.R. 2013. 'Reinforcement learning and approximate dynamic programming for feedback control.' New York: John Wiley &Sons. LeCun, Y., Bengio, Y. & Hinton, G. 2015. 'Deep learning.' Journal of Nature, 521, 436 –444. Newbery, D.M., 2012. 'Reforming competitive electricity markets to meet environment targets.' Journal of Economics of Energy & Environmental Policy 1, 6982. Powell, W.B. 2007. 'Approximate Dynamic Programming: Solving the Curses of Dimensionality.' New York: John Wiley &Sons. Pourmousavi, S., Nehrir, M., Colson, C. & Wang, C. 2010. 'Real-time energy management of a stand-alone hybrid wind micro-turbine energy system using particle swarm optimization.' Journal of IEEE Transactions on Sustainable Energy,
ACCEPTED MANUSCRIPT 1, 193-201. Palma-Behnke, R., Benavides, C., Lanas, F. Severino, B. & Reyes, L . 2013. ' A microgrid energy management system based on the rolling horizon strategy.' Journal of IEEE Transactions on Smart Grid, 4, 996-1006. Siano, P., Cecati, C., Yu, H. & Kolbusz, J. 2012. 'Real time operation of smart grids via FCN networks and optimal power flow.' Journal of IEEE Transactions on Industrial Informatics, 8, 944-952. Salinas, S., Li, M., Li, P. & Fu, Y. 2013. 'Dynamic energy management for the smart grid with distributed energy resources.' Journal of IEEE Transactions on Smart Grid, 4, 2139-2151. Sun, S., Dond, M. & Liang, B. 2014. 'Joint supply, demand, and energy storage management towards micro-grid cost minimization.' IEEE International Conference on Smart Grid Communications, 14, 109-114. Shi, W., Xie, X., Chu, C. C. & Gadh, R. 2015. 'Distributed optimal energy management in micro-grids.' Journal of IEEE Transactions on Smart Grid, 6, 1137-1146. Tang, G., Zhao, Y. & Zhang, B., 2007. 'Optimal output tracking control for nonlinear systems via successive approximation approach.' Journal of Nonlinear Analysis Theory Methods & Applications, 66, 1365–1377. Vojdani, A. 2008. 'Smart Integration.' Journal of IEEE Power and Energy Magazine, 6, 71-79. Vamvoudaki, K.G., Vrabie, D. & Lewis, F.L. 2009. 'Online policy iteration based algorithms to solve the continuous-time infinite horizon optimal control problem.' in: Proceedings of the 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, Nashville, USA: IEEE, 36-41. Vamvoudakis, K.G. & Lewis, F.L. 2010. 'Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem.' Journal of Automatica, 46, 878-888. Yadav, V., Padhi, R. & Balakrishnan, S.M., 2007. 'Robust/optimal temperature profile control of a high-speed aerospace vehicle using neural networks.' Journal of IEEE Transactions on Neural Networks, 18, 1115–1128. Zhang, T.P., Lu, Y., Zhu, Q. & Shen, Q.K. 2012. 'Adaptive dynamic surface control with unmodeled dynamics.' in: Proceedings of the 31st Chinese Control Conference (CCC), Hefei, China: IEEE, 2859-2864. Zhang, H.G., Zhang, X., Luo, Y.H. & Yang, J. 2013. 'An overview of research on adaptive dynamic programming.' Journal of Acta Automatica Sinica, 39, 303-311.