Approximate dynamic programming based supplementary reactive power control for DFIG wind farm to enhance power system stability

Approximate dynamic programming based supplementary reactive power control for DFIG wind farm to enhance power system stability

Author's Accepted Manuscript Approximate Dynamic Programming Based Supplementary Reactive Power Control for DFIG Wind Farm to Enhance Power System St...

1MB Sizes 0 Downloads 73 Views

Author's Accepted Manuscript

Approximate Dynamic Programming Based Supplementary Reactive Power Control for DFIG Wind Farm to Enhance Power System Stability Wentao Guo, Feng Liu, Jennie Si, Dawei He, Ronald Harley, Shengwei Mei

www.elsevier.com/locate/neucom

PII: DOI: Reference:

S0925-2312(15)00870-X http://dx.doi.org/10.1016/j.neucom.2015.03.089 NEUCOM15691

To appear in:

Neurocomputing

Received date: 1 September 2014 Revised date: 8 March 2015 Accepted date: 19 March 2015 Cite this article as: Wentao Guo, Feng Liu, Jennie Si, Dawei He, Ronald Harley, Shengwei Mei, Approximate Dynamic Programming Based Supplementary Reactive Power Control for DFIG Wind Farm to Enhance Power System Stability, Neurocomputing, http://dx.doi.org/10.1016/j.neucom.2015.03.089 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting galley proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Approximate Dynamic Programming Based Supplementary Reactive Power Control for DFIG Wind Farm to Enhance Power System Stability,✩✩ Wentao Guoa , Feng Liua , Jennie Sib,∗, Dawei Hec , Ronald Harleyc , Shengwei Meia a State

Key Laboratory of Power Systems, Department of Electrical Engineering, Tsinghua University, Beijing 100084, China b Department of Electrical Engineering, Arizona State University, Tempe, AZ 85287, USA c School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA

Abstract Reactive power control of doubly fed induction generators (DFIGs) has been a heated topic in transient stability control of power systems in recent years. By using a new online supplementary learning control (OSLC) approach based on the theory of approximate dynamic programming (ADP), this paper develops an optimal and adaptive design method for the supplementary reactive power control of DFIGs to improve transient stability of power systems. To augment the reactive power command of the rotor-side converter (RSC), a supplementary controller is designed to reduce voltage sag at the common coupling point during a fault, and to mitigate active power oscillation of the wind farm after a fault. As a result, the transient stability of both DFIGs and the power system is enhanced. For the supplementary controller design, an action dependent cost function is introduced to make the OSLC model-free and completely data-driven. Furthermore, a least-squares based policy iteration algorithm is employed to train the supplementary controller with convergence and stabil✩ This work was supported in part by the National Natural Science Foundation of China (No. 51377092, No. 51321005) and the Special Fund of National Basic Research Program of China (No. 2012CB215103). ✩✩ An abbreviated version of some portions of this article appeared in [1] as part of the IJCNN 2014 Conference Proceedings, published under IEEE copyright. ∗ Corresponding author Email address: [email protected] (Jennie Si)

Preprint submitted to Neurocomputing

June 30, 2015

ity guarantee. By using such techniques, the supplementary reactive power controller can be trained directly from data measurements, and therefore, can adapt to system or external changes without an explicit offline system identification process. Simulations carried out in Power System Computer Aided Design/ Electro Magnetic Transient in DC System (PSCAD/EMTDC) show that the OSLC based supplementary reactive power controller can significantly improve the transient performance of the wind farm and enhance the transient stability of the power system after sever faults. Keywords: Approximate dynamic programming, Doubly fed induction generator (DFIG), Reactive power control, Online supplementary learning control 2010 MSC: 00-01, 99-00

1. Introduction With rising concerns on fossil energy shortage and for environment protection, wind power generation has grown rapidly in recent years. As one of the most important types of wind generators, doubly fed induction generators 5

(DFIGs) are widely used for their flexibility in active/reactive power control and low cost [2, 3]. As the penetration level of grid-connected DFIGs increases rapidly, the issue of how the integration of DFIG based wind farm may affect overall power system stability has been brought to the forefront of wind power research. How the

10

system behaves during and after a sever fault becomes especially critical and has drawn extensive attention. Usually, a fault in the power system may cause voltage sag at the point of common coupling (PCC) of the wind farm, and further lead to a temporary imbalance between the input wind power and the output electrical power of DFIGs. As a result, large current surges can occur

15

in both the stator and the rotor windings of DFIGs. Furthermore, the inputoutput power imbalance may excite shaft vibrations of DFIGs, and then result in low-frequency oscillations of the active power output of the wind farm [3]. Here 2

two important issues for the control design of DFIGs need to be considered: the first is how to prevent converters of DFIGs from overcurrent and to maintain 20

an uninterrupted operation of DFIGs during a fault, which is referred to as the problem of low voltage ride through (LVRT) [4, 5]; the second is how to control DFIGs to enhance system stability during and after a fault, provided that DFIGs can successfully ride through the fault [3, 6]. Transient reactive power control of wind farms is an effective method to

25

solve the aforementioned problems. In [3, 6, 7, 8, 9, 10], dynamic reactive compensators such as static synchronous compensators (STATCOMs) and static Var compensators (SVCs) are used to provide transient reactive power compensation. Nevertheless, due to economic concerns, many wind farms based on DFIGs are not equipped with dynamic reactive power compensators [11].

30

This forces the full use of the dynamic reactive power regulation capability of DFIGs themselves through proper control strategies. In [11], a reactive power control strategy for DFIG based wind farm is proposed to supply reactive power to a nearby fixed speed induction generator (FSIG) based wind farm during a fault. In [2], reactive power control of DFIGs in both normal operation condi-

35

tion and on-fault condition is comprehensively studied. However, the optimality and adaptability of the controller is not considered in the reactive power control strategies in [2, 11]. As an optimal and adaptive control method, approximate dynamic programming (ADP) [12] or adaptive dynamic programming [13, 14] is used extensively

40

to design power system controllers. ADP can obtain the optimal control policy while circumventing the issue of solving the associated Hamilton-JacobiBellman (HJB) equation directly. ADP has been proved to be effective in many power system control problems, including the turbogenerator control [15, 16], the wind farm control [1, 3, 6, 17], the dynamic stochastic optimal power flow

45

control [18, 19], and the inter-area damping control [20, 21, 22]. In [3], [6], [17], coordinated reactive power control of DFIG and STATCOM is realized by using adaptive critic designs (ACD) [23], direct heuristic dynamic programming (direct HDP) [24] and goal representation heuristic dynamic pro3

gramming (GrHDP) [25], respectively. In [10], GrHDP is employed to coordi50

nate the control of DFIG, STATCOM, and high-voltage direct current (HVDC) link. It should be noted that ACD is essentially an offline design approach that requires to train a model network in advance. On the other hand, direct HDP and GrHDP are model-free learning based on stochastic approximation, which do not have theoretical guarantee of deterministic stability during learning. Be-

55

sides, ACD, direct HDP, and GrHDP use the gradient descent method as the training algorithm, which is less efficient in utilizing training samples, sensitive to learning parameters, and slow to converge. In summary, these existing ADP approaches cannot guarantee stability while conducting online and model-free learning. This has hindered their applications in realistic problems, since un-

60

modeled system dynamics and uncertainties can only be handled effectively by using online learning approaches. Different from these ADP approaches, the reactive power controller proposed in this paper is designed by using an online supplementary learning control (OSLC) [26] method based on ADP. In the OSLC, a policy iteration algo-

65

rithm [27] is employed. As such, we can achieve not only stability guarantee but also fast convergence during learning. Under this problem formulation, we are able to solve the cost function by using the least square method, which is efficient in both computation time and sample utilization. We also introduce an action dependent cost function so that the learning process is independent of the

70

system model. The training proceeds along with online sample acquisition and is completely data-driven. Therefore, our proposed OSLC is suitable for online optimal control and system adaptation to cope with ever changing environment. In this paper, we extend the work in [1] to design an optimal adaptive reactive power controller of DFIGs when additional dynamic reactive power com-

75

pensators are not available. The control objective is to reduce the voltage sag at the PCC during a fault, to damp the active power oscillation of wind farm after a fault, and ultimately to enhance the system stability and dynamic performance. Compared to [1], there are two advancements in this paper. First, we take two realistic design constraints into consideration, namely, the power rat4

80

ing of DFIG and the current limit of converters. These parameters are crucially important for the protection of DFIG and its converters in real applications. A limit on the reference value of the d-axis current in the RSC is set. Second, the contribution of the supplementary reactive power control to enhancing power system stability is comprehensively studied by simulation. Influences of the

85

supplementary reactive power control for DFIG based wind farm on the bulk power system is shown, such as influences on conventional generators and tie lines. To summarize, major contributions of this paper are as follows. First, different from the ADP approaches used in literature, a new ADP approach, the

90

OSLC, is used for online training of the reactive power controller, which is both model-free and stable. Second, since the OSLC is suitable for online implementation, we show that the reactive power controller can online re-optimize the control performance after un-modeled changes in the system. Third, impacts of the supplementary reactive power control for DFIG based wind farm on the

95

power system stability is studied extensively. The rest of this paper is organized as follows. Modeling of the benchmark power system and the DFIG wind farm is introduced in Section 2. In Section 3, both the structure and the training algorithm of the supplementary reactive power controller are presented. Simulation results are shown in Section 4 to

100

demonstrate the online optimization and adaptation capability of the proposed controller. Conclusion is given in Section 5. 2. Modeling of Power System and Wind farm 2.1. Benchmark Power System In this paper, we use the 12-bus benchmark power system from [3, 28] to

105

study the reactive power control of DFIG based wind farm. The single-line diagram of the system is illustrated in Figure 1. The test system contains three geographical areas. Area 1 is a generation center. Generator G1 in area 1 is modeled as an ideal voltage source, which makes bus 9 an infinite bus. The 5

230 kV 1

G1

230 kV 2

5 10

G2

230 kV

230 kV

22 kV

22 kV 9

230 kV 6

Infinite Bus

4

22 kV G4 12

Area 1

Area 3 230 kV

Wind Farm 345 kV 7

Area 2

345 kV 8

3

22 kV G3 11

Figure 1: Single-line diagram of the benchmark system [3, 28]

other generator in this area, G2, is modeled in detail as a hydro generator with 110

a governor and an exciter. Area 3 is a load center with some local generation represented as a thermal generator G3, which is also modeled in detail with a governor and an exciter. Most of load demand in area 3 is met by the generation from area 1, through two 230 kV transmission lines and one 345 kV transmission line. Between area 1 and area 3 is area 2. Similar as [3], a 400 MW wind farm

115

is connected to bus 12 in area 2. A part of the wind generation is locally consumed, and the rest is delivered to the load center area 3. The wind farm only has some fixed-capacitor compensation at the high tension side of the stepup transformer. No dynamic reactive compensator like STATCOM or SVC is installed in or around the wind farm. As discussed above, a fault in the

120

system, such as at the location of bus 1, can cause a voltage sag at bus 6 and an oscillation in the active power output of the wind farm. In the following context, a supplementary reactive power controller for the wind farm will be developed to reduce the voltage sag and to mitigate the active power oscillation. 2.2. DFIG Based Wind Farm

125

It is not easy to mathematically model a wind farm with a large number of DFIGs. Modeling all DFIGs in details could create a huge simulation overhead. As such, representing a wind farm by a simplified model has been the focus of modeling efforts. According to [29], when we investigate the collective response

6

Induction Generator

isabc

Ps Qs

vsabc Power Grid

Gear Box

Wind Turbine Crow-Bar Circuit

Lf

Pr Qr

Q g Pg

irabc

igabc r L g g

Vdc GSC

RSC

vrabc

vgabc

dc-Link

Figure 2: Schematic of DFIG [3]

of a wind farm for stability studies, a reduced model using one single DFIG can 130

be successfully applied. Hence, similar to those approaches in [3, 30, 10, 6], we model the DFIG wind farm as a single equivalent DFIG with a wind turbine. The schematic of DFIG wind generator is shown in Figure 2. Physically, the wind turbine drives the induction generator through a shaft system. For transient stability studies, the shaft system is modeled as a combination of a

135

low-speed shaft, a high-speed shaft, and a gearbox between them. Details on the modeling of the mechanical system can be found in [3]. The stator of DFIG is directly connected to the power grid, while the wound-rotor of DFIG is fed to the grid through a back-to-back ac/dc/ac converter, including a rotor-side converter (RSC), a grid-side converter (GSC) and a dc-link capacitor in between.

140

To achieve a bidirectional power flow between the rotor and the grid, the RSC and the GSC are based on four-quardrant insulated-gate bipolar transistors (IGBTs). To protect the RSC from overcurrent during a fault, a crow-bar circuit is used to short circuit the RSC during a sever fault. The RSC realizes the active and reactive power control of the stator in a

145

decoupled manner [3, 6, 31]. Usually, the rotor current in the abc reference frame is projected to the stator-flux oriented reference frame, whereby the rotor current is transformed into d-axis and q-axis component, i.e. idr and iqr . The stator active power Ps is determined by the q-axis component, while the stator reactive power Qs is determined by the d-axis component. A schematic of the

150

RSC control is shown in Figure 3. It includes a very fast inner-loop for current tracking, and a relatively slow outer-loop for active and reactive power regula7

vqr2

Ps Ps*

irabc Q s*

+

-

PI

i

qr +

iqr abcÆdq +

PI

PI

vqr1++

vqr

-

idr idr + -

PWM

PI

vdr1+

-

+

vdr

vdr2

Qs

Figure 3: Schematic of RSC control [3, 6, 31]

tion. The inner-loop generates voltage commands for pulse-width modulation (PWM) converters through two proportional-integral (PI) controllers. On the other hand, the outer-loop generates reference current commands for the inner155

loop through two PI controllers. To keep DFIG working at the unity power factor, the reactive power command Q∗s is normally set to zero in a wind farm with adequate reactive power compensation. The GSC is controlled to maintain the dc-link voltage and to regulate the reactive power exchange between the GSC and the power grid [3, 6, 31]. Different

160

from the RSC, stator-voltage oriented reference frame is employed for the control of GSC. In such a reference frame, the dc-link voltage Vdc is determined by the d-axis component of the current fed into the grid, while the reactive power exchange Qg is determined by the q-axis component. A schematic of the GSC control is shown in Figure 4. Similar to the RSC, the GSC control scheme

165

also includes a fast inner-loop for current tracking and a slow outer-loop for dc-link voltage and reactive power regulation. As the converter rating is only 25%− 30% of the generator rating [3] and the converter is mainly used for active power exchange, the command value of the reactive power exchange is usually set to Q∗g = 0.

8

Vdc V dc*

+

|vs|

-

PI

i

dg +

idg igabc

Q g*

abcÆdq

+

PI

PI

+

vdg

-

-

Z s Lg

PWM

iqg

iqg +

vdg1+

Z s Lg -

PI

vqg1++ vqg

-

Qg

Figure 4: Schematic of GSC control [3, 6, 31]

170

3. Supplementary Reactive Power Control of Wind Farm 3.1. Supplementary Learning Structure for Transient Reactive Power Control of DFIGs According to the above analysis, there are two objectives for the transient reactive power control of DFIGs: 1) to reduce the voltage sag at the PCC during

175

a fault; 2) to damp the active power output oscillation of the wind farm after a fault. To accomplish such goals, the voltage at bus 6, V6 , and the active power output of the wind farm, Pg4 , are fed into the reactive power controller [3]. A regulatory signal, ΔQ∗s , is generated to augment the original RSC reactive power command, as illustrated in Figure 5. The idea of such a supplementary

180

controller is to compensate for the on-fault voltage sag by enforcing reactive power injection from DFIGs to the grid, and to damp the oscillation of active power after the fault by rapidly regulating the reactive power output of DFIGs. The input/output relationship of the supplementary dynamic reactive power controller is also shown in Figure 5. The input is denoted as a vector, x =

185

[ΔV6 , ΔPg4 ]T , in which ΔV6 is the scaled voltage deviation, and ΔPg4 is the scaled active power deviation. The supplementary reactive power command ΔQ∗s is determined by the output of the supplementary controller us and a fixed modulation factor QB . In Figure 5, Pg4,ref and V6,ref are the steady-state values of the active power

190

output of the wind farm and the voltage of bus 6. In this way, the output of the 9

V6

+

yVB -

V6,ref Pg 4

+

y PB

'V6

ADP ADP us 'Qs* uQB Supplementary Supplementary + + 'Pg 4 Controller Controller

Qs*

RSC Control

Qs*,0

-

Pg 4,ref

Figure 5: Introducing a supplementary control signal in the reactive power control loop of RSC

supplementary reactive power controller in steady state is 0. The supplementary reactive power controller is activated only when the system is in transient state. Taking the power rating and the converter current limit into consideration, a maximum limit is set for the rotor current, which is denoted as ir,m . Hence, the current reference value for the RSC control should satisfy (i∗qr )2 + (i∗dr )2 ≤ (ir,m )2 ,

(1)

where i∗qr and i∗dr are the reference value for the q-axis and d-axis rotor current, respectively. Since the reactive power controller is added in the d-axis control loop of the RSC control, we limit i∗dr within the following constraint, −

  (ir,m )2 − (i∗qr )2 ≤ i∗dr ≤ (ir,m )2 − (i∗qr )2 .

(2)

3.2. Controller Training Using the OSLC For the supplementary reactive power control of DFIGs, an instantaneous cost function is defined as 2 2 U (xk , us,k ) = RV ΔV6,k + RP ΔPg4,k + Ru u2s,k , k = 0, 1, 2, · · · ,

(3)

where RV , RP , and Ru are positive numbers, and k is the discrete time step. The action dependent cost function for the supplementary controller us (xk ) is defined as Qus (xk , us,k ) = U (xk , us,k ) +

∞  i=k

10

U (xi+1 , us (xi+1 )).

(4)

According to Bellman’s principle of optimality [32], the optimal cost function Q∗ (xk , us,k ) satisfies that Q∗ (xk , us,k ) = U (xk , us,k ) + min Q∗ (xk+1 , us,k+1 ). us,k+1

(5)

The optimal supplementary controller can be computed from u∗s (xk ) = arg min Q∗ (xk , us,k ). us,k

(6)

Equation (5) is essentially a discrete-time HJB equation. Since the HJB 195

equation cannot be analytically solved in general, we employ a new ADP based approach, the OSLC [26], to solve the problem of optimal reactive power control approximately and iteratively. The basic idea of the OSLC is to complement an existing controller using a supplementary controller based on ADP. In this way, the prior knowledge

200

of the original controller can be utilized. On the other hand, the OSLC can make the original controller become “smarter” by incorporating an actor-critic structure [12]. The critic is designed to evaluate the current supplementary controller, while the actor is to produce a supplementary control signal to improve the control performance. In the OSLC, the policy iteration algorithm [13, 27, 33]

205

is used to train the actor-critic structure. The policy iteration algorithm is an iteration between two steps: policy evaluation via the critic, and policy improvement via the actor guided by the critic. The algorithm can be described as follows. Step 1: policy evaluation Q(i) (xk , us,k ) = U (xk , us,k ) + Q(i) (xk+1 , u(i) s (xk+1 )), i = 0, 1, 2, · · · , ∀xk .

(7)

In the above equation, Q(i) (xk , us,k ) is a cost function of the form (4) corre(i)

sponding to the supplementary controller us (xk ). When updating the critic, a small probing noise should be added to the supplementary control signal (i)

us,k = us (xk ). In the case study, it is found that the probing noise can be very small, although it is necessary. The cost function Q(i) (xk , us,k ) can be 11

approximated by approximators such as multilayer perceptron (MLP), radial basis function (RBF), and others. The simplest option for the cost function approximator is the following quadratic function, T (i) (i) T (i) T Q(i) (xk , u(i) s (xk )) = [xk , us (xk )]S [xk , us (xk )] ,

(8)

where S (i) ∈ R3×3 and it can be obtained by solving a least square problem. De2 2 fine φ(xk , us,k ) = [ΔV6,k , 2ΔV6,k ΔPg4,k , 2ΔV6,k us,k , ΔPg4,k , 2ΔPg4,k us,k , u2s,k ]T (i)

(i)

(i)

(i)

(i)

(i)

(i)

and W (i) = [S11 , S12 , S13 , S22 , S23 , S33 ]T , where Spq is the element of S (i) in the pth row and the qth column. The least square method goes as follows. (i)

First, N sample data pairs are collected, i.e. φ(xk , us,k ) − φ(xk+1 , us (xk+1 )) and U (xk , us,k ) for k, k + 1, · · · , k + N , where typically N ≥ L. These samples are aligned in rows and denoted as ψk ∈ RL×N and μk ∈ R1×N , respectively. Then W (i) can be obtained as W (i) = (ψk† )T μTk ,

(9)

where ψk† is the pseudo inverse of ψk . Note that the least square method is 210

efficient in data utilization and does not need any learning parameter. Step 2: policy improvement u(i+1) (xk ) = arg min Q(i) (xk , us,k ), i = 0, 1, 2, · · · , ∀xk . s us,k

(i+1)

us

(10)

(xk ) can also be parameterized by a function approximator. If the cost (i+1)

function is parameterized as the quadratic form in (8), us

(xk ) can be directly

and immediately solved from (10) as the following linear feedback controller (i+1)

(xk ) = KV u(i+1) s

(i)

(i+1)

ΔV6,k + KP (i)

ΔPg4,k

(i)

(i)

= −S13 /S33 ΔV6,k − S23 /S33 ΔPg4,k , (i+1)

where KV

(i+1)

and KP

(11)

are the feedback gain of ΔV6 and ΔPg4 in the (i + 1)th

iteration, respectively. Remark 1. The supplementary reactive power controller can be trained by utilizing data measurements, and therefore, can adapt to system or external

12

215

changes without an explicit offline system identification process. Before learning, the supplementary controller or the actor in the OSLC is initialized to be a zero-output controller. Then the controller is trained during online system operation. It should be noted that the supplementary controller cannot be trained if the system operates steadily at the equilibrium. Fortunately, the

220

power system is constantly excited by random events, such as load changes, wind speed variations, and transmission line faults. Once the system deviates from the equilibrium, the responses of V6 and Pg4 can be measured. Such data are referred to as training samples. After enough training samples are collected online, the OSLC can online calculate the action dependent cost function by

225

the least square method. Accordingly, the supplementary controller can be updated online. This is referred to as an online training iteration. By use of an action dependent cost function, the learning control can be implemented in a model-free and data-driven manner. Remark 2. As a policy iteration method, the above algorithm can guaran-

230

tee the system stability during learning, and can converge to the optimal cost function and the optimal supplementary controller [26]. Besides, the policy iteration algorithm is highly efficient in policy-search with a quadratic convergence rate [13]. Remark 3. An effective interface neurocontroller (INC) for coordinated

235

reactive power control of DFIG and STATCOM is designed in [3], where an ACD approach is used. When training the INC, a forced training phase is necessary. In this phase, pseudorandom binary signal (PRBS) is injected into the controlled system. However, such PRBS may not be allowed for a real power system that is operating. As for the proposed approach in this paper,

240

the probing noise for the policy evaluation is found to be much smaller than PRBS in the case study, which can avoid undesired negative effects on the online system operation.

13

4. Simulation Study The simulation study is carried out on Power System Computer Aided De245

sign/ Electro Magnetic Transient in DC System (PSCAD/EMTDC). The configuration and parameter of the benchmark power system can be found in [3, 28, 1]. The wind farm is modeled as a reduced model using one equivalent DFIG with one wind turbine. The parameters of the equivalent wind turbine and DFIG are the same as [3, 1].

250

Two cases are investigated in this section. The first case is to test the online optimization capability of the supplementary reactive power controller, while the second is to test the adaptability of the proposed controller when the system operating condition changes. In the simulation, the following configuration is set: the wind speed 11 m/s,

255

the steady-state active power output of wind farm Pg4,ref = 300 MW, the steady-state voltage of bus 6 V6,ref = 1.02 pu, the steady-state reactive power command of DFIG Q∗s,0 = 0, the scale factor VB = 230 kV and PB = 400 MW, the modulation factor QB = 40 MVar, the maximum rotor current ir,m = 20 kA, the probing noise uniformly distributed in [−0.01, 0.01], the sampling time

260

step 0.001 s, the parameters for instantaneous cost function RV = RP = 1, 000 and Ru = 10. The cost function approximation in quadratic form (8) is used for the OSLC, which is a locally quadratic approximation to the cost function in a certain operating point [34]. N = 3, 000 training samples are collected for the policy evaluation in every single training iteration. That means, the

265

supplementary reactive power controller is updated every 3 s while the power system is operating online. The online training algorithm is illustrated in Table 1. 4.1. Case 1 Parameters of the supplementary controller are illustrated in Figure 6. KV

270

and KP correspond to the feedback gain of ΔV6 and ΔPg4 , respectively, which are defined in (11). It can be seen that the training process converges in 10 iterations. 14

Table 1: Online training of the supplementary reactive power controller

Algorithm: Online training of the supplementary reactive power controller (0)

Step 1. Initialization: i = 0, initial supplementary controller us (xk ) = 0, number of samples for the policy evaluation N . (i)

Step 2. Implement the supplementary controller us (xk ) in the controlled system and collect N samples for the policy evaluation. (i)

Step 3. Obtain the cost function Q(i) (xk , us (xk )) by using (9). (i+1)

Step 4. Obtain the supplementary controller us

(xk ) by using (11).

Step 5. i = i + 1. Go to Step 2.

Parameters of supplementary controller

10

5

K

P

0

KV −5

−10

−15

−20

−25

−30

0

2

4

6

8

10

Training iteration

Figure 6: Parameters of supplementary controller

To demonstrate the control performance of the converged OSLC based supplementary reactive power controller, its performance is compared to the baseline control without any transient reactive power control, and the transient reactive power control strategy proposed in [2]. The idea of the control strategy in [2] is also to add a supplementary control signal on the reactive power command of the RSC. The supplementary control signal is determined based on a PI controller for the voltage deviation of the PCC.  ∗ ΔQs = kV p ΔV6 + kV i ΔV6 dt, 15

(12)

where ΔQ∗s is the supplementary reactive power control signal fed to the RSC (MVar); ΔV6 is the voltage deviation of bus 6 (pu); kV p and kV i are the pro275

portional gain and the integral gain of the PI controller, respectively. With an elaborate tuning through trial-and-error, kV p = 1, 000 and kV i = 100 are used to obtain a good control performance. In the following context, we refer to the transient reactive power control strategy in [2] as PI control. In the simulation, the control performance of the converged supplementary

280

reactive power controller based on the OSLC, the baseline control, and the PI control is tested under a three-phase short circuit. At t = 1 s, a three-phase grounded short circuit happens at bus 1, which lasts for 150 ms before it is cleared. The response of the voltage at bus 6, i.e., the high tension side of the step-

285

up transformer, is illustrated in Figure 7. The active power output of the wind farm is illustrated in Figure 8. It can be observed from Figures 7-8 that the transient reactive power control provided by the PI control and the OSLC can significantly improve the dynamic performance of the PCC voltage and the active power output of the wind farm. According to Figure 7, with the transient

290

reactive power control based on the PI control and the OSLC, the lowest voltage is elevated from 0.49 pu under the baseline control to around 0.65 pu. Also, the voltage oscillation after the fault is better damped. Note that the elevation of the PCC voltage is beneficial for enhancing the LVRT capability of the wind farm, as wind generators will be tripped by low-voltage protection when the

295

PCC voltage is under a pre-set trigger value [8]. According to Figure 8, it can be observed that the active power oscillation is better damped by the transient reactive power control. Note that the performance of the OSLC is slightly better than or similar to the performance of the PI control. It is reasonable because the PI control is elaborately tuned at the specified operating condition. It will

300

be demonstrated in the next case that such PI control cannot adapt to changes in operating condition. The rotor current of DFIG shown in Figure 9 indicates that the proposed supplementary reactive power controller remarkably shortens the duration of 16

Baseline control PI control OSLC

1.2

Voltage of bus 6 (pu)

1.1 1 0.9 0.8 0.7 0.6 0.5 1

1.5

2

2.5

3

Time (s)

Figure 7: Voltage of bus 6

Output active power of wind farm (MW)

380

Baseline control PI control OSLC

360 340 320 300 280 260 240 220 200 180 1

1.5

2

2.5

3

3.5

4

4.5

5

Time (s)

Figure 8: Active power output of wind farm

large rotor current, alleviating the risk of rotor circuit overheating. Note that 305

the largest rotor current is reduced from 19.11 kA under the baseline control to 18.72 kA under the OSLC, although reducing the rotor current is not explicitly reflected in the cost function. Also, note that the rotor current is within the maximum current limit. Next, influences of the OSLC on the system dynamic performance is studied,

310

including the response of conventional generators, the power flow through tie lines, and the voltage at load-side buses.

17

20

Rotor current of DFIG (kA)

Baseline control PI control OSLC

15

10

5 1

1.5

2

2.5

3

3.5

4

Time (s)

Figure 9: Rotor current of DFIG

Baseline control PI control OSLC

Speed of generator G2 (pu)

380

379

378

377

376

375

374

0.5

1

1.5

2

2.5

3

3.5

4

4.5

Time (s)

Figure 10: Rotor speed of hydro generator G2

Figures 10-11 show the rotor speed and the active power output of the hydro generator G2 in area 1, respectively. The OSLC has little influence on the response of G2 because G2 is very close to the fault location. Figures 12-13 315

show the rotor speed and the active power output of the thermal generator G3 in area 3, respectively. The OSLC achieves more significant improvement on G3 than on G2. As G3 is far away from the fault, the mitigation of the wind farm active power can help alleviate the active power oscillation of G3. The delivered active power of tie lines 1-6, 2-5, 3-4 and 6-4 are presented

18

Output active power of generator G2 (MW)

Baseline control PI control OSLC

750 700 650 600 550 500 450 400 350 300 250 1

1.5

2

2.5

3

3.5

Time (s)

Figure 11: Active power output of hydro generator G2

Baseline control PI control OSLC

379

Speed of generator G3 (pu)

378.5 378 377.5 377 376.5 376 375.5 375 0.5

1

1.5

2

2.5

3

3.5

4

Time (s)

Figure 12: Rotor speed of thermal generator G3

320

in Figures 14-17, respectively. The active power of tie line 1-6 is not improved during the fault because the fault happens right at bus 1. Even so, the active power of tie line 1-6 is rapidly recovered and well damped after the fault, benefited from the OSLC. From Figures 14-17, it can be seen that the active power of other tie lines is also better damped and the oscillation amplitude is

325

remarkably reduced. It is because that the active power output oscillation of the wind farm is mitigated and the transient active power exchange through tie lines is reduced.

19

Output active power of generator G3 (MW)

Baseline control PI control OSLC

350

300

250

200

150

100

1

1.5

2

2.5

3

3.5

4

Time (s)

Figure 13: Active power output of thermal generator G3

A part of the active power generated by the wind farm is delivered to the load center through tie line 6-4. The receiving-end voltage at bus 4 is also 330

elevated by the OSLC during the fault, as illustrated in Figure 18. To further compare the control performance, the integral of squared-error(ISE) [35] is employed. The ISE of a variable x is defined as  tf x − xe 2 ( ) dt, ISE = xe t0

(13)

where xe is the steady state value of x, t0 and tf are the beginning time and the ending time to calculate the ISE, respectively. In this paper, t0 = 1 s and tf = 6 s are used. The ISE indices of the converged OSLC, the baseline control, and the PI 335

control are illustrated in Table 2. Note that a smaller ISE indicates a better control performance. From Table 2, it can be seen that ISE indices for most variables are improved by the proposed OSLC based supplementary reactive power control. Also, the ISE of the OSLC is slightly better than or at least similar to the ISE of the well-tuned PI control.

340

4.2. Case 2 In this case, the local load at bus 6 decreases by 25%, which changes the system power flow. As a consequence, the environment of the wind farm varies 20

Tie line active power from bus 1 to bus 6 (MW)

Baseline control PI control OSLC

350 300 250 200 150 100 50 0 1

1.5

2

2.5

3

Time (s)

Figure 14: Tie line active power: bus 1 to bus 6

Tie line active power from bus 2 to bus 5 (MW)

220

200

180

160

140

120

Baseline control PI control OSLC

100

80

60

0

1

2

3

4

5

6

Time (s)

Figure 15: Tie line active power: bus 2 to bus 5

accordingly. The previously well-trained supplementary reactive power controller may not behave as well as before in the new environment. Fortunately, 345

owing to the inherent learning ability of the OSLC, the proposed reactive power controller is capable of adapting to the new environment. Parameter evolution of the supplementary reactive power controller is illustrated in Figure 19. It can be observed that the training process converges in 10 training iterations.

350

Under the same simulation scenario as in case 1, control performance of the

21

Tie line active power from bus 3 to bus 4 (MW)

120 110 100 90 80 70 60

Baseline control PI control OSLC

50 40 30 1

1.5

2

2.5

3

3.5

Time (s)

Tie line active power from bus 6 to bus 4 (MW)

Figure 16: Tie line active power: bus 3 to bus 4

Baseline control PI control OSLC

80

60

40

20

0

−20

−40 1

1.5

2

2.5

3

3.5

4

4.5

Time (s)

Figure 17: Tie line active power: bus 6 to bus 4

updated supplementary controller is compared to the supplementary controller obtained in case 1 and the PI control based transient reactive power control. Figures 20-21 illustrate the voltage at bus 6 and the active power output of the wind farm, respectively. Their ISE indices are shown in Table 3. It can 355

be seen that after the power flow changes, the performance of the supplementary controller obtained in case 1 and the PI control deteriorates, especially in the response of the active power output of the wind farm. On the contrary, the online updated supplementary controller shows an improved performance.

22

1

Voltage of bus 4 (pu)

0.9

0.8

Baseline control PI control OSLC

0.7

0.6

0.5

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4

2.6

Time (s)

Figure 18: Voltage of bus 4 Table 2: ISE of different controllers: case 1 Baseline control

PI control

OSLC

0.0371

0.0163

0.0165

Voltage of bus 6 Active power output of wind farm

0.0290

0.0047

0.0046

Rotor current of DFIG

0.1101

0.0978

0.0951

Speed of generator G2 Active power output of generator G2

0.0980

0.0865

0.0867

2.743×10−5

2.351×10−5

2.358×10−5

Speed of generator G3

0.0220

0.0241

0.0189

4.545×10−6

6.211×10−6

4.802×10−6

Tie line active power from bus 1 to bus 6

0.2087

0.1465

0.1386

Tie line active power from bus 2 to bus 5

0.0393

0.0304

0.0307

Tie line active power from bus 3 to bus 4

0.0674

0.0362

0.0389

Tie line active power from bus 6 to bus 4

0.5383

0.2931

0.2049

Voltage of bus 4

0.0265

0.0125

0.0124

Active power output of generator G3

Under the updated supplementary controller, the oscillation of the wind farm 360

active power output is significantly mitigated, and the voltage oscillation of bus 6 after the fault is better damped. This case validates the online optimization and adaptation capability of the proposed method. According to case 1 and case 2, we can summarize the advantages of the proposed supplementary reactive power control method over the PI control based

365

transient reactive power control strategy in [2]. By elaborately tuning the PI parameters through trial-and-error, the PI control in [2] can achieve a similar

23

Parameters of supplementary controller

10

5

0

−5

−10

−15

KP

−20

KV

−25

−30

0

2

4

6

8

10

Training iteration

Figure 19: Parameters of supplementary controller

PI control OSLC in case 1 Updated OSLC

1.3

Voltage of bus 6 (pu)

1.2 1.1 1 0.9 0.8 0.7 0.6

1

1.5

2

2.5

3

3.5

4

Time (s)

Figure 20: Voltage of bus 6

performance as the proposed method. Nevertheless, the proposed method is a systematical method, in which the supplementary reactive power controller is trained online without human intervention. Another advantage of the proposed 370

supplementary reactive power control method is its online adaptation capability, while the PI control in [2] is designed for a specified operating condition, which cannot adapt to changes in operating condition.

24

Output active power of wind farm (MW)

450 400 350 300 250 200 150 100

PI control OSLC in case 1 Updated OSLC

50 0 −50 1

1.5

2

2.5

3

3.5

4

Time (s)

Figure 21: Active power output of wind farm Table 3: ISE of different controllers: case 2 PI control

OSLC in case 1

Updated OSLC

Voltage of bus 6

0.0319

0.0325

0.0326

Active power output of wind farm

0.2315

0.1794

0.0328

5. Conclusion In this paper, a supplementary transient reactive power controller for DFIG 375

based wind farm is developed to enhance transient stability of power systems. By making use of the OSLC based on ADP, the training of the supplementary controller can be model-free with guaranteed stability during learning, which makes the proposed method suitable for online implementation. With the supplementary reactive power controller, the voltage sag at the PCC during a fault

380

can be reduced, and the active power oscillation of the wind farm after a fault can be mitigated. Furthermore, dynamic responses of conventional generators and tie lines are improved system wide. Also, when the operating condition of power systems changes, the proposed method can online adapt the supplementary reactive power controller and re-optimize the control performance. It

385

should be noted that the enhancement of power system stability is achieved by an add-on intelligent controller for DFIG based wind farm, rather than equipping a new advanced control device such as STATCOM. Therefore, our proposed 25

method is easy to implement and cost effective. References 390

[1] W. Guo, F. Liu, D. He, J. Si, R. Harley, S. Mei, Reactive power control of DFIG wind farm using online supplementary learning controller based on approximate dynamic programming, in: Proc. 2014 International Joint Conference on Neural Networks, Beijing, 2014, pp. 1–7. [2] M. Kayık¸cı, J. V. Milanovi¸c, Reactive power control strategies for DFIG-

395

based plants, IEEE Trans. Energy Convers. 22 (2) (2007) 389–396. [3] W. Qiao, R. G. Harley, G. K. Venayagamoorthy, Coordinated reactive power control of a large wind farm and a STATCOM using heuristic dynamic programming, IEEE Trans. Energy Convers. 24 (2) (2009) 493–503. [4] J. Liang, W. Qiao, R. G. Harley, Feed-forward transient current control

400

for low-voltage ride-through enhancement of DFIG wind turbines, IEEE Trans. Energy Convers. 25 (3) (2010) 836–843. [5] L. G. Meegahapola, T. Littler, D. Flynn, Decoupled-DFIG fault ridethrough strategy for enhanced stability performance during grid faults, IEEE Trans. Sustain. Energy 1 (3) (2010) 152–162.

405

[6] Y. Tang, H. He, Z. Ni, J. Wen, X. Sui, Reactive power control of gridconnected wind farm based on adaptive dynamic programming, Neurocomputing 125 (11) (2014) 125–133. [7] T. Masaud, P. K. Sen, A comparative study of the implementation of STATCOM and SVC on DFIG-based wind farm connected to a power system,

410

in: Proc. 2012 IEEE PES General Meeting, San Diego, CA, 2012, pp. 1–1. [8] T. Masaud, P. K. Sen, Study of the implementation of STATCOM on DFIG-based wind farm connected to a power system, in: Proc. 2012 IEEE PES Innovative Smart Grid Technologies, Washington, DC, 2012, pp. 1–7.

26

[9] W. Qiao, G. K. Venayagamoorthy, R. G. Harley, Real-time implementa415

tion of a STATCOM on a wind farm equipped with doubly fed induction generators, IEEE Trans. Ind. Appl. 45 (1) (2009) 98–107. [10] Y. Tang, H. He, J. Wen, Adaptive control for an HVDC transmission link with FACTS and a wind farm, in: Proc. 2013 IEEE PES Innovative Smart Grid Technologies, Washington, DC, 2012, pp. 1–6.

420

[11] S. Foster, L. Xu, B. Fox, Coordinated reactive power control for facilitating fault ride through of doubly fed induction generator- and fixed speed induction generator-based wind farms, IET Renewable Power Generation 4 (2) (2010) 128–138. [12] J. Si, A. G. Barto, W. B. Powell, D. Wunsch, Handbook of learning and

425

approximate dynamic programming: scaling up to the real world, WileyIEEE Press, Piscataway, NJ, 2004. [13] F. L. Lewis, D. Vrabie, K. G. Vamvoudakis, Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers, IEEE Control Syst. Mag. 32 (6) (2012) 76–105.

430

[14] F.-Y. Wang, H. Zhang, D. Liu, Adaptive dynamic programming: an introduction, IEEE Comput. Intell. Mag. 4 (2) (2009) 39–47. [15] G. K. Venayagamoorthy, R. G. Harley, D. C. Wunsch, Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neurocontrol of a turbogenerator, IEEE Trans. Neural Netw.

435

13 (3) (2002) 764–773. [16] G. K. Venayagamoorthy, R. G. Harley, D. C. Wunsch, Implementation of adaptive critic-based neurocontrollers for turbogenerators in a multimachine power system, IEEE Trans. Neural Netw. 14 (5) (2003) 1047–1064. [17] Y. Tang, H. He, J. Wen, J. Liu, Power system stability control for a wind

440

farm based on adaptive dynamic programming, IEEE Trans. Smart Grid 6 (1) (2015) 166–177. 27

[18] J. Liang, G. Venayagamoorthy, R. Harley, Wide-area measurement based dynamic stochastic optimal power flow control for smart grids with high variability and uncertainty, IEEE Trans. Smart Grid 3 (1) (2012) 59–69. 445

[19] J. Liang, D. D. Molina, G. K. Venayagamoorthy, R. G. Harley, Two-level dynamic stochastic optimal power flow control for power systems with intermittent renewable generation, IEEE Trans. Power Syst. 28 (3) (2013) 2670–2678. [20] Z. Ni, Y. Tang, H. He, J. Wen, Multi-machine power system control based

450

on dual heuristic dynamic programming, in: Computational Intelligence Applications in Smart Grid (CIASG), 2014 IEEE Symposium on, 2014, pp. 1–7. [21] X. Sui, Y. Tang, H. He, J. Wen, Energy-storage-based low-frequency oscillation damping control using particle swarm optimization and heuristic

455

dynamic programming, IEEE Trans. Power Syst. 29 (5) (2014) 2539–2548. [22] C. Lu, J. Si, X. Xie, Direct heuristic dynamic programming for damping oscillations in a large power system, IEEE Trans. Syst., Man, Cybern. B 38 (4) (2008) 1008–1013. [23] D. V. Prokhorov, D. C. Wunsch, Adaptive critic designs, IEEE Trans.

460

Neural Netw. 8 (5) (1997) 997–1007. [24] J. Si, Y.-T. Wang, Online learning control by association and reinforcement, IEEE Trans. Neural Netw. 12 (2) (2001) 264–276. [25] H. He, Z. Ni, J. Fu, A three-network architecture for on-line learning and optimization based on adaptive dynamic programming, Neurocomputing

465

78 (1) (2012) 3–13. [26] W. Guo, F. Liu, J. Si, D. He, R. Harley, S. Mei, Online supplementary ADP learning controller design and application to power system frequency control with large-scale renewable energy integration, Submitted to IEEE Trans. Neural Netw. Learn. Syst. 28

470

[27] M. G. Lagoudakis, R. Parr, Least-squares policy iteration, Journal of Machine Learning Research 4 (2003) 1107–1149. [28] S. Jiang, U. D. Annakkage, A. M. Gole, A platform for validation of FACTS models, IEEE Trans. Power Del. 21 (1) (2006) 484–491. [29] V. Akhmatov, Analysis of dynamic behavior of electric power systems with

475

large amount of wind power, Ph.D. dissertation, Technical University of Denmark, Kgs. Lyngby, Denmark. [30] W. Qiao, R. G. Harley, G. K. Venayagamoorthy, Effects of FACTS devices on a power system which includes a large wind farm, in: Proc. IEEE PES Power Syst. Conf. Expo. (PSCE’ 2006), Atlanta, GA, 2006, pp. 2070–2076.

480

[31] S. Faried, I. Unal, D. Rai, J. Mahseredjian, Utilizing DFIG-based wind farms for damping subsynchronous resonance in nearby turbine-generators, Power Systems, IEEE Transactions on 28 (1) (2013) 452–459. [32] R. E. Bellman, Dynamic Programming, Princeton Univ. Press, Princeton, NJ, 1957.

485

[33] R. A. Howard, Dynamic Programming and Markov Processes, MIT Press, Cambridge, MA, 1960. [34] J. J. Murray, C. J. Cox, G. G. Lendaris, R. Saeks, Adaptive dynamic programming, IEEE Trans. Syst., Man, Cybern. C 32 (2) (2002) 140–153. [35] Z.-L. Gaing, A particle swarm optimization approach for optimum design

490

of PID controller in AVR system, IEEE Trans. Energy Convers. 19 (2) (2004) 384–391.

29