Copyright © IFAC 12th Triennial World Congress. Sydney. Australia. 1993
NUCLEAR REACTOR CONTROL USING DIAGONAL RECURRENT NEURAL NETWORKS c.-c. Ku*, K.Y. Lee* and R.M. Edwards** ' Dcpartment of EIl'ctrical and Computer EnKinl'erinK . Th e P" nnsylvania State Un i versity . University Park. PA IMW:! . USA "Department of Nue/ear EnKinl'CrinK. The Pennsylvania State University. Unil'Nsity Park . PA IMW:!. USA
Abstract. A new approach for wide-range optimal reactor temperature control using Diagonal Recurrent Neural Networks (DRNN) with adaptive learning rate scheme is presented. The draw back of the usual feedforward neural network (FNN) is that it is a static mapping and requires a large numbe r of neurons. The usual fixed learning rate based on empirical trial and error scheme is slow and does not guarantee convergence. The DRNN is for dynamic mapping and requires much less number of neurons and weights, and thus converges faster than FNN. Rapid convergence of this DRNN-hased control system is demonstrated when applied to improve reactor temperature performance.
Key Words. Artifici,d neural nets; nuclear reactor cuntrol; nonlin ear system control
1993). Simulation results show that it p erforms very well not only locally for normal operation, but also over a wide range of operation.
1. INTRODUCTION An ubserver-based optimal state feedback control theury has been developed to improve the telllperature response of a nuclear reactor (Edwarcls et ai, 1992). A controller was designed in a robust manner to account fur the uncertainties introduced by lineari:t.ation , unmodeled dynamics, and significant parameter variations. As an alternative to the model-based controller design, this paper considers artificial neural networks, which not only identify nonlinear plant dynamics under various operating conditions, but also control the plant to give an improved temperature response over a wide range of operation.
2. DIAGONAL RECURRENT NEURAL NETWORK BASED CONTROL The structure of the Diagonal Recurrent Neural Network (DRNN) is used for both control and system identification (see Fig. 1). The DRNN based control system has two DRNNs (see Fig. 2). The controller network, called a Diagonal Recurrent Neurocontroller (DRNC), uses three layers: input, output, and hidden layers. The hidden layer has a number of self-recurrent neurons, each feeding back its output only into itself and not to other neurons in the same layer. The input to the DRNC consists of reference signal, bias, dela.yed outputs of the DRNC and the plant. The output of the DRNC is the input signal to the plant. The identifier network , called a Diagonal Recurrent Neuroidentifier (DRNI), also has only one hidden layer. The input to the DRNI consists of the control signal generated from DRNC and the delayed output of the plant.
Artificial neural networks have been Ilsed in many control problems of dynamical systems, and some applications in nuclear power plant control are reported (Ku et aI, 1992a; Eryurek and Upadhyaya, 1990). Recently, the architecture of diagonal recurrent neural network (DRNN) was developed as a minimal realization of recurrent neural network (RNN) (Ku ~.tal, 1992a; Ku and Lee , 1992c). It is minimal in the sense that it has only one hidden layer of self-recurrent neurons, each feeding back its output only into itself and not to other neurons in the same layer. Since there is no interlinks among neurons, the DRNN has considerably fewer weights than the FRNN.
~
Luu:u
® -
DfurOtl
G;)"'O
SiPOld unaotl
0 (" [JlI De1&y Operator
This paper demonstrates the robustness and adaptivity of the DRNN in controlling reactor temperature and compares with desired optimal reference responses. The DRNN-based controller is shown to converge very rapidly due to an adaptive dynamic learning algorithm (Ku and Lee, 1992b,
Fig. I. Diagonal recurrent neural network
797
il()( k)
mr; ) aU( k)
aWJ' iJ()( k)
mVij
Xj(k)
(7a)
WJ'I'j(k)
(7b)
W;'Qij(k),
(7c)
I)),
(8a)
Qij(k) . J'(S;)(f,(k) I WJ'Qij(k - I)).
(8b)
f)j(k)
J'(Sj)(Xj(k
1)
wfl'j(k
Fig. 2. DRNN based reactor control system
2.2. Dynalllic13ackpropagati<)IlJ!~rP RNl 2.1. Dy!!.amic
Repr~sentation
of DRNN From (6), the negat.ive gradient of the error with respect to a weight. vector in ~~" is
The mathematical model for the DRNN (see Fig. 1) is shown below: O(k) = 'LW]'X)(k), Xj(k) ~ f(Sj(k)), Sj(k) == WFXj(k - 1)
+ 'LWi~Ii(k).
aJ~",
(I)
iJlV
aO(k) cw(k)aw- ·
(9)
(:1)
The weights can now be adjusted following the steepest descent mdhod, i.e., the update rule of the weights beCOllles Let d(k) and y(k) be the desired and actual responses of the plant, respectively, then an error function for DRNC can be defined as
vF(n 1 1)
(10)
where 'I is a learning rate and t.W(n) represents the change in weight in the ,,'" iteration. The error function (3) is also modified for the DRNI by replacing d(k) and y(k) with y(k) and Ym(k), respectively, where y",(k) is the output of the DRNI, i.e., Em
where y,,,(k)
= ~(Y(k)
7~ O(k)
In the case of DRNC, froUl (5), the negative gradient of the error with respect to a weight vector in R" is (m ,o iJO(k) (11) e,(k)y,,(k) aw-" iJW
- y",(k))2,
in (1).
The sensitivity was shown in (Ku and Lee, 1992c) to be
The gradient of error in (3) with respect to an arbitrary weight vector 11' ( Rn is represented by
y,,(k)
where the variables and weights are those found in DRNI, and H IL represents the input weight of DRNI corresponding to 7l(k) as its input.
where e,.(k) = d(k) - y(k) is the error between the desired and output responses of the plant, and the factor y,,(k) == ~~i~\ represents the sensitivity of the plant with respect to its input. In the case of the DR:\,I, the gradient of error in (4) simply becomes
?~"'DW
y
= - e (k)8 ",(kl = -e
'"
811'
'"
(k)~L~ 811"
The speed of convergence depends upon the learning rate 1J in the dynamic backpropagation algorithm. However, an arbitrary large learning rate causes the algorithm unstable. An efficient way of checking for the largest possible learning rate which guarantees the convergence stability was developed (Ku and Lee, 1993) and its results are summarized here.
(6)
where e",(k) == y(k) - y",(k) is the error between the plant and the DRNI responses. The output gradient I!~N.2 is common in (5) and (6) , and needs to be computed for both DRNC and DRNI. The gradient with respect to output, recurrent, and input weights, respectively, are computed using the following equations (Ku and Lee, 1992c):
G<))lVerg~nce
of DRNI
Let 1JI, the learning rate for the weights of DRNI, satisfy 1)1 .- '/J /gi.",,,,., with 0 < 1)J < 2, and 91.""" de-
798
fined as gI ,maz := maxk ll gI(k) ll , where gI(k) = q~;Vl , and 11 . 11 is the usual Euclidean norm in 31". Then the convergence is guaranteed if '11 is chosen as
for each reference model was used to obtain desired temperature response . TABLE 1
2
0 < '11 < -T - - '
THI-. C lAS~I F I CA T! ()N
( 13)
9l ,mu:r
Convergence of
Iq
n'()
05
1.11
0.0290 0.0145 0.0070
Region J Region 2
Region 4 Region I Region R
Region ) Re gion 0 Region 7
DR~Q
1\: 1
1.'m .;,: ,
•
Region Y
4. SIMULATION RESULTS
("
In the following studies, the total number of neurons and weights for the DRNN-based control system are NT = he + hI + 2, and W T = (n ,. + 3)h, + (nI + 3)hI, respectively. The set of inputs are 1', = {e:,n~,u(k - l),y(k - l)} and PI = {u(k),y(k - l)}, thus n ,. = 4, nI = 2, NT = 16 , and WT = 88. Here , u(k) and y(k) are the input and output of the plant. First , all nine regions are trained sequentially (from region 1 to region 9) 2 times to track their respective reference models (global training) , each with ± 10% changes in power level. Then it is followed by training only the region 6, four times (local training), which is the normal 100% power operating region with the average control rod worth
where hI IS the number of neurons in the hidden layer. Then the convergence is guaranteed if '1" is chosen as l.m.;
OPE R A TI O N RI· (j I ON ~
0. 1
Let '1e, the learning rate for the weights of DRNC, satisfy '1e = '12/ g~,,,,,,x' with 0 < '12 < 2, ge, .."" defined as gc ,maz := maxk ll gc(k)ll, where g,(k) = 8~;~k), and 5 m ax =
or
G, \ n,(I
( 14)
3. APPLICATION TO REACTOR CONTROL The proposed DRNN based control system is applied to a nuclear reactor model to improve reactor temperature response. As shown in Fig. 2, the inputs to the DRNC are the desired power level n~, the control rod reactivity worth gain e;., the delayed control signal u(k - I) and delayed output of the plant y(k - 1). The desired power load n;/ is split into two parts n;l = n;," 1- ond. The term n;," represents an equilibrium power level for which a time varying linear reference model is defined in region i; ond represents a deviation from n;'Il" Inputs to the DRNI are the control signal generated from DRNC u(k) and the delayed output of the plant y(k - I). The time varying linear reference model is approximated by a set of nine time invariant models, representing different operating points (e~, n~lI), and the DRNN was trained with
(e,
= 0.0145).
Once the training procedure is completed, both the local and global controls are tested to evaluate the performance of the DRNN-based controller. 4.1.
Local Control
After training is completed (global training followed by a local training in Region 6), the neural network is tested again in Region 6 with the power level changing in steps: 100 % -> 90% - . 100%. The responses of reactor power, exit temperature, and control rod speed are shown in Fig. 3. Since this is a local control, the performance of the neural network is very good, as expected. Similar testing was performed for all nine regions for local control, where the network was trained locally in each region following the global training . The responses of the reference model and the plant matched very closely in each region.
the reference model in each set, where i represents in the region i. 3.1. The Reactor Power Plant Modeling To demonstrate the proposed DRNN based controller, a simplified model of a pressurized water reactor (PWR) was used. The model represents the point kinetics with one delayed neutron group and temperature feedback from lumped fuel and coolant temperature calculations. While this model is ideal for an initial comparison of control concepts on an equal basis, more detailed models, analysis, and experiments are required for verification of a system to be implemented on an operating power plant (Edwards et aI, 1992).
o
I~I
"1 2
) I!>
I
\
)'4
I 0
0.'
,.,
00
/ "-- _ . _ . _ -_. _ - -
1'" .
3.2. Linear Reference Models
-~ ~1" +'~~1=;::;=;=j=~'.-.-+-'-~~~~.. j
Reactor safety operation dictates the specification of desired temperature responses, and the DRNN based controller is capable of tracking any reasonably specified reference response. This paper chooses the model-based optimal controller (Edwards et al, 1992) as a good candidate for generating reference training responses. The reference model is constituted by nine different time invariant linear reference models, as defined by the regions in Table I, and the optimal state feedback
0 .•
20
Tim~
(I«and.)
"
Fig. 3. Local control in region 6 for 100%- >90%-> 100% power level change. (a) Relative reactor power. (b) Exit temperature (unit: °C). (c) Control rod speed (unit: fraction of core length per second) (Solid line: reference. Dotted line: plant output)
799
4.2. global Operation
and weights when compared with the feedforward network. The above features allow the DRNN to be considered for on-line applications. Also, the use of adaptive learning rates guarantees convergence for any stable plant, in which case the learning rates are adjusted in each step so that they are within the corresponding bounds.
In this study, the controller whose trallllIlg was completed in Region 6 (normal operation) is tested in three other regions: Regions 1,3, and 7. In all regions a step change of 1. 10% in power level was applied. In Region 7, since it is in high power region, as expected, the results for this case are very good . When Region I , which is the' midpower region, is tested, the results for reactor power, exit temperature, and control rod speed arc shown in Fig. 4. In this case, the results are still good. For Region 3, which is low power region, the' simulation results of reactor power, exit temperature, and cuntrol roel speed responses are shown in Fig. 5. In luw power regions, the performances are fair compared to the reference model responses.
Several case studies, which include local and global control, have been investigated to observe the performance of the proposed DRNN-based controller. The results show that after the global training procedure for neural networks is completed, not only can the neurocontroller be used as an effective local controller but also as a global controller. Although in low power regions, the plant output does not track the reference model very closely, the performance is still acceptable.
o 07 ~ ~)o! ~O !!~
; j
o
o~c -
j
ACKNOWLEDGEMENT
il i
J
\
- j
-0 02~-
I
I
J J0 2-t O ° 7 ·0 0'0-:- ~ J ;
I
'
4
-0 Q1!1 -;- 'Ol
This work was supported in part by a grant from the Department of Energy, University Grant DEFG-07-89ER 12889, Intelligent Distributed Control for Nuclear Power Plants. However, any findings, conclusions, or recommendations expressed herein are those of the authors and do not necessarily reflect the views of DOE.
I,
~ . ~~ ~~~I ~,~~~~~ ' ~~~~~ TO ~ 10 210 )10 ,,10 :'0
REFERENCES M. Edwards, K. Y. Lee, and A. Ray (1992)," Robust Optimal Control of Nuclear Reactors and Power Plants," Nuclear Technology, Vol. 98, pp. 137-148.
Tim. (OKODcU)
n.. Fig. 4. Global control in region I for 40 %->50%->40% power level change. (a) Relative reactor power. (b) Exit temperature (unit: °C). (c) Control rod speed (unit: fraction of core length per second) (Solid line: reference. DOlled line: plant output)
C. C. Ku, K. Y. Lee, and R. M. Edwards (1992a)," Diagonal Recurrent Neural Networks for Nuclear Power Plan t Control," The 8th Power Plant .Qynamics, C()_ntrol, and Testing Symposium, pp. 61.01-61.14, Knoxville, May 27-29.
E. Eryurek and B. R. Upadhyaya (1990), "Sensor Validation For Power Plants Using Adaptive Backpropagation Neural Network," IEEE Trans. Qn Nuclear Science,Yol. 37, No. 2, pp. 1040-1047.
'0
20
)0
40
C. C. Ku and K. Y. Lee (1992b), "Diagonal Recurrent Neural Networks for Nonlinear System Contro!," Proc . . Int~rnational Joint Conference on_ Neu~etwQ!ks, pp. 315-320, Baltimore.
!IQ
Time (aecoadJ)
Fig. 5. Global control in region 3 for 20%- >10%->20 % power level change. (a) Relative reactor power. (b) Exit temperature (unit: °C). (c) Control rod speed (unit: fraction of core length per second) (Solid line: reference. DOlled line: plant output)
C. C. Ku and K. Y. Lee (1992c), "System Identification and Control Using Diagonal Recurrent Neural networks," Proc. 1992 American Control _C9l.l.f~f.!K~, pp. 545-549, Chicago.
5. CONCLUSION This paper described the diagonal recurrent neural network (DRNN) based control system, which includes the diagonal recurrent neurocontroller and neuroidentifier. The DRNN architecture was shown to have a very good dynamic mapping characteristics. Moreover , it requires much fewer neurons 800
C. C. Ku and K. Y. Lee (1993), "Diagonal Recurrent Neural Networks for Dynamic Systems Control," to appear in IEEE Trans. on Neural Net ~QI~.k~.