PREDICTING THE GREENHOUSE INSIDE AIR TEMPERATURE WITH RBF NEURAL NETWORKS Ferreira, P.M. • Ruano, A.E. " ,*
, Universidade do Algarvc, Faculdade de Ciencias e Tecnologia Campus de Gambelas, 8000 Faro, Portugal Email:
[email protected]@ualg.pt "Institute oj Systems & Robotics, Portugal
Abstract: The application of the radial basis function neural network to greenhouse inside air temperature modell ing has been previously investigated by the authors. In those studies, the inside air temperature is modelled as a function of the inside relative humidity and of the outside temperature and so lar radiation. A second-order model structure previously selected in the contex t of dynamic temperature models identification, is used. Several training and learn ing me thods were compared and the app lication of the Levenberg-Marquardt optimisat IOn method was found to be the best way to determine the neural network parameters. Such a type of model is intended to be incorporated in a real-time predictive greenhouse environmental control strategy, which implies that prediction horizons greater than one time step will be necessary. In this paper the radial basis function neural network will be compared to conventional auto-regressive with exogenous inputs models, on the prediction ef the greenhouse inside air temperature, considering prediction horizons greater than one time step. Keywords : Neural Networks, Greenhouse Environmental Control, Modelling, Radial Basis Functions, Te mperature Prediction
I. INTRODUCTION
greenhouses is to improve the environmenta l conditions in which plants are grown. The aim of GEC is to provide means to further improve these conditions in order to optimise the plant production process. The greenhouse climate is influenced by many factors, for example the outside weather, the actuators and the crop. Methods aimed at efficient ly controlling the greenhouse climate environmen t must take these influences into account, and that is ach ieved by the lIse of mode ls. The design problem being considered is to model the inside air temperature of an hydroponic vegetable production greenhouse as a function or the outside solar radiation and temperature, and the inside relative humidity. The application of the RBF NN to greenhouse inside air temperature modelhng has been previously investigated (Ferreira et al., 2000a) . This type of feed-forward NN is structurally simple and may be ch aracterised by a nonlinear- li near topology in the parameters. Ex is ti ng hybrid train ing me thods al-
Feed-rorward layered neural networks (NNs) are widely applied in many fields of engineering to perform some type of non-linear data processing. In the fie lds or identification and mode ll ing of non-linear systems their universal approximalOr prope rt y is exploited. In sit uations where the data generating function is a non- linear ti me-varying functi on, it is stan dard practice to train first the networks off-l ine, and subsequently to adapt the trained neural networks online. One type of feed-forward NN which in recent years has received growing interest is the radial hasi.1 jim etirill (RBF) NN. In this rarer one application or RBF NNs to a RrecllfulIIse e/lvironmCl/ral control (GEe) prohlem is discussed. The main purpose 01
--- _•._ - --- ,- - -
I The aUlhors wou ld l i ke 10 acknowle.dge Ihe FCT (projecl
MGSn39()(,f9')()() JnJ gran l SF R H/R D1I2Vi/2000) for slIpporting Ihis work ,
67
F•• t.er ".nd bet tar production ··
R.d.ucthm ot co.e.
performs a mapping, f, from an input space, X", to an output space, 'Y"'. The hidden layer appJies a non- linear transformation to the inputs generating an hidden space which in general has a higher dimension than X. Broomhead and Lowe (1988) proposed the RBF network which is described by the following equation:
we.ther
Condition.
11
f(x;)
= l,a;tp(llx;-cdl)
(I)
;= 1
where the {C;} 7= 1 are a set of points called centers which, together with the se t of weights, {a; };~ I ' have to be chosen in order to minimise the distance, from the approximation f to the target)', stated as:
Fig . I. Hierarchical greenhouse environmental control strategy ready rcllcct this structure, as found on RBFs, but fai I to fully cxploit it on the minimisation of a single ex[Jlicit training cr iteri on. An algorithm based on unconstrained deterministic optimisation u~ing LevcnbergMarquan.1l (LM) methods, which expl()it~ this feature on the minimisation of 1\ new training urilericlI1 (Fel'teiruI:H!cl RIJIiln(>', 2(:)00) . h~\s been pfOPOSCCI and analysed, und ~\ strategy ·for It.s(m·Ul1o(!pplitlallon is
ttlsos1.Iggested (Ferrelra (it al,. i
O()()c),
N
= l, (y; -
'E(f)
f(x;))2
(2)
)= t
,a,l
Defining A ;:; [a;, ... as the lincar weight vector, eq . (1) elln be written in order to the weights in the following compact rorlTl, (3)
FinaUy, a COI11-
[JnriliM study (Ferreira el (d., 2000(,') where various off-line nnd on-line methods were considered rcveilled thlllt the cm-line LM method exploiting the serorahility of ranlll101tltS nch lcved the hest perfoi'mance 011 this modelling problem, The bli.sis rOrcQmpllrison was the onc-time-slep-ahead predicl,ion errOr nnd the nctwork size. Th is model is intended to he used in (I greenhousc adaptive predictive hierarchical contro l scheme as shown in fig. I. The 1:1:;0(,11" prod ictive control on the GEe prohlem im[Jlies that rrediction horizons grenter than one time step will hc necessary. In this pa[Jer the application of the olT-line and on-line LM mcthods will he compared to conven tional clI.llo-regm.l'sive with C'XO[
where y is nn N -by-I vector of the desired tnrsct vul\.Ies and 4l is nn N-by-n matrix whOseel()Jnenls tp;,1 are the Vllluesot'the radial bll~i$ f~lI1ction~ centred at {Cl H~ Isndevalunted a.t lhCPOh"~S{.:~J}7"" t · <1>+ detlotesthe pseudo-inverse of 4>. The most used function in RBF NNs is a Gaussia'n fUnction ofthe form:
" i4lltj ~J,rll'
tp; (XI) =tt:""r
3. TRAINING METHODS 3.1
qtJ~lifl(!
Imillin,f(
In thisapp'r'ottch the e~nire IOC(ltions.Jhe spr'eods of cenlers nnd the outpat linear weightS arc nil determined under a supervised learning prf)(,edure hased on unconstrained deterministic optimisation. Bnsica ll y new parameter values are cak:ulated in an iterated manner in order to minimise the crlst function,
'£=
~
i
2 ,,,, I
(t (i) - y(i))
(4)
where t is the vector of target values and vector y . . 7 IS defined from eq. (I) as y = [J(XI), ... ,.f(XN)] . Eq . (4) can aIso be rewri !ten as:
2. OVERVIEW OF RBFNS A RBF NN consists of three fully connected layers. The first is the input layer connecting the source nodes to the hidden layer of the network, which is composed of a certain number of units , called neurons . The outputs or the hidden layer are then linearly combined by a set of pa rameters to produce the overa ll netwo rk response in the output layer. This way the network
'E=
I
2
lllt-yl12
(5)
= [al , ... ,a,,] T , v = [c; , ... ,c;',0"1 , .. , ,0",, 1T and w = [v T , uT] T. The neurons output N-by-n mat;ix is 0 = [
68
{Xi, 9j }~=k-M+ I of the trainin g data, un til the termination criteria (9) is met. At k + I the Ilrst inputoutput pair in Z is disearded and the one pertai ning time step k + I is added. Assuming that the dimension of Z is large enough two conclus ions can be drawn : its statistical propel1ies at k + I are essen ti ally the same as in k, and its distribution on input space is representative of the process data to some extent. As a consequence, the point w in parameter space that minimises n at time k + 1 will be the same as in k with a slight correction, The choice of M is application and prob lem dependen t and care shou ld be taken with its cho ice in order to satisfy the assumptions made . A more detailed explanation and analysis of both the offline and on-line algorithms can be found in Ferreira et ai, (200 1) and Ferreira and Ruano (2000),
network can be descr ibed by eq. (6), where the linear dependence of the network on the output weights and the dependence of 0 on v have been made explicit. y=O(v)u
(6)
Eq. (5) now becomes: 'E (w)
I
=2
11 t
--
0 (v) u 11
(7)
The formulatIOn presented so far involves all the network parameters in the optimisation procedure. As already mentioned the output weights can be optimally determined by the least squares (LS) solution. Substituti ng the target values vector, t, in eq . (6), denoting matrix 0 (v) by A and solving for u yields,
u = A+t 4. THE ARX MODEL where A+ stands for the pseudo-inverse of matrix A. Substitut ing this result in eq. (7) gives the new training criterion : 'E(v)
= ~ Il t-AA+tll
The ARX model was previously employed (Ferreira et at., 2000b) in the identification of dynamic temperature models. The model parameters are recursively identified by a recursive least squares (RLS) algorithm with ex ponent ial forgetti ng and anti estimatorwindup (Astrom and Wittenmark, 1989), As the parameters are slowly variant, exponential forgetling was applied in order to have a time varying weighting of data and therefore keeping tack of parameter changes and consequently of the process dynamics. Some of the input signals of the model being identified have no change over relatively long periods of time, for instance, solar radiation during the night. Thi s may lead to a typical problem of exponential forgetti ng, called estimator-windup,
(8)
This new training criterion does not depend on the li near pa rameters, u, and explic itl y incorporates the find ing that, whatever values the nonlinear parameters v take, the u parameters employed are the optimal ones, For non -linear LS problems the LM algorithm is recog nised as the best method , as it exploits the sum of-the sq uares characteristic of the prob lem (Ruano et 01., 1992), Let n k denote the training criterion in iterati on k, The optimisation procedure is iterated unti I a set of termination criteria is met (Gi ll et al., 198 1) . Assume 9k is a measure of absolute accuracy, where "Cf is a measure of the desired number of correct fi gures in the objective function :
5. EXPERIMENTAL SETUP The experiments consist of predicting the greenhouse inside air temperature empl oyi ng the three methods presented in the previous sections , The input-output model structure was selected from a previous work (Cunha et al., 1996), using the ARX mode l, where several hypot hesis were tested and the best on e chosen by means of the Akaike information criterion (Akaike, 1974), It is a second order model with oll e delay from the outside so lar radiation to the inside ai r temperature. The data set used is composed of 4257 po ints aequired with a sample rate of 5 min utes. This data set is show n in fig. 2, All DC terms were subtracted from the signals, wh ich were then scaled to an amplitude one, [-0.5, 0 .5]' interval. Table I shows the values of the DC terms subtrac ted from the signa ls and the amp litude interva l from which they were scaled , The pred iction horizon considered is 24 time steps, For the off- line trained NNs the complete test data set will also be consid ered as a prediction horizon. Three sampli ng times are used, S, 10 and 15 minutes, correspond ing to 2, 4 and 6 hours of predi cted values, respec tively,
The opt imisation stops when all the following conditions are met:
.ok < 8 k il vk-I vkl l
(9)
gk is the grad ien t vector invo lved in th e LM optim isa-
ti on method,
3,2 Oil-line learnillg
The on- line learning algorithm cons idered comes from lhe LM method presented in the previous sec tion and the reasoning beh ind its implementation follows, The LM opti misati on method is iterated a certain num ber of times, at each time step k, over a subset ZM (k) =
69
JvvvWvNW~ ":1 AAAAA~ A! A~A AAAI 1 o
500
1000
1500
2000
2'500
JCOO
3500
4000
"'500
-050
5.00
1000
15{)O
2000
2500
3000
3500
4000
4500
Fig . 3. Results obtained with the ARX model.
Fig. 2. Input-output data . From top : inside air tem perature, outside air temperature, ins ide relative humidity and outs ide solar radiation
Table 2 . Summary data for the ARX mode l predictions
Tab le I. Values invo lved in signa l preprocessing
Inside re lative
t+1 t+8 t+16 t+24 Min . Mean Max.
5 min 0.0 119 0.1258 30686 134.5828 0.0017 1.6703 1.3327E03
10 min 0.0 163 0.1049 0.822 3 13.292 1 0.0043 0.3770 127.7854
is min 0.0178 0.0940 0 .3557 20277 0.0046 0.1593 t4 .4282
19
nearest to the mean . Table 2 summarises the results obtained by the ARX mode l. The firs t group of cells presents the RMSE obtained at discrete instants in the prediction horizon over all time steps . The seco nd group of cells shows the minimum (Min .), Mean and max. imum (Max.) RMSE values obtained over each prediction horizon, for all time steps. Observing fig. 3 and analysing table 2 it can be seen that better results are obtained with larger sampling intervals , sometimes by a factor of one order of magnitude . Increasi ng the sampling interval has probab ly a fi ltering effect which produces some denoising of the input signals, resu lting in a better signal to noise relation . Considering the NNs adapted on-line by the LM method, fig . 4 presents the RMSE values obtained in the pred iction horizon over all time steps. Each co lumn of plots corresponds to one of the considered samp ling intervals as indicated . The three plots at the top relate to the LM method minimising (8) whil e the bottom plots re late to the minimisation of (5). Each of the lines in al l plots correspond to one va lue of the termination criteria (9): so lid line is for T.r = 0 . 1, dash line 0.0 I. Table 3 for L I = 0.05 and doted Iine for Lr shows the RMSE va lues obtained over the prediction horizon for all time steps. Th is tab le is composed of six groups of ce lls arranged in three lines and two columns. Each grouped line corresponds to the samp ling intervals considered,S, 10 and 15 minutes , respectively, from top . Each grou ped column rel ates to the mm imisation of (8 ) and (5). For each group of cells the columns are for the values of L / involved in the termination criteria (9). T he lines correspond to th e minimum, mean and maximum (from top) RMSE values obtained . Fig. 5 presents resu lts obtained with two NNs adapted on- line. The pl ots on the le ft are
For the off-l ine tra ined NN the first 1000,500 and 333 points, corresponding to each sampling interval, are employed to train the network and the rest for testing purposes. Over the prediction horizon the model predicted values are used for the inputs related with past values of the inside air tem perature. The size of the ne tworks was c hose n as the best performin g one from a previous comparison study (Ferreira et a/., 200 I) . 6 and 8 neurons are used for the on-line adapted and off- line trai ned NNs , respec tively. For the on- line NN three d ifferent values, LI {O. 1, 0.05,0.0 I}, for the termination criteria (9) are tested and M takes the size of one day of data. In the off-l ine case the va lue of 0 .0 I is employed. The initial values for the eenters of the RBF NNs are obtained by one iteration of the optimal adaptive k-means clu stering algorithm (Ch inrungrueng and Sequin, 1995 ). The spreads of the Gaussian activatio n functions are determined as in Haykin (1998) . The initial linear we ight vector for the LM minimis ing the new criterion is dete rmined usi ng the LS optima l va lues with a sma ll perturbation .
6 . RESULT S AND DISCUSSION Regarding the ARX model recursively identified with the RLS estimator fig. 3 prese nts results for the various samplin g intervals considered . For every samp ling int erva l the pl ot at the to p is the root mean square error (RMSE) obta ined over the prediction horizon at every time ste p. 'I'he second graph presents the RMSE for each predic ti on instant ove r all time steps . The third and fourth plots present prediction sequences corres ponding to the hest RMSE obta ined and the one
70
LM rnl~g('O)
17)
Fig. 5. Results obtained with the on-line adapted NNs.
Fig. 4. RMSE values over the prediction horizon for the on-line adapted NNs.
Table 4, RMSE values obtained over the prediction horizon by the off-line trained NNs .
Table 3. RMSE values obtained at each ti me step, over the prediction horizon LM minimising (8) 0.01 O.O~ 0.1 0.0016 0.0018 0.0019 0.0299 0.0279 0.0326 0.1 98 7 0.2909 0.6066 0.0022 0.0029 0.0035 0.0379 0.0359 0.0385 0.3155 0.2619 0.2538 0.0031 00055 0.0032 0.048 1 0.0510 0.04 19 (U 712 0.4 599 O'i108
L"
LM minimising (8) Min. Mean Max. 0.0013 0.0482 0.2060 0.0575 0.2062 0.0055 0.0124 0.0559 0.145 1
LM minimising (5) 0.05 0.01 0.1 0.0017 0.0020 0.0021 0.0307 0.0273 0.0301 0.3490 0.3073 0.5662 0.0037 0.0034 0.0023 0.0343 0.0485 00320 0.1184 0.1809 1.3554 0.0037 0.0046 0.0053 0.0350 0.0390 0.0668 0.7324 0.11 19 0.2270
for a NN designed by the minimisation of (8), with a sampling interval of 15 minutes and ""C, = 0.0 I. The plots on Ihe right correspond to a NN designed minimis ing (5), on the same cond it ions . The pl ots at the lOp correspond to the best RMSE obtained over onc prediction horizon, those on the middle are for the nearest RMSE to its mean value, and the bottom riots show the distribution of RMSE values obtained . Looki ng at lig. 4 it can be see n Ihal al most al l networks present similar results. The exception is the for those obtained by minimisation of (5) which seem to be more sensitive to variations of the sampling interval and 10 Ihe va lues or" . Analysi ng table 3 it can be observed Iha t, in general , heller resul ts are obtained with smal ler sampl in g intervals. Also, the LM methods are less sensitive to the variation of 1} than they are to the sampling interval. Fig. 5 shows that the di'>lribution of errors lor the LM me thod minimising (8) is slightl y hetter conce ntrated near zero. For the off-li ne Irained networks tah le 4 presents a summary of the RMSE values obtained during prediction. Each line corresponds to the sampling interv"ls of 5, I0 and 15 minutes, respectively from top. Fig 6 shows results fo r th e ofI-l ine tra ined nelworks with data acquired with a salllplll1g inte rval of 5 minu t.es. The left plots re late to networks obtained by minimisation of (8) "nd those on the right arc 101 networks determined minimising (5) . The plots at the top correspond t(Jthe best RMSE ohtained over one rrediction horizon, those at the midcl k to the mean RMSE , and the hottolll graphs present
LM minimising (5) Min. Mean Max. 0.181'; 0.0020 0.0415 0.0029 0.0639 0.2506 0.0058 0.0546 () 1500
Fig. 6. Res ults obtained wit h the off-li ne trai ned NNs, the RMSE distribution . The behaviour is the same as for the on-line adapted networks, although the RMSE results can , in general, be considered slightly worst. Table 5 presents the RMSE values obta ined during tra ining and pred iction for the networks designed by minimisation of the standard and new error criterion, In this case pred iction is done from the beginning of the test data set as if no measured values of the inside temperature exist fo r the rest of the data set. SI stands for sampling interval. Table 6 shows dat a related with the absolute error values obtained during prediction for the same networks. Each Ime corresponds to the sampling intervals,S, 10 and IS minutes (from top). Fig . 7 shows results regarding off-line network s for dat a with a samp li ng rate of 5 minutes. The le ft plots are for the network minimising (8) and those on the right for the network minimising (5) The plots at the (OP present the actual (solid) and predicted (doted) va lues for the las t 1257 po ints of the data set. The bottom plots show the predict ion error distribu ti on.
71
trol. first ed .. Chap. 11, pp. 410-413. AddisonWesley. Broomhead, D.S. and D. Lowe (1988). (1988). Multivariable functional interpolation and adaptive networks. Complex Systems 2, 321-355 321-355.. Chinrungrueng, C. and C.H. c.H. Sequin (1995). Optimal adaptive K-Means algorithm with dynamic adjustment of learning rate. IEEE Transactions on Neural Networks 6(1),157-169. Cunha, 1. Boaventura, A.E. AE. Ruano and E.A. E.A Faria (1996). Dynamic temperature models of a soilless greenhouse. In In:: Proceedings of the 2"d Portuguese Conference on Automatic Control. Vol. 1. I. Portuguese Association of Automatic Control. Porto, Portugal. pp. 77-81. Ferreira, P.M. and A.E. AE. Ruano (2000). (2000). Exploiting the separability of linear and non linear parameters in nonlinear radial basis function networks. In: IEEE Symposium 2000: Adaptive Systems for Signal Processing, ing. Communications, and Control. Lake Louise, Alberta, Canada. pp. 321-326. 321-326. Ferreira, P.M., E.A Faria and AE. AE. Ruano (2000a). (2000a) . Application of radial basis function neural networks to a greenhouse inside air temperature model. In: IFAC Agricontrol 2000, 2000. International Conference on Modelling and Control in Agriculture, culture. Horticulture and Post-harvested Processing. Vol. 2. Wageningen, The Netherlands. pp. 172-177. A.E. Ruano (2000b). Ferreira, P.M., E.A. E.A. Faria and A.E. Design and implementation of a real-time data acquisition system for the identification of dynamic temperature models in a hydroponic greenhouse. In: Acta Horticulturae 519: Computers and Automation, Electronic Information in Horticulture. ISHS - XXV International Horticultural Congress. Brussels, Belgium. pp. 191197. Ferreira, P.M., E.A. Faria and A.E. Ruano (2000c). Neural network models in greenhouse environmental control. In: Proceedings of the 6th International Conference on Engineering Applications of Neural Networks (EANN'2000) (Dimitris Tsaptsinos, Ed.). Kingston University, England. pp. 100-107. P.M., E.A. Faria and A.E. Ruano (2001). (200 I). Ferreira, P.M., Neural network models in greenhouse air temperature prediction. Neurocomputing Neurocomputing.. To appear in a forthcoming special issue on Engineering Applications of Neural Networks. WaIter Murray and Margaret H. Wright Gill, Philip E., Waiter (1981 ). Practicalities. In: Practical Optimization. (1981). p. 306. Academic Press, Inc. Chap. 8, p. Haykin, Simon (1998), (1998). Learning strategies. In: Neural Networks: A Comprehensive Foundation. second Chap. 5, p. 299. Prentice Hall. cd .. Chap. Ruano, A.E., PJ. Fleming and DJ DJ., Jones (1992). A connectionist approach to pid autotuning. autotuning. In: In: lEE Proceedings. Proceedings, Part D. Vol. Vo\. 139. Brighton, pp. 279-285. 279-285. U.K.. pp. U.K
Table 5. Summary of RMSE values obtained with the off-line trained NNs. LM minimising (8) LM SI
train
test
5 10 15
0.0057 0.0085 00085 0.0095
0.0962 0.0916 0.0703
LM minimising (5) train test
0.1064 0.0876 0.0642
0.0054 0.0075 0.0087
Table 6. Summary of the absolute prediction error values obtained with the off-line trained NNs. LM minimising (8)
Min. 0.0000 0,0001 0.0001 0.0000
Mean 0.0707 0.0687 0,0610 0.0610
Max. 0.3246 0.3273 0.1905
LM minimising (5)
Min. 0.0000 0.0002 0.0001
Mean 0.0760 0.0678 0.0525
Max. 0.3497 0,2788 0.2788 0,173:'1 0.1733
"lMMJ "MAN ............... 0:": \ : \ \ . \
__0.2 0.2
oo
~
\:..
200
.. 00 "00
\
80Q 800
aoo 800
1000
1200
.~
_n
6
_0 .;'1
_0,'
0 .'
0.3
LM m,,,k"l.lng (10)
Fig. 7. Results obtained with the off-line adapted NNs.
These results show that the model is stable and is not sensitive to its own predictions. It can be expected that with data from another season of the year these results would be worst.
7. CONCLUSIONS AND FUTURE WORK Analysing the results shown in the last section it can concluded that the NN models provide substanbe conduded beller results than the ARX model. Comparing tially better now on-line adaptation with the use of a fixed model, the former provides slightly better results. results. Larger improvements should be expected in continuous on-line operation, as data varies significantly throughout the year. Addressing now the use of criterion(8) versus (5) for on-line adaptation, in general, the former achieves slightly better results with a decrease in computation time. time. Future work will focus on the model input selection as the structure used was derived from an ARX model. Additionally a means to predict predict the external perturbations and the inside relative humidity must be thought. thought.
R. R. REFERENCES
Akaike, H. (1974). A new look at the statistical statistical model identification. identification . IEEE Transactions on Automatic Control AC-19 AC-19,, 716-723. Astrom, K. K. 1. J. and B. B. Wittenmark (1989). Parameter Parameter tracking tracking and and estimator estimator windup windup.. In: In : Adaptive COflCOIl-
72