Copyright © IFAC Transponation Systems Chania, Greece, 1997
NEURO-FUZZY TECHNIQUES FOR TRAFFIC CONTROL
J-J. Henry,
J~.
Farges and J.L. Gallego
Centre d'E!udes et de Recherches de Toulouse BP 4025 - Toulou.seCedex 31055 Tel: +33 562252TI9 Fax: +33 562252564
e-mail:
[email protected]
Abstract Intersection stage control using Forward Dynamic Programming (FOP) with a sample time of five seconds is already effective on the field. Nemo-fuzzy techniques ae proposed here for controlling each light each second Rules, fuzzyfication and inference ae modeled by a neural network. For each signal, the neuro-fuzzy control selects the highest membership value between 'switch on' and 'off' and presents it to a Petri net. For nemofuzzy acceleration of FOP, only controls with low membership values differences ae enumerated. Simulations on different intersections show delay reductions with respect to fixed time from 0% to 30% for neuro-fuzzy control and from 15% to 35% for neuro-fuzzy acceleration of FOP. Keywords: Transportation control, Traffic control, Fuzzy control, Neural networks, Optimal control, Algorithms, Dynamic Programming.
safe and can be modeled by state equations. the improvement of the estimation and the acceleration of the optimization process.
1. IN1RODUcnON Modem ttaffic control techniques applied to intersection control have been proven to be effective on the field (Mamo and Di Taranto, 1990; Gartner, 1990; Farges, et al., 1991). Those methods are 00sed on the modeling of the process by discrete time stare equations, on the computation of the control with a sampling time of few seconds, on the separation principle and on a rolling horizon optimization.
This third aspect implies the research of alternatives to FOP for optimization of intersection control. Fuzzy control, neuralleaming and the combination of these method (Takagi, 1990; Werbos, 1992; Langari and Berenji, 1992) are presented as atJIaCtive solutions to control problems. For those solutions, the fuzzy controller is modeled by an "action selection" neural network which produces, in ~ with given rules, a control for any given value of the state of the process (Langari and Berenji, 1992).
PRODYN, developed by CERT and presented for the fIrSt time in IFAC Baden-baden symposium (Heruy, et al.,1983) is a real time control algorithm based on commutation of stages, on a 5 seconds sampling period time, on a Bayesian estimation of queues ad on the use of Forward Dynamic Programming (FOP) optimization on a rolling horizon.
N~fuzzy control
is most of the time used to control processes without having a model of the process. In the case of traffic control, models of the process are already devel~ and allow the combination of optimal control methods with nemofuzzy techniques leading to heuristic searches. The knowledge on the process control given by the rules is mixed with the knowledge on the process behavior given by a state equation model.
To cover more functionalities, CERT undertook research having in view to reach the one secood sampling time period and the control of each light with the same, even better, optimization. This long term work presents three main aspects: the design of a controller which fIlters the controls which are not
705
for the signals of cp and such as all signals of 'If ae antagonist to the signals of cp. Then if the demands of all signals of 'If are higher than given bounds the signals of '" must be switch on and the signals of cp must be switch off. The bounds are computed from saturation flows and clearance time.
The first part of the paper is devoted to the application of the ~fuzzy approach to intersection control. A set of rules is defined and from this set a neural network is built This neural network is used in the second pan to accelerate a FOP algorithm. The third part is devoted to the test of the developed methods using a microscopic simulator.
Maximum waiting time rules. This rule takes into account the couples of antagonist signals. Any signal with a low remaining time before reaching its maximum waiting time switch off its own antagonist signals.
2. NEURO FUZZY CONI'ROL
The main steps in the development of a neuro-fuzzy control are:
Incident rule. Those rules state that if an incident is detected in a link upstream a signal, this signal must be switch off. The incident is defected by the estimator.
(i) Defmition of fuzzy rules. (ii) Translation of fuzzyfication, inference through rules and defuzzyfication in neural netwak.
Queue spill back rule. Those rules state that if the demand of a link downstream and significantly mt by a signal is too large the signal must be switch off.
2.1 . Definition of fuzzy rules Here, the whole loop of control includes a state estimator which is able to produce at each sample time an estimation of the state of the intersection. This state includes the state of the controller which is perfectly known and, for each link of the intersection, the queue, the vehicles traveling at free speed and the probability of incident
2.2. Neural fuzzyfication The fuzzyfication process consists in associating to a numerical variable membership values for the different intervals associaJed to a linguistic split of the variable. The fozzyfication is pezfonned on demand variables for the demand, maximum output flow and queue spill back rules. The linguistic terms are 'null' and 'not null' for demand rules, 'higher than bound' and 'lower than bound' for maximum output flow rule and 'too large' and 'not too large' for queue spill back rule. The fuzification is pezformed on remaining time variable for the maximum waiting time rule with the linguistic tenDS 'low' and 'high'. For the incident rule the membership value fa' 'incident' is a function of the probability of incident The color of signals which is used in the maximum output flow and waiting time rules is not fuzzyfied.
The objective of the fuzzy rules is to desaibe, depending on the state of the intersection (x), which kind of control must be applied at each signal. The rules are defined in oo:Ier to switch on signal with demand and switch off signal without demand, switch on the set of comprabble signals which ensure the maximum output flow, do not switch on a signal in conflict with another signal which may reach its maximum waiting time, switch off signals which presents incidents upstream and switch off signals with saturated links downstream. Those rules ae bised on four specific quantities which are given or can be computed from the intersection state: (i) the demands in the links which are defined as the sum of queuing vehicles and vehicles traveling at free speed, (ii) the probability of an incident in a link, (iii) the value of remaining time before reaching the maximum waiting time for each signal and (iv) the color of the signals.
The neural implementation of the fuzzyfication is based on the sigmoid function: S(y) = (1
+ exp(_y»·l
(1)
This process consists in defining the membership function as the difference of two sigmoids for lit interval between two values and as the output of me sigmoid for and an interval between one value infmity. For instance:
Demand
an
rules. this set of rules states that if the demand on one link is different from zero the downsIream signal must be switch on and that a signal with null demand on upstream links must be
J.l..i.-IlD:l_.n(X) = S(wcf(x)-b)
(2)
= I-S(wd1(x)-b)
(3)
switch off.
and:
Maximum oUJput flow rules. In mier to descnbe those rules the conflict graph between signals (A) is introduced. Two signals i and j are antagonists if 3ld only if iAj. Then if the set of green signals is denoted by cp the rules consider all the possible set of signals ('If) such as the sum of the saturation flows associated to the signals of 'If is higher than the sum
~lD:l.n(X)
were J.I.m......J(x) is the membership value for the interval, cf(x) is the demand on link I computed from the value of the state x, and the coefficients w aIXI b are respectively the weight of the input and the bias.
706
For the color of signals which are not fuzzyfied sigmoids are also used in oder be consistent with the other variables but the weight of the input variable is quite high and the biaS is set to the half of the weighL The set of sigmoids associated to fuzzyfication consists the first layer of the action selection neural netwOIk..
network the control with the highest membership value is selected for each signal. Because they ~ produced by fuzzy rules those controls may not fulfill the intersection safety constraints. It has been shown by Gallego, et al. (1996) that the design of controllers using timed Petri nets can be made in such a way that the controller acts as a filter of switch-over signal sequences. If the input sequences do not comply with safety constraints. at the CODlr3ry output sequences. applied on the field. do.
2.3. Neural inference The rules are descnbed in the following formalism:
The selected values are successively presented to a timed Petri net in decreasing order of membership. As de.caibed by Gallego, et al. (1996), this Petti net implements the following safety constraints: cycle of colors with minimum or fixed timings, clearance time for antagonisms between signals and maximum waiting time bef
If([cl.l and Ct.z and ...CtAl)] or [~1 and ~ and ...~ or ... or [c..... and C..,.z and ..•cm.z>(m)D then r The fuzzy logic of Lukasiewicz is used in order to compute the membership value of r from the membership values of ciJ which are given by the first layer. Note that at to an 'ij' indice corresponds a semantic description such as for instance 'demand link I null'. The operations 'and' and 'or' ~ formalized as a sum of the variables plus a bias (different for 'and' and 'or') which is the argument of a maximum or minimum function. Denoting by J..I.i.i(x) the membership value of ci,j and J.I,.(x) the membership value for r, we have: a,{x)=max(0J.li.l(x)+J.l.i.z(x)+ ...+Jl;."ci)(x)-n(i)+l)
For a futme use of the "action selection" network in a backward mode such as learning by backpropagation, opposite Boolean values would be respectively assigned to the membership values 'switch on' ml 'switch off' of each signal.
(4)
lIld: J.I,.(x) = min(1,al(x)+az(X)+ ...+a",(x»)
(5) 3. SPEEDING FORWARD DYNAMIC PROGRAMMING
Noticing that the membership values and the a;(x) values vary between 0 and 1, the max(O,.) am min(1 •. ) functions can be merged in a single function. This function is equal to 0 from -infinity to 0, equal to the identity from 0 to 1, equal to 1 from 1 to +infmity and approximated by a sigmoid. Equations (4) and (5) become:
3.1 OjUimizatiQn criterion The criterion chosen for the rolling horizon optimization is the discounted delay. The form of the instantaneous reward r() is:
a;(x)=S(% (x}+aJ.l.;.z(x)+ ...+
(8) lIld: J.I,.(x)
= S( D:a (x)+D:a (x)+...+o:a",(x)-aI2) 1
Z
were x is the state, u the control, i the time index inside the horizon, 0: the discount factor (a value slightly less than 1) and DO the delay of vehicles for the time slice number i. At the end of the horizon a terminal criterion which is an approximation of the delay from the end of the horizon to infinity is aH!d The contribution of each queue to the terminal criterion is compared assuming that the avemge arrival and exit I3teS obsezved until now will stay constant As a consequence there is a strong penalty on large queues at the end of the horizon.
(7)
where the value of 0: is 2ln(9) which ensure that the mor on the approximation is less than 10%. FmaDy the inference process is modeled in the action selection neural network by two additional layers of nemoDS. The first layer computes the membership functions of the 'and' (for instance [c u and Cu Bi ~.]) and the second layer the membership value of the results (r).
3.2 Forward Dynamic Programming 2.4 Defozzyfication
Practical optimization by FOP was de.caibed by For each signal the nemal network produces a membership value for 'switch on' and a membership value for 'switch off'. For a forward use of the neural
Farges (1990) and is based on equation (9).
7CJ1
=
R;'I(S;'I) min {r(x;(SJ,~,i)+R;(SJ} Ui,Si / f(X;(SJ,~,i)E Si+hUiE U
computed as the absolute value of the difference between the membership values of 'switch on' an 'switch off .
(9)
were i is the time index. S a subset of the reachable state space were costs can be compared. R is the optimal cost associated to a subset, r() is the reveille given by equation (8), x is the value of state which ensure the optimal cost, u is the control. fO is the state equation and U is the set of admissible controls. The initialization of the equation at the beginning of the horizon is:
1
J.l.swiu:h off
Confident switch 9tr' , , Conflicting
,, "
(10) No rules
where X is the state predicted by the estimator and the set SI is the one such as XE SI. The information associated to a subset Si is the state Xi(SJ, the optimal cost Ri(S;) and. the control 11;.I(SJ an previous set S;'I(SJ which ensure the minimization by (9). Naming Slructure this infonnation, the algorithm is as follow:
o
rules
,,
.
Cehfident
1
Fig. 1. Confidence of fuzzy control This leads to the following fonnula:
(a) Store inside sttucture 0, the initial state and a null cost Link the structure 0 to the time 1 root. Do b for time 1 to the end of the horizon. At the end of the horizon select the structure which get the lowest R. Recover controls using backward links Si.l(SJ.
u(x) =
I ~(x) - J.L-~x) I
(11)
The bound on the confidence level is initially set to zero and modified at each sampling period (le) by the integratioo of a feedback of the difference between the time taken to compute the sub-optimal control (l'r) and a desired computation time (T.) which is lower than one second.
(b) Do c for all the structures of the current time. (c) Do d for all discrete values of control inside U.
Et...l =Elt + K(T. -Tr)
(d) Apply state equation to the state associated to the structure and the control to get a new state. Adi the revenue to the cost of the structure to get a new cost Determine the set associated to the new state. If no structure associated to the set is found in the next time chain do e. If one is found and its associated cost is larger than the new cost, do f.
(12)
The feedJack coefficient (K) is set to a very low positive value in order to avoid an oscillatory behavior. At the initialization of the algorithm £ is set to zero. This ensme a very quick computation of the control, because in that case there is no optimization by FOP and the algorithm consists only in applying the controls given by the neural netwcxk for the successive states of the ttajecury over the horizon. Then e is increased by equation (12) until Tr reachesTr
(e) Create a new structure, link it to the next time and fulfill it with the new state, the new cost, the discrete value of control and the backward link.
(f) Fill up the existing Slructure with the new state, the new COSt, the discrete value of control and the backward link.
4. SIMULATION TESTS 4.1. Scenario
3.3 Mixing FOP and N~fuzzy control
The two methods previously described (Neun>fuzzy control and acceleration of FOP by ~fuzzy control) were tested on three intersectioos of inaeasing complexity with the SITRA-B+ (Pommier and Barbier, 1992) microscq>ic simulator. TRANSYT 7 (Robertson, 1969) was used to compute plans for the reference case. The simplest intersection shown on figure 2 presents 2 links and 2 signals. The length of the links in lOOm. The average demand at this inter:section is 1080 veh./h.
In the context of signal by signal control (defined as the opposite of stage controO the multiplication of alternatives makes the optimization based on a traffic model very cmnbersome. Indeed the step c of the FOP algorithm consists in ordering and enmnerating all the switch-over possibilities at each signal. In ader to minimize the exploration of alternatives, only the controls for which the confidence level ('U(x» is lower than a given bound (e) are enumerated, the other controls are given by the neural netwOIk. As shown by figure 1, the c:onfidence level of the control of one signal at a given sampling time can be
Figure 3 presents the medimn complexity intersection. This 8 links, 6 signals intersection
708
presents specific pockets and signals for left tmns on one axis. The length of the pockets are 80m and the length of the four other links is lOOm. The average demand at this intersection is 1650 veh./h.
Table 1 Total travel time results for the simple intersection (veh.*s)
Free speed travel time TRANSYTI
o
Travel time 30073
Delay
78252
48179
63733
33660
61200
31127
N~fuzzy
~
control N~fuzzy
o signal
accel. of FOP Table 2 Total travel time results for the medimn complexity intersection (veh. *s)
o loop Fig. 2. Simple intersection
Travel time 46205
Delay
time TRANSYTI
140165
93960
N~fuzzy
119156
72951
112751
66546
Free speed travel
•
... dl
control N~fuzzy
ID -
accel. of FOP
o signal
Table 3 Total travel time results for the complex intersection (veh.*s)
Dbop Fig. 3. Medium complexity intersection
1-00
The complex intersection shown on figure 4 presents 10 links and 8 signals. The length of the 3 internal links is 50m and the length of other links lOOm. The
-
demand at
II:! q o signal o loop
~
Travel time 49485
Delay
time TRANSYTI
162700
113215
N~fuzzy
162781
113296
145546
96061
Free speed travel
;s 1924 veMo.
control
Nemo-fuzzy accel. of FOP
0- 0
Table 4 Delay reduction (%) with Ql!1imized fixed time control
I§a
il ~
remect
10
Intersection
Method
simple
mediwn
complex
30
22
0
35
29
15
N~
Fig. 4. Complex intersection
fuzzy control
4 .2. Results
Nemofuzzy
acceL Tables 1. 2 and 3 present the total travel time results for respectively simple. medium complexity DI complex intersections. In those tables the total travel time at free speed. computed from the demand. is also given in order to estimate the total delay.
of FOP
4.3. Comments The Nemo-fuzzy control gives an average benefit of 17%. This average benefit is of the same ader of magnitude of a benefit expected with PRODYN.
The table 4 summarizes the results obtained in terms of delay reduction.
709
am
Thus this approach can be considered efficient it is expected that the method is less sensitive to parameter tuning because only saturation flows are considered in the rule definition. Good results are obtained for simple medium complexity intersections, but the ~fuzzy control is not able to poduce a good control for the complex intersection. The use of a traffic model seems necessary to manage that kind of intersection.
am
The fusion of nemo-fuzzy control with optimal control based on the modelization of the traffic pocess gives a satisfactory solution with an average benefit on delay of 26%. This average benefit is larger than the benefit expected with PRODYN. It seems that in the fusion each method compensates the weak points of the other one: (i) the a:celeration of the FDP algorithm by the neuro-fuzzy control allows a smaller sampling time than PRODYN 3ld (ii) the actual optimization of the control by the FDP improves the performance of the Nemo-fuzzy control. Moreover with this solution and the tenninal criterion presented here the queues remain stable all along the simulation period for the large intersection. This is not the case for the nemo-fuzzy control or for an optimization without tenninal criterion.
Gallego, J.L., Farges J.L. and Henry JJ. (1996). Design by Petri nets of an intersection signal controller. Transportation research - part C Vol. 4 No 4, 231-248. Gartner, N.H. (1990). OPAC: strategy for ~ responsive decentralized traffic signal control. In: Control Computers Communications in Transportation (J.P. Penin (Bd.», 241-244. pergamon Press, Oxford. Henry, JJ., J.L. Farges and J. Tuffal (1983). The PRODYN real time traffic algorithm. In:
Proceedings of the 4th IFAC-IFIP-IFORS Conference on Control in Transportation Systems, Baden-badeD. Langari, R. and H.R. Berenji (1992). Fuzzy logic in control engineering. In: Handbook of inteUigent control (D.A. White and D.A. Sofge (Ed.», 93140. Van Nostrand Reinhold, New York. Mauro, V. and C. Di Taranto (1990). UTOPIA. In:
Control Computers Commuilications in Transportation (J.P. Perrin (Bd.», 245-252. Pergamon Press, Oxford. Pommier, L. and M. Barbier (1992). Manuel utilisaleur de SITRA-B+. Technical report,
CERT-ONERA, Toulouse. Robertson, DJ. (1969). Transyt a uaffic netwc:xk Sbldy tool. Technical report, TRRL, Crowthome. Takagi, H. (1990). Fusion technology of fuzzy theo)' and neural networks - survey and fu1lJre directions. In: Proceedings of the Internatio1lfJl Conference on Fuzry Logic & Neural Networks, 13-26. Iizuka, Japan. Werbos, PJ. (1992). Neurocontrol and fuzzy logic: connections and designs. Journal of approximote reasoning, 185-219.
5. CONCLUSION The objective of this study was to derme and assess a satisfactory real time intersection control method with the constraints of a sampling time of 1 secood and a signal by signal control. In a first step a nemofuzzy control method was developed shows good results for simple and medium complexity intersections but a poor perl'ormance on a complex intersection. In a second step the idea of mixing neuro-fuzzy control with optimal control leads to a control method which presents good performance (15% . to 35% delay reduction with respect to 1RANSYT 7) for all tested intersections.
am
Future research will continue on that topic. Indeed the control foond by neuro-fuzzy acceleration of FDP can be learned by the nemal network using backpropagation or reinforcement techniques. If that pocess converges, a slow improvement of the control can be expected.
REFERENCES Farges, J.L. (1990). Optimal control of pedesIrian crossings. In: Proceedings of the 80 congresso brasileiro de aulOmatica (J.N. Garcez J.R. Brito de Souza (Ed.», 348-353. IFAC I SBA. Farges, J.L., I. Kamdem J.B. Lesort (1991). Realization and test of a prototype for real time mban traffic control. In: Advanced Telematics in Road Transport - Proceedings of the DRIVE conference (CEC-DG xn (Ed.», 527-542. Elsevier, Amsterdam.
am
am
710