Computers ind. K~qf, VoL31,No.3/4,pp. 855- 859,1996 C ~ h t 0 1995Chinat/ach~ Prm Publi~edbyEbevierSeieweLtd.Pdut~ inGm~Bdtain
Pergamon
S0360.8352(96)00279-3
o360-8352/96 s15.oo+o.oo
Optimal Design of a Star-LAN Using Neural Networks Mitsuo GEN
Yasuhiro TSUJIMURA
Syunsuke ISHIZAKI
Department of Industrial and Systems Engineering
Ashikaga Institute of Technology, 268 Ohomae-cho Ashikaga 326 .Japan E-mail : {gen tujimr g94501}Ogenlab.ashitech.ac.jp Abstract : Optimal design of a Star-LAN includes a few important and difficult sub-problems. An optimal IIUB allocation problem is one of the sub-problems. Neural networks based on the Boltzmann machine are suitable for solving s,ch problems. In this paper, we apply the Boltzmann Machine Neural Networks(BMNN) to optimal IIUB allocation problems on a Star-LAN computer networks. We also show some numerical experiments to demonstrate performances on solving the problems. Keywords : Boltzmann Machine Neural Networks, Star-LAN, Optimal HUB Allocation Problem, Simulated Annealing
1
Intorduction
We assume that a communication subnet is available, and we want to design a Star-LAN that connects a collection of terminals with known demands to the subnet. This problem is often addressed in the context of a hierarchical strategy, whereby groups of terminals are connected, perhaps through Local Area Networks(LANs), to various types of IIUB, which are in turn connected to higher levels of HOB, and so on. It is difficult to recommend a global design strategy without knowledge of the given practical situation, llowever, there are a few sub-problems, known as the IIUB allocation problem. The IIUB allocation problem is generally formulated as a 0-I Integer Programming (0-11P) problem, llowever, it must take much time to solve the problem by using conventional techniques, when the size of the problem is very large like a real-world problem. Therefore, it is need to develop an efficient method for solving such a large-scale problem. On the other hands, the field of neural networks is a very broad one, drawing researchers from many fields such as computer science, engineering, physics, neurology, biology and diverse, aspects of neural networks and they often have a quite different perception of what neural networks are. Particularly, neural networks are one of the most 'powerful tools for solving nonlinear optimization problems. In this paper, we propose an efficient method for optimal allocation of IIUBs using the BMNN. Particularly, neural networks based on the Boltzmann machine are suitable for solving combinational optimization problems. The Boltzmann machine is a kind of stochastic feedback neural networks consisting of binary neurons which appear probabilistically in one of two states ON or OFF. The algorithm used by the Boltzmann machine to locate energy function minima is a simulated annealing approach. Simulated Annealing (SA) is a stochastic strategy for searching the state of neurons corresponding to the global minimum of the energy function. Because the optimal IIUB allocation problem on a complex and real-size Star-LAN is hard to solve, it is very ellicicnt to use tile BMNN for obtaining the optimal design of the network. We also show some numerical experiments to demonstrate performances on solving the problems.
2
Optimal H U B Allocation P r o b l e m
The optimal IIUB allocation problem is to locate adequate number of IIUBs to n potential IIUB sites, and to connect m computers to the IIUBs that themselves connected to a communication subnet shown in Fig.l. The objective of this problem is to minimize the total cost to locate IIUBs and to connect computers to the located IIITBs with deciding adequate number of IIUBs enough to connect all of tile computers. The optimal 855
856
I8th International Conference on Computers and Industrial Engineering i
mm
,-n
•
--
".>
,- - ,
•
i-~
, ,
I_J
@ •
l
--
•
: Computer
•
• ; ? ' Potential H U B Site
Fig.1
[] • H U B
Illustrations of the optimal HUB allocation
IIUB allocation problem is formulated as the following : rain
C(z, y ) =
+ ZSiyj i=l
Jml
(i)
3rel
tt s.t.
ZzO
= 1,
i = 1, 2, . . . , m
(2)
./ = 1 , 2 , . . . , .
(3)
m
z,~_
Z
y, ~ n
(4)
3ffil
z,j = Oor 1, y,j = Oor 1
i = 1, 2, ..., m, j = 1, 2, . . . , n
(5)
where C(z, y) : the total cost of locating HUBs and connecting computers to the HUBs, c, 3 : the cost of connecting computer i to HUBj, bj : the cost of locating [IUBj, /fj : the maximum number of computers that can be handled by HUBj, and the decision variables are f l; if computer/is connected to IIUB, x ° = ~. 0; otherwise ,
f l; if a HUB is located site j 10 = ~. 0; otherwise
The constraint (2) assumes that each computer is connected to only one IlUB. The constraint (5) is the 0-1 integer constraint. This problem has been treated extensively in the operations research literature, where it is known as the warehouse location problem. It is considered to be a difficult combinatorial problem that can usually be solved only approximately.
3
Boltzmann
Machine and Simulated Annealing
A large class of combinatorial optimization problems can be solved by neural networks with k neurons described by the computational energy function as following : k
E(,) :
k
~
k
~,,z,,, -
~
z,e,
-
(e)
where the vector z = [zt, z2, . . . , z~] r represents the state of the neural network, the matrix W = [wiy] is asymmetric matrix which represents the synaptic weights between the neurons and the vector @ = t e l , e2, " " , e~] r contains the input (external) bias signals. The energy function E ( z ) may contain many local minima, so that is may be very difficult to find a good solution using the llopfield neural networks which updating rule is only guaranteed to converge to the nearset local minimum. In order to escape from bad local minima one may be forced to use more sophisticated optimization strategies than the gradient descent. There exist different stochastic procedures for performing tile hill-climbing necessary to avoid getting stuck in local minima. Another frequently exercised and promising approach is tile use of the Boltzmann machine[5]. The Boltzmann machine is a kind of stochastic feedback weights[4][5]. In fact the Boltzmann machine is an energy minimization network consisting of statistical neurons which appear probabilistically in one of two states ON or O F F (e.g. +1, -1). The algorithm used by the Boltzmann machine to locate energy' function minima is a SA approach
18th International Conference on Computers and Industrial Engineering
857
[4][6]. SA is a stochastic strategy for searching the state of neurons corresponding to the global minimum of the energy function (6). This strategy has an analogy to the physical behavior of annealing of a molten solid (e.g. metal)[6]. The Boltlmann machine introduces artificial thermal noise which is gradually decreased in time. 'rhi~ noise allows occasional hill-climbing interspersed with descents. Strictly speaking, the fluctuations of the energy function E(z) are allowed to be a Boltzmann probabilistic distribution (hence the name of the network): exp ( P(z)--
E(--~- )
exp/__~
)
(7)
where the sum runs over all possible configurations of the states, and T is a controlling parameter called the computational temperature, in physical systems the temperature T has physical meaning; in the Boltzmann machine the temperature is simply a parameter which controls the magnitude of fluctuations of the energy function E(z). The idea is to apply uniform random perturbations to the output states of the neurons and then determine the resulting change A E in the energy, if the energy is reduced (i.e. A E < 0) the new configuration is accepted, llowever, if the energy is increased (i.e. A E > 0) the new configuration may also be accepted but with a probability proportional to e x p ( - A E / T ) , in other words, we must select a random number Nr between zero and one using a uniform density function. If N, < e x p ( - A E / T ) , then the new state is accepted, otherwise it is rejected. At a high temperature the probability of uphill moves is the energy function is large, llowever, at a low temperature the probability is low, i.e. as the temperature decreases fewer uphill moves are allowed. The simulated annealing allows moves uphill in a controlled fashion, so there is no danger of jumping out of a local minimum and falling into a worse one. In practice the probabilistic acceptance or rejection of the state is achieved by adding to each neuron a separate "thermal" noise Ni. So the output state of any neuron can be computed as: r
/
k
\1 2
-I_ \
I
.is=0
/J
The gain 7 should be high enough so that the sigmoid activation function closely approximates tile signum function. Each neuron should be fed by an additive zero-mean independent (uncorrelated) noise source in such a way that its state will be unaffected by the noise applied to other neuron[8]. The noise must be slowly reduced in time in order to perform a process of SA. The updating rule based on tile SA algorithm can be performed as follow: !. 2. 3. 4.
Get an initial system configuration; begin with in arbitrary state z(0). Define a parameter T which represents a computational temperature, start with T at a large value. Make a small random change in the state. Evaluate the resulting change in the energy function E(z).
5. If the. energy function is reduced (improved), retain the new state, otherwise accept the transition to the new state with probability P = exp(--A/T). For this purose select a random number N~, from a uniform distribution between zero and one. If P ( A E ) is greater than N., retain the new state, otherwise return to the previous state. 6. Repeat steps 3 though 5 until the system reaches an equilibrium, i.e. until the number of accepted transitions
becomes insignificant. 7. Update the temperature T according to an annealing schedule and repeat steps 3 though 6. In practice the decrease factor of the temperature between two steps is chosen as 0.85 to 0.96 (typically The,, = 0.93Told). Tire algorithm stops when the temperature is small enough to consider the system to have reached a state near the ground state (absolute minimum).
4
A l g o r i t h m B a s e d on B M N N
We inlroduce tire algorithm implemented tire SA updating rule. Before showing the algorithm by using tire exterior penalty function method, we transform the optimal tlUB allocation problem, given by eqs.(l),-.(5) to the following unconstrained single objective quadratic penalty function to be minimized :
c'(z,~,.)
= I2F i=1
b,u, + ~"
c+jz,, +
3=1
• ,,(0)=z~ ° ) , ~ , ( 0 ) = y ~ °)
([h,lz)])' +
Jfl
i=],2,
([g, l z , ~ ) ] _ ) ' + ([glv)]-)' J=l
,m
i=1,2,
,.
t9)
858
18th International Conference on Computers and Industrial Engineering
It is added the constraints of the original problem as penalty terms. By replacing w~j and zl of eq.(6) to co, bj and rlj, yj ofeq.(9) respectively, we can solved the optimal IIUB allocation ploblem by BMNN. The algorithm based on BMNN is shown as follows; Step1 Set the initial temperature Ts and the terminal temperature Te of the temperature parameter T, and inp,t control coefficient e, so, ca, mgnt, acre and the number of freguencies of learning I,,,z. And set the initial weight, wl°) = co, w~°) = bj and I = O. Step2 Transform the orignal optimal HUB allocation problem given by eqs.(1)~(5) to the energy function E(z, y, ~) by using the exterior penalty function method. Step3 Iterate from Step3.2 to Step3.6 until the temperature parameter T reaches the stable state. ~;. (o), 6w~0) be zero. And set T = Ts, k = O. Step3.1 Let zl °), y~o), time t, _,~, Step3.2 Generate a integer random number ir either from [I,,n] or from [1, n]. If the desi.~io, varialnle i., T,,-,j, the range [ l , m ] is used for ir. And if the desision variable is y . , the range [ l , n ] is used. Calculate u as
following
:
/ ~?'~. (') - -( ' ) n = ] ),*l W;rJ " *lr.j / - (a) • .(,) wit
Jir
Step3.3 Generate a random number rn from [0, l]. Decide the raJues of =,;-(*+I),y~k+1) according to the following conditions : z(k+,) {~ ; rn
; -~- > s o
mgnt
,.
mgnt
-- ~u - < c a
; otherwise
Step3.4 Calculate the cost function C(z, y). And update 6wl~+]) and 6w(:~÷1) U following / . ow,,(h) "4"1 ; ~,, _(h÷s) ~ 0 (k+,) °wO = fi"~ij . (k) ' ; oterwise f
6w( h+') = f fiw~k)-F l ~ ~ 6w~h)
~ 0 ; .(h+n) Yj ; oterwise
Step3.8 Update the temperature parameter as : T T = - l+t
Stepa.6 l f T > Te, set t = t-t- 1, k ffi k + 1, and return to Step4.2. Otherwise, go to the next step. Step4 Updite the weight as : ~(,+,) i)
.(,).
sfiwl~ )
~--- w i . ~ "1"
GCWI
wI,+,) = w~') + e6w~ ') uclru.
'
Step5 If i = 1. . . . output the values of decision variables and the cost f.nction C ( z , y ) , and terminate. Else, set I = I + 1, and return to Step3.
5
Numerical Example
We solve the following optimal design problem with 31tUBsites-8computers shown in Fig.2 by using the proposed method as a numerical example.
•
,e,
O
Computer Fig. 2 O p t i m a l d e s i g n p r o b l e m w i t h 3 H U I ) s i t e s o g C o m p u t e r s
18th International Conference on Computers and Industrial Engineering
859
Costs of locating ll U Bs and costa of connecting computers to li U Bs are given in Table 1. The maximum nuinber of computers that can be handled by iiUBj, Kj = 4(j = 1, 2 , 3). Table 1. H U B allocation costs and computer connection costs
I I U 5~°-1 io I J ~ t-~-~-~sZ~JL-ID~ I 3 - ~ - i s ~ - [ -~° I ~o I to I-iO I l~-liVl[~0o I This problem is formulated as following : 8
3
3
m,n O~..~)-- ~: Ec,,.,, ÷ E~,,, ~=1 ) m l
istl
3
s.t.
~-'~zO = 1,
i = 1,2,...,8
8
~-'~ z~j < 4yj,
j ----1,2,3
3
jII
xo=O,l
ly.~=O,l
i=1,2,...,8,
j----1,2,3
[Solutions] The results of the simulation in summarized in Table 2. in this table another solutions obtained by using typical Linear Programming (LP) method are also included to show our solutions are right. T a b l e 2. R e s u l t i n g s o l u t i o n s
r~l,
,,
,,,
,
o o , o o , o o , o o o T~_-I Io Io I, I o to i, Io Io I, Io Io Io I-~--~I:~
oloHIolol,l,lol,, 6
~,
Conclusion
In this paper, we applied the Boltzmann Machine Neural Networks to optimal llUB allocation for designing a Star-LAN. And we showed a numerical experiment to demonstrate good performance on solving the prohJenm. In un illustrative example, the optimal solution for the Boltzmann machine neural networks is as found as a combinational optimization teclmique. Finally, a parl of this research was supporled wilh University-In--University Cooperative Research, Ihe Inlemalional Scientific Research Program (No. 07045032), Granl-in-Aid for Scienlific Research by the Minislry of Japan, the Japanese Government.
References Ill BertsekLq, D. and R. Gallager : Data Networks, 2nds ed. Prenitece-l|all, pp.448-451pp, 1992. [2] Sagara. N : b, troduction to Mathematical Programming, Morikita, pp.65-68, 1976. (in Japanese) [3] Cichocki, A and R.Unbehauen : Neural Neta,orks/or Optimization and Signal Processing, pp.526, Wiley, 1993. [4] Emile, A and J.Korst : Simulated Anne,lit 9 and Bolt:mann Machines, 272pp, Wiley, 1990. [5] G. E. Jlinlon, R. J. Sejnowski and D. 11. Ackley : Boltsma,n Machines : Constraints Saris/action Networks that Learn, Tech. llep. CMU-CS-84-119 Carnegie-Mellon University, Dept. of Computer Science, 1984. [6] S. Kitkpatrick, C. D. Gelatt and M. P. Vecchi : Optimization by simulated annealing, Science, 1/oi.220, pp.6TI-680, 1983. [7] Csrsten, P and J. R. Anderson : A Mean Field Learning Algorithm for Neural Networks, Complex System, Vol. I, pp.995-1019, 1987. [8] .I. Alspector, J. W. Gannett, S. ]laber, M. B. Parker and R. Chn : A VLSl-efficient Technique for Generating M,Ilil,le Uncorrclated Nois Source and Its Applicatuons to Stochastics Neural Networks, IEEE Trans. Circuits and System, Vol.38, No.i, pp.109-123, 1991. [9] Nakano, K : introdflction and Practice Neuro Comuputer, Gijyntsuhyouronsya, pp.318, 1989. (in Japanese) [10] Nakano, K : Basic o/Neural Computing, Corona-sha, pp.248, 1990.(in Japan)