A New Family of Multivalued Networks

NeuralNetworks, Vol. 9, No. 6, pp. 97%989,1996 Copyright 01996 ElsevierScience Ltd.All rightsreserved Pergamon Printedin GreatBritain. 089A5080/96$1...

Download PDF

2MB Sizes 1 Downloads 46 Views

Report

PDF Reader
Full Text

NeuralNetworks, Vol. 9, No. 6, pp. 97%989,1996 Copyright 01996 ElsevierScience Ltd.All rightsreserved

Pergamon

Printedin GreatBritain. 089A5080/96$15.00+.00

@ PII: S08934080(96)00016-0

CONTRIBUTED ARTICLE

A New Family of MultivaluedNetworks MAHMUT H. ERDEMt ANDYUSUF OZTURK Ege University

(Received9 December1994;accepted6 November1995)

Abstract-This paper introduces a new fmily of multivalued neural networks. We have interpreted the Hopjield network as encoding the same/dt~erent information of elements of binary patterns in the connect~onsand developeda scheme which encodes biggerlsmaller information of multivaluedpatterns in the connections with the aid of signum function. The moakl can be constructed as an autoassociative memory (mtdtivalued counterpart of Hopjeld) or heteroassociative memory (multivalued counterpart of BAM). We have used Lyapunov stability analysis in showing the stability of networks. In simulations the energy surface topography of the moakl is compared to that of Hop@eId. Also, asymptotic stability and basin of attraction of the stored patterns are examined. The proposed model can also be used in solving the optimization problems. Mapping a problem on to the network is relatively easy compared to the Hopjield moakl because of the multivalued representation. Very good results are obtained in traveling salesperson problem simulations. Copyright Q 1996 Elsevier Science Ltd. Keyword&Associative memory,Parallelprocessing,Contentaddressablememory. synaptic connections. Cellular learning rules suggested that the associative mechanisms need not require complex neural networks (Kandel & Hawkins, 1992). Although binary (or bipolar) representations are extensivelyused in engineering applications for their simplicity, real world quantities are analog. Representing analog values with two state elements is a fundamental problem (Lippman, 1987). However, multivalued representation is a much relevant and direct approximation to real world data. In this paper we propose a new multivalued, recurrent, nonlinear associative memory structure inspired from Hopfield’sbinary associative memory. Originally it is developed as an autoassociative memory (MAREN) (Erdem & Ozturk, 1994a).The objective of MAREN is to store the multivalued patterns as asymptoticallystable states. Hence, when the network is started at a state close to a stored pattern, it is expectedthat the system evolvestowards the stored pattern by updating its neurons until the neurons does not change anymore. In general, associative memories are heteroassociative (Kosko, 1988). We have developed a heteroassoeiative version of MAREN: multivalued multidirectional (associative) memory (M3) (Erdem & Ozturk, 1994b). In M3, the aim is to store associated multivalued patterns in a neural network and recall them later. In recall, the input can be any

1. INTRODUCTION

It is long sinceunderstood that knowledgeis essential to intelligence.Data is immenseand can be acquired easily. However, extracting or forming knowledge from raw data is a different matter and it is very difficult.Learning is the process by which we acquire new knowledge and memory is the process by which we retain that knowledge over time. Learning is a major vehicle for behavioral adaptation. Learning often involves association of simultaneous stimuli and permits storage of information about a single event that happens in a particular time and place. For example we recognizethe face of an acquaintance in a specific context. The stimuli of the face and the setting act simultaneously to help us recognize the person. Associative memory is not only a memory system in the context of conventional computer but also a system that implies higher level brain functions such as learning and perception. Storing these associations in the brain involves changes in the strength of t This article is dedicated to the memory of Mahmut Hilmi Erdem. Requests for reprints should be sent to: Yusuf Ozturk, Dept. of Computer Engineering, Ege University, 35100 Izmir, Turkey; Tel: + 00 232 388 72 21; Fax: + 00 232 388 72 30; E-mail: ozturkyf.ij)staff.ege. edu.tr.

979

980

A4.H. Erakm and Y. Ozturk

combination of noisy or incomplete patterns. A vector n-tuple (A, B, C. ..) defines a state of the network. In proving the stability, we used Lyapunov stability analysisby identifyinga Lyapunov or energy function E with each state. In M3 the energy fimction has local minima in the states corresponding to stored n-tuples (in an ndirectional memory) and has lower values elsewhere. In recall phase, neurons update their states according to a rule that monotonically increases the energy function. Hence, the network finally reaches and stops in a local maximum corresponding to a stored pattern. By storing the patterns as asymptotically stable states with a basin of attraction around them, they can be reached by their noisy versions. Much like the Hopfield model, MAREN can also be used as an optimization network (Erdem and Ozturk, 1994c)by defining an energy function which translates the problem to a different domain which enables the network to solve it. It is interesting that, although a discrete Hopfield network can not solve optimization problems, MAREN, which is also a discrete network, can. There are two major advantages. It is easy to map a problem on the network and it is easy to implement it on a digital system. After a brief discussion of Hopfield network in Section 2, Section 3 introduces MAREN concept. Section 4 shows how the model can be used in a multidirectional memory system. We have suggested using MAREN in optimization problems in Section 5 by giving two example problems; Traveling salesperson problem and graph K partitioning problem. Each section concludes with simulation results. Comments and comparisons with other networks are presented in Section 6. 2. HOPFIELD MODEL In 1982 Hopfield introduced a powerful neural network model for binary associative or content addressable memory (Hopfield, 1982).Later Hopfield and Tank (1985) demonstrated that the model (actually the analog Hopfield model) is capable of solving NP-complete or simple optimization (Hopfield & Tank, 1986) problems. Although its basic elements (neurons) are very simple, the system composed many of them is very hard to analyze. In the analysis of his model, Hopfield used an approach called Lyapunov stability analysis.In this method an energy function is defined for the system under consideration and that function is shown to decrease or stay at the same level,hence, the system will settle at some minimal energy state. The most attractive property of the method is that equations of motion are not to be known or solved. On the other hand, deriving an energy function for a

system is hard and requires trial and error (Ogata, 1970). To use the Lyapunov method in CAM applications, there are two requirements to be satisfied. First, the topography of the energy function must have local minima only at the points where stored patterns lie, but not elsewhere.Second, an updating rule must be found which monotonically decreases the system’senergy, so that eventually the system settles down to a local minimum. In Hopfield’smodel processingelements(neurons) asynchronously and randomly update their states according to (1)

Here Vi is the state of neuron i and Tij is the connection between neurons i andj. Sinceall neurons are connected to all other neurons, there is a very strong back coupling. Hopfield claims all interesting results arise as consequences of this feedback. The learning rule of the model is ~i = ~ (2J5- 1) (21’5- 1)

(2)

where ji = Tij. This equation says: if the two neurons are the same, increasethe connection strength between them, if they are different,decrease it. Thus, by this learning rule the information of “same” and “different” between all pairs of a binary pattern is encoded. The energy function of Hopfield model is defined as (3)

which can be rewritten as

As stated before, since the topography of energy function must have local minima at the points where stored patterns lie but not elsewhere,Emust have low values for stored patterns and higher values for random patterns. Let us analyze eqn (4) considering eqn (2).

A New Family of Multivalued Networks

E/2 = –(2~1–

981

1) (2@ – 1) V, Vz

(2fi’ (2fi’ (2~’ (2~’ (2~’ (z~;-,

1)(2~’ - 1)V,v, -... 1)(2~’ - 1)V,v, 1)(2~’ - 1)V,v, 1)(2~’ - 1)V,V,-... 1)(2~’ - 1)V,V,-... -... - 1)(2V; - 1)VN-, ~N - (Zp;-, - 1)(2V; - 1)VN.,~N-... - (2~-, - 1)(2~~- 1)~~-1 ~N.

-

-

(5)

optimization problems. The problems to be solved must be formulated in terms of desired optima often subject to constraints. The basic property of the network described above is the minimization of E. Through the construction of an appropriate energy function for the systemand a strategy for interpreting the state of the neuron as a solution, an optimization problem may be “mapped” on to the network (Hopfield & Tank, 1985).These problems can be as hard as TSP or as simple as A/D converters. 3. MAREN MODEL

If the pattern is stored one, say S1(V1 = Vi), one quarter of the bold terms willeach be 1, and the mean of the other terms will be O. Thus the energy of a stored pattern will be E.,oti % –(N2– N)/8.

(6)

The architecture of the proposed network is identical to that of Hopfield. Every neuron is connected to all others with connection strengths Tij. Suppose we wish to store the set of states @(s = 1.. .p) which their elements ~(i = 1 . . .N) can take one of M discrete levels.The storage algorithm of the proposed model is defined as

If the pattern is a random one the mean of all terms in eqn (5) will be O.

The greater the similarityof a pattern to a stored one, the closer the energy to –(N2 – N)/8. Let us seewhy eqn (1) monotonically decreases the energy function. In eqn (l), the current state of the updating neuron is ignored. Table 1 shows the possibilitiesof the state of neuron i and ~ Tij Vj and the effect of update on energy. Since updating neurons i changes only the terms including Vl,

The signumfunction is defined as, sgn(x) = 1,–1, Oif x >0, x <0 and x = Orespectively.This definitionof signum function is valid throughout this paper. The storage prescription in eqn (8) increases the connection strength between neuron i and j if Vi> Vj and decreases it if Vj > Vi. If Vi = Vj the connection strength is left unchanged. Thus, we encode “bigger” and “smaller” information between all pairs of a multivalued pattern. When learning is completed, the case Tij >0 indicates that there are more patterns in which Vi > Vj than patterns with Vj > Vi and vice versa. The energy function of the model is defined as

AE= AEi= E(Vi(t + 1)) ––E(Vi(t)) — . ‘1/2(Vi(t

+ 1) – Vi(t)) ~

. –1/2AVi ~

Tij~

TijVj.

which can be extended as In Table 1, it is obvious that eqn (1) satisfiesAEi <0 in all situations. The capacity of the Hopfield network is about O.15N, where N is the number of neurons. Detailed information about the capacity and other issuescan be found in Hopfield (1982)and several other papers. References Hopfield and Tank (1985, 1986) explain how such a network can be used to solve

E/2 =

sgn(~’ – V~)sgn(V1 – V2) + sg13(vy– @)sgn(vl – v’)+ . . . + sgn(vy – ~p)sgn(v, – Vz) + Sgn(q – ~l)sgn(vl +

TABLE1 NeuronUpdates in Hopfield

vi(t) o o 1 1

~

Til

30 <(J ~o <0

Sgn(fi’ – Vf)sgn(v,

+ sgn(~p – ~’)sgn(vl

ViVi(t + 1)

AVi

AE

1 o 1 0

1 0 0 –1

<() 0 0
–

V3)

– VJ) + . . . –

V3)

+ Sgn(~~-,

– ~~)sg13(vN_ I – vN)

+ sa(~~-,

– ~~)sg33(vN_l –

+ Sgn(v’”-, – ~j)sg@vN_l

VN)+...

– ~N).

It can be seen that if the input pattern is equal to

982

M. H. Er&m and Y. Ozturk

one of the stored patterns, say s1, the bold terms will each be 1, except the terms in which (Vi = Vj). The sum of all the other terms will probabilistically be near O.Then the total energy will be Emti x N2 – N.

3.1. UpdatingRule The neurons must be updated in a way that never decreasesthe energy. Let d be the differencebetween two consecutive discrete values and let us define the energy changes when we update neuron i as

(11)

A~ = h’(k’i+ d) – E(Vi)

If the pattern is a random one the sum of all terms will be near O.

AK = E(Vi – d) – E(Vi)$

From the above analysis it can be seen that the model forms an energy surface with local maxima near the stored patterns. On the other hand, in eqn (9) we consider either a neuron is greater or smaller than another neuron. Hence, two patterns having the same ordering among their components will have identical energies. Consider the following two patterns: P1 = (2,3, 1,4), P2 = (1, 5,0,7). E(P1) = E(P2). Thus, if P1 is a stable pattern P2 is also stable. However, if the number of discrete levels a neuron takes, M, is less than the number of neurons N, the probability of tiding two patterns with the same ordering will be very small and if exists the two patterns will be similar. Another case is with the patterns with a lot of equal components (H= ~ = V, = . . .), as can be seen from eqn (10) the energies of these patterns will be small and can not be stored as asymptotically stable points. These are the two constraints in the proposed model. Remember that similar constraints exist in Hopfield model. Patterns with a lot of zeros can not be stored and because of the symmetry if (1011 ...) is a stable pattern, so is (0100 . . .). To partially overcome these constraints and also improve reliability of the proposed model another term can be added to the energy function, E= ~

Z’ijsgn(Vi – ~) + ~

~

(15)

i.e., A~ and ~ denote the differencein total E when we increment or decrement J’i respectively. Every neuron update must satisfy AEi >0. This can be achieved by simply incrementing and decrementing V’iand observing the change in total energy.If the energy increases by increasing or decreasing Vi, we increase or decrease Vi respectively.If both decrease the energy no change must be made. These situations are depicted in Table 2. Consider the updating rule below. >0

Vi+ d vi= vi –d Vi

if

ki <0

ki = s&(A~)

–s@(A~).

=0 (16)

Equation (16) realizes the table except the last row. When the change of Vi in any direction increases the energy, update rule does not change the neuron’s state. Hence, eqn (16) guarantees AEi >0. Updating Vi changes only the terms including Vi in eqn (10). We can rewrite eqn (15) as

Ti.jsgn(Vi – V.j) (13)

(14)

where ~~j~ = 1 . . . M) are fixed neurons which are assigned iixed values corresponding to each discrete M level. There is no need to connections between these fixedneurons (l’’’~i,~j = O). Using theseneurons, the number of neurons in the system is increased by M. We have tested the contribution of tied neurons by simulations. Although they increased the performance slightly, we have preferred to omit this modification in order to keep the system robust and compact.

An alternative updating rule can be considered:as assigning a random value (V,) to Vi and accept it if AEi >0 and do not accept it otherwise. TABLE2 NeuronU@ataain MAREN

. 0 >() <0 <0 =o <0 >()

. 0 <0 >0 <0 <0 . 0 >0

Vi+ d or Vi Vi+ d Vi – d Vi Vi+ d or Vi Vi – d or Vi V}+ d

A New Family of Multivalued Networks

983

TABLE3 EnergySurfaceTopography of MARENandHopfleld. S: atorad,R: random,A: avarage,D: atandard deviation, k%atoradpettarna, N: numberof neurons,M: numberofvaluea

N =30

Hopfield

M=5 R

s P

A

D

3 5 8

–112 –106 –116

3 5 8

830 851 850

A

D

A

18.97 26.07 30.33

35.91 0 43.38 –1.3 50.69 –0.5 M =20 143.4 27.2 208.6 –13.8 305.8 34.1

Tij[s~(Vr – Vi) – sgn(Vi _ ~)].

160.58 219.67 253.94 M =40 142.01 204.09 295.89

856 816 829

A

D

–12.1 14.2 –4.6

186.06 237.65 313.96

–3.8 –0.1 –0.8

187.56 231.16 320.34

as the standard deviation measure. The averagesare about –(~ – N)/8 for Hopfield and N2 – N for MAREN as expected. As the capacity increases standard deviations also increase. In Hopfield, deviations of random patterns are less than that of stored patterns, whereas, in MAREN random pattern deviations are greater than that of stored patterns. Also we have investigated the asymptotic stability of stored patterns. We input stored patterns and let the network convergeto a stable state. Then, we have computed the distance between the output and initial patterns.

(18)

= Vi if AEi <0

R D

720 686 650

205.7 267.2 302.4

Vi= Vrif AEi >0

AEi = ~

s

(19)

Although this updating rule is simpler than the previous one, it requires a random number generator for every neuron. In our simulations we have used eqn (16)

3.2. SimulationResults L1(X,Y)= ~ IX,– Yil

First of all we wanted to observe the energy surface topography of MAREN and compare it with the Hopfield model. We have computed the energies of stored and random patterns for various capacities (P) and discrete levels (M). Table 3 shows the averages and standard deviations. We have used

is a suitable distance measure. (Ll),v is computed by taking the average of L1 for various network trials. Table 4 shows the results for various P and M. Patterns are stable with about 20°/0 error ((Ll),v/it4). For M = 40 results are better. But this is misleading.When M is increased, updating the rule given in eqn (16) can not increase the energy by

1/2 Xi – X.,)z

(x(

)

TABLE4 Aaympfotic Stabflity Analysis

N=30 P=5 P=8

(Ll)av

1

2

M=5 M=20 M=40 M=5 M=20 M=40

97

3 14 17 10 4

90 1

3

4

5

6

7

8

34 26

33 29

14 17

5 9

1

1

28 5

34 11

24 13

8 22

1 21

13

9

10

11

4

>10

TABLE5 Percentage of RandomPatterns Convargad toNeara StoredPattarn

N= 30 P=5 P=8

(Ll)av

1

M=5 M=20 M=40 M=40

100

2

3

4

5

6

7

8

9

10

11

12

24

48

28 6

6

16 12

20 22

26 24

20 24

6 16

2

>12

M H E

9

i

d

a

Y O

an d i

H p

T

a

p 5

–

p

c

a

~Bj)

=

(

a

‘Sgn(vl?j

–

A

a g

O c

c

s M=

t ~,.ii),

=

E(A,

p a

a

p p

r

m

c

M =

o

M =

i

c a M

a

p

M

U a

N s s

a h H

ah H

a r

a

e e

m

a

p

. . .)

e m

U

s

b

g

m m

(~1,

BI),

(x42, B2),

...,

f c c

1 C

n

a T = ~

s

(20)

~ = –

– V

n

l

B r

g

Aa M

d f

c

r a

p p

(

c

=

Vlfj)

a

s

d E E

B = ~

~

T

– l

(

= N

(

i

a B

=

j

j=L

T

a s

(

a

m a p A B

a

985

A New Family of A4ultivaluedNetworks

directional memory. In synchronous update, every layer updates one at a time and every neuron in a layer updates simultaneously. Connection strengths and energy function are computed similar to eqns (8) and (9).

A

Tij = ~sgn(V~i

– VEj)

(29)

FIGURE2.Thraadimensional network arohitaotura.

considered.Equations (16)and (17)become (for layer A);

k’Ai=

I V“i

=0

(25)

A~i = ~

~j[s@(vA~

+

d– V~J) – s@(k’Ai – VBj)]

4.1. SimulationResults

(26)

A~i = ~

~ij[s@(vAi

– d–

vBj) – s@(~Ai

– ~Bj)] .

Also eqns (18) and (19) become, VAi = Vr if AEAi 20

(27)

= k’Aiif AEAi <0 AE = AEAi = ~

~ij[Sgn(vr

MAREN can be considered as a srxcial case of M3 in . which there are N layers consisting of only one neuron.

– vBj) – s@(vAi – vBj)].

(28)

In the simulations we have used eqn (27). In the multidirectional case, the network consists of many layers. Every neuron is connected to every other neuron except to neurons in its own layer. Figure 2 depicts the network architecture of a three

We tested the pattern completion capability of M3 in the bidirectional case. In simulations we have stored P pattern pairs (Ai, Bi) in the network and given the original pattern Ai or Bj as the input. Then, let the network produce the other pattern by updating the layer corresponding to unknown pattern several times (synchronous update) until the energy does not increase anymore. In simulations we have used L1(~ I~’g - ViI/N) which corresponds to the average absolute error between the elements of the original pattern and the pattern produced by the network, as the error measure. Figure 3 depicts the results for various M (number of discrete values each neuron can take) and P (capacity) values. L1.v/M is computed by taking the average of L1 in several simulations and normalizing it by M; N, the number of neurons in each layer, was 30. Of course, the layers can have

Llf@M 0.3

‘i cm 02

rl15 al OM o P=3 FIGURE3. Pattern complatlon raaulta.

+

+ F=6

P=12

986

iU. H. Erdem and Y. Ozturk

different numbers of neurons. Then the capacity is determined by the minimum of these numbers. As simulations suggest, about 0.13 N (comparable to BAM) patterns can be stored in M3 with about 0.15 error (Llav/M) in recall. 5. MAREN SOLVES OPTIMIZATION PROBLEMS There are a wide variety of problems in engineering, science and commerce which can be cast as optimization problems. In an optimization problem there is a “cost fimction” which must be minimized or maximized and a set of constraints which must be satisfied.The aim is to find the variables’values that minimize (or maximize) the cost function while satisfying the constraints. These kinds of problems can be found in robotics, perception, pattern recognition, etc. In many problems huge numbers of variables and combinatorial complexitymake it very hard to get an optimal solution. In addition, many of these problems are N – P complete type. Examples include traveling salesperson problem (TSP), graph k-partitioning, etc. Often, instead of a best solution a very good (but fast) solution is sufficientin real world applications. In a pioneering work Hopfield and Tank (1985) demonstrated how a neural network can be used to solve an optimization problem, namely TSP. Later, they and others showed that the analog Hopfield net is capable of solvingmany N – P complete or simple optimization problems (Tank & Hopfield, 1986).In an analog Hopfield network (Hopfield, 1984), differential equations govern the evolution of neurons. Although neuronal computations are carried out in analog mode, neurons converged to binary values as the parallel computations proceeds. Since the network is analog, it is very easy to implement electronically with conventional circuit elements. However it is hard to simulate on a digital computer. Furthermore, it is pointed out that the network has a convergence difficulty in large scale problems and some techniques have been developed to overcome this problem (Wilson & Pawley 1988). In Section 3, we have derived the energy function for MAREN. Neuronal state changes can only increase this function. This is achieved by simple trial and error, if a new value does not decrease the energy function, the updating neuron takes it as its new state. Thus, the essential character of the network’s evolution is to increase the energy function. If a representation scheme which decodes the final states of neurons as a solution and an energy function whose maximum value corresponds to the best solution of the problem can be found, we may solve it by the network.

5.1. MappingTSP on the Network The traveling salesperson problem is defined as follows. There are N cities where every city is connected to every other city by a path whose length is dij. A tour is a closed path which starts at a particular city, visits every city only once and returns to the first city. The problem is to find the shortest tour. In an N-city TSP network there are N neurons and the discrete levels are; 1,2, ..., N. Every neuron denotes a city and the state of this neuron represents the visiting order of that city, e.g., if ~E = 3, city B will be the third city to be visited, etc. When the network converged to a legitimate tour, neuronal states will be like (in a 5-city TSP say): 35241.Since the sequence stands for: VAP’BP’CV’D VE,it defines a closed tour: ECADBE. There are two necessities in defining the energy function; a tour must be a valid tour and the energy of a short tour must be greater than that of a long tour. To satisfy the former, valid tours must have higher energies than invalid ones. In a valid tour all neurons willhave distinct values. In an invalid tour some neurons (at least two) will have identical states. So, we must “punish” the states which have some equal valued neurons. Consider the equation below, El = A ~

[s~(Vi – ~)]2.

(31)

Here, A is a constant, if Vi= Vj, the squared term will be zero, otherwise it will be equal to 1. Thus, El willhave a maximum value if Vi <> Vj for all i <> j. In this case, E, =A(fv2– N).

(32)

In addition, the energy function must favor short tours rather than long tours among valid ones. In a valid tour, abs(Vi – Vj) >1 where i <> ~.

(33)

Here, abs(x) is the absolute function; abs(x) =x, –x, 0 if x c Ox>0 and x = Orespectively. sgn[abs(Vi – ~) – I] = O

or 1.

(34)

Equation (34) will be O if Vi = V + 1 and 1 if Vi= fi + x(x > 1). Now consider eqn (35) E2= –1/2B~d~j[l - [sgn(abs(Vi– U) - I)]].

(35)

In eqn (35) B is a constant. If Vi= ~ + 1, i.e., if

987

A New Family of A4ultivalued Networks

city j will be visited after or before city i, dijwillbe added (here dij is the distance between ith and jth cities),otherwise will not. Divisionby 2 indicatesthat a distance between two citiesare added twice; dij and ~i. To include the distance between the first and the last cities another term must be added. ES= -1/2B~dij[l

+ [sgn(abs(Vi -L)

-N+

1)]]. (36)

E2 + E3 is exactly (minus) the tour length times B. The total E will be, E = El + E2+ E3.

(37)

Hence, the energy of a valid tour is E = A(N2 – N) – B * tour length.

It is obvious that EmaX correspondst. the best solution when A and B were chosen properly. In the network Tij = dij and Tii = dii = O. The updating rule must monotonically increasethe energy function E. A random value is chosen for neuron i and let the energy change be AEi due to the change of neuron i; if AEi >0 then the new state will be accepted, otherwise no state change will take place. Random asynchronous or synchronous updates can be considered. In asynchronous update, a neuron is randomly chosen and it is decided whether it should take a new random value (V,) according to eqn (38). Vi= Vr if AEi >0

(38)

= Vi if AEi <0 AEi = E(Vr) - E(Vi) AEi = A ~[(sgn(V,

– ~))2– +

(.$@(vi

– v))’]

[sgn(abs(v,- ~) - 1)11

-

l/2B~dij[(l

-

[1 - [sgn(abs(Vi - ~) - 1)]]]

l/2B~dij[[l + [sgn(abs(V,- ~) – N+ l)]]] (39) - [1+ [sgn(abs(V, -~) -N+ l)]]. -

In synchronous mode, all neurons update simultaneously according to the same rule. The process continues until no energy change takes place. Then the network is said to be converged.

5.2. MappingGraphK-PartitioningProblem The graph k-partitioning problem is an extensively studied combinatorial optimizationproblem. Given a

graph G = (V,E) with N verticeswith vertex weights ~i and edges between the vertices eij, the problem is to partition the graph into k partitions of nearly equal weight (i.e., the sums of the weights of vertices assigned to the partitions are nearly equal) such that the cut size (i.e., the number of edges with an end point in different partitions) is minimized. The MAREN network for this problem consists of N neurons corresponding to each vertex in the graph and each neuron can take a value between 1 and K corresponding to K partitions. Also, each neuron should hold a value ~i which stands for weight of corresponding vertex. In the steady state of the network, the verticesin the graph whose corresponding neurons all have the same state belong to the same partition. The energy function has two components, (a) Zeij (where Vi# V) must be minimized. (b) z~i~j (where Vi= V) have a minimum value when the sum of vertex weights assigned to each of the K partitions are equal. Also, it has a high value when there are less than K partitions. The term sgn (abs(P’i– Vj)) takes Owhen Vi= Vjand 1 when Vi<> ~. Hence, the energy function is defined as E= l/2A ~e,jsgn(abs(Vi + B~

– V))

Wiwj(l – sgn(abs(Vi – Vj))).

(40)

Here, E must be minimized in order to reach a solution. The first term minimizesthe weighted sum of edgeswhich belong to the cut and the second term ensures that the partitions are balanced. The parameters A and B denote the relative importance of the constraint that the partitions be of equal sizeas compared to the value of the cut. To minimize the energy function we can followthe strategy used in the previous subsection. A random value for neuron i is tried and it becomes the new state if AEi <0. In the Hopfield network of the problem there are NK neurons arranged in a matrix form; Vij denotes the assignment of vertex i to partition j. The energy function is,

The first and second terms guarantee that each vertex is assigned to one and only one partition. We do not need these terms in MAREN because this constraint is always satisfied. The third and fourth terms are functionally equivalent to the fist and second terms of eqn (40).

988

M. H. Erokm and Y. Ozturk TABLE6 TSPSimulation Reeuite

1@ random trials Cities 10 20 30 3W 50

100 network trials

Average

Minimum

Better than Minimum

Minimum

0/0 valid

A

B

5.37 10.05 14.38 14.38 22.57

3.14 6.19 9.07 9.07 16.58

0 10 23 13 54

3.14 5.39 6.49 7.93 11.84

55 72 64 59 83

1.5 2 1.4 3.8 4

8 8 5 10 10

a Synchronous update.

5.3. TSP SimulationResults Simulations of the TSP problem for 10,20,30,50,80 cities were performed on a microcomputer system. Locations of cities were chosen at random on the interior of a square of edge length 1. The A and B parameters in eqns (31), (35), and (36) were easily found by a number of random choices. Choosing A >> B results in almost always valid tours, but tour lengths then become longer. Choosing B>> A strongly favors short tours rather than long ones, however, most of the trials converged to invalid tours. The network was started by setting all the neurons equal to N/2 in order not to favour any tour. Then each neuron was updated randomly and asynchronously according to eqn (38)until the energy does not increase anymore. Tour lengths obtained by the model were compared to random tour lengths. In Table 6, 105 random tours are compared to 100 trials. As the number of cities increase, results get better. For very large scale problems convergence difficulties are expected to emerge. However, the network was seen to converge easily even for 80 cities. The second and third columns in Table 6 indicate the average and minimum 100 000 random tours. The fourth column gives the percentage of 100 network simulations that produced the tours better than the third column. The minimum of 100trials can be found in column 5. The percentage of valid tours is depicted in column 6. The last two columns show the parameters A and B used in simulations.For a 50-city problem the network did an excellent job by performing better than the minimum of 105random tours 547. of the time. Remember that, in a 50-city TSP there are 3.04 1062(50!/2*50) distinct tours. Synchronous update was also tested and somewhat inferior results were obtained.

6. CONCLUSION In this paper three different multivalued neural network paradigms based on the same core idea are presented. All the three structures hold information

in a distributed manner. The crucial point whichmust be emphasized is that sgn(Vi – ~) in MAREN and M3 encodes the ordering (bigger or smaller) information among the components of a stored pattern like (2Vi– 1)(2~ – 1) in Hopfield (for binary patterns) and ViVj in BAM (for bipolar vectors) encode the “similar” or “different” information. Actually, the error rates in MAREN and M3 are not small.However, considering200/.error it must be pointed out that MAREN and Hopfield use the same amount of information (connections). However, unlike Hopfield memory which should decide whether a neuron’s value is 1 or O, MAREN is expected to find a value among M possibilities.Also, compared to BAM, M3 is supposed to find a neuron’s value among M possibilities. Results obtained by MAREN optimization model are comparable to Hopfield’s. The proposed network has several advantages. (a) It is a discrete network, so it can be easily implemented or simulated by the existing digital technology. Remember that a discrete Hopfield network can not perform better than random (Hopfield & Tank, 1985). (b) It is much faster (especially in synchronous mode) than the discrete models which use simulated annealing. (c) Unlike the Hopfield model which has convergence difficulties,it can be applied to large scale problems. (d) Becauseit has a multivalued and not a binary character, it is easy to map a problem on to the network, e.g., there is no need for a permutation matrix in TSP. The major disadvantage is that, it is not as robust as Hopfield and the computations performed in neurons are relativelymore complex.

REFERENCES Erdern,M. H., & Ozturk, Y. (1994a). MAREN: multivslued recurrentnetwork.Acceptedin 3. Inter.Conf. on Automation, Roboticsand ComputerVision,Singapore. Erdem, M. H., Ozturk, Y. (1994b). M3: multivahd multi-

A New Family of Multivalued Networks directional (associative)memory.Acceptedin 3. Inter.Conf.on

Automation,RoboticsandComputerVision, Singapore. Erdem,M. H., Ozturk,Y. (l!W4e).MAREN solves optimization problems.Acceptedin 3. Inter.Conf.on Automation,Robotics andComputerVision,Singapore. Hopfield,J. J. (1982).Neuralnetworksandphysicalsystemswith emergentcollective computationalabilities, Proc. Nat. Acad. Sci. USA, 79,2554-2558. Hoptield, J. J. (1984). Neurons with graded response have collective computationalpropertieslike those of two state neurons.Proc. Nat. Acad. Sci. USA, 81, 3088-3092. Hopfield,J. J., & Tank, D. W. (1985). Neural computationof decisionsin optimizationproblems.BiologicalCybernetics,55, 141-152.

989 Kan&l, E., & Hawkins,R. (1992).Thebiologicalbasisof learning andindividuality.Scient@cAmerican,September. Kosko,B. (1988).Bidirectionalassociativememories.ZEEETrans. on Systems, Man, and Cybernetics,18(1). Lippman, R. P. (1987). An introduction to computing with neural nets. IEEEASSP Magazine,April, 4-22. Ogata, K. (1970). Moakrn control engineering.Englewood Cliffs, NJ: Prentice Hall. Tank, D. W., & Hopfield, J. J. (1986). Simple neural optimization networks. IEEE Trans. on Circuits and Systems, CAS

33,533-541. Wilson, G. V., & Pawley, G. S. (1988). On the stability of the traveling salesman problem algorithm of Hopfield and Tank.

BiologicalCybernetics,58.

A New Family of Multivalued Networks

A New Family of Multivalued Networks

Recommend Documents