On the performance of Hopfield network for graph search problem

On the performance of Hopfield network for graph search problem

NEUROCOMPUTINC ELSEVIER Neurocomputing 14 (1997) 365-381 On the performance of Hopfield network for graph search problem Gursel Serpen a,*, Azadeh P...

1MB Sizes 2 Downloads 57 Views

NEUROCOMPUTINC ELSEVIER

Neurocomputing 14 (1997) 365-381

On the performance of Hopfield network for graph search problem Gursel Serpen a,*, Azadeh Parvin b aDepurtment of Electricul Engineering und Computer Science. The University of Toledo, Toledo, OH 43606, b Deportment oj’Ciui1 Engineering,

USA The Uniuersity

of Toledo,Toledo, OH 43606, USA

Received 9 June 1995; accepted 3 1 May 19%

Abstract

This paper presents a study on the performance of the Hopfield neurai network algorithm for the graph path search problem. Specifically, performance of the Hopfield network is studied from the dynamic systems stability perspective. Simulations of the time behavior of the neural network is augmented with exhaustive stability analysis of the equilibrium points of the network dynamics. The goal is to understand the reasons for the well-known deficiency of the Hopfield network algorithm: the inability to scale up with the problem size. A recent procedure, which establishes solutions as stable equilibrium points in the state space of the network dynamics, is employed to define the constraint weight parameters of the Hopfield neural network. A simulation study of the network and stability analysis of equilibrium points indicate that a large set of non-solution equilibrium points also becomes stable whenever constraint weight parameters are set to make the solution equilibrium points stable. As the problem size grows, the number of stable non-solution equilibrium points increases at a much faster rate than the number of stable solution equilibrium points: the network becomes more likely to converge to a non-solution equilibrium point. Keyworrls: Neural networks; Hoptield networks; Weight parameters; Stability analysis; Dynamic system; Optimization; Constraint satisfaction

* Corresponding author. Email: [email protected]. 092.52312/97/$17.00 Copyright 0 1997 Elsevier Science B.V. All rights reserved. PII SO925-23 12(96)00037-9

G. Serpen, A. Paruin/Neurocompuring

14 (1997) 365-381

1. Introduction Hopfield networks have been employed as fixed-point attractors to solve a large set of constraint satisfaction and optimization problems [3-7-J. Time-behavior of the network can be succinctly expressed as performing a gradient descent search in the Lyapunov function space, providing a measure for the degree of satisfying the set of constraints associated with a given problem [ 161. Main promise of the Hopfield network is to converge to the stable fixed-point located within the basin of attraction implied by the initial conditions of the network dynamics. The Hopfield network can successfully address certain constraint satisfaction or optimization problems for which a local optimum solution is acceptable [2,12,15]. A Boltzmann machine, a stochastic version of the Hopfield neural network algorithm, is employed if the global optimum solution is desired [21]. When the Hopfield network is employed in a fixed-point attractor mode, the neural network design task incorporates establishing the solutions of a given optimization problem as local minimum points or equivalently stable points in the Lyapunov space or state space, respectively. This task has been performed in an ad hoc manner until recently, since the original Hopfield algorithm did not propose a way to define the constraint weight parameters [1,2,13,14,17-19,21,23,24]. In most cases, constraint weight parameters were set using empirical guidelines. As a result, stability properties of solution equilibrium points could not be determined. Recently, Serpen [19] and Abe [l] have devised procedures, employing two different approaches, to set the constraint weight parameters to establish solutions as stable equilibrium points in the state space of the problem dynamics. Most studies in the literature consider the Traveling Salesman Problem (TSP), which is NP-complete, as the benchmark problem for performance analysis of the Hopfield network algorithm. In an earlier work [ 181, it was shown that the Hopfield network converged to a solution of the TSP after each relaxation when the constraint weight parameters were set in accordance with the procedure specified as in Serpen [19]. However, the solution quality was, at best, locally optimum. A state space analysis indicated that all solutions were stable and no other equilibrium point was stable. In summary, the Hopfield network always located a solution for the TSP but the quality of solutions, in terms of the overall travel distance, was average. On the other hand, initial work with the path search problem in graphs indicated that the Hopfield network with the constraint weight parameters defined as in Serpen [19] often failed to converge to a solution. The graph path search problem (GPSP) proved to be “hard” for the Hopfield network paradigm. The goal of this paper is to investigate the reasons for the poor performance of the Hopfield network on the GPSP. Towards that goal, an extensive set of simulation and mathematical stability analysis will be performed. Specifically, stability of solution points, the set of stable equilibrium points, and convergence properties of the Hopfield network algorithm will be studied. The structure of the paper is as follows. The discrete Hopfield network will be presented in Section 1. Definitions of GPSP, network topology, and energy function and bounds on constraint weight parameters are given in Section 2. A simulation study and a

G. Serpen, A. Purvin/NeurocomputinK

14 (1997) 365-381

367

mathematical stability analysis are presented in Section 3. Conclusions will be presented in Section 4. 1.1. List of symbols

Presented below is a list of symbols employed in the remainder of the paper. ‘i wij

bi

4 E net, c, gV “$ d;

output of network node i. weight between nodes i and j. external bias term for node i. threshold of node i. energy or Lyapunov function. network input for node i. constraint cp. weight parameter for C, function representing interaction between nodes i and j under C,. function representing the cost of interaction between nodes i and j under CQ'

I .2. Definitions A list of definitions which will be employed in the remainder of the paper is presented next. Definition

1. The state space set contains all 2N N-bit binary vectors for an N-node

network. 2. The stable point set includes those binary vectors which are stable points of the Hopfield network dynamics for a given problem.

Definition

Definition 3. The solution set contains those N-bit binary vectors for an N-node network which are solutions of a problem. 4. The stable solution set consists of those solutions of a given problem which are stable points of the network dynamics.

Definition

5. A relaxation (iteration) of the Hopfield network is the total computation effort for the network to start from an initial state and to converge to a final state. Definition

6. The convergence rate is computed by the number of convergences to solutions divided by the number of total relaxations attempted.

Definition

7. The convergence ratio is calculated by dividing the number of elements in the stable solution set to the number of elements in the stable point set.

Definition

G. Serpen. A. Parvin / Neurocomputing

368

14 (1997) 365-381

Definition 8. An operating point is an instance of the values for the set of constraint weight parameters. Definition 9. For an N X N array network topology, row(i) and col(i) are equal to the row and column indices of the node si with i = 1,2,. . . ,N2, respectively. Definition 10. A constraint is called hard if violating it necessarily prevents the network from finding a solution. Definition 11. A sofr constraint is employed to map a cost measure associated with the quality of a solution as typically found in optimization problems. 1.3. The discrete Hopfield network The discrete Hopfield network [8-l l] is a non-linear dynamic system with the following formal definition. Let si represent a node output where si = 0,l for i = 12 , , . . . ,N and N is the number of network nodes. Then, the equation given by

E=

-;,@wijsisjibisi+ &$si,

i#j

I=-II= I

i= I

i=

I

(1)

is the Lyapunov function whose local minima are the final states of the network with node dynamics defined by Sk+’ = Oif net,k O;,

s:+’ =sf if net”= @;,

i= 1,2 ,..., N,

(2)

where k is a discrete time index, N

net” = c wijsjk + b, j= I

(3)

with i # j and 0; is the threshold of node si. The weight term is defined by Z

where Z is the number of constraints. Given the set of constraints C, E {C,,C2,. . . ,C,}, gV E R+ if the hypotheses nodes si and sj represent for C, are mutually supporting and g, E R- if the same hypotheses are mutually conflicting. The term ai‘$ is equal to 1 if the two hypotheses represented by nodes si and sj are related under C, and is equal to 0 otherwise. The dt term is equal to 1 for all i and j under a hard constraint and is a predefined cost for a soft constraint, which is typically associated with a cost term in optimization problems.

G. Serpen, A. Porvin/Neurocomputing

I4 (1997) 365-381

369

2. Graph path search problem

A graph with directed edges is used as a mathematical model for a number of real-life problems [5]; example problems include the path planning in a task state space, the non-linear multi-commodity flow problem and the routing problem in computer networks. Determining a solution to those problems often requires computation of the shortest path between two vertices of a graph: the source vertex and the target vertex. A directed graph is a set of m vertices, Vi, i = 1,2,. ..,m, and k ( = m2> directed edges, eij, i,j= 1,2,. . . m. where some of the edges may not exist. Each directed edge in the graph has an associated length or cost. The length of an edge is represented by a real number. A path between a source vertex, V,, and a target vertex V,, is an ordered sequence of non-zero length edges. The path length is given by the algebraic sum of lengths of all the edges in that path. A solution for the GPSP is then to identify the path which has the minimum length. For the case where all edges have the same length, the shortest path is equivalent to the minimum number of edges which connect the source vertex to the target vertex. The scope of this paper is limited to directed graphs with non-weighted edges, which will be adequate for the purposes of this study. From the graph theoretic viewpoint, the shortest path between two vertices of a non-weighted directed graph is defined as a sub-graph which meets all of the following criteria: . The sub-graph representing a path is both asymmetric and irreflexive. * Each vertex, except the source and target vertices, must have in-degree of one and out-degree of one. . The source vertex has in-degree of zero and out-degree of one. * The target vertex has in-degree of one and out-degree of zero. * The length of the shortest path is equal to the power of the adjacency matrix which has the first non-zero entry in the row and column locations as defined by the source and target vertices, respectively. Given the graph-theoretic constraints a path specification has to satisfy, the next step is to define the corresponding topological constraints of the GPSP. The topology of the network is modeled after the adjacency matrix of a given graph, where each network node represents a graph edge. An active node at row r and column c indicates that the edge from vertex V, to vertex V, is included in the path specification. Since the shortest path is irreflexive, the nodes located along the main diagonal are clamped to zero as these nodes represent the hypothesis that the shortest path has a self-loop for vertex Vi. In order to map the in-degree of zero constraint for the source vertex, the nodes in the column labeled by the source vertex are clamped to zero. Similarly, the nodes in the row labeled by the target vertex are clamped to zero to enforce the out-degree of zero constraint for the target vertex. The nodes in the locations where there is a zero entry in the associated adjacency matrix are also clamped to zero, since a zero entry in the adjacency matrix implies that the graph does not have an edge between the related vertices. An adjacency matrix has IV2 entries for an N-vertex digraph. Given that certain nodes are clamped to zero, the network will have K < NZ unclamped nodes. Note that the clamped nodes need not be included in the computations associated with the network simulations.

370

2.1. Definition

G. Serpm, A. Purvin / Neurocomputing

14 (1997) 365-381

of the energy function

The asymmetry of a graph which represents the shortest path requires that one of the two entries, at most, located at symmetric positions with respect to the main diagonal of the adjacency matrix, be equal to one. Any two nodes in the network interact if the row index of node si equals the column index of node sj and the column index of node si equals the row index of node sj. Since only one of two interacting nodes can be equal to 1, the type of interaction between the nodes is inhibitory, g, E R-. The energy term for this inhibitory constraint is of the form:

where 6; = 1 if row(i) = col(j) and col(i) = row(j); otherwise 6; = 0 for i,j = 12 , , . . . , K and i # j. The subscript/superscript a indicates the asymmetry inhibition. Consider the network nodes in rows and columns not associated with the source or target vertices. If a vertex is included in the shortest path specification, then there is exactly one active node in both the row and the column labeled by this vertex. On the other hand, there are no active nodes within the rows and columns labeled by vertices which do not belong to the path specification. Thus, when the network converges to a solution, there can be at most one node active per row or column. A digraph vertex, belonging to the path specification and having in-degree of one, implies the existence of a single 1 in the associated column of the adjacency matrix. The condition for two nodes to interact under the column constraint is that the column index of node si equals the column index of node sj. Similarly, a digraph vertex with an out-degree of one requires exactly a single 1 to exist in the associated row of the adjacency matrix. Two nodes interact under the row constraint if the row index of node si equals the row index of node sj. These constraints permit at most one of the nodes in a column or row to be active at any time; thus the type of interaction is inhibitory with g,,g, E R-. The energy term for the column inhibition constraint is defined by

where 8,) = 1 if co/(i) = co/(j); otherwise 6; = 0 for i, j = 1,2,. . . , K and i # j given that both coZ(i) and col( j) do not correspond to the target vertex column. Similarly, the energy term for row inhibition constraint is defined by

where 66 = 1 if row(i) = row(j); otherwise 64 = 0 for i, j = 1,2,. . . , K and i # j given that both row(i) and row(j) are not the source vertex row. Subscripts/superscripts r and c indicate the row and the column inhibition constraints, respectively. Since the source vertex must have out-degree of one, there is exactly one node active within the source vertex row for a solution array. Given the target vertex has in-degree

G. Serpen, A. Paruin/Neuro~omputinfi

14 (1997) 365-381

of one, exactly one node is active within the target vertex column for a solution These two constraints can be mapped employing the energy terms defined by

371

array.

where k represents the source vertex row and 1 is the index for the columns of the network in the first energy term, n represents the target vertex column and m is the index for the rows of the network in the second energy term, and g, E R- and g, E Rare the constraint weight parameters associated with the column and the row inhibition constraints, respectively. The in-degree and out-degree values for vertices of a solution graph depend on if a particular vertex belongs to the shortest path specification. Any vertex other than the source and the target vertices has both the in-degree and the out-degree equal to one if that vertex belongs to the path specification. The source vertex has in-degree of zero and out-degree of one. The target vertex has in-degree of one and out-degree of zero. A vertex not in the path specification has both the in-degree and the out-degree equal to zero. In terms of the adjacency matrix, a vertex with both the in-degree and the out-degree equal to one implies a 1 in both the row and the column labeled by this vertex. In general, if there exists a 1 in a particular row r and column c, then there must exist a 1 in the corresponding column r and row c of the solution array. This constraint which enforces both the in-degree and the out-degree of a vertex within the path specification to be equal to 1 will be decomposed into two sub-constraints, which are called the sub-constraint column-to-row excitation and the sub-constraint row-to-column excitation, for ease of mapping to the network topology. The sub-constraint column-to-row excitation states that if the in-degree of vertex Vi is one, then the out-degree of the same vertex must also be one. For the sub-constraint row-to-column excitation, if the out-degree of vertex Vi is one, then the in-degree of the same vertex must also be one. In terms of the network topology, if the column sum associated with the vertex Vi is equal to one, then the row sum associated with the same vertex must also be one for the sub-constraint column-to-row excitation. The nodes in the target vertex column do not belong to the interaction topology of this sub-constraint since the out-degree of the target vertex is set to zero. Similarly, if the row sum associated with the vertex Vi is one, then the column sum associated with the same vertex must be one for the sub-constraint row-to-column excitation. The nodes in the source vertex row do not interact under the sub-constraint row-to-column excitation since the nodes in the source vertex column are clamped to zero to set the in-degree of the source vertex to zero. Thus, two nodes, si and sj, interact under sub-constraint column-to-row excitation if the column index of si is equal to the row index of sj, and si is not located within the target vertex column because the nodes in the target vertex row are clamped to zero. Similarly, two nodes, si and sj, interact under sub-constraint row-to-column excitation if the row index of si is equal to the column index of sj and si is not located within the source vertex row, since the nodes within the source vertex column are clamped to zero. The nodes of a given row and column pair interact under the sub-constraint

372

G. Serpen, A. Paruin/N~urocomputing

14 (1997) 365-381

column-to-row excitation or the sub-constraint row-to-column excitation. Sub-constraint column-to-row excitation can be mapped by

where r, is the index for the rows and c,, c2 are the indices for the columns of the network topology with the property that both c, and cZ are not the target vertex column. Similarly, sub-constraint row-to-column excitation can be mapped by

where c, is the index for the columns and r,, r2 are the indices for the rows of the network topology with the property that both r, and r2 are not the source vertex row. These energy terms have a value of zero if the corresponding column/row sums are both equal to zero or both equal to one. The values of the energy terms are greater than zero if a column/row sum is zero and the corresponding row/column sum is greater than zero. On the other hand, the energy term values become less than zero if a column/row sum is equal to one and the corresponding row/column sum is greater than one. Although the energy terms favor each of the interacting columns and rows having more than one node active, other constraints force the network dynamics to favor row and column sums with a value of one. A single constraint weight parameter for both sub-constraints, g, E R+, will be employed given that two sub-constraints are obtained by decomposing the original constraint. Any solution path must be of minimum length. The length of the shortest path can be computed without knowing the actual path itself. The global inhibition constraint can be used to force the network to have M-out-of-K nodes active. The value of M is equal to the length of the shortest path less 2, since the nodes within the source vertex row and the target vertex column do not interact under this constraint and a solution array has one active node within both the source vertex row and the target vertex column. The following energy term is minimum when exactly M nodes are active,

i= 1,2, . . . ,K, where row(i) and col(i) are not the source vertex row and the target vertex column, respectively. The energy function for the network is the algebraic sum of all individual energy terms and is given by E = E, + E, + EC + Es, + E,, + E,, + E,. In accordance with the work by Serpen [19], the applicable bound on constraint weight parameters for solutions of the GPSP to be stable is given by

/g,l+lg,l+~.~l~~l~~,.

(4)

Note that this inequality establishes a relationship between only the magnitudes of the constraint weight parameters.

G. Serpen, A. Prrruin/Neurocomputing

3. Simulation

14 (1997) 365-381

373

analysis

The identification of set ordering relationship between two sets of interest, the solution set and the stable point set, is necessary to understand the convergence characteristics of the Hopfield network. The solution set represents all those output vectors that satisfy the constraints of a given problem and the stable point set consists of all those output vectors that are stable equilibrium points in the state space of the Hopfield network dynamics. It is important to observe that if the solution set is equal to the stable point set, then the Hopfield network will always converge to a solution point after each relaxation. Any other relationship between these two sets will either cause some solution points to be unstable (in the case where the stable point set is a proper subset of the solution point set) or some non-solution points to be stable (in the case where the solution point set is a proper subset of the stable point set). The latter is the only feasible case for the bounds given by Eq. (4) since, by definition, bounds on the constraint weight parameters, as suggested by Abe [ 1I and Serpen [ 191,establish stability of all solution points. 3.1. Description of the simulation study and testing methodology Evaluation of the network performance is realized by employing two techniques: simulation based relaxation study and mathematical stability analysis. The simulation based relaxation study is used to observe the time behavior of the network dynamics. Specifically, the rate at which the network converges to fixed points is used as the measure for the network performance. Simulation based relaxation study provides a statistical estimate of the convergence rate. Stability analysis of equilibrium points in the problem state space is used to identify the set of stable points. A second measure of network performance, the convergence ratio, is computed by algebraically dividing the number of stable solution points by the number of stable equilibrium points, using the findings of the mathematical stability analysis. Two network performance measures, the convergence rate and the convergence ratio, are expected to correlate to a very large degree by definition. Stability analysis of all equilibrium points in the problem state space is not computationally feasible for large problem sizes. A K-node network with discrete node dynamics will have 2K states in the K dimensional state space. This large number indicates that the upper limit for the problem size is the 5 X 5 node network for problems which require two dimensional arrays as network topologies, where no nodes are clamped to either zero or one. The parameter set employed in the study included the graph size, the connectivity of graph and the path length. Instances of the directed graphs which were employed in the simulation study and in the stability analysis were created by modifying: 1. the number of vertices in the directed graph, 2. the connectivity of the digraph, which is the ratio of the number of edges in the graph to the number of edges in the same graph when it is fully connected, and 3. the path length between the source and the target vertices of the digraph.

374 Table 1 Operating

C. Supen, A. Purvin /Neurocomputing

point definitions

14 (1997) 365-381

for the GPSP

Constraint weight parameters

Operating 1

2

3

4

5

6

7

8

9

Rr = fit = Kc,

- 1.0

- 1.0

- 10.0

1.5 - 1.0

I .o

I .o

- 10.0

- 1.0

- 10.0 1.o - 10.0

- 1.0

RI/ ‘G

- 10.0 10.0 - 1.0

- 1.0 0.2 - I.0

- 1.0 1.4 - 1.0

- 1.0 10.0 - 10.0

point

I .o - I.0

Graph edges were randomly defined in order to establish the generality of the simulation results. A random variable uniform in the interval [O,l], p, was used. An edge e, was included in the graph specification if the following inequality held: pi > 1 - Connectivity Leuel. To test the performance of the network, the number of vertices, the connectivity level and the desired path length were provided to the algorithm. The length of a path between any two pairs of vertices of the digraph was computed using the adjacency matrix, although the path itself was not known. A path length of less than three was not used in any of the test cases. The operating points for the network were generated using Eq. (4). Relative magnitudes of constraint weight parameters rather than their absolute values are manipulated to generate a complete set of operating points which are presented in Table 1. 3.2. Simulation results An initial evaluation of the network performance indicated that the convergence rate was less than 100% and varied significantly as the graph size, the operating point, the path length, the connectivity level, and the graph instances differed. Therefore, structured tests were run to better understand the relationship between the convergence rate and the variables in the parameter set. Additionally, the dependencies between the variables in the parameter set was also taken into consideration and the simulation study was modified accordingly. An example of this is the dependence of the path length on the graph size, the connectivity level, and the graph instance, where the path length decreases as the connectivity level increases. In the tests that follow, three out of four variables (the graph size, the connectivity level, the path length, and the operating point) are fixed and the convergence rate is observed as the fourth one varies. The fifth variable, the graph instance, can not be controlled since the graph edges are randomly created. The first test was run for the case where the operating point, the graph size, and the path length were fixed and the connectivity level varied. The network performance strongly depends on the connectivity level as given in Fig. 1. Specifically, the convergence rate and the convergence ratio decrease as the connectivity increases: the convergence rate and ratio drops from 50-60% to about 15% as the connectivity increases from 0.2 to 0.8. There is a high level of correlation between the convergence ratio and the convergence rate as the connectivity varies (the correlation coefficient is 0.9908). The decrease in the convergence ratio indicates that the cardinality of the stable

G. Serpen, A. Puruin/Neurocomputing

I4 (1997) 365-381

375

Performance vs. Connectivity (6-vertex, path length of 3, operating point 7) y 0.6 0 0. & 0.4 ub 0.3 \1=::.i g 0

0.2 0.1 ,'

L I

I v

0 0.2

0.3

0.4

0.5

m

0.7

0.6

0.8

Connectivity Fig. I. Performance

analysis at various connectivity

levels for the GPSP.

solution set becomes much smaller than that of the stable solution set as the connectivity increases. As a result, the network does not always converge to a solution. The second test case involved the evaluation of the network performance as the operating point varied. This test was performed with the following parameter values: a digraph size of six vertices, a path length of 3, and a connectivity of 0.6. The same graph instance was used for all experiments. The size of the state space set and the number of solutions for this graph instance were 4096 and 3, respectively. Results presented in Fig. 2 indicate that the network performance significantly varied as the operating point changed. Operating point 1 guided the network towards an almost 70% convergence rate while operating point 5 caused the network performance to drop down to approximately 20% convergence rate. Additionally, the convergence ratio is not highly correlated with the convergence rate. It is tempting to infer that initialization of the network and the shape and size of the basins of attraction associated with the stable equilibrium points played a significant role in the lack of correlation. Exhaustive stability evaluation of equilibrium points demonstrated that all solutions are stable. The third test case involved studying the variation of the convergence rate and the convergence ratio with respect to the variation in the graph size. In simulation studies, a graph connectivity level of 0.3 and a path length of half the number of vertices were used. For the graphs with an odd number of vertices, the path length was determined by rounding the computed value up to the next larger integer. ‘Ihe operating point employed in the experiments was number 8 in Table 1. Simulation results, presented in Fig. 3, indicate that the network performance degraded drastically (convergence rate dropped

Performance vs. Operating Point (6-vertex, path length of 3, connectivity 0.6)

1

2

3

4

5

6

7

6

Operating Point Fig. 2. Performance

analysis at various operating points for the GPSP.

9

376

G. Serpen, A. Pnruin/Neuroc~mputinK

14 (1997) 365-381

Performance vs. Problem Size (operating point 8, connectivity 0.3)

6

a

7

9

10

Vertex Count Fig. 3. Performance analysis for various GPSP sizes.

from 70% to about 10% for a minimal increase in the vertex count from 6 to 10) as the graph size increased. Similarly, the convergence ratio dropped from 30% to 1% for the same increase in the graph size, which highly correlated with the variations in the convergence rate. The next test case was set up to observe the change in the convergence rate with respect to the changes in the path length and the connectivity level for a lo-vertex graph. The connectivity level and the path length varied in the ranges [0.1,0.8] and [4,8], respectively. Operating point number 8 in Table 1 was employed for this test case. The results are summarized in Table 2. The symbol “ X ” in a box located at row r and column c means a lo-vertex graph with a path length of c and a connectivity of r could not be generated in 200 attempts. The data in Table 2 demonstrates that the convergence rate is significantly higher for low values of the graph connectivity or the path length. As an example, the convergence rate is 56% for a connectivity level of 0.3 and a path length of 4. The convergence rate drops to 1% for a path length of 8 and the same connectivity level. Simulation results presented in Table 2 correlate with findings in Fig. 2 and Fig. 3 to a large degree. Exhaustive stability analysis was not conducted for this test since the lo-vertex problem with 0.8 connectivity level generates approximately 5.7 X 10” lOO-bit binary vectors to evaluate. Anothei study analyzed the performance of the network with respect to variations in the graph instances. A graph with 10 vertices, 0.3 connectivity, and a path length of 5 was employed. The constraint weight parameters were set in accordance with the

Table 2 Convergence rate vs. Dath length X connectivitv level for the GPSP Connectivity level

Path length 4

5

0.1

0.50

0.3

0.2 0.3 0.4 0.5 0.6

0.40 0.56 0.17 0.19

0.04 0.04 0.07

X

X

I

6

7

0.33

X

0.07 0.03 0.04

0.00 0.03 X

8 0.09 0.01

G. Serpen. A. Parvin/Neurocomputing

Performance

[.04,.06)

[.06,.06)

377

14 (1997) 365-381

vs. Graph Instance

[.OS,.lO)

[.lO,.lZ)

[.12,.14)

[.14,.16)

[.16,.16)

Convergence Rate Interval Fig. 4. Convergence rate vs. graph instance for the GPSP.

operating point number 8 in Table 1. Analysis of the data in Fig. 4 indicates that the network performance varies considerably as the graph instance changes for fixed values of the operating point, the connectivity level, the path length, and the graph size. Simulation studies indicate that the performance of the network depends heavily on the operating point choice, the problem size, the path length, and the connectivity of the graphs. The exhaustive mathematical stability analysis confirmed that the stable equilibrium point set included all solution points and often a significant number of non-solution points. The number of non-solution equilibrium points which are stable depended on values of the operating point, the problem size, the graph connectivity, path length, and the graph instance. Therefore, as these parameters were changed, the cardinality of the stable equilibrium point set, which necessarily included all solutions, also changed as reflected by the values of the convergence ratio parameter. The convergence ratio showed a good degree of correlation with the convergence rate and helped explain the results obtained through simulation. 3.2.1. Pegormance vs. problem size The performance of the network for graph sizes of up to, and including, 50 vertices was next studied. The previous work indicates the convergence rate is likely to drop significantly as the graph size increases. Thus, the variables (the graph connectivity, the path length and the operating point) are set to maximize the convergence rate so that the network is likely to converge to a solution without requiring unreasonable simulation time. Earlier simulation work shows the convergence rate is the highest for low graph connectivity, the operating point given by number 1 in Table 1, and low path lengths. The operating point used in the testing of 20 through 50 vertex graphs is very close to number 1 in the constraint weight parameter space and defined by number 8 in Table 1. In order to save computational effort, the tests were run until the network converged to a solution. The simulation was then terminated by recording the number of relaxations required to converge to the solution, rather than running the simulation for 100 relaxations. Additionally, not all test cases in the test matrices were implemented since they would, otherwise, require a very large amount of computation. In all the tables that follow, “ > 100” indicates the network is likely to require more than 100 iterations to locate a stable solution. The symbol “X ” denotes that a graph with corresponding

378

Table 3 Iterations for convergence Connectivity level 0.05 0.10 0.15 0.20

Table 4 Iterations for convergence Connectivity level 0.05 0.10 0.30 0.40 0.50

specifications could available. The blank The performance 50-vertex GPSP are

Table 5 Iterations for convergence

G. Serpen, A. Paroin/Neurocomputinfi

to a solution for the 20-vertex

5

8

10

40 1

X

X

>lCQ

X

X

23

15

X X

8

to a solution for the 30-vertex GPSP Path length 3

4

3 3 2

7

15

20 > 100

>lOO

X

1

not be generated within 200 attempts and thus, no test results are boxes represent the cases where no testing was done. analysis of the network configured to solve the 20, 30, 40, and presented in Table 3, Table 4, Table 5, and Table 6, respectively.

to a solution for the 40-vertex

GPSP

Path length 3

5

10 >I00

0.05 0.10 0.50

0.05 0.10 0.40

GPSP

Path length

Connectivity level

Table 6 Iterations for convergence

14 (1997) 365-381

X

73 3

to a solution for the 50-vertex

GPSP

>loo 4

I

> 100

> 100

G. Serpen, A. Parvin/Neurocomputing

I4 (1997) 365-381

379

Simulation data indicates that the network performance degrades significantly as the problem size increases. The network also occasionally manages to locate a solution. Earlier studies on smaller size graphs showed that the size of the stable equilibrium point set increased as the problem size increased, which helped to explain the significant drops in the convergence rate value. The network attempted 20 relaxations for the 30-vertex GPSP with a connectivity level of 0.05 and a path length of 7 (Table 4) before locating a solution. In other words, it converged to 19 non-solution stable equilibrium points before it was able to converge to a solution. The same network required 73 relaxations for the 40-vertex GPSP with a connectivity of 0.10 and a path length of 5 (Table 5) to converge to a solution. The increase in the number of non-solution stable equilibrium points as the problem size grows is most likely to be the reason for the degradation of the network performance.

4. Conclusions Simulation results show that the network algorithm does not scale well with the increase in the size of the problem. The mathematical stability analysis of equilibrium points in the state space of the problem indicates that the stable point set includes all the solutions as well as some non-solution points. The same analysis also shows that the number of non-solution stable equilibrium points becomes much larger than the number of solutions in the stable point set as the size of the problem increases. Therefore, the network becomes more likely to converge to non-solution stable equilibrium points as the problem size increases. Even though many non-solution equilibrium points become stable for bounds on constraint weight parameters for solutions to be stable, the size of the stable point set is only a very small fraction of the size of the equilibrium point set. This indicates the established bounds on constraint weight parameters confine the search effort to a very small subspace of the problem state space. The topology of the network and the second-order Lyapunov function in its generic form as defined by Hopfield are not adequate to guide the network to a solution during each relaxation, There are a number of areas which might be studied further to improve the existing Hopfield network algorithm. The problem representation and the definition of the problem specific Lyapunov (energy) function could be further studied. The idea behind a new and better representation for the problem is to discover a form for the Lyapunov function which will result in the two sets of interest being equal, namely the stable point set and the solution set. Results of this work confirmed once more that a quick and average quality solution is the promise of the Hopfield network. Global search algorithms like the simulated annealing and genetic programming are shown to be better for problems where near-optimal solutions are needed [22]. There are also efforts in the literature in terms of combining the Hoptield network paradigm, which is a local search algorithm, with a genetic programming paradigm, which is a global search algorithm, in order to benefit from desirable features of both paradigms [20].

380

G. Serpen, A. Parvin/Neurocompuring

14 11997) 365-381

Acknowledgements We wish to express our appreciation to the anonymous referees of this paper for their valuable feedback.

References [l] S. Abe, J. Kawahama and K. Hirasawa, Solvin g inequality constrained combinatorial optimization problems by the Hoptield neural networks, Neurul Networks 5 (1992) 663-670. [2] S.V.B. Aiyer, M. Niranjan and F. Fallside, A theoretical investigation into the performance of the Hopfield model, IEEE Trunsacrions on Neurul Networks 2 (1990) 204-215. [3] M.K.M. Ali and F. Kamotm, Neural networks for shortest path computation and routing in computer networks, IEEE Transactions on Neural Networks 4 (1993) 941-954. [4] A. Bouzerdoum and T. Pattison, Neural network for quadratic optimization with bound constraints, lEEE Trunsuctions on Neural Networks 4 ( 1993) 293-303. [5] S. Cavalieri et al., Optimal path determination in a graph by Hoptield neural network, Neurul Nefworks 7(2) ( 1994) 397-404. [6] C. Chiu, Y. Maa and M.A. Shanblatt, Energy Function Analysis of Dynamic Programming Neural Networlrs, /EEE Tranvucrions on Neural Networks 4 (1991) 418-426. [7] B.J. Hellstrom and L.N. Kanal, Knapsack packing networks, l.!XE Trunsuctions on Neurul Nerworkv 3 ( 1992) 302-307. [8] J.J. Hopfield and D.W. Tank, Computing with neural networks: A model, Science 233 (1986) 625-632. [9] J.J. Hopfield and D.W. Tank, Neural computations of decisions in optimization problems, Biologicul Cybernetics 52 (1985) 14l- 152. [lo] J.J. Hopfield, Neurons with graded response have collective computational properties like those of two-state neurons, Proceedings of theNutional Acudemy of Sciences fC/.S.A.) 81 (1984) 3088-3092. [l 11 J.J. Hopfield, Neural networks and physical systems with emergent collective computational properties, Proceedings of the Nutionul Academy of Sciences (U.S.A.) 79 (1982) 2554-2558. [I21 Y. Kobuchi, State evaluation functions and Lyapunov functions for neural networks, Neural Networks 4 (1991) 505-510. [I31 B.W. Lee and B.J. Sheu, Modified hopfield neural networks for retrieving the optimal solution, IEEE Trunsuctions on Neural Networks 1 ( 1991) 137- 142. [I41 W.E. Lillo, M.H. Loh, S. Hui and S.H. Zal;, On solving constrained optimization problems with neural networks: A penalty method approach, IEEE Trunsucrions on Neurul Networks 4 (1993) 93 l-939. [15] W. Lin, J.G. De&do-Frias, G.G. Pechanek, and S. Vassiliadis, Impact of energy function on a neural network for optimization problems, IEEE World Congress on Compututionaf Intelligence ( 1994) 45 18-4523. [16] F-L. Luo and Y-D. Li, A theorem concerning the energy function of Hopfield continuous-variable neural networks, Internutionul Journul of Electronics 76(3) (1994) 443-446. [ 171 G. Serpen and D.L. Livingston. Analysis of the relationship between weight parameters and stability of solutions in Hopfield networks from dynamic systems viewpoint, Neurul, Purullel und Scientific Computurion 2 (1994)

361-372.

[I81 G. Serpen and D.L. Livingston, Bounds on the weight parameters solutions, Submined to u Journal.

of hopfield networks for the stability of

[19] G. Serpen, Bounds on constraint weight parameters of hopfield networks for stability of optimization problem solutions, Ph.D. Dissertation, Old Dominion University, Norfolk, VA, 1992. [20] H. Shirai et al., A solution of combinatorial optimization problem by uniting genetic algorithms with Hoptield’s model, IEEE World Congress on Compututionul Inrelligence (1994) 4704-4709. [21] Y. Shrivastava, S. Dasgupta. and SM. Reddy, Guaranteed convergence in a class of Hopfield networks, IEEE Trunsuctions on Neurul Networks 3 (1992) 95 l-961.

G. Serpen. A. Parvin/Neurocomputing

14 (1997) 365-381

381

[22] M.A. Styblinsky and T.S. Tang, Experiments in nonconvex optimization: Stochastic approximation with function smoothing and simulated annealing, Neural Networks 3 (1990) 467-483. [23] K.T. Sun and H.C. Fu, A hybrid neural network model for solving optimization problems, fi%E Trunsurtions on Computers 42 ( 1993) 2 18-227. [24] M. Vidyasagar, Location and stability of the high-gain equilibria of nonlinear neural networks, IEEE Transactiotu on Neural Networks 4 ( 1993) 660-67 I. Curse1 Serpen is currently an Assistant Professor with the Electrical Engineering and Computer Science Department of the University of Toledo. Dr. Serpen received his Ph.D. in Electrical Engineering from Old Dominion University, Norfolk, Virginia in 1992. He worked as an Aoolications Eneineer and Senior Software Engineer for Integrated Systems, Inc., Sa&a Clara, Cal&nia before he joined the fact&y in the University of Toledo in 1993. He is a member of the IEEE and the International Neural Networks Society. Dr. Serpen has been actively involved and published in the area of neural networks. His current research interests include the theory of recurrent neural networks.

Azadeh Parvin received her D.Sc. degree in Structural Engineering from The George Washington University in 1992. She served as a research advisor for SEAS Laboratory at the George Washington University. She worked as a project engineer for DMI Engineering, Inc., and Engineering Consulting Service, Ltd., both in Washington D.C., before joining the Civil Engineering Department of the University of Toledo in January 1994 as an Assistant Professor. Her current research interests include static and dynamic analysis and design of structures and intelligent systems applications in structures. She is a member of the Society of Women Engineers, The Masonry Society, AC1 and ASCE.