A neural network theory for constrained optimization

Neurocomputing 24 (1999) 117—161 A neural network theory for constrained optimization Yoshikane Takahashi* NTT Information and Communication Systems ...

Download PDF

387KB Sizes 4 Downloads 118 Views

Report

PDF Reader
Full Text

Neurocomputing 24 (1999) 117—161

A neural network theory for constrained optimization Yoshikane Takahashi* NTT Information and Communication Systems Laboratories, Yokosuka, Kanagawa, 239-0847, Japan Received 19 May 1997; accepted 2 September 1998

Abstract A variety of real-world problems can be formulated into continuous optimization problems with constraint equalities. The real-world problem here can include, for example, the traveling salesman problem, the Dido’s isoperimetric problem, the Hitchcock’s transportation problem, the network flow problem and the associative memory problem. In spite of the significance, there has not yet been developed any robust solving method that works efficiently across a broad spectrum of optimization problems. The recent Hopfield’s neural network method to solve the traveling salesman problem is a potentially promising candidate because of the efficiency due to its parallel processing. His method, however, has certain drawbacks that must be removed away before it can be qualified for an efficient, robust solving method. That is: (a) locally minimum solutions instead of globally minimum; (b) possible infeasible solutions; (c) heuristic choice of network parameters and an initial state; (d) quadratic objective functions instead of arbitrary nonlinear objective functions with arbitrary nonlinear equality constraints; and (e) unorganized mathematical formulation of the network for extension. This paper develops from the Hopfield method an efficient, robust network solving method of the continuous optimization problem with constraint equalities that resolves all the drawbacks except for (a) that has already been resolved by others. The development is mathematically rigorous and thus constitutes a solid foundation of a neural network theory for constrained optimization. 1999 Elsevier Science B.V. All rights reserved. Keywords: Neural networks; Constrained optimization; Dynamical systems; Transformation systems; Compatibility conditions; Traveling salesman problem; Hopfield’s network

* Tel.: #81 468 59-2681; fax: #81 468 59-3428; e-mail: [email protected] 0925-2312/99/$ — see front matter 1999 Elsevier Science B.V. All rights reserved. PII: S 0 9 2 5 - 2 3 1 2 ( 9 8 ) 0 0 0 9 2 - 7

118

Y. Takahashi/Neurocomputing 24 (1999) 117–161

1. Introduction Optimization problems, continuous or combinatorial, have been attracting many researchers in science and engineering fields from classic to modern [6]. One of the reasons for this is that a variety of real-world problems from our daily life to pure research can be formulated as an optimization problem in a simple abstract form: minimize an objective function subject to variable constraint, equalities and/or inequalities. It is then well-known that there are many popular optimization problems in various fields, classic and modern, that have equality-constraint only (refer to [6] for instance). In fact, let us name several examples: Dido’s isoperimetric problem [4,42], traveling salesman problem (TSP) [31], transportation-type problems such as the Hitchcock’s transportation problem and the assignment problem [20,30], and network flow problem for linear graphs [3,29]; logic-related problems such as propositional logic satisfiability and nonmonotonic reasoning [38,39], and associative memory problem [8,22,23,28]. This paper thus considers equality-constraint optimization problems, continuous or combinatorial (often referred to as problems simply). Needless to say, most of them originally arose in individual application fields and were studied each separately (refer to [6,16] for instance for conventional optimization methods). Then more powerful unified solving methods were developed. Specifics follow: the gradient method for the continuous problem [19,32]; linear/nonlinear programming [33] and the simulated annealing [27] mainly for the combinatorial one. These powerful unified methods, we believe, have recently been culminated with the Hopfield’s neural network method [5,23,24,26]. It is innovative across various scientific and engineering fields. The reason follows. First of all, it is capable of solving both the continuous and the combinatorial problems though some conventional methods are as well. Second, it is a parallel-processing version of the gradient method and thus can be more powerful than most previous methods. Third, the Hopfield’s network was further implemented with a simple electronic circuit, which solves the TSP quite efficiently; namely, within 10\—10\ s. The Hopfield’s method thus explored an analog VLSI approach to the optimization problem [36]. This paper thus purposes to develop his method into a more general problem solving method beyond the TSP. It is well-known, however, that his method has been subjected to intense, severe criticism. Its major drawbacks were revealed (refer to [49] for instance): (a) locally minimum solutions instead of globally minimum; (b) possible infeasible solutions; (c) heuristic choice of network parameters and an initial state; (d) quadratic objective functions instead of arbitrary nonlinear objective functions with arbitrary nonlinear equality constraints; and (e) unorganized mathematical formulation of the network for extension. Fortunately, enthusiastic studies on their removal for over a decade have produced a lot of various improvements. Let us then describe our approach to these drawbacks together with the brief survey of those studies that have had a significant influence on it. The drawback (a) has been removed away by the previous work to the extent that it can produce reasonably approximate solutions to many linear problems in a

Y. Takahashi/Neurocomputing 24 (1999) 117–161

119

sufficiently efficient manner [1]. It was also demonstrated that most methods developed could solve nonlinear optimization problems with no constraint [1,2,9—15,25, 37,43,48,51]. This paper thus excludes (a) from the scope. The drawback (b) was partly removed by Xu with the aid of the Lagrange method [50]; Reklaitis et al. [40] and Urahama [47] did related work as well. Their networks, unfortunately, could not always produce feasible solutions to the constrained optimization problem, even the TSP. It is the work [46] by Takahashi that completely removed (b) for the first time. He developed techniques that fitted in with the TSP as a constrained optimization problem. That is, the minimization of the objective function and the satisfaction of the constrained equality are respectively mapped onto a unit dynamical system and a synapse dynamical system. Those techniques also enabled the constraint to be satisfied autonomously at any final states for any choices of network parameters and an initial state. The drawback (c) was thus overcome as well. Takahashi [45] further extended the work [46] to the Cohen—Grossberg network [8] that could solve a certain larger class of problems than the TSP. Thus the drawback (d) was partly removed but not yet fully. The complete removal of (d) needs to be considered in connection with the drawback (e). The removal of (d) and (e) requires two technical developments. One is a network general enough to solve problems that consist of an arbitrary nonlinear objective function and arbitrary nonlinear equality constraints. The other is a mapping method from the problem onto the network than can be an extension of the work [45]. Takahashi [44] developed mathematically rigorously a network for problem-solving, which could serve suitably for our purpose. Consequently, this paper will develop mathematically rigorously a certain mapping method from the problem onto the Takahashi’s network that can resolve all the Hopfield drawbacks (b)—(e) except for (a) that has already been resolved by others. The mapping to be developed consists of a series of mappings from realworld problems onto the network (refer to Fig. 1). The optimization problem is considered in three versions: i.e., a real-world version (real-world problem), its formulated version (application problem) and its network-oriented version (network problem). In addition, a network is constructed at two stages. First, a dynamical system is constructed that represents outward behavior of the network to the problem. Second, a transformation system is constructed that represents an inside parallel processing to the problem. The dynamical system and the transformation system work in cooperation with each other. It is mathematically proved that the network constructed can produce solutions to the problem. This paper is comprised of three parts. Part I (Sections 2—4) specifies two problem versions (application problem and network problem) and besides demonstrates that there exists a specific constructible mapping from the first to the second. Part II (Sections 5 and 6) then constructs from the network problem a dynamical system that can mathematically produce solutions to the problem. Finally, Part III (Sections 7—9) embeds the dynamical system into a parallel-processing transformation system that

120

Y. Takahashi/Neurocomputing 24 (1999) 117–161

Fig. 1. Network construction from real-world problems.

can be substantiated with some engineering method. Furthermore it demonstrates that the network really solves the problem.

PART I: OPTIMIZATION PROBLEMS WITH CONSTRAINT EQUALITIES This part (Sections 2—4) discusses the upper half of the diagram depicted by Fig. 1. First, Sections 2 and 3 specify real-world problems into two versions: application problem and network problem. Then Section 4 demonstrates that there exists a specific constructible mapping from the former onto the latter.

Y. Takahashi/Neurocomputing 24 (1999) 117–161

121

2. Application problems 2.1. Specification Suppose that a vector-variable x,(x ) (i"1,2, n) moves within a closed domain G Dom(x) that lies in a real space RL or an integer space ZL. Assume also that functions f (x) and r (x) (k"1,2, KH) are all in C (continuously differentiable) when I Dom(x)LRL, and integer/real-valued when Dom(x)LZL. Then we specify application problems as follows. Definition 1. An application problem is to globally minimize or locally minimize an objective function f (x) in Dom(x) subject to constraint equalities r (x)" I 0 (k"1,2,KH with KH"0 or KH52). Here it is assumed that there exists at least one globally-minimum or locally-minimum point x+3Dom(x) of f (x) such that r (x+)"0 for all k"1,2,KH. I

(1)

Let us add some interpretation on this Definition 1. 1. The term “constraint equalities r (x)"0 (k"1,2,KH with KH"0)” indicates I “with no constraint equality”. Thus the application problem includes, as a special case, problems with no constraint. 2. We have dropped the case of KH"1 from Definition 1 though we recognize that the case covers not the least real-world problems. This is entirely due to our network (dynamical system) construction technique to be developed later in Part II. It thus is left for a future study to incorporate the case KH"1 into Definition 1. 3. The restriction (1) is a necessary condition for the solution existence to the application problem. 4. Some local-minimization problems are placed as a locally approximate version of the global-minimization problem. Others, however, can represent real-world problems that cannot be represented by the global-minimization problem. This is why we have put the two in juxtaposition. 2.2. Examples 2.2.1. Examples of the global-minimization problem The global-minimization problem of Definition 1 includes the following for instance. 1. Dido’s isoperimetric problem [4,42]. 2. TSP [24,31]. 3. Transportation-type problems such as the Hitchcock’s transportation problem, the assignment problem, the maximum flow problem and the shortest path problem [20,30]. 4. Network flow problem for linear graphs [3,29].

122

Y. Takahashi/Neurocomputing 24 (1999) 117–161

5. Logic-related problems such as propositional logic satisfiability, maximal consistent subsets, automatic deduction and nonmonotonic reasoning [38,39]. From among these we pick out the TSP for the specific expression in terms of f (x) and r (x)"0 of Definition 1. This is because it was used as the example of the I Hopfield’s method [24] and thus is naturally employed as a main example by the present paper. Accordingly, we adapt his TSP expression in [24] for our purpose. The TSP [31] is to find the shortest closed tour by which every city out of a set of M cities is visited once and only once. Let us denote by X and ½ representative cities and express by a real constant d a distance between two cities X and ½. In addition, 67 let us represent any tour by an M;M matrix x,(x ), x 3+0,1,(X, m"1,2, M). 6K 6K Here, row X and column m of x correspond to city X and position m in a tour, respectively; and besides, x "1 (x "0) represents that city X is (not) visited at 6K 6K the mth order. Then the TSP is expressed as follows. Definition 2. An objective function f (x) and constraint equalities r (x)"0 (k"1,2,3) I of the TSP are specified as follows: f (x)" d x (x #x ), 67 6K 7K> 7K\ 6 7$6 K r (x)" x x "0, 6K 6L 6 K L$K

(2) (3a)

r (x)" x x "0, (3b) 6K 7K K 6 7$6 r (x)" x !M "0. (3c) 6K 6 K The function f (x) represents the total path length of a tour. On the other hand, the constraint equalities r (x)"0 (k"1,2,3) represent a necessary and sufficient condiI tion for feasible solutions to the TSP. Specifically, each r (x)"0 (k"1,2,3) is interI preted as follows. The first constraint equality r (x)"0 holds if and only if each city row X of x contains no more than one “1” (the rest of the entries being zero). The second constraint equality r (x)"0 holds if and only if each “position in tour” column of x contains no more than one “1” (the rest of the entries being zero). The third constraint equality r (x)"0 holds if and only if there are M entries of “1” in the entire matrix x.

2.2.2. Examples of the local-minimization problem There are many real-world optimization problems in computer engineering that can be expressed as the local-minimization problem. For instance are included associative memory problem, generalization, familiarity recognition, categorization, error correction and time sequence retention [8,22,23,28]. Among others, the associative memory problem is extremely significant since all the other problems above can be considered its slightly different versions. We thus focus

Y. Takahashi/Neurocomputing 24 (1999) 117–161

123

on its specific expression in terms of f (x) and r (x)"0 of Definition 1. The specificaI tion is quoted from those by Cohen and Grossberg [8] and Hopfield [23] that are inherently involved with the network. Associative memory is a kind of content-addressable memory that stores and recalls data representing objects, concepts and so forth. The problem is to design a network to substantiate such a memory for any given data set. In the work [8,23], a network is required to be designed such that each individual piece of data can be placed at an asymptotically stable point of the network, where it is viewed mostly as a dynamical system. This network memory can recall as its stable state some data that is best associated with input piece of data. The time behavior of the network, viewed as a dynamical system, is summarized by its Lyapunov function [19,32,34]. In [8,23], certain Lyapunov functions were found such that any asymptotically stable point of the network could be some locally minimum point of the Lyapunov function, and vice versa. Let us briefly look at their Lyapunov functions specifically. Hopfield [23] implemented his network with an electronic circuit that has the following energy function E(x), a typical Lyapunov function.

VG (4) E(x),!(1/2) ¹ x x # (1/R ) h\(x) dx. GH G H G G H$G G Here, a real variable x indicates the activation level of a neuron while constants G ¹ and R indicate a synapse weight and a circuit resistance, respectively; besides, GH G a function h (l) the sigmoidal input—output transformation function at a neuron. G Then Cohen and Grossberg [8] mathematically developed a network memory that was more general than the Hopfield’s. Its Lyapunov function ¸(x) is expressed as follows:

VG ¸(x)"!2 b (x )+dc (x )/dx , dx # ¹ c (x )c (x )# a (x) (5) G G G G G G HI H H I I G G G V G H I G Here, the notation of x , x and ¹ is used similarly; and in addition, G G GH a (l )3C, b (l )3C, and c (l )3C with certain additional restrictive conditions on G G G G G G a (l ) and b (l ). G G G G Therefore the associative memory problem is formulated as a problem to construct a network (dynamical system in this case) such that its Lyapunov function must coincide with a given function f (x). Furthermore, there is no restricted condition r (x)"0 imposed on the associative memory problem in [8,23]. In this consequence, I it has turned out that the associative memory problem is expressed as a localminimization problem with no constraint of Definition 1.

3. Network problems This section first characterizes locally minimum solutions to the application problem of Definition 1 that can usually be produced with most networks. It then

124

Y. Takahashi/Neurocomputing 24 (1999) 117–161

derives from the application problem the specification of network problems that can be more suitable to be solved with networks. 3.1. Characterization of locally minimum solutions to the application problem Most networks including Hopfield’s usually produce locally minimum solutions to the problem. Thus, this paper necessarily and naturally employs a locally minimum type of solutions. Let us then characterize what fields of real-world problems this solution type is applicable to. In light of solution precision, real-world problems can be categorized into two classes, i.e., a strictly minimum class and an approximately minimum class. Locally minimum type solution is usually applicable to the latter class, in particular globalminimization problems. Hence this paper presupposes the latter class of problems. Let us then look into this class of problems specifically. Consider, for instance, real-time problems such as biological and robotics tasks of perception and pattern recognition. To these problems, solutions are usually produced with engineering methods practically. In addition they typically have an immense number of variables and the task of searching for the mathematical optimum of the criterion can often be of considerable combinatorial difficulty, and hence time consuming. Thus, to them, a “very good” or “good” solution computed on a time scale short enough so that the solution can be used in the choice of appropriate action is more important than a nominally better “minimum” solution. Consequently, they exemplify well the approximately minimum class of problems. 3.2. Specification We reformulate the application problem of Definition 1 into those problems, named network problems, in the approximately minimum class where solutions to them can be more naturally identified with the locally minimum solution produced with networks. First of all, let V,(» ) and T,(¹ ) denote a vector variable and a matrix variable G GH that moves within closed domains Dom(V )LR, and Dom(T )LR,, respectively (i, j"1,2, N). These continuous variables V and T are intended to correspond to neuron activation levels and synapse weights of networks. Let us also denote by F(V ) and R (V, T ) functions on Dom(V ) and Dom(V, T ),Dom(V );Dom(T ); and besides, I assume that F(V )3C and R (V, T )3C (k"1,2,K). Then we specify network I problems as follows. Definition 3. A network problem is to locally minimize an objective function F(V ) subject to constraint equalities R (V, T )"0 (k"1,2,K with K"0 or K51). Here I it is assumed that there exists at least one locally minimum point V+3Dom( V ) of F(V ) such that T+3Dom(T ): R (V+,T+)"0 for all k"1,2,K. I

(6)

Y. Takahashi/Neurocomputing 24 (1999) 117–161

125

It is moreover assumed that each constraint function R (V, T ) satisfies the following I three conditions. (a) A symmetric condition for T : R (V, 2,¹ ,2,¹ ,2)"R (V, 2,¹ ,2,¹ ,2) I GH HG I HG GH for all pairs (i, j) (i(j; i, j"1,2,N), R (V, 2,¹ ,2)"R (V, 2, 0,2), i.e., ¹ "0 for all i"1,2, N. I GG I GG (b) A non-negative condition for (V, T ):

(7a) (7b)

R (V, T)50 for all (V, T )3Dom(V, T ). (8) I (c) ¹wo-variable condition for T: Each constraint function R (V, T ) includes at least I two different variables ¹ and ¹ (i (k)(j (k), i (k)(j (k)) from T"(¹ ) GIHI GIHI GH that satisfy the following condition. ,¹ ,5+¹ ,¹ ," +¹ GIYHIY GIYHIY GIHI GIHI for any pair (k, k) with kOk (k,k"1,2, K).

(9)

Let us make some remarks on the network problem of Definition 3 in connection with the application problem of Definition 1. 1. There can exist the case of K"1 for the network problem; this is different from the application problem where the case of KH"1 has been dropped. The reason is that we can construct such a network problem from the application problem, as shown later in Theorem 1. 2. The condition KH52 for the application problem is caused by condition (9) on R (V, T ) for the network problem. Condition (9) in turn is essentially required for I our network (dynamical system) construction technique to be developed later in Part II. 3. Conditions (7) and (8) on R (V, T ) for the network problem originally come from I the electronic circuit that the Hopfield’s network was implemented with. They are considered natural conditions with engineering devices in general. We thus have put them on the network problem so that solving networks can be implemented naturally and easily. 4. The network problem can be combinatorial as well as continuous though Definition 3 apparently allows continuous problems only. This is because any integer condition » "m (l"1,2, ¸ ; i"1,2, N) can be substantiated as an additional G GJ G constraint equality, say R (V, T )"0, where R (V, T ) is defined as follows: R (V, T ), (1/2)(¹H #¹H )R (V ) (i"1,2, N!1), GG> G>G G G ¹H ,1#exp(¹ )'0, GG> GG> T "¹ and ¹ "0 for all i"1,2,N, GG> G>G GG

R (V ), (» !m ) G GJ G J

(l"1,2,¸ ) G

(10a) (10b) (10c)

126

Y. Takahashi/Neurocomputing 24 (1999) 117–161

Here, at least two elements of (¹ ) must be selected as variables while the rest GG> can be chosen as variables or constants. Furthermore, one can even deal with the 0—1 condition » "0 or » "1 G G engineeringly instead of R (V, T )"0. That condition can be approximately imple mented with networks including the Hopfield’s, where the input—output transformation at neuron i is subject to a sigmoidal function expressed by » "h (u ),(1/2)+1#tanh(u /u),, (11a) G G G G u , ¹ » !h. (11b) G GH H G H Here, u and » indicate the input to and the output from neuron i, and besides, G G u and h are real constants. It then is easily seen from Eq. (11) that » takes almost G G G on 0 or 1 when u is taken extremely small. Small values of u can, for instance, be G G implemented with high amplifier gain of the Hopfield’s electronic circuit. 5. In general, solutions to the network problem are quite fewer than all the locally minimum points of F(V ). This is because the solution must be some locally minimum point that satisfies all the constraint equalities R (V, T )"0 (k"1,2,K) I as specified by Eq. (6). Hence, it is likely for the solution to be a globally minimum point of F(V ); the more likely as the set of the constraint equalities R (V, T )"0 I increases in cardinality and/or complexity.

4. Mapping from the application problem onto the network problem This section first discusses mapping methods from the application problem onto the network problem, distinguishing two types of mapping: centralized and distributed. It then constructs a specific centralized mapping and therewith demonstrates that any application problem can be mapped onto some network problem. It also constructs a special distributed mapping particular to the TSP. 4.1. Mapping methods There are many possible mapping methods conceivable from the application problem onto the network one. Any such mapping determines how the objective function f (x) and the constraint functions r (x) of the application problem are mapped I onto the objective function F(V ) and the constraint functions R (V, T ) of the network I problem [45]. Needless to say, we here discuss only proper constraints R (V, T ), I excluding the integer constraint R (V, T ) of Eq. (10). On the one hand, it is quite natural that f (x) is mapped straightforwardly onto F(V ). That is, V,x, N,n and F(V ),f (x). We thus adopt this mapping with respect to f (x). From the TSP objective function f (x) of Eq. (2) for instance, we construct a TSP objective function F(V ) as follows: F(V )" d » (» #» ). 67 6K 7K> 7K\ 6 7$6 K

(12)

Y. Takahashi/Neurocomputing 24 (1999) 117–161

127

On the other hand, there are many methods constructible that map r (x) onto I R (V, T ). We categorize these into two contrastive types of mapping, i.e., centralized I and distributed. Each type is specified as follows. 4.1.1. A centralized type of mappings A centralized type mapping is specified as a one that maps all the constraint functions r (x) (k"1,2,KH with KH52) onto a single constraint function, say I R (V, T ). In addition, R (V, T ) includes the minimum number, namely KH, of variables ¹ . Here the number of the variable ¹ (i,j"1,2,n) of R (V, T ) can vary from KH to GH GH I one no greater than (1/2)n(n!1) because of the symmetric condition (7). This type of mapping is the simplest and the easiest to construct. 4.1.2. A distributed type of mappings A distributed type mapping is specified as a one that maps the constraint functions r (x) (k"1,2,KH with KH52) onto more than one constraint function I R (V, T ) (k"1,2, K) including more than KH variables ¹ . I GH Compared to the centralized type, this distributed type can be substantiated with networks that are more oriented to autonomous, balanced parallel processing. That processing is just one of the most distinguished features of the network. Thus, the more distributed the constraint function R (V, T ) is constructed, the more desirable I networks can be substantiated. Distributed mappings can in general be characterized by the number of the variable ¹ (i, j"1,2, N) in addition to the number of the constraint function GH R (V, T ) (k"1,2, K). Moreover, special distributed mappings particular to the TSP I can display the full variety of distributiveness including the most distributed one. Hence, we will take up later, in Section 4.3, this most distributed mapping as a distinguishedly contrastive example with the centralized one. 4.2. Construction of specific centralized mappings 4.2.1. General construction We demonstrate Theorem 1 that any application problem of Definition 1 can be mapped onto some network problem of Definition 3. In addition, we construct a specific centralized mapping in the proof. Theorem 1. (A) From any given global-minimization problem, there always exists some network problem constructible such that its solutions produce locally minimum approximate solutions to the given one. (B) From any given local-minimization problem, there always exists some network problem constructible such that its solutions produce exact solutions to the given one. Proof. Let us construct a specific centralized mapping from the application problem onto the network problem. It is then sufficient to construct a constraint equality R (V, T )"0 of conditions (7)—(9) that all the constraint equalities r (x)"0 I (k"1,2,KH with KH52) are mapped onto.

128

Y. Takahashi/Neurocomputing 24 (1999) 117–161

First of all, we construct constituent functions R (V ) (k"1,2,KH) as follows: I R (V ),r (x) I I

if r (x)50 for all x3Dom(x) I

,+r (x), if x3Dom(x): r (x)(0. I I

(13)

We then construct a constraint function R (V, T ) as follows: R (V, T ), (1/2)(¹H #¹H )R (V ) II> I>I I I

(k"1,2,KH with KH52), (14a)

¹H ,1#exp(¹ )'0, ¹ "¹ for all k"1,2,KH. II> II> II> I>I

(14b)

Here, at least two elements of (¹ ) must be selected as variables while the rest II> of (¹ ) can be chosen as variables or constants. The rest of the variables II> ¹ (i,j"1,2,N) not used in Eq. (14) are all set to zero. For the sake of simplicity, GH we have assumed here that KH#24N; if not, we have only to add appropriate variables ¹ . GH It is easily seen from Eq. (13) and (14) that all the constraint equalities r (x)"0 are I mathematically equivalent to R (V, T )"0. Furthermore, R (V, T ) also satisfies the conditions (7)—(9). Moreover, it follows from ¹ "¹ in Eq. (14b) that II> I>I R (V, T ) only includes the KH variables ¹ from (¹ ) where KH indicates the II> GH minimum number of variables necessary. This completes the proof. 䊐 4.2.2. Special construction for the TSP Hopfield [24] constructed a network problem for the TSP of Definition 2. The problem can be characterized as a special version of the constraint function R (V, T ) of Eq. (13) and (14) that is particular to the TSP. Specifically it is specified as follows. Definition 4. A centralized network TSP consists of the objective function F(V ) of Eq. (12) and a constraint equality R(V, T )"0 where R(V, T ) is defined as follows: R(V, T )"¹H » » #¹H » » 6K 6L 6K 7K 6 K L$K K 6 7$6

# ¹H » !M 6K 6 K ¹H "1#exp(¹ )'0, II> II>

"0 (X, ½, m, n"1, 2, M),

(15a)

¹ "¹ for all k"1,2,3. II> I>I

(15b)

Here, at least two of the three ¹ , ¹ and ¹ must be selected as variables while each of the rest can be chosen as either a variable or a constant.

Y. Takahashi/Neurocomputing 24 (1999) 117–161

129

The constraint function R(V, T ) of Eq. (15) can include as a special version a Hopfield’s one R(V) defined as follows: R(V)"AH » » #BH » » 6K 6L 6K 7K 6 K L$K K 6 7$6 # CH » !M "0 AH'0, BH'0 and CH'0 6K 6 K (X,½,m,n"1,2,M).

(16)

Here AH, BH and CH indicate positive-real constants. 䊐 One can easily see that the constraint function R(V, T ) of Eq. (15) satisfies conditions (7)—(9) in Definition 3. 4.3. Construction of the most distributed mapping for the TSP We construct the most distributed constraint functions R (V, T ) for the TSP, which I reveals itself in a marked contrast with the centralized network TSP of Definition 4. For this purpose, it is quite essential to construct from the TSP of Definition 2 a variable T,(¹ ) (X,½, m"1,2, M) of constraint functions R (V, T ). Thus, let 6K7L I us first introduce 2M(M!1)#1 real variables denoted by A (X, m, 6K6L n"1,2,M; nOm), B (X, ½, m"1, 2, M; ½OX), and C such that A and 6K7K 6K6L B satisfy the following symmetric condition: 6K7K A "A for all X, m, n"1,2, M (nOm), (17a) 6K6L 6L6K B "B for all X, ½, m"1,2, M (½OX). (17b) 6K7K 7K6K For notational simplicity, we next write Z ,A for k"1,2,M(M!1); I 6K6L Z ,B for k"M(M!1)#1,2,2M(M!1); Z ,C for k"2M(M!1)#1. I 6K7K I In addition, we put that ZH,1#exp(Z )'0 (k"1,2,2M(M!1)#1). (18) I I Then we proceed to express T"(¹ ) in terms of ZH. That is, let us specify real 6K7L I variables ¹ as follows: 6K7L ¹ ,!AH d (1!d )!BHd (1!d ) 6K7L 6K6L 67 KL KL 67 !CH! DHd (1!d )(d #d ) when ½"X and nOm, 67 67 LK> LK\ (19a) ¹ ,!AHd (1!d )!BH d (1!d ) 6K7L 67 KL 6K7K KL 67 !CH! DHd (1!d )(d #d ) when ½OX and n"m, 67 67 LK> LK\ (19b)

130

Y. Takahashi/Neurocomputing 24 (1999) 117–161

¹ ,!AHd (1!d )!BHd (1!d ) 6K7L 67 KL KL 67 !CH! DHd (1!d )(d #d ) when ½OX and nOm, 67 67 LK> LK\ (19c) ¹ ,0. (19d) 6K6K Here, AH, BH and DH denote any positive real constants. The transformation (18), (19) provides an expression of T"(¹ ) by Z,(Z ). 6K7L I This allows us to consider that the variables Z are substantial and independent while I the variables ¹ apparent and dependent. Hence we employ the variable Z instead 6K7L of T; and thus R (V, Z) instead of R (V, T ). I I Now, let us construct the most distributed network TSP as follows. Definition 5. A distributed network TSP consists of the objective function F(V ) of Eq. (12) and M(M!1) constraint equalities R (V, Z )"0 (k"1,2,M(M!1)) deI fined as follows: R (V, Z ),R (V, Z )#R (V, Z )#R (V, Z )"0 I I + +\>I (14k 4M(M!1)), R (V, Z ),R (V, Z )#R (V, Z )"0 I I + +\>I for all k"1,2,M(M!1) (kOk ), where each constraint function of Eq. (20) is defined by

(20a)

(20b)

» » for all j"1,2,M(M!1), (21a) R (V, Z ),AH H 6K6L 6K 6L R (V, Z ),BH » » for all j"M(M!1)#1,2,2M(M!1), (21b) H 6K7K 6K 7K R (V, +Z ),CH » !M . (21c) 6K 6 K

The constraint equalities (20) and (21) are the farthest decomposition of the TSP constraint equalities (3). They are thus the most distributed. That is, a pair of numbers of variables Z (k"1,2,2M(M!1)#1) and the constraint equalities I R (V, Z)"0 (k"1,2,M(M!1)) attain the maximum among all particular netI work problems that are derivable from the TSP of Definition 2.

PART II: SOLVING THE NETWORK PROBLEM WITH A DYNAMICAL SYSTEM This part (Sections 5 and 6) discusses the middle part of the diagram depicted by Fig. 1, i.e., the construction from the network problem a dynamical system that solves it. First, Section 5 views the network as the dynamical system in light of the outward

Y. Takahashi/Neurocomputing 24 (1999) 117–161

131

behavior, and constructs a dynamical system from the network problem. Then Section 6 demonstrates that the dynamical system produces solutions to the network problem.

5. Construction of a dynamical system from the network problem Golden [17] developed a unified framework for networks where network outward behavior, viewed as a maximum a posteriori estimation algorithm, is represented by a dynamical system with Lyapunov functions. Takahashi’s network [44] and problem-solving technique [45,46] were based on his framework. Hence this paper necessarily employs it. This section first extends his dynamical system model and then constructs a dynamical system from the network problem. The development in this section is thus a generalization of the Takahashi’s work [44,45]. 5.1. Mathematical preparations For the paper to be self-contained, we prepare two classical theorems well-known in differential equation and dynamical system theory [7,18,19,32,34]. 5.1.1. The Cauchy’s solution existence theorem The Cauchy’s solution existence theorem is stated as follows. Cauchy’s Solution Existence Theorem. Consider a canonical dynamical system expressed as follows: d» /dt"uH(t,V, T ), (22a) G G d¹ /dt"tH(t,V, T ). (22b) GH GH Here functions uH(t,V, T ) and tH(t, V, T ) are bounded and in C (continuous); and G GH besides, satisfy a ¸ipschitz condition on a closed domain [0, tH];Dom(V, T ) where tH denotes any positive real number. Suppose also that (V, T) is any point on Dom(V, T ). ¹hen there exists a unique trajectory (V, T ),(V(t), T(t)) of the dynamical system (22) that starts from the initial point (V(0),T(0))"(V, T). In practice, the continuous differentiability (uH(t, V, T ), tH(t, V, T )3C) is usually G GH used instead of the Lipschitz condition. 5.1.2. The Lyapunov’s stability theorem First of all, the Lyapunov function is defined as follows. Definition of the Lyapunov Function. Suppose that (V#, T#) is an equilibrium point of the dynamical system (22). Furthermore, assume that a function ¸(V, T ) is in C on some neighborhood º of (V#,T#) and differentiable on º!(V#, T#). Then, ¸(V, T ) is

132

Y. Takahashi/Neurocomputing 24 (1999) 117–161

called a (local) Lyapunov function for the dynamical system and (V#, T#) if it satisfies the following Lyapunov conditions: (L1) ¸(V, T )'¸(V#, T#) for all (V, T )3º!(V#,T#). (L2) d¸(V, T )/dt(0 along all the trajectories (V, T ) of Eq. (22) contained in º!(V#, T#). Then Lyapunov’s stability theorem is stated as follows. Lyapunov’s Stability Theorem. Suppose that (V#, T#) is any equilibrium point of the dynamical system (22). Assume also that it has some ¸yapunov function for (V#, T#). ¹hen, (V#, T#) is asymptotically stable. ¹hat is, the trajectory (V(t), T(t)) converges on (V#, T#). 5.2. Golden’s dynamical system of the network In the Golden’s framework, network behavior for problem-solving is represented as a dynamical system in a unified manner as follows. (1) A neuron dynamical system. Network behavior for problems such as the TSP, pattern matching, pattern classification is characterized by a neuron dynamical system with some Lyapunov function G(V, T ). Here a variable V"V(t) and a constant T indicate a neuron activation level and a synapse weight. The neuron dynamical system is expressed as follows: dV/dt"!*G(V, T )/*V.

(23)

(2) A synapse dynamical system. Network behavior for problems such as learning is characterized by a synapse dynamical system with some Lyapunov function H(V, T ) [41]. Here a variable (V, T )"(V(t), T(t)) represents a pair of a neuron activation level and a synapse weight. The synapse dynamical system is expressed as follows: dT/dt"!*H(V, T )/*T.

(24)

In the meantime, it is also well-known that the membrane current of biological nerve fibers is precisely interpreted by the Hodgkin—Huxley second-order partial differential equation system [21]. As pointed out by Golden, however, almost all current networks can be represented by the simpler dynamical system (23) or (24) of the first-order. This paper thus is to extend the dynamical systems (23), (24) for solving the network problem. 5.3. Construction of a dynamical system from the network problem Let us denote by g(V, T ) any C-function of the variable (V, T )3 Dom(V, T )LR,;R,. Consider then dynamical systems that locally minimize g(V, T ) in Dom(V, T ) without constraint.

Y. Takahashi/Neurocomputing 24 (1999) 117–161

133

First of all, classical optimization theory [19,32] instructs us that such a dynamical system can be constructed as follows: d» /dt"!o (V, T )+*g(V, T )/*» , (i"1,2,N), (25a) G G G d¹ /dt"!p (V, T )+*g(V, T )/*¹ , (i, j"1,2,N). (25b) GH GH GH Here functions o (V, T ) and p (V, T ) are in C such that G GH o (V, T )'0 for all (V, T )3Dom(V, T ), (26a) G p (V, T )'0 for all (V, T )3Dom(V, T ). (26b) GH Let us then extend this classical method to the network problem that includes constraint equalities R (V, T )"0 (k"1,2,K). We construct from it a dynamical I system, which is specified as follows [44,45]. Definition 6. Assume that the network problem is given. From it, construct a Lyapunov function ¸(V, T ) as follows: ¸(V,T ),F(V)#C R (V,T ), C : a positive real parameter. (27) I I Then a dynamical system is specified as a composite system consisting of a neuron dynamical system and a synapse dynamical system. Each constituent system is specified as follows. (a) A neuron dynamical system. A neuron dynamical system is expressed by the following system consisting of N equations for » (i"1,2,N): G

d» /dt"!o (V, T )+*¸(V, T )/*» , (i"1,2,N). (28) G G G Here each function o (V, T ) is in C such that G o (V, T )'0 for all (V, T )3Dom(V, T ). (29) G (b) A synapse dynamical system. It is assumed that the synapse weights ¹ "¹ (t) (i, j"1,2,N; i(j) satisfy the following symmetric condition: GH GH ¹ (t)"¹ (t) and ¹ (t)"0 for all t3[0,R), (i, j"1,2,N). (30) GH HG GG Then a synapse dynamical system is expressed by the following system consisting of (1/2)(N!1)N equations for ¹ : GH

d¹ /dt"!p (V, T ) C *R (V, T )/*¹ (i, j) GH GH I GH I O(i (kH), j (kH)),(i (kH), j (kH); kH"1,2,K),

C *R (V, T )/*¹ * (d¹ * /dt) I GI HI* GI HI* I

# C *R (V, T )/*¹ * (d¹ * /dt) I GI HI* GI HI* I

(31a)

134

Y. Takahashi/Neurocomputing 24 (1999) 117–161

"!p * (V, T ) C *R (V, T )/*¹ * GI HI* I GI HI* I !p * (V, T ) C *R (V, T )/*¹ * GI HI* I GI HI* I

(kH"1,2,K),

(31b)

+*RH(V, T )/*» ,(d» /dt)# +*RH(V, T )/*¹ ,(d¹ /dt) I GH GH I G G G G H (31c) "!RH(V, T ) (k"1,2,K). I Here, RH(V, T ) is defined by I RH(V, T ),R (V, T )+1#G(¹ ), (k"1,2,K), (32a) I I GI HI where G(¹ ) in turn is defined by GI HI G(¹ ),(1/2)+1#tanh(¹ ), (k"1,2,K). (32b) GI HI GI HI Moreover each function p (V, T ) is in C such that GH p (V, T )'0 for all V, T )3Dom(V, T ). (33) GH Each equation for ¹ in Eq. (31a) can also be replaced by the trivial differential GH equation, d¹ /dt"0 if ¹ is constant in the network problem. GH GH Furthermore, the equation system (28), (31) is equivalently rewritten in the following special form of the canonical system (22). d» /dt"uN (V, T ) (i"1,2,N), (34a) G G d¹ /dt"tM (V, T ) (i, j"1,2,N); tM "tM and tM "0. (34b) GH GH GH HG GG Here functions uN (V, T ) and tM (V, T ) are in C on Dom(V, T ). G GH Finally, assume that this dynamical system (28), (31), or equivalently Eq. (34), starts from an initial point (V(0), T(0))"(V,T)3Dom(V, T ) arbitrarily chosen and synaptically symmetric, namely, ¹ "¹ and ¹"0 (i, j"1,2,N). GH HG GG Note. (1) The dynamical system of Definition 6 is parametrized by C used in Eq. (27). (2) Apply the Cauchy’s solution existence theorem to the dynamical system (34). Then one can know that there exists a unique trajectory (V(t), T(t)) of Eq. (34) that starts from the initial point (V,T). Now, let us expound the background of Definition 6, i.e., how we have constructed the dynamical system [44,45]. First of all, we constructed the following dynamical system by putting g(V, T )"¸(V, T ) in the classical method (25): d» /dt"!o (V, T )+*¸(V, T )/*» , (i"1,2,N), G G G d¹ /dt"!p (V, T )+*¸(V, T )/*¹ , (i, j"1,2,N). GH GH GH

(35a) (35b)

Y. Takahashi/Neurocomputing 24 (1999) 117–161

135

The equation system (35a) has remained as the neuron dynamical system (28). On the other hand, the equation system (35b) needs to be slightly modified to incorporate the constraint equalities R (V, T )"0. There can be more than one such method. I Among them, we selected one that requires the least number of equations from Eq. (35b) to be modified since this method is considered one of the simplest. To begin with, we have left the (1/2)(N!1)N!2K equations for ¹ (i, j)O(i (kH), j (kH)), (i (kH), j (kH)); kH"1,2,K) among those in Eq. (35b) as GH they are. They constitute Eq. (31a). Then we have incorporated the constraint equality R (V, T )"0 (k"1,2,K) with I K synapse dynamical equations that are constructed as follows: dRH(V, T )/dt"!RH(V, T ) I I

(k"1,2,K).

(36)

The reason follows. The kth equation of Eq. (36) is equivalently rewritten as RH(V(t),T(t))"RH(V(0),T(0))exp(!t) I I

for all t3[0,R).

(37)

It furthermore follows from Eq. (32b) that ),(2 for all real values ¹ . 1(+1#G(¹ GI HI GI HI

(38)

It thus follows from Eqs. (32a), (37) and (38) that R (V(t),T(t))"0 is satisfied at t"R. I Note that we are constructing the dynamical system so that its motion gets stable at t"R, as we will rigorously confirm this later in Theorem 2 in Section 6.1. In the meantime, one can easily see that the kth equation of Eq. (36) is also equivalently transformed into the kth equation of Eq. (31c). Let us here additionally explain the reason why we could not use in Eq. (31c) the constraint function R (V, T ) straightforwardly instead of RH(V, T ) of Eq. (32a). ImagI I ine on the contrary that R (V, T ) were to be used in Eq. (31c). Then Eq. (31c) must I include the term +*R (V, T )/*¹ , that is also produced from Eq. (31b) to be G H I GH constructed soon. This implies that Eq. (31c) cannot be independent of Eq. (31b). We have eluded this failure by having some bias G(¹ ) solely towards ¹ of GI HI GI HI RH(V, T ) as specified in Eq. (32). I We have so far constructed the synapse dynamical equations (31a) and (31c); besides, the equations in Eq. (35b) for (i, j)"(i (kH), j (kH)), (i (kH), j (kH)) (kH"1,2,K) still remain. Thus the total number of the equations sums up to (1/2)(N!1)N#K; in addition, the number of the unknown functions ¹ (t) is GH (1/2)(N!1)N. Hence, we are required to reduce the total number of the equations by K. Also, that reduction must not affect the non-increasing property of the function ¸(V(t),T(t)) along the trajectory (V(t),T(t)). We have resolved these requirements by combining each pair of two equations in Eq. (35b) for ¹ ((i, j)"(i (kH), j (kH)), (i (kH), j (kH)) into the equation for kH in GH Eq. (31b). In fact, the following inequality, held for Eq. (35), remains to hold as well for (28) and (31) (refer to Theorem 2 later in Section 6.1 for rigorous confirmation): ¸(V(t),T(t))/dt40 for all t3[0,R).

(39)

136

Y. Takahashi/Neurocomputing 24 (1999) 117–161

Consequently, we have constructed the dynamical system of Eq. (28) and (31) that consists of (1/2)N(N#1) differential equations for (1/2)N(N#1) unknown functions (V, T ). 5.4. TSP Examples: construction of dynamical systems from the network ¹SPs 5.4.1. Construction of a dynamical system from the distributed network ¹SP Apply the construction method of the dynamical system in Definition 6 to the distributed network TSP of Definition 5. Then we obtain an example of the dynamical system for the TSP, which is specified as follows. Definition 7. Assume that the distributed network TSP is given. From it, construct a Lyapunov function ¸(V, Z ) as follows: ¸(V, Z )"F(V )#R(V, Z ).

(40)

Here the objective function F(V ) was previously defined by Eq. (12). In addition, the constraint function R(V, Z ) is defined by R(V, Z )" R (V, Z ) (k"1,2,M(M!1)), I I

(41)

where each constraint function R (V, Z ) was previously defined by Eqs. (20) and (21). I Then a distributed dynamical system for the TSP consists of a neuron dynamical system and a synapse dynamical system, which are specified as follows. (a) A neuron dynamical system. A neuron dynamical system is expressed by d» /dt"!2» (1!» )*¸(V, Z )/*» (X, m"1,2,M). 6K 6K 6K 6K

(42)

(b) A synapse dynamical system. It is assumed that the variables Z (t)"A (t) I 6K6L (k"1,2,M(M!1)) and Z (t)"B (t) (k"M(M!1)#1,2,2M(M!1)) I 6K7K satisfy the following symmetric condition: A (t)"A (t) for all t3[0,R) and for all X,m,n"1,2,M (nOm), 6K6L 6L6K (43a) B (t)"B (t) for all t3[0,R) and for all X,½,m"1,2,M (½OX). 6K7K 7K6K (43b) Then a synapse dynamical system is expressed by the following system consisting of 2M(M!1)#1 equations:

dC/dt"! *R (V, Z )/*C , I I

(44a)

137

Y. Takahashi/Neurocomputing 24 (1999) 117–161

*R (V, Z )/*Z H (dZ H/dt)# *R (V, Z )/*Z (dZ /dt) I I I I + +\>IH + +\>IH I I "! *R (V, Z )/*Z H ! *R (V, Z )/**Z I I I + +\>IH I I (kH"1,2,M(M!1)), (44b)

+*RH(V, Z )/*» ,(d» /dt)# +*RH(V, Z )/*Z H,(dZ H/dt) I 6K 6K I I I 6 K IH "!RH(V, Z ) (k"1,2,M(M!1)). (44c) I Here, RH(V, Z ) is defined by I RH(V, Z ),RH(V, Z )+1#G(Z ), (k"1,2,M(M!1)). (45) I I I Furthermore, the equation system (42)—(45) is equivalently rewritten into the following special form of the canonical system (22): d» /dt"uN (V, Z ) (X,m"1,2,M), (46a) 6K 6K dZ /dt"tM (V, Z ) (k"1,2,2M(M!1)#1);tM I I 6K7L "tM and tM "0 (X, ½, m, n"1, 2, M). (46b) 7L6K 6K6K Finally, assume that this dynamical system (42)—(45), or equivalently Eq. (46), starts from an arbitrarily chosen initial point (V(0), Z(0))"(V, Z)3 Dom(V, Z ),[0,1]+;(!R,#R)++\> that satisfies a symmetric condition with respect to Z(0)"(Z (0)) similar to Eq. (43). 䊐 I Let us add some remarks on this Definition 7. 1. The Lyapunov function ¸(V,Z ) of (40) does not include any positive real parameter C though the function ¸(V, T ) of (27) does. The rationale follows. First of all, we are to construct in Part III a particular network for the TSP that has the sigmoidal transformation function h (u ) of Eq. (11) at each neuron i"Xm where the constant G G u is taken extremely small. Thus, each variable » takes on almost one of the two G 6K values 0 and 1. This construction is essentially based on the work by Hopfield [24]. For the dynamical system of his network [24], Hopfield demonstrated that its asymptotically stable points lie in 2, (N"M) corners of the hypercube [0,1],, each of which in turn is a locally minimum point of the Lyapunov function ¸(V,Z ) of Eq. (40). Moreover, as will be confirmed later in Theorem 2 in Section 6.1, the constraint equality R (V(t),T(t))"0 of the network problem is satisfied at some I asymptotically stable point of the dynamical system. Therefore, any asymptotically stable point of the dynamical system is some locally minimum point of the function F(V ) of Eq. (12) that satisfies the constraint equality R(V, Z )"0. This is true for whatever choices of C. Consequently, we are allowed to have put C"1 in Eq. (40).

138

Y. Takahashi/Neurocomputing 24 (1999) 117–161

2. In the construction of the neuron dynamical system (42), we have selected 2» (1!» ) as o (V,T ) in Eqs. (28) and (29) from among many possible choices. 6K 6K G That is, o (V, Z)"2» (1!» ). 6K 6K 6K

(47)

This is because we have inherited this coefficient 2» (1!» ) from the dynam6K 6K ical system of the Hopfield’s network [24]. It was implemented with the electronic circuit; the coefficient comes from the motion equation of that electronic circuit. If consideration were to focus on purely mathematical matters, we could construct other simpler neuron dynamical systems such as d» /dt"!*¸(V, Z )/*» (X, m"1,2,M). 6K 6K

(48)

3. The dynamical system of Definition 7 is the most distributed, i.e., autonomous and balanced among those that are constructible from the TSP. It thus is desired that without much difficulty, the dynamical system can be implemented with physical devices such as VLSI [36]. 5.4.2. Construction of a dynamical system from the centralized network TSP Apply the construction method of the dynamical system in Definition 6 to the centralized network TSP of Definition 4. Then we obtain just the dynamical system of the Hopfield’s network, which is reformulated as follows [24]. Definition 8. Assume that the centralized network TSP is given. From it, construct a Lyapunov function ¸(V ) as follows: ¸(V )"F(V )#R(V ).

(49)

Here the objective function F(V) and the constraint function R(V ) were previously defined by Eqs. (12) and (16), respectively. Then the Hopfield’s dynamical system consists of a neuron dynamical system and a synapse dynamical system. The neuron dynamical system is expressed by d» /dt"!2» (1!» )*¸(V )/*» (X, m"1,2, M). 6K 6K 6K 6K

(50)

On the other hand, the synapse dynamical system is the trivial one expressed by dZ /dt"0 (k"1,2, 2M(M!1)#1); I ZH"AH for k"1,2,M(M!1), I ZH"BH for k"M(M!1)#1,2,2M(M!1), I

(51)

ZH"CH for k"2M(M!1)#1. I Finally, assume that the Hopfield’s dynamical system starts from an initial point V(0)"V3Dom(V ),[0,1]+ arbitrarily chosen.

Y. Takahashi/Neurocomputing 24 (1999) 117–161

139

6. Solving the network problem with the dynamical system This section demonstrates mathematically rigorously that the dynamical system of Definition 6 produces solutions to the network problem of Definition 3. 6.1. Solving the network problem with the dynamical system: a general theorem Let us state the results as the following Theorem 2. Theorem 2. Assume that the network problem is given. ¹hen the dynamical system produces a solution V1 to it. Specifically, the dynamical system produces a point (V1,T1) that satisfies the following three conditions. (DS-1) ¹he point (V1,T1) is an asymptotically stable point of the dynamical system. (DS-2) ¹he point (V1,T1) satisfies the constraint equalities: R (V1,T1)"0 for all k"1,2,K. (52) I (DS-3) ¹here exists some positive real parameter C for the ¸yapunov function ¸(V, T ) of Eq. (27) such that V1 is a locally minimum point of the objective function F(V) of the network problem. Remark. The selection of the parameter C in condition (DS-3) generally depends on the interrelationship between F(V ) and R (V, T ). I Proof. First of all, the Lyapunov’s stability theorem in Section 5.1.2 enables us to rewrite the conditions (DS-1)—(DS-3) into the following. (DS-1) The motion of the dynamical system always converges on some asymptotically stable point for whatever point (V ,T ) is chosen as its initial point (V(0),T(0)). Specifically the following two conditions hold. (DS-1a) For any equilibrium point P#,(V#,T#) of the dynamical system, the function ¸(V, T ) of Eq. (27) is really a Lyapunov function for the dynamical system and P#. That is, ¸(V, T ) satisfies the following conditions. (DS-1a-L0) ¸(V, T ) is continuous on some neighborhood º of P# and differentiable on º!P#. (DS-1a-L1) ¸(V, T )'¸(V#,T#) for all (V, T )3º!P#. (DS-1a-L2) d¸(V, T )/dt(0 along all the trajectories (V(t), T(t)) of the dynamical system contained in º!P#. (DS-1b) Any equilibrium point P# is an asymptotically stable point P,(V,T). (DS-2) At each asymptotically stable point P"(V ,T ) of the dynamical system, all the constraint functions R (V, T ) vanish. That is, I R (V ,T )"0 for all k"1,2,K. (53) I (DS-3) There exists some positive real parameter C for ¸(V, T ) such that any asymptotically stable point P"(V ,T ) includes a solution V 1,V . Specifically the following two conditions hold.

140

Y. Takahashi/Neurocomputing 24 (1999) 117–161

(DS-3a) Any asymptotically stable point P is a locally minimum point P+H,(V +H,T +H) of ¸(V, T ) for which all the following constraint equalities hold. R (V +H,T +H)"0 for all k"1,2,K. (54) I (DS-3b) There exists some positive real parameter C for ¸(V, T ) such that any locally minimum point P +H"(V +H,T +H) of ¸(V,T ) satisfying Eq. (54) includes a locally minimum point V +,V +H of F(V ). In particular, the point V +H"V + is a solution V 1 to the network problem. Now, let us demonstrate each of the conditions (DS-1)—(DS-3) one by one sequentially. (1) Satisfaction of (DS-1) (a) Satisfaction of (DS-1a). Suppose that the point P# is any equilibrium point. For this equilibrium point P#, we examine the conditions (DS-1a-L0)—(DS-1a-L2). First, the condition (DS-1a-L0) evidently proves true. This is because ¸(V, T )3C follows from Eq. (27) coupled with F(V )3C and R (V, T )3C. I We then proceed to (DS-1a-L1). The point P#, by definition of the equilibrium point [19,32], complies with the following condition: d»/dt"0 (i"1,2,N),

(55a)

d¹ /dt"0 (i, j"1,2,N). GH It thus follows from Eqs. (28) and (31) that Eq. (55) is rewritten as o (V, T )+*¸(V, T )/*» ,"0 (i"1,2,N), G G

(55b)

(56a)

p (T ) C *R (V, T )/*¹ "0 I GH GH I ((i, j)O(i (kH), j (kH)), (i (kH), j (kH); kH"1,2,K), (56b) p H H (V, T ) C *R (V, T )/*¹ H H G I H I I G I H I I #p H (V, T ) C *R (V, T )/*¹ H "0 (kH"1,2,K), (56c) GI HIH I GI HIH I RH(V, T )"0 (k"1,2,K). (56d) I It then follows from Eqs. (29), (32a), (32b) and (33) coupled with C'0 that Eq. (56) is further rewritten as

*¸(V, T )/*» "0 (i"1,2,N), G

(57a)

*R (V, T )/*¹ "0 (i, j"1,2,N), I GH I R (V, T )"0 (k"1,2,K). I

(57b) (57c)

141

Y. Takahashi/Neurocomputing 24 (1999) 117–161

In the meantime, any extreme point (locally maximum or minimum) of ¸(V, T ) satisfies the following condition: *¸(V, T )/*» "0 (i"1,2,N), (58a) G *¸(V, T )/*¹ "0 (i, j"1,2,N). (58b) GH It follows from Eq. (27) coupled with C'0 that Eq. (58) is further rewritten as *¸(V, T )/*» "0 (i, j"1,2,N), G

(59a)

*R (V, T )/*¹ "0 (i"1,2,N). (59b) I GH I Evidently, Eq. (57) is sufficient for Eq. (59). Hence, P# is an extreme point of ¸(V,T ). Furthermore, condition (DS-1a-L2) to be demonstrated next enables us to state that P# is a locally minimum point of ¸(V, T ). Therefore we have proved (DS-1a-L1). Finally, let us examine (DS-1a-L2). First, differentiate ¸(V, T ) with respect to t. Then we have the following equation: d¸(V, T )/dt" +*¸(V, T )/*» ,(d» /dt)# +*¸(V, T )/*¹ ,(d¹ /dt) G G GH GH G G H

" +*¸(V, T )/*» ,(d» /dt)# C *R (V, T )/*¹ (d¹ /dt). G G I GH GH G G H I (60) Next, let us evaluate each term on the right-hand side of Eq. (60). It follows from Eqs. (28) and (29) that the first-term is transformed into and evaluated by +*¸(V, T )/*» ,(d» /dt)"! o (V, T )+*¸(V, T )/*» ,40. (61) G G G G G G It furthermore follows from Eqs. (31) and (33) that the second term is transformed into and evaluated by

C *R (V, T )/*¹ (d¹ /dt) I GH GH I G H

"! p (V, T ) C *R (V, T )/*¹ 40. (62) GH I GH G H I Therefore, by substituting the right-hand sides of Eqs. (61) and (62) for the corresponding terms of Eq. (60), we have the following evaluation: d¸(V, T )/dt"! o (V, T )+*¸(V, T )/*» , G G G ! p (V, T )+C *R (V, T )/*¹ ,)0. GH I GH G H I

(63)

142

Y. Takahashi/Neurocomputing 24 (1999) 117–161

Moreover it follows from Eq. (63) coupled with Eqs. (29) and (33) that the necessary and sufficient condition for d¸(V, T )/dt"0 is identical with Eq. (59). Remember here that any equilibrium point P# is also an extreme point. In addition, it is sufficient to assume that the functions F(V ) and R (V, T ) (k"1,2,K) are all generic and hence so I is ¸(V, T ). Consequently, we can take some neighborhood º of P# such that º5+(V, T )3Dom(V, T ) " the condition (62) holds for (V, T ),"+P#,.

(64)

This proves (DS-1a-L2); and thus completes the proof of (DS-1a). (b) Satisfaction of (DS-1b). In the proof of (DS-1a), we have already demonstrated that the dynamical system has a Lyapunov function ¸(V, T ) for any equilibrium point P#"(V #, T #). We thus know from the Lyapunov’s stability theorem that P# is asymptotically stable. This has confirmed (DS-1b). (2) Satisfaction of (DS-2). Any asymptotically stable point P, by definition, is an equilibrium point P#. Then, P# satisfies Eq. (57) including the condition Eq. (57c) that coincides with Eq. (53). This concludes the proof. (3) Satisfaction of (DS-3). The condition (DS-3a) is derived straightforwardly from (DS-1a-L1) and (DS-2). We then proceed to (DS-3b). Let P+H be any locally minimum point of ¸(V, T ) satisfying Eq. (54). First of all, P+H necessarily satisfies Eq. (59). It follows from Eq. (27) that Eq. (59a) is further rewritten as *¸(V, T )/*» "*F(V )/*» #C *R (V, T )/*» "0 (i"1,2,N). (65) G G I G I Remember here that R (V, T )3C satisfies the non-negative condition (8). Then it I follows from Eq. (54) that P+H is a locally minimum point of R (V, T ) and hence I satisfies the following condition: *R (V, T )/*» "0 (i"1,2,N; k"1,2,K), (66a) I G *R (V, T )/*¹ "0 (i, j"1,2,N; k"1,2,K). (66b) I GH Therefore it follows from Eq. (65) coupled with Eq. (66a) that V +H satisfies the following condition: *F(V )/*» "0 (i"1,2,N). (67) G H It hence follows from Eq. (67) that V + is also an extreme point of F(V). Furthermore, for all the locally minimum points P+H"(V +H, T +H) of ¸(V, T ), we select some parameter C, not dependent on any locally minimum point P+H, that satisfies the following condition: There exists some neighborhood º(P +H) of P +H such that the following condition holds along V(t) of every trajectory (V(t), T(t)) contained in Proj(º(V +H))5Dom(V )!V +H. dF(V )/dt(0. Here, Proj(º(V +H)) denotes the projection of º(P+H) over V-coordinates.

(68)

143

Y. Takahashi/Neurocomputing 24 (1999) 117–161

In fact, this selection of the parameter C is possible. Let us specifically find some parameter C. First, it follows from Eqs. (27) and (28) that Eq. (68) is equivalent to the following inequality:

o (V, T ) +*F(V )/*» ,#C+*F(V )/*» , *R (V, T )/*» '0. (69) G G G I G G I Because of Eq. (29), we can employ the following termwise-positive sufficient condition for Eq. (69):

+*F(V )/*» ,#C+*F(V)/*» , *R (V, T )/*» '0 for all i"1,2,N. (70) G G I G I Furthermore, all the derivatives *F(V )/*» and *R (V, T )/*» are bounded on their G I G whole domains since F(V )3C and R (V, T )3C on the closed domains Dom(V ) and I Dom(V, T ), respectively. Hence, for all the locally minimum points P+H"(V +H,T +H) of ¸(V, T ), we can find some parameter C and a neighborhood º(P+H) of P+H such that the following condition holds: +*F(V )/*» ,#C+*F(V )/*» , *R (V, T )/*» '0 G G I G I for all (V, T )3º(P+H) 5Dom(V, T )!P+H and for all i"1,2,N.

(71)

We thus obtain the parameter C that ensures Eq. (68). Condition (68) evidently shows that the extreme point V +H of F(V ) is actually a locally minimum point V+ of F(V ). That is, V+"V +H. This concludes the proof of (DS-3b). 䊐 6.2. TSP examples: solving the network ¹SP with the dynamical systems This section shows two TSP examples of Theorem 2: solving the distributed network TSP of Definition 5 with the TSP distributed dynamical system of Definition 7; and the centralized network TSP of Definition 4 with the Hopfield’s dynamical system of Definition 8. 6.2.1. Solving the distributed network TSP with the TSP distributed dynamical system Let us apply Theorem 2 to the distributed network TSP. Then we have the following corollary. Corollary 1. Assume that the distributed network ¹SP is given. ¹hen the ¹SP distributed dynamical system produces a point V 1 that is a solution to it and thus a feasible one to the original ¹SP of Definition 2. Specifically, the ¹SP distributed dynamical system produces a point (V 1, Z 1)3+0,1,+;(!R,#R)++\> that satisfies the following three conditions. (DS-1) ¹he point (V 1, Z 1) is an asymptotically stable point of the ¹SP distributed dynamical system.

144

Y. Takahashi/Neurocomputing 24 (1999) 117–161

(DS-2) ¹he point (V 1, Z 1) satisfies the constraint equalities: R (V 1, Z 1)"0 for k"1,2,M(M!1). (72) I (DS-3) ¹he point V 1 is a locally minimum point of the ¹SP objective function F(V ) of Eq. (12). In (DS-3) of Theorem 2, the parameter C for ¸(V, T ) must be found suitably. On the other hand, such discussion is not required for (DS-3) of Corollary 1. This is because we already put C"1 as in Eq. (40). 6.2.2. Solving the centralized network TSP with the Hopfield’s dynamical system Let us apply Theorem 2 to the centralized network TSP. Then we have the following corollary. Corollary 2. Assume that the centralized network ¹SP is given. ¹hen the Hopfield+s dynamical system produces a point V 1 that is a solution to it if the positive real constants AH,BH and CH of the ¹SP constraint function R(V ) of Eq. (16) are all selected appropriately. Specifically, the Hopfield+s dynamical system produces a point V 13+0,1,+ that satisfies the following three conditions. (DS-1) ¹he point V 1 is an asymptotically stable point of the Hopfield+s dynamical system. (DS-2) ¹he point V 1 satisfies the constraint equality: R(V 1)"0.

(73)

¹his satisfaction of the constraint equality is obtained only if the positive real constants AH, BH and CH of R(V ) of Eq. (16) are all determined by the final-state parameters (Z (R)) of the Hopfield+s dynamical system. ¹hat is: I AH"1#exp(Z (R))"1#exp(A (R)) (k"1,2,M(M!1)), (74a) I 6K6L BH"1#exp(Z (R))"1#exp(B (R)) I 6K7K (k"M(M!1)#1,2,2M(M!1)), (74b) CH"any positive real constant.

(74c)

(DS-3) ¹he point V 1 is a locally minimum point of the ¹SP objective function F(V ) of Eq. (12). As described in the Introduction, the Hopfield’s dynamical system has major drawbacks: among them are (b) possible infeasible solutions; and (c) heuristic choice of network parameters and an initial state. Comparison of (DS-2) in Corollaries 1 and 2 clearly shows that the TSP distributed dynamical system has removed both the drawbacks (b) and (c) completely. Let us expound this improvement in some more detail. The Hopfield’s dynamical system can asymptotically converge on any of the 2, corners of the hypercube [0,1], depending on the choice of the parameters AH,BH

Y. Takahashi/Neurocomputing 24 (1999) 117–161

145

and CH as well as the initial point V(0). For any V(0), it is then extremely difficult for humans to find by trial-and-error suitable positive real constants AH, BH and CH, such as Eq. (74), that can lead to feasible solutions to the TSP. In the meantime, the satisfaction of Eq. (73) is necessary and sufficient for any TSP solution V 1 to be feasible. The Hopfield’s dynamical system cannot always converge on the point V 1 satisfying Eq. (73). This is because not all of the 2, corners of [0,1], can satisfy Eq. (73). It thus often produces infeasible solutions to the TSP. Standing in marked contrast to the Hopfield’s dynamical system, the TSP distributed dynamical system always produces the point (V 1, Z1) satisfying Eq. (72) for any initial point (V(0), Z(0)) chosen. This is achieved by the elastic synapse dynamical system (44),(45), which is a distinguished improvement from the trivial one (51) of Hopfield’s.

PART III: SOLVING THE NETWORK PROBLEM WITH A NETWORK This part (Sections 7—9) discusses the lower half of the diagram depicted by Fig. 1. First, Section 7 embeds the dynamical system into a parallel-processing transformation system; a resultant composite system constitutes a network. Then Section 8 constructs from the network problem a particular network for problem-solving. Finally, Section 9 demonstrates that the network produces solutions to the network problem.

7. A network By extending the Golden’s framework [17], Takahashi [44] developed mathematically rigorously a network model for problem-solving. This paper employs his network model. For the paper to be self-contained, this section adapts it from [44] just for the purpose of problem-solving. 7.1. Overall specification of a network In contrast with most previous networks that were developed in engineering ways, somewhat bottom-up, we specify our network in a mathematically rigorous, topdown way. We thus specify a network as a set of its constituent functions that must comply with all restrictive conditions, necessary and sufficient. Note that every constituent function is inherently abstract. Hence, the specification does not put any restriction on mappings from problems onto networks, including engineering implementation methods of the mapping. Prior to the specification, we need to prepare some notation. Consider analog, mesh-shaped, interconnected networks. Let N denote the number of neurons and t3[0,R) the time. Then, associated with each neuron i at each time t, there are two quantities u ,u (t)3R and » ,» (t)3R that represent an input to and an output G G G G from neuron i respectively (i"1,2,N). Furthermore, associated with each synapse

146

Y. Takahashi/Neurocomputing 24 (1999) 117–161

ij at each time t, a quantity ¹ ,¹ (t)3R that represents a synapse weight GH GH (i, j"1,2,N). It is supposed that Dom(V, T )" [a ,b ];R, where a and b both G G G G G indicate real numbers. Now, let us proceed to the specification. We first give an overall specification of a network as follows; details of each network constituent are to be specified in the subsequent series of sections. Definition 9. A network is an analog, mesh-shaped, interconnected network consisting of N neurons and N synapses that satisfies the following three conditions. (a) The network is a composite system that consists of a transformation system and a dynamical system. (b) The dynamical system consists of a neuron dynamical system and a synapse dynamical system. The constituent dynamical systems represent time-evolutionary state-changes of the activation level V(t) (output) and the synapse weight T(t), which are represented by the following canonical form of the first-order differential equation system: d» /dt"u (V, T ), u 3C (i"1,2,N), (75a) G G G d¹ /dt"t (V, T ); t 3C, t "t and t "0 (i, j"1,2,N). (75b) GH GH GH GH HG GG Moreover, the dynamical system has some local (not necessarily global) Lyapunov function for any equilibrium point. (c) Constituent functions of the network must comply with some Dt-shift form of an ensemble of global compatibility conditions and local compatibility conditions. 7.2. Specification of a transformation system Main parallel-processing power of the network lies in each neuron. Its processing task is to transform inputs to outputs where the neuron activation level is used as the output. McCulloch and Pitts [35] developed a primitive neuron model. With their neuron model, its processing task is mathematically identified with a threshold function that transforms inputs to outputs. Though their neuron model is quite simple, many previous networks employ its analog version. That is, many current neurons adopt a sigmoidal function as a transformation function. We thus reach the following specification of a transformation system that is comprised of sigmoidal neurons. Definition 10. An ith transformation system is a system that incorporates the following deterministic time-invariant input—output transformation rule at neuron i (14i4N): » "u (u ), G G G u " v (¹ )g (» )!h (» ); v "v G GH GH H H G G GH HG H

(76a) and v "0 (i, j"1,2,N). GG

(76b)

Y. Takahashi/Neurocomputing 24 (1999) 117–161

147

A transformation function u (u )3C (two times continuously differentiable) is G G bounded, monotonically increasing; and besides, v (¹ )3C, g (» )3C and GH GH H H h (» )3C. G G Note. The smoothness assumption of u,(l, g, h)3C and u3C can be achieved by usual network implementation techniques such as electronic circuits and VLSI [24,36]. Let us give two special examples of the transformation rule (76) that underlie Definition 10. The first one comes from the McCulloch—Pitts neuron [35]. Its transformation rule is expressed by » "1[u ]"1 if u 50, G G G » "1[u ]"0 if u (0, G G G

(77a)

u " ¹ » !h. (77b) G GH H G H Here, 1[u ] denotes a threshold function, and besides, the real constants ¹ and G GH h represent a synapse weight and a threshold, respectively. This transformation rule G (77) is a special case of Eq. (76) in the sense that 1[u ] is a limitation of the sigmoidal G function u (u ). G G The second one has to be referred to the Hopfield’s network. Its transformation rule was previously expressed by Eq. (11). Formally, the rule (11) is a special form of Eq. (76). Actually, we have specified the transformation rule (76) by mathematically generalizing the Hopfield’s one (11). 7.3. Specification of compatibility conditions Each neuron of the network supports its own part of the transformation system and the dynamical system. Thus, the piecewise specification (75), (76) of the network must be consistent with one another. First of all, the Lyapunov’s stability theorem ensures that the dynamical system (75), if separated from the transformation system (76), can perform consistent statechanges and converges asymptotically on some equilibirum point. Hence, the specification consistency is reduced to certain conditions that the specification (76) must be non-contradictory by itself, and besides compatible with the specification (75). We express those conditions by an ensemble of two sets of compatibility conditions: global and local compatibility conditions. They are intuitively characterized as follows. 1. A set of global compatibility conditions. A collection of all the piecewise transformation rules (76) specified locally at neuron i must form a non-contradictory transformation rule specified globally over all the neurons and synapses. Specifically each transformation rule must be compatible with the rest. Such compatibility is specified into global compatibility conditions on the functions (u, u, u, t) as well as the variables (V, T ).

148

Y. Takahashi/Neurocomputing 24 (1999) 117–161

2. A set of local compatibility conditions. Each transformation rule (76) at neuron i must be compatible with the state-change rules (75) of neuron i and synapses ij ( j"1,2,N) for all t3[0,R). Such compatibility is specified into local compatibility conditions on the functions (u, u, u, t) as well as the variables (V, T ). Now, let us specify the compatibility conditions mathematically rigorously as follows. Definition 11. (a) A set of global compatibility conditions is specified as a necessary and sufficient condition that ensures solution existence of the following functional equation system: v (¹ )g (» )!h (» )!u\(» )"0 (i"1,2,N). (78) GH GH H H G G G G H Here, u "u\(» ) indicates the inverse function of » "u (u ). G G G G G G (b) A set of local compatibility conditions is specified as a necessary and sufficient condition that ensures solution existence of the following functional equation system:

u !(du /du ) v (dg /d» )u # (dv /d¹ )t g !(dh/d» )u "0 GH H H H GH GH GH H G G G G G H H (i"1,2,N).

(79)

Note. (1) The functional equation system (78) consists of N equations with u"(v, g, h) and u unknown functions and (V, T ) independent variables. Its solution existence requires that there exists at least one valid combination of (u, u) satisfying Eq. (78) for at least one point (V, T ) shared by all domains of each of the functions (u, u). (2) The functional equation system (79) consists of N equations with (u, u,u, t), u"(v, g, h) unknown functions and (V, T ) independent variables. Its solution existence requires that there exists at least one valid combination of (u, u, u, t) satisfying Eq. (79) for at least one point (V, T ) shared by all domains of each of the functions (u, u, u, t). We must additionally explain how we have derived expressions (78) and (79). The derivation of Eq. (78) is easy. We have obtained it from Eq. (76) by substituting the right-hand side of Eq. (76b) for u on the right-hand side of Eq. (76a). G On the other hand, the derivation of Eq. (79) needs a little more elaboration. We have obtained it from Eqs. (75) and (76). First of all, we substituted the right-hand side of Eq. (76b) for u on the right-hand G side of Eq. (76a). We then differentiated the resultant equation with respect to the time t, producing the following differential equation system:

d» /dt"(du/du ) v (dg /d» )(d» /dt) G G GH H H H H

# (dv /d¹ )(d¹ /dt)g !(dh /d» )(d» /dt) . GH GH GH H G G G H

(80)

Y. Takahashi/Neurocomputing 24 (1999) 117–161

149

Finally, we substituted the right-hand sides of Eq. (75) for d» /dt and d¹ /dt on both G GH the sides of Eq. (80). We have thus obtained Eq. (79). 7.4. Specification of Dt-shift forms of the compatibility conditions Engineering devices cannot avoid involving time delay somewhere in the operation within the transformation system and/or the dynamical system, and/or between the transformation system and the dynamical system. It is usually the case, however, that the time delay occurs in the operation within the transformation system. This is because many devices can hardly satisfy the transformation rule (76) exactly. Instead, they satisfy transformation rules in delayed-mode expressed by » (t#D t)"u (u (t)), G G G G

(81a)

u (t)" v (¹ (t))g (» (t))!h (» (t)). (81b) G GH GH H H G G H Here, D t represents a delay time in the input—output transformation at neuron i. As G contrasted with this expression (81), the exact transformation rule (76) is often rewritten as » (t)"u (u (t)), G G G

(82a)

u (t)" v (¹ (t))g (» (t))!h (» (t)). (82b) G GH GH H H G G H In the meantime, the network problem of Definition 3 can be categorized into two classes in light of the delayed-mode transformation system (81) that is to be applied to it. The first class is the one where the time delay is negligible for the solving network. That is, the time delay does not affect any solutions to the network problem produced by the network. This class includes, for instance, pattern classification and pattern learning [11,13,41]. The second class is the one where the time delay is significant for the solving network. That is, the time delay has a serious influence on solutions. This class includes the TSP for instance [8—10,12,24]. Our concept of Dt-shift form of the compatibility conditions enables the network to be applied to both the classes of the network problem in an integrated manner. Let us thus specify Dt-shift forms as follows. Definition 12. Denote by D t and D t non-negative real numbers that represent small G GH quantities of time: D t50 and D t50 (i, j"1,2,N). Consider also the following G GH functional equation system consisting of 2N functional equations: H (u, u, u, t),H (u(V(t), T(t)), u(u(V(t), T(t)), u(V(t), T(t)), t(V(t), T(t))) I I "0 (k"1,2,2N).

(83)

Here, H (u, u, u, t)3C. Then, replace any one or more variables instances from I among the variables » (t) and ¹ (t) in Eq. (83) with » (t#D t) and ¹ (t#D t). The G GH G G GH GH

150

Y. Takahashi/Neurocomputing 24 (1999) 117–161

resultant functional equation system is named a Dt-shift form of Eq. (83). Furthermore, a collection of functions (u, u, u, t) is said to comply with the Dt-shift form if it satisfies Eq. (83) for at least one point (V, T ) shared by all domains of each of the functions (u, u, u, t). Note. The functional equation system (83) is also a special Dt-shift form of its own where the delay-times D t and D t are all set to zero. G GH One can see from Definitions 11 and 12 that any Dt-shift form of the compatibility conditions (78), (79) is a differential-difference equation system for the unknown functions (u, u, u, t). Thus the item (c) in Definition 9 means that the network constituent functions (u, u, u, t) must be a solution to such a differential-difference equation system. 7.5. The network is well-specified We justify the specification of the network of Definitions 9—12. That is, let us confirm as follows that the network is well-specified in its motion. Proposition 1. ¹he network works consistently. Specifically it satisfies the following three conditions. (C1) ¹he dynamical system (75) performs the state-changes consistently for all the time t3[0,R). (C2) ¹he transformation system (76) performs the input—output transformations consistently for all the inputs—outputs (u, V ). (C3) ¹he cooperation between the dynamical system and the transformation system is performed consistently for all the constituent functions (u, u, u, t), the variables (V, T ) and the time t3[0,R). Proof. Actually, we have specified the network with this Proposition 1 in mind. Hence, conditions (C1)—(C3) have already been embedded suitably into the specification though somewhat scattered. We thus have only to collect here the discussion on them. 1. It is ensured by the Lyapunov’s stability theorem that the motion of the dynamical system (75) converges asymptotically on some equilibrium point. Thus (C1) is satisfied. 2. Conditions (C2) and (C3) are identical with those necessary and sufficient for the network to comply with some Dt-shift form of the compatibility conditions (78) and (79). Obviously, they are nothing but the specification item (c) in Definition 9. 䊐

8. Construction of a particular network from the network problem This section constructs from the network problem of Definition 3 a particular network that can solve the problem. Construction techniques are an extension of those developed by Takahashi [45,46].

Y. Takahashi/Neurocomputing 24 (1999) 117–161

151

8.1. A particular network: general construction from the network problem First of all, the network of Definitions 9—12 can be identified with its constituent functions (u, u, u, t) that satisfy the following two requirements. (a) The functions (u, u, u, t) must be a solution of a differential-difference equation system consisting of some Dt-shift form of the differential equation system (78), (79). (b) The functions (u, t) must be constructed so that the dynamical system of (75) can have some Lyapunov function for any of its equilibrium points. Then the differential equation system (78), (79) is indefinite. This is because it is comprised of 2N equations for (1/2)N(N!1)#4N unknown functions (u, u, u, t) with u"(v, g, h). There are thus several methods conceivable to construct a set of functions (u, u, u, t) from the network problem. Every construction method depends on the way how that indefiniteness in Eqs. (78) and (79) is removed away. It seems difficult, however, to study such possible construction methods exhaustively. We thus develop a particular standard construction method, instead. Essential idea of the development lies in the construction of the functions (u,t) and (u,u) separately. Specifics are described as follows. We first construct a particular version of the dynamical system (75); actually by adopting the dynamical system (28)—(33) that was previously constructed from the network problem in Section 5.3 of Part II. It thus follows from Eqs. (34) and (75) that u"uN and t"tM .

(84)

The construction of (uN ,tM ) fulfills the requirement (b) above as ensured previously by Theorem 2. We then construct a particular version (uN ,u) of the transformation system (76). Specifically, we generalize the Hopfield’s transformation system (u ,h ) of Eq. (11) into G G a necessary minimum for solving the network problem. Here we can choose the transformation system (uN ,u) without considering the network problem; that is, (uN ,u) can be independent of the network problem. It is also expected that the generalization can practically cover many engineering devices developable in the future. Now, a resultant network (uN ,u,uN ,tM ) is specified as follows; while it is left for a Theorem 3 in subsequent Section 9.1 to confirm that the network (uN ,u,uN ,tM ) fulfills the requirement (a). Definition 13. A particular network (uN ,u,uN ,tM ) consists of a dynamical system (uN ,tM ) and a transformation system (uN ,u), which are specified as follows. (a) The dynamical system (uN ,tM ) is the same as the one Eqs. (28)—(30) and (33), or equivalently its particular canonical form (34), in Definition 6. (b) The transformation system (uN ,u) is a collection of the ith transformation system (uN ,u ) over all the neurons i. Each ith transformation system (uN ,u ) is delayed mode G G G G and expressed by » (t#Dt)"u (uN (t)) (i"1,2,N), G G G

(85a)

152

Y. Takahashi/Neurocomputing 24 (1999) 117–161

uN (t)" vN (¹ (t))gN (» (t))!hM (» (t)) G GH GH H H G G H , ¹ (t)» (t)!h (i, j"1,2,N), GH H G H vN (¹ ),¹ ; vN "vN and vN "0, GH GH GH GH HG GG

(85b)

gN (» ),» , H H H

(86b)

hM (» ),h. G G G

(86c)

(86a)

Here, u 3C is bounded and monotonically increasing. In addition, h3R and Dt50 G G represents the delay time common to all the neurons. Note. The ith transformation system (uN ,u ) of Eqs. (85) and (86) is a special version of G G the general one (81). The input function uN "(vN , gN , hM ) of Eqs. (85) and (86) are also a fairly simple special version of the general one u"(v, g, h) of Eq. (76b). This is because uN of Eqs. (85) and (86) is a necessary minimum generalization of the G Hopfield’s input function u of Eq. (11b). G 8.2. TSP examples: specific network construction from the ¹SP 8.2.1. Network construction from the distributed network ¹SP Apply the network construction method of Definition 13 to the distributed network TSP of Definition 5. Then we obtain an example of the particular network for the TSP, which is specified as follows. Definition 14. A TSP distributed network (uN ,u,uN ,tM ) consists of a dynamical system (uN ,tM ) and a transformation system (uN ,u), which are specified as follows. (a) The dynamical system (uN ,tM ) is the same as Eq. (42), (44), Eq. (45), or equivalently its particular canonical form (46), in Definition 7. (b) The transformation system (uN ,u) is a collection of the Xmth transformation system (uN , u ) over all the neurons Xm. Each Xmth transformation system 6K 6K (uN , u ) is delayed mode and expressed by 6K 6K » (t#Dt)"u (uN (t)),h (uN (t)) 6K 6K 6K 6K 6K ,(1/2)+1#tanh(uN (t)/u ),(X,m"1,2,M) 6K 6K

(87a)

uN (t)" vN (¹ (t))gN (» (t))!hM (» (t)) 6K 6K7L 6K7L 7L 7L 6K 6K 7 L , ¹ (t)» (t)!h (X,m,½,n"1,2,M). 6K7L 7L 6K 7 L Here, u 3R is taken extremely small. In addition, h 3R and Dt50. 䊐 6K 6K

(87b)

Y. Takahashi/Neurocomputing 24 (1999) 117–161

153

Note. The transformation function u of (87a) has been adopted straightforwardly 6K from Hopfield’s one h of Eq. (11a). G 8.2.2. Network construction from the centralized network TSP Apply the network construction method of Definition 13 to the centralized network TSP of Definition 4. Then we obtain just the Hopfield’s network, which is reformulated as follows [24]. Definition 15. The Hopfield’s network (uN ,u,uN ,tM ) for the centralized network TSP consists of a dynamical system (uN ,tM ) and a transformation system (uN ,u), which are specified as follows. (a) The dynamical system (uN ,tM ) is the same as Eqs. (50) and (51) in Definition 8, which is expressed as the following special canonical form: d» /dt"uN (V ),!2» (1!» )*¸(V )/*» (X, m"1,2,M), 6K VK 6K 6K VK dZ /dt"tM (V ),0 (k"1,2,2M(M!1)#1). I I

(88a) (88b)

(b) The transformation system (uN , u) is the same as the one (u, h) of Eq. (11). The Hopfield’s network is indicated by (u, h, uN , tM ) as well. Note. The Hopfield’s transformation system (u, h) of Eq. (11) is also a special version of Eq. (87) for the distributed network TSP in Definition 14.

9. Solving the network problem with the particular network This section demonstrates mathematically rigorously that the particular network of Definition 13 produces solutions to the network problem of Definition 3. 9.1. Solving the network problem with the particular network: a general theorem Theorem 2 in Section 6.1 is further elaborated into a parallel processing network version. The result constitute a core part of the present paper, which is stated as follows. Theorem 3. Assume that the network problem is given. ¹hen the particular network (uN ,u,uN ,tM ) produces a solution V Q to it. Specifically, the particular network and the point VQ satisfy the following two conditions. (NET-1) ¹he particular network works consistently. Specifically it satisfies the following three conditions. (NET-1a) ¹he dynamical system (uN ,tM ) of Eq. (34) performs the state-changes consistently for all the time t3[0,R). (NET-1b) ¹he transformation system (uN ,u) of Eq. (85) performs the input—output transformations consistently for all the inputs—outputs (uN ,V ).

154

Y. Takahashi/Neurocomputing 24 (1999) 117–161

(NET-1c) ¹he cooperation between the dynamical system and the transformation system is performed consistently for all the constituent functions (uN ,u,uN ,tM ), the variables (V, T ) and the time t3[0,R). (NET-2) ¹he particular network produces at t"R a point (V 1,T 1) that satisfies all the three conditions (DS-1)—(DS-3) in ¹heorem 2. Proof. Theorem 2 and Proposition 1 have already completed the proof. 䊐 The essential advancement of Theorem 3 from Theorem 2 lies in the condition (NET-1). We thus here provide a deeper insight into it by demonstrating (NET-1) directly instead of Proposition 1. That is, let us confirm that the particular network (uN , u, uN , tM ) complies with some Dt-shift form of the compatibility conditions (78) and (79). First of all, it is ensured by the Cauchy’s solution existence theorem that the dynamical system (34) has a unique trajectory (V, T )"(V(t), T(t)) starting the initial point (V(0), T(0)). We thus substitute this solution (V, T ) for all the variables (V, T ) in Eqs. (34) and (85). Let us then construct a natural Dt-shift form of the global compatibility conditions (78). It follows from Eq. (85) that the following condition is satisfied for any bounded, monotonically increasing C-function u , any real constant h and any non-negative G G real number Dt50: ¹ (t)» (t)!h!u\(» (t#Dt))"0 (i"1,2,N). (89) GH H G G G H This condition (89) is nothing but a specific Dt-shift form of Eq. (78). Next, let us also proceed to construct a natural Dt-shift form of the local compatibility conditions (79). We first substitute the right-hand side of Eq. (85b) for uN on the G right-hand side of Eq. (85a), and obtain the following equation:

» (t#Dt)"u vN (¹ (t))gN (» (t))!hM (» (t)) "u ¹ (t)» (t)!h . (90) G G GH GH H H G G G GH H G H H We then differentiate both the sides of Eq. (90) with respect to the time t. This produces the following differential-difference equation system: d» (t#Dt)/dt"(du (u (t))/du ) G G G G

; vN (¹ (t))(dgN (» (t))/d» )(dv (t)/dt) GH GH H H H H H

# (dv (¹ (t))/d¹ )(d¹ (t)/dt)gN (» (t))!(dhM (» (t)) /d» )(d» (t)/dt) GH GH GH GH H H G G G G G H

"(du (u (t))/du ) ¹ (t)(d» (t)/dt)# (d¹ (t)/dt)» (t) . G G G GH H GH H H H

(91)

155

Y. Takahashi/Neurocomputing 24 (1999) 117–161

Here, d» (t#Dt)/dt indicates the derivative of » (t) with respect to t, and evaluated at G G t#Dt. Finally, we substitute the right-hand sides of Eq. (34) for d» (t#Dt)/dt on the G left-hand side and for d» (t)/dt and d¹ (t)/dt on the right-hand side of Eq. (91). We G GH then obtain the following differential-difference equation system: uN (V(t#Dt), T(t#Dt)) G

!(du (u (t))/du ) ¹ (t)uN (V(t), T(t))# tM (V(t), T(t))» (t) "0. (92) G G G GH H GH H H H This equation system (92) is nothing but a specific Dt-shift form of Eq. (79). Furthermore, Eq. (92) is satisfied for all t50; and besides, for the dynamical system (uN ,tM ) of Eq. (34), any bounded, monotonically increasing transformation functions u(uN )3C of Eq. (85) and any non-negative real number Dt50. Consequently, we have finished demonstrating (NET-1) directly, i.e., confirming that the particular network (uN ,u,uN ,tM ) comply with the natural Dt-shift form (89), (92) of the compatibility conditions (78) and (79). 9.2. TSP examples: solving the network ¹SP with the particular networks The essential advancement of Theorem 3 from Theorem 2 can be much more visible and more easily understandable when we apply it to the TSP, comparing the results with Corollaries 1 and 2 to Theorem 2. This section thus shows two TSP examples of Theorem 3: solving the distributed network TSP of Definition 5 with the TSP distributed network of Definition 14; and the centralized network TSP of Definition 4 with the Hopfield’s network of Definition 15. 9.2.1. Solving the distributed network TSP with the TSP distributed network Let us apply Theorem 3 to the distributed network TSP. Then we have the following Corollary 3. Corollary 3. Assume that the distributed network ¹SP is given. ¹hen the ¹SP distributed network (uN ,u,uN ,tM ) produces a point V 1 that is a solution to it and thus a feasible one to the original ¹SP of Definition 2. Specifically, the ¹SP distributed network and the point V 1 satisfy the following two conditions. (NET-1) ¹he ¹SP distributed network works consistently. Specifically it satisfies the following three conditions. (NET-1a) ¹he dynamical system (uN ,tM ) of Eq. (46) performs the state-changes consistently for all the time t3[0,R). (NET-1b) ¹he transformation system (uN ,u) of Eq. (87) performs the input—output transformations consistently for all the inputs—outputs (uN ,V ). (NET-1c) ¹he cooperation between the dynamical system and the transformation system is performed consistently for all the consistent functions (uN ,u,uN ,tM ), the variables (V, T ) and the time t3[0,R).

156

Y. Takahashi/Neurocomputing 24 (1999) 117–161

(NET-2) ¹he ¹SP distributed network produces at t"R a point (V 1, Z 1)3 +0,1,+;(!R,#R)++\> that satisfies all the three conditions (DS-1)—(DS-3) in Corollary 1. 䊐 The essential advancement of Corollary 3 from Corollary 1 can be shown more visibly with the direct demonstration of the condition (NET-1) instead of Theorem 2 based on Proposition 1. We thus confirm that the TSP distributed network (uM ,u,uN ,tM ) of Eqs. (46) and (87) complies with some Dt-shift form of the compatibility conditions (78) and (79). First of all, it follows from Eq. (80) coupled with Eq. (18), (19), (46) and (87) that the local compatibility condition (79) is rewritten as

uN !(du /duN ) uN ¹ # tM » +(*¹ /*Z ) 6K 6K 6K 7L 6K7L 6K7L 7L 6K7L I 7 L 7 L

#(*¹ /*C), "0 (X, m, ½, n"1,2, M). 6K7L

(93)

This equation system (93) is indefinite algebraic equation for the unknown functions (V, Z ) where the apparent function T(t) can be expressed by the substantial and independent function Z(t) subject to Eq. (18) and (19). Hence, there exists at least one solution (V, Z ) to Eq. (93). We then substitute that solution (V,Z ) for all the variables (V, Z ) in Eq. (93) as well as the transformation system (uN , u) of Eq. 87. Let us next proceed to the construction of some specific Dt-shift form of Eqs. (78) and (79). We construct from Eq. 87 a specific Dt-shift form of the global compatibility condition (78) as follows: ¹ (t)» (t)!h !u\(» (t#Dt))"0 6K7L 7L 6K 6K 6K 7 L (X, m, ½, n"1,2,M).

(94)

This equation system (94) holds for any Dt50. In addition, we construct from Eq. 87 and (93) a specific Dt-shift form of the local compatibility condition (79) as follows: uN (V(t#Dt), Z(t#Dt)) 6K

!(du (uN (t))/duN ) ¹ (t)uN (V(t), Z(t)) 6K 6K 6K 6K7L 7L 7 L

# tM (V(t),Z(t))» (t)+(*¹ (t)/*Z )#(*¹ (t)/*C), 6K7L 7L 6K7L I 6K7L 7 L "0(X,m,½,n"1,2,M).

(95)

This equation system (95) also holds for any Dt*0. Consequently, we have finished constructing the specific Dt-shift form (94) and (95) of the compatibility conditions (78) and (79) for the TSP distributed network (uN , u, uN , tM ).

Y. Takahashi/Neurocomputing 24 (1999) 117–161

157

9.2.2. Solving the centralized network TSP with the Hopfield’s network Let us apply Theorem 3 to the centralized network TSP. Then we have the following Corollary 4. Corollary 4. Assume that the centralized network ¹SP is given. ¹hen the Hopfield+s network (u,h,uN ,tM ) produces a point V 1 that is a solution to it and thus a feasible one to the original ¹SP of Definition 2 if the positive real constants AH,BH and CH of the ¹SP constraint function R(V ) of (16) are all selected appropriately. Specifically the Hopfield+s network and the point V 1 satisfy the following two conditions. (NET-1) ¹he Hopfield+s network works consistently. Specifically, it satisfies the following three conditions. (NET-1a) ¹he dynamical system (uN ,tM ) of Eq. (88) performs the state-changes consistently for all the time t3[0,R). (NET-1b) ¹he transformation system (u, h) of Eq. (11) performs the input-output transformations consistently for all the inputs—outputs (u, V ). (NET-1c) ¹he cooperation between the dynamical system and the transformation system is performed consistently for all the constituent functions (u, h, uN , tM ), the variables (V, T ) and the time t3[0,R). (NET-2) ¹he Hopfield+s network produces at t"R a point V 13+0,1,+ that satisfies all the three conditions(DS-1)—(DS-3) in Corollary 2. Similar to Corollary 3, let us show the essential advancement of Corollary 4 from Corollary 2 more visibly with the direct demonstration of the condition (NET-1). The demonstration is much simpler. We thus construct some specific Dt-shift form of the compatibility conditions (78) and (79) for the Hopfield’s network (u,h,uN ,tM ) of Eqs. (11) and (88). Let us first consider the transformation system (11) with no time delay, which is written up as follows: » (t)"u (uN (t))"(1/2)+1#tanh(uN (t)/u ), (X, m"1,2,M), 6K 6K 6K 6K 6K

(96a)

uN (t)" ¹ » (t)!h (X, m, ½, n"1,2,M). (96b) 6K 6K7L 7L 6K 7 L Next, substitute the right-hand side of Eq. (96a) for » on both-sides of Eq. (88a). 6K Then the specification Definition 8 of the centralized network TSP helps us to obtain the following equation system for the function u(t): du (t)/dt#(u !1)u (t)#(CM!h )"0 (X, m, ½, n"1,2, M). (97) 6K 6K 6K 6K This equation system (97), a first-order linear ordinary differential one, expresses nothing but the time-evolutionary motion of the Hopfield’s electronic circuit. One can thus know intuitively that Eq. (97) has solutions. Let us prove this fact mathematically rigorously. Consider the canonical first-order linear ordinary differential equation as follows: du/dt#p(t)u#q(t)"0.

(98)

158

Y. Takahashi/Neurocomputing 24 (1999) 117–161

Here, u"u(t) denotes an unknown function while p(t)3C and q(t)3C known functions. Then general solutions of Eq. (98) are expressed by

u" c! q(t)P(t) dt /P(t),P(t),exp

p(t) dt .

(99)

Here, c denotes an arbitrary real constant. Obviously, each equation of Eq. (97) for u is independent of one another and besides takes a special form of Eq. (98). Hence, 6K any solution u of (97) does exist and furthermore is expressed as a special form of 6K Eq. (99), as follows: u "c exp+!(u !1)t,!(CM!h )(u !1)\ 6K 6K 6K 6K 6K (X, m, ½, n"1,2,M).

(100)

We then substitute this solution u of Eq. (100) for all the input functions u in Eq. (11), obtaining the corresponding output functions V. Furthermore, we substitute this resultant output functions V for all the functions V in the dynamical system (88). This substitution of Eqs. (11) and (88) finally produces the following specific Dt-shift form of the compatibility conditions (78), (79) for the Hopfield’s network: ¹ » (t)!h !u\(» (t#Dt))"0 (X, m, ½, n"1,2,M), (101) 6K7L 7L 6K 6K 6K 7 L

uN (V(t#Dt), T(t#Dt))!(du (uN (t))/duN ) ¹ uN (V(t), T(t)) "0 6K 6K 6K 6K 6K7L 7L 7 L (X, m, ½, n"1,2,M). (102) This concludes the direct demonstration of the condition (NET-1), which well exemplifies the significant advancement of Theorem 3 from Theorem 2 that the present paper has achieved.

10. Conclusion This paper has mathematically developed the efficient, robust network solving method of the continuous optimization problem with constraint equalities in a general setting. The results are based on the Takahashi’s network [44] and the generalization of the Takahashi’s network problem-solving techniques [45,46]. His work aimed for a general extension of the Hopfield’s method [24], along with the removal of the four Hopfield’s drawbacks (b)—(e) described in Section 1. The present paper thus has completed his aim. Let us summarize the specific results. The problem-solving techniques developed consist of a series of the mapping and the construction (refer to Fig. 1 again). First in Theorem 1, we have mapped the real-world problem into the network problem. We then have constructed from it the dynamical system, which we have proved in

Y. Takahashi/Neurocomputing 24 (1999) 117–161

159

Theorem 2 solves it. The essential advancement from the conventional dynamical system construction lies in that of the synapse dynamical system that ensures the satisfaction of the constraint equalities. Finally, we have constructed from the network problem the particular network that consists of the dynamical system and the transformation system. It has been proved in Theorem 3, main theorem, that the particular network solves the network problem. Applying Theorems 2, 3 to the TSP has produced Corollaries 1—4, which provide a view of the Hopfield’s method in its perspective. The results are mathematically rigorous. We thus can claim that the present paper constitutes a solid foundation of a neural network theory for constrained optimization. It can provide scientists with deeper insights into the network problem-solving and serve as a starter for developing further techniques applicable to more complicated real-world problems. It is also expected that the results of this paper can stimulate engineers to implement the network into devices with high performance such as VLSI. Finally, we must refer to some future studies. The drawback (a) of the Hopfield’s method in Section 1 needs to be removed for our method as well. We believe that an idea of a dynamic transformation function can be as powerful as the simulated annealing and thus produce an effective remedy for it. It is also left for another future study to extend our method to inequality-constrained problems and probabilistic networks.

Acknowledgements The author is grateful to the anonymous reviewers for their valuable suggestions and comments that helped him to improve the presentation of the results in this paper significantly.

References [1] E.H.L. Aarts, J. Korst, Simulated Annealing and Boltzmann Machines — A Guide to Neural Networks and Combinatorial Optimization, Wiley, New York, 1989. [2] B. Angeniol, G.L.C. Vaubois, J.Y. Texier, Self-organizing feature maps and the traveling salesman problem, Neural Networks 1 (1988) 289—293. [3] C. Berge, Theorie des graphes et ses applications, Dunod, Paris, 1963. [4] W. Blaschke, Kreis and Kugels, Verlag von Veit, Chelsea, New York, 1949. [5] M. Budinich, A self-organising neural network for the traveling salesman problem that is competitive with simulated annealing, Proc. ICANN, vol. 1, Sorrento, Italy, May 1994, pp. 359—361. [6] A. Cichocki, R. Unbehauen, Neural Networks for Optimization and Signal Processing, Wiley, New York, 1996. [7] E.A. Coddington, N. Levinson, Theory of Ordinary Differential Equations, McGraw-Hill, New York, 1955. [8] M.A. Cohen, S. Grossberg, Absolute stability of global pattern formation and parallel memory storage by competitive neural networks, IEEE Trans. Systems Man Cybernet. 13 (5) (1983) 815—826. [9] R. Durbin, R. Szeliski, A.L. Yuille, An analysis of the elastic net approach to the traveling salesman problem, Neural Comp. 1 (1989) 348—358.

160

Y. Takahashi/Neurocomputing 24 (1999) 117–161

[10] R. Durbin, D. Willshaw, An analogue approach to the traveling salesman problem using an elastic net method, Nature 326 (1987) 689—691. [11] F. Favata, R. Walker, A study of the application of Kohonen-type neural networks to the traveling salesman problem, Biol. Cybernet. 64 (1991) 463—468. [12] D.B. Fogel, An evolutionary approach to the traveling salesman problem, Biol. Cybernet. 60 (1989) 139—144. [13] J. Fort, Solving a combinatorial problem via self-organizing process: an application of the Kohonen algorithm to the travelling salesman problem, Biol. Cybernet. 59 (1988) 33—40. [14] A.H. Gee, S.V.B. Aiyer, R.W. Prager, An analytical framework for optimizing neural networks, Neural Networks 6 (1993) 79—97. [15] A.H. Gee, R.W. Prager, Polyhedral combinatorics and neural networks, Neural Comput. 6 (1994) 161—180. [16] D.E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, AddisonWesley, Reading, MA, 1989. [17] R.M. Golden, A unified framework for connectionist systems, Biol. Cybernet. 59 (1983) 109—120. [18] L.T. Grujic, Exact both construction of Lyapunov function and asymptotic stability domain determination, IEEE Int. Conf. on Systems, Man and Cybernetics, Le Touquet, France, vol. 1, 1993, pp. 331—336. [19] M.W. Hirsch, S. Smale, Differential Equations, Dynamical Systems, and Linear Algebra, Academic Press, New York, 1974. [20] F.L. Hitchcock, The distribution of a product from several sources to numerous localities, J. Math. Phys. 20 (1941) 224—230. [21] A.L. Hodgkin, A.F. Huxley, A quantitative description of membrane current and its application to conduction and excitation in nerve, J. Physiol. 117 (1952) 500—544. [22] J.J. Hopfield, Neural networks and physical systems with emergent collective computational abilities, Proc. Natl. Acad. Sci. USA, vol. 79, 1982, pp. 2554—2558. [23] J.J. Hopfield, Neurons with graded response have collective computational properties like those of two-state neurons, Proc. Natl. Acad. Sci. USA, vol. 81, 1984, pp. 3088—3092. [24] J.J. Hopfield, D. Tank, Neural computation of decisions in optimization problems, Biol. Cybernet. 52 (1985) 141—152. [25] A. Jagota, M. Garzon, On the mappings of optimization problems to neural networks, Proc. of the WCNN, vol. 2, San Diego, June 1994, pp. II-391-II-398. [26] B. Kamgar-Parsi, B. Kamgar-Parsi, On problem solving with Hopfield neural networks, Biol. Cybernet. 62 (1990) 415—423. [27] S. Kirkpatrick, C.D. Gelatt, M.P. Vecchi, Optimization by simulated annealing, Science 220 (1983) 671—680. [28] T. Kohonen, Associative Memory: A System-Theoretical Approach., Springer, Berlin, 1977. [29] D. Konig, Theorie der endlichen und unendlichen Graphen, Akademische Verlag., Leipzig, 1936. [30] T.C. Koopmans, S. Reiter, A model of transporation, in: T.C. Koopmans (Ed.), Activity Analysis of Production and Allocation, Wiley, New York, 1951, pp. 222—259. [31] E.L. Lawler, J.K. Lenstra, H.G. Rinnooy Kan, P.B. Shmoys, The Traveling Salesman Problem, Wiley, New York, 1985. [32] D.G. Luenberger, Introduction to Dynamic Systems: Theory Models, and Applications, Wiley, New York, 1979. [33] D.G. Luenberger, Linear and Nonlinear Programming, Addison-Wesley, Reading, MA, 1984. [34] A.M. Lyapunov, The general problem of the stability of motion, Kharkov Mathematical Society, Kharkov, 1892 (in Russian). English translation: Taylor & Francis, London, 1992. [35] W.S. McCulloch, W.H. Pitts, A logical calculus of the ideas immanent in neural nets, Bull. Math. Biophys. 5 (1943) 115—133. [36] C. Mead, Analog VLSI and Neural Systems, Addison-Wesley, Reading, MA, 1989. [37] C. Peterson, B. Soderberg, A new method for mapping optimization problems onto neural networks, Int. J. Neural Systems 1 (1) (1989) 3—22.

Y. Takahashi/Neurocomputing 24 (1999) 117–161

161

[38] G. Pinkas, Symmetric neural networks and propositional logic satisfiability, Neural Comput. 3 (1991) 282—291. [39] G. Pinkas, Reasoning, nonmonotonicity and learning in connectionist networks that capture propositional knowledge, Artificial Intell. 77 (1995) 203—247. [40] G.V. Reklaitis, A.G. Tsirukis, M.F. Tenorio, Generalized Hopfield networks and nonlinear optimization, Proc. of the NIPS, Denver, November 1989, pp. 355—362. [41] D. Rumelhart, J. McClelland, PDP Research Group Parallel Distributed Processing, MIT Press, Cambridge, 1986. [42] L.A. Santalo, La desiguadad isoperimetrica sobre superificies de curvatura constante negativa, Rev. Univ. Tucuman 3 (1942) 243—259. [43] P. Simic, Statistical mechanics as the underlying theory of ‘‘elastic” and ‘‘neural” optimization, NETWORK: Comp. Neural Systems I (1) (1990) 89—103. [44] Y. Takahashi, A unified constructive network model for problem-solving, Theoret. Comput. Sci. 156 (1996) 217—261. [45] Y. Takahashi, Solving optimization problems with variable-constraint by an extended CohenGrossberg model, Theoret. Comput. Sci. 158 (1996) 279—341. [46] Y. Takahashi, Mathematical improvement of the Hopfield model for TSP feasible solutions by synapse dynamical systems, Neurocomputing 15 (1997) 15—43. [47] K. Urahama, Gradient projection network: analog solver for linearly constrained nonlinear programming, Neural Comput. 8 (1996) 1061—1073. [48] M.M. VanHulle, A goal programming network for linear programming, Biol. Cybernet. 65 (1991) 243—252. [49] G.V. Wilson, G.S. Pawley, On the stability of the traveling salesman problem algorithm of Hopfield and tank, Biol. Cybernet. 58 (1988) 63—70. [50] L. Xu, Combinatorial optimization neural nets based on a hybrid of Lagrange and transformation approaches, Proc. WCNN, vol. 2, San Diego, June 1994, pp. II-399-II-404. [51] A.L. Yuille, Generalized deformable models, statistical physics, and matching problems, Neural Comput. 2 (1990) 1—24.

Yoshikane Takahashi received the M.Sc. degree in mathematics from The University of Tokyo, Tokyo, Japan in 1975. He is currently with NTT Information and Communication Systems Laboratories, Kanagawa, Japan. His research fields include communications protocol, fuzzy theory, neural networks, nonmonotonic logic, genetic algorithms, and semantics information theory. Mr. Takahashi was awarded the first Moto-oka Commemorative Award in 1986. He is a member of the Japanese Institute of Electronics, Information and Communication Engineers, and the Information Processing Society of Japan.

A neural network theory for constrained optimization

A neural network theory for constrained optimization

Recommend Documents