A recurrent neural network for solving a class of generalized convex optimization problems

A recurrent neural network for solving a class of generalized convex optimization problems

Neural Networks 44 (2013) 78–86 Contents lists available at SciVerse ScienceDirect Neural Networks journal homepage: www.elsevier.com/locate/neunet ...

659KB Sizes 1 Downloads 124 Views

Neural Networks 44 (2013) 78–86

Contents lists available at SciVerse ScienceDirect

Neural Networks journal homepage: www.elsevier.com/locate/neunet

A recurrent neural network for solving a class of generalized convex optimization problems Alireza Hosseini a , Jun Wang b,c,∗ , S. Mohammad Hosseini a,1 a

Department of Mathematics, Tarbiat Modares University, P.O. Box 14115-175, Tehran, Iran

b

Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong

c

School of Control Science and Engineering, Dalian University of Technology, Dalian, Liaoning, 116023, China

article

info

Article history: Received 4 November 2012 Received in revised form 12 March 2013 Accepted 14 March 2013 Keywords: Recurrent neural networks Differential inclusion Nonsmooth optimization Generalized convex Pseudoconvexity

abstract In this paper, we propose a penalty-based recurrent neural network for solving a class of constrained optimization problems with generalized convex objective functions. The model has a simple structure described by using a differential inclusion. It is also applicable for any nonsmooth optimization problem with affine equality and convex inequality constraints, provided that the objective function is regular and pseudoconvex on feasible region of the problem. It is proven herein that the state vector of the proposed neural network globally converges to and stays thereafter in the feasible region in finite time, and converges to the optimal solution set of the problem. © 2013 Elsevier Ltd. All rights reserved.

1. Introduction Constrained optimization arises in numerous scientific, engineering and business applications. In many engineering applications, such as automatic control and robotics, real-time solutions to constrained optimization problems are often needed; e.g., Wang, Hu, and Jiang (1999). With some resemblance to human brains, recurrent neural networks emerged as popular parallel computational models for real-time optimization. The first neurodynamic optimization model was introduced by Tank and Hopfield in 1986, for solving linear programming problems. Inspired by the Tank–Hopfield network, many recurrent neural network models were proposed thereafter for solving various constrained optimization problems. For example, for linear programming problems, Kennedy and Chua (1988); Xia (1996) and Xia and Wang (1995) proposed some neural network models for problems with various kinds of constraints. For quadratic programming problems we can refer to Xia, Feng, and Wang (2004) and also Liu and Wang (2006). For more general models, we can point to neural network models for solving convex programming problems

∗ Corresponding author at: Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong. Tel.: +852 39438472. E-mail addresses: [email protected] (A. Hosseini), [email protected], [email protected] (J. Wang), [email protected] (S.M. Hosseini). 1 Tel.: +98 21 82883454; fax: +98 21 82883454. 0893-6080/$ – see front matter © 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.neunet.2013.03.010

and models described for some applications (Wang, 1994; Wang et al., 1999; Xia, Leung, & Wang, 2002; Xia & Wang, 2004 and references therein). In all of these models, the authors have tried to design an ordinary differential equation in which any state vector converges to an element or approximating element of optimal solution set. Most of the designed models are based on some famous optimality equations and conditions: penalty-based approach Kennedy and Chua (1988), primal–dual conditions (Liu & Wang, 2006; Xia, 1996; Xia & Wang, 1995), and various forms of Kahn Tucker conditions, such as projection form conditions (Xia et al., 2002; Xia & Wang, 2004). In last decade, some neural network models based on differential inclusion approach have been introduced (for example see Bian & Xue, 2009; Forti, Nistri, & Quincampoix, 2004; Hosseini & Hosseini, 2013; Hosseini, Hosseini, & Soleimani-damaneh, 2011; Li, Song, & Wu, 2010; Liu, Guo, & Wang, 2012 and Liu & Wang, 2011, 2013). This kind of networks are generalization of differential equation ones in which the right handsides are discontinuous. Theory of generalization of discontinuous right-hand side differential equation models is argued in Aubin and Cellina (1984) and Filippov (1988). Differential inclusion-based neural networks can use discontinuous activation functions as well, but one should make sure that the activation function is upper semicontinuous to guarantee the existence of the solution. Forti et al. (2004) have introduced the first differential inclusion-based neural network model for solving nonsmooth convex optimization problems with inequality constraints. By generalizing the Kennedy and Chua’s model, they have used penalty approach for designing the neural network. Bian and Xue (2009) have extended the Forti

A. Hosseini et al. / Neural Networks 44 (2013) 78–86

et al.’s model to a model for solving problems with both inequality and affine equality constraints. The importance of convex functions is well known in optimization problems. Convex functions come up in many mathematical models used in economics, engineering, etc. Convexity is also used to obtain sufficiency for conditions that are only necessary, as with the classical Fermat theorem or with Kuhn–Tucker conditions in nonlinear programming. In microeconomics, convexity plays a fundamental role in general equilibrium theory and in duality theory. For more applications and historical reference, see Arrow and Intriligator (1981), Guerraggio and Molho (2004) and Islam and Craven (2005). The convexity of sets and the convexity and concavity of functions have been the object of many studies during the past one hundred years. Early contributions to convex analysis were made by Holder (1889), Jensen (1906), and Minkowski (1910). More often, convexity does not appear as a natural property of the various functions and domain encountered in such models. The property of convexity is invariant with respect to certain operations and transformations. However, for many problems encountered in economics and engineering the notion of convexity does no longer suffice. Hence, it is necessary to extend the notion of convexity to the notions of pseudoconvexity, quasiconvexity, etc. Generalized convex optimization problems arise in many applications such as portfolio optimization; for example, minimizing the condition number of a set of matrices recently has been used to estimate the covariance matrix for the Markowitz portfolio selection model (see Maréchal & Ye, 2010). Let us consider the following condition number problem: min κ(A)

(1)

s.t. A ∈ Ω ,

where Ω is a compact convex subset of Sn+ , the cone of symmetric positive semidefinite n × n matrices, and κ(A) denotes the condition number of A. If we denote the eigenvalues of A, in decreasing order by, λ1 (A), . . . , λn (A), then the function κ(A) is defined by

 λ1 (A)/λn (A), κ(A) = ∞, 0,

if λn (A) > 0, if λn (A) = 0 and λ1 (A) > 0, if A = 0.

(2)

It is proved that the function κ is strongly pseudoconvex on the cone of symmetric positive definite n × n matrices. Designing neural network models for nonconvex optimization problems is very difficult, because in neural network models, a parallel structure is acceptable and it should not use complicated algorithms like traditional optimization methods. Some authors have tried to design models for nonconvex problems. A neural network model for solving special case of nonconvex optimization problems has been suggested by Liu and Wang (2011). This model solves problems with inequality and equality affine constraints too. Guo, Liu and Wang also designed two neural network models for solving pseudoconvex optimization problems, one for linear equality constraints (Guo, Liu, & Wang, 2011) and the other for both box constraints and affine equality constraints (Liu et al., 2012). In second paper, they have also employed the model for portfolio optimization. In this paper we design a neural network for a class of nonsmooth nonconvex optimization problems. We assume that the objective function is pseudoconvex only on the feasible region. This neural network model is more general than Guo, Liu and Wang’s network because the new model can apply to problems with general convex constraints containing affine linear constraints and convex general nonsmooth inequality constraints. The paper is organized as follows. In Section 2, the problem and some definitions and preliminaries are mentioned which are needed in other sections. In Section 3, we introduce the new neural network model and analytically prove the existence of the solution. We also

79

show that the state vector of the neural net converges to the feasible region in finite time and stays there thereafter. At the end of the section, we prove that any state vector of the neural network converges to an optimal solution of the optimization problem. In Section 4, we delineate the simulation results of the neural net for solving three nonsmooth nonconvex optimization problems to substantiate our theoretical findings. 2. Preliminaries In this section, we express the problem and some definitions and preliminary results concerning nonsmooth optimization, setvalued maps and nonsmooth analysis which are needed in sequel. Let us consider the constraint optimization problem min s.t.

f (x) x ∈ Ω,

(3)

where Ω = Ω1 ∩ Ω2 , Ω1 = {x ∈ ℜn : G(x) ≤ 0}, Ω2 = {x ∈ ℜn : H (x) = 0}, G = (g1 , g2 , . . . , gm )T : ℜn → ℜm is m-dimensional vector-valued function of n variables, H = (h1 , h2 , . . . , hk )T : ℜn → ℜk is k-dimensional vector-valued affine function of n variables T and hj (x) = aj x + bj , j = 1, 2, . . . , k where A = (a1 , a2 , . . . , ak )T is a full row rank matrix. Assume that g (x) = max{gi (x), i = 1, 2, . . . , m} is convex on ℜn and f is a regular function that is pseudoconvex on Ω . Suppose that Ω ∗ is the optimal solution set, Ω10 is the interior of Ω1 , ∂ Ω1 is the boundary of Ω1 , and Ω c is the complement of Ω . We also use B(x0 , l) to denote an open ball with center x0 and radius l. Suppose that the following assumptions hold: Assumption 1. There exists x0 ∈ ℜn such that x0 ∈ Ω10 , that is g (x0 ) < 0. Assumption 2. There exists s ∈ Ω and r > 0 such that Ω1 ⊂ B(s, r ). Assume that h(x) = max{|hi (x)|, i = 1, 2, . . . , k}, and f and gi s and consequently g are Lipschitz functions on B(s, r ). Take l1 , l2 and l3 as Lipschitz constants of f , g and h respectively. We assume that problem (3) is feasible and nonsmooth generally, i.e., the functions involved in this problem can be nondifferentiable. We will analyze the existence of solution of a new neural network and its convergence to optimal solutions of the corresponding problem (3). Definition 1 (Upper Semicontinuity Aubin & Cellina, 1984). Let X and Y be normed spaces. We say that the set valued map F : X → 2Y , where 2Y is the set of all subsets of Y , is upper semicontinuous (USC ) at x0 , if given ϵ > 0 there exists δ > 0 such that F (x0 + δ B(0, 1)) ⊆ F (x0 ) + ϵ B(0, 1). We say that F is USC , if it is USC at every x0 ∈ X . Definition 2 (Clarke, 1983). Assume that f is Lipschitz near x. The generalized directional derivative of f at x in the direction v ∈ ℜn is given by f 0 (x; v) = lim sup y→x,t ↓0

f (y + t v) − f (y) t

,

and the Clarke’s generalized gradient of f is defined as

∂ f (x) = {y ∈ ℜn : f 0 (x, v) ≥ yT v, ∀v ∈ ℜn }. Definition 3. Let E ⊂ ℜn be a nonempty convex set. A function f : E → ℜ is said to be pseudoconvex on E if, for every pair of

80

A. Hosseini et al. / Neural Networks 44 (2013) 78–86

distinct points x, y ∈ E

∃η ∈ ∂ f (x) : η (y − x) ≥ 0 → f (y) ≥ f (x). T

f is strictly pseudoconvex if

∃η ∈ ∂ f (x) : ηT (y − x) ≥ 0 → f (y) > f (x). f is strongly pseudoconvex, if there exists a constant β > 0, such that

∃η ∈ ∂ f (x) : ηT (y − x) ≥ 0 → f (y) ≥ f (x) + β∥y − x∥2 . n

Definition 4 (Aubin & Cellina, 1984). Assume that F : ℜn → 2ℜ is a set-valued map, then a vector function x(.) : [0, ∞) → ℜn is called a solution of x˙ (t ) ∈ F (x(t )) on [t1 , t2 ] if x(.) is absolutely continuous on [t1 , t2 ] and x˙ (t ) ∈ F (x(t )), for a.e. t ∈ [t1 , t2 ]. Proposition 1 (Chain Rule Forti et al., 2004). If W : ℜn → ℜ is a regular function at x(t ) and x(.) : ℜ → ℜn is differentiable at t and Lipschitz near t, then dW (x(t )) dt

= ζ T x˙ (t ),

∀ζ ∈ ∂ W (x(t )).

(4)

Proof. We have D ∩ Ω1 = ∅, then obviously for each x ∈ D, ∂ g + (x) = ∂ g (x). From convexity of g , ηT (s − x) ≤ g (s) − g (x), for each x ∈ ℜn and η ∈ ∂ g (x). On the other hand if x ∈ D, then g (s) − g (x) < 0. But by compactness of D and continuity of g, there exists υ > 0, such that g (s) − g (x) < −υ for each x ∈ D and this means that ηT (s − x) < −υ , for each η ∈ ∂ g + (x).  Theorem 1. Let x0 ∈ B(s, r ). Any state of the neural network (5) satisfies x(t ) ∈ B(s, r ) if θ > l1 r /υ , where υ is the same as defined in Lemma 1. Proof. Suppose ρ(x(t )) = ∥x(t ) − s∥2 /2, then using the chain rule we have dρ(x(t ))/dt = x˙ (t )T (x(t )− s), that is, for each t there exist ψ(t ) ∈ ∂ f (x(t )), η(t ) ∈ ∂ g + (x(t )) and ζ (t ) ∈ ∂ h(x(t )) such that dρ(x(t )) dt

3. Main results In this section we propose a new neural network for solving a class of nonsmooth nonconvex optimization problems. Consider the following differential inclusion:

(7)

On the other hand, by convexity and nonnegativity of h and because h(s) = 0, for each x ∈ D and ζ ∈ ∂ h(x), we have ζ T (s − x) ≤ h(s)− h(x) = −h(x) ≤ 0. From (7) and Lemma 1, if x(t ) ∈ D, we get dρ(x(t ))

Remark 1. Note that from convexity of g, if 0 ∈ ∂ g (x), then x is global minimizer of g on ℜn . Thus if g (x) ≥ 0, that is, x ∈ Ω1c ∪∂ Ω1 , then 0 ̸∈ ∂ g (x) because otherwise x is a minimizer of g on ℜn and this contradicts g (x0 ) < g (x).

= −ψ(t )T (x(t ) − s) − θ (η(t ) + ζ (t ))T (x(t ) − s).

dt

< ∥x(t ) − s∥ ∥ψ(t )∥ − θ υ.

(8)

From this result we should be able to show that if θ > l1 r /υ , then x(t ) ∈ B(s, r ). Because if this is not the case, then x(t ) should leave B(s, r ) at time t1 and consequently dρ(x(t ))/dt (t1 ) > 0, as x(t1 ) ∈ D. From (8), and ∥ψ∥ < l1 , we have dρ(x(t )) dt

< rl1 − θ υ < 0,

(9)

(5)

which is in contradiction with dρ(x(t ))/dt (t1 ) > 0. This establishes theorem. 

where θ is a positive parameter, k(x) = h(x) + g + (x), and h(x) = max{|hi (x)|, i = 1, 2, . . . , k}, and

Corollary 1. If θ > l1 r /υ , then any solution of the differential inclusion (5) with any initial solution x0 ∈ B(s, r ) exists globally.

x˙ (t ) ∈ −∂ f (x) − θ ∂ k(x),

g + ( x) =



0, g (x),

if g (x) ≤ 0, if g (x) > 0.

Proof. The conclusion follows from Theorem 1 and Proposition 2. 

In Table 1, we have compared some related neurodynamic optimization models with the present one. Remark 2. Compared with the recurrent neural network models for constrained nonsmooth optimization in Liu et al. (2012) and Liu and Wang (2011), the present model has only one design parameter θ instead of two and aims at solving nonsmooth optimization problems with general convex inequality constraints as well as affine equality ones. Proposition 2. For any initial state x0 ∈ ℜn , the state vector (t ) of the neural network model (5) is a local solution for solving the nonsmooth problem (3). Proof. From Clarke (1983), the right-hand side of differential inclusion (5) is an upper semicontinuous set-valued map with nonempty compact convex values. Hence, the local existence of the solution x(t ) for (5), on [0, t1 ] (t1 > 0) with x(0) = x0 is a straightforward consequence of Theorem 1 in p. 77 of Filippov (1988).  Next, the convergence to optimal solution set will be proved.

φ(x) = max{φi (x), i = 1, 2, . . . , m}, then

∂φ(x) = conv{∂φi (x), i ∈ I (x)}, where I (x) = {i ∈ {1, 2, . . . , m} : φi (x) = φ(x)} and ‘‘conv’’ denotes convex hull of a set. Corollary 2. If h(x) > 0, then

∂ h(x) ∈

 

 αi ai :

i∈J



|αi | = 1

(10)

i∈J

and if h(x) = 0 then

Lemma 1. Consider D = {x ∈ ℜn : ∥x − s∥ = r },

Lemma 2. Suppose {φi , i = 1, 2, . . . , m} is a finite collection of functions each of which regular and Lipschitz near x. Consider

(6)

as the boundary of B(s, r ), where s is the same which is mentioned in Assumption 2. Then there exists υ > 0 such that for each x ∈ D and η ∈ ∂ g + (x), ηT (s − x) < −υ .

    ∂ h(x) ∈ [0, 1]αi ai : |αi | = 1 , i∈J

i∈J

where J = {i : |hi (x)| = h(x), i = 1, 2, . . . , k}.

(11)

A. Hosseini et al. / Neural Networks 44 (2013) 78–86

81

Table 1 Comparison of related neural networks in terms of performance and convergence criteria. Method

Convergence condition

Objective function

Equality constraints

Inequality constraints

Herein Liu et al. (2012) Guo et al. (2011) Liu and Wang (2013) Forti et al. (2004) Xia and Wang (2005) Xia et al. (2008)

f (x) is pseudoconvex on Ω f (x) is pseudoconvex on S Ditto Ditto f (x) is convex f (x) is strictly convex on Ω ∇x2 L(x, y) is positive semidefinite

Nonsmooth Ditto Ditto Ditto Ditto Smooth Ditto

Linear Ditto Ditto Ditto – Linear –

Convex and nonsmooth Box – Box Convex and nonsmooth – Convex

Proof. From |hi (x)| = |ai x + bi |, it is easy to see that if h(x) > 0, then ∂|hi (x)| = sign(hi (x))ai for each i ∈ J and if h(x) = 0, then ∂|hi (x)| = [0, 1] sign(hi (x))ai , for each i ∈ J. Consequently from Lemma 2, we have

 

∂ h(x) =

 αi ∂|hi (x)| :



i∈J

αi = 1 .

(12)

i∈J

Therefore if h(x) > 0, (10) holds and if h(x) = 0, (11) holds.



Lemma 3. If the set {γ ∈ ∂ g + (z ) : γ ̸= 0} ∪ {ai : i = 1 · · · k} is a linearly independent set for each z ∈ Ω ∩∂ Ω1 , then there exists δ > 0, such that for each x ∈ B(s, r ) − Ω and each γ ∈ ∂ k(x), ∥γ ∥ > δ . Proof. We prove this lemma in three steps. Step 1: In this step we prove that there exists δ1 > 0, such that for each x ∈ B(s, r ), which g + (x) > 0, and each η ∈ ∂ g + (x), we have ∥η∥ > δ1 . If y ∈ B(s, r )∩(Ω1c ∪∂ Ω1 ), then by Assumption 1 and Remark 1, 0 ̸∈ ∂ g (y). Consequently there exists δ1 > 0, such that for each y ∈ B(s, r ) ∩ (Ω1c ∪ ∂ Ω1 ) and each η ∈ ∂ g (y), we have ∥η∥ > δ1 . If this is not the case, then there exist yn ∈ Ω1c ∪ ∂ Ω1 and ηn ∈

∂ g (yn ), such that ∥ηn ∥ < 1/n, but from compactness of B(s, r ) ∩ (Ω1c ∪ ∂ Ω1 ), the sequence, {yn }, has a convergent subsequence. For simplicity, suppose that {yn } is the convergent subsequence. Assume that yn → y¯ ∈ B(s, r ) ∩ (Ω1c ∪ ∂ Ω1 ). ∂ g (x) is an upper semi-

continuous set valued map with compact convex values; therefore, ηn → η¯ ∈ ∂ g (¯y). On the other hand ∥ηn ∥ → 0, that is η¯ = 0, but this is in contradiction with 0 ̸∈ ∂ g (¯y). Thus there exists δ1 > 0, such that for each x ∈ Ω1c ∪ ∂ Ω1 and η ∈ ∂ g (y), we have ∥η∥ > δ1 . Now for x ∈ ℜn , if g + (x) > 0, then ∂ g + (x) = ∂ g (x). If g + (x) > 0, and η ∈ ∂ g + (x), then x ∈ B(s, r ) − Ω1 and consequently x ∈ B(s, r ) ∩ (Ω1c ∪ ∂ Ω1 ); thus ∥η∥ > δ1 and proof of Step 1 is completed. Step 2: In this step we prove that there exists δ2 > 0, such that for each x ∈ ℜn , h(x) > 0 and for each ζ ∈ ∂ h(x), we have ∥ζ ∥ > δ2 . If h(x) > 0, then from Corollary 2,

∂ h(x) ∈

 

 αi a : i

i∈J



|αi | = 1 ,

(13)

where J = {i : |hi (x)| = h(x), i = 1, 2, . . . , k}. But A is full row rank matrix; thus the set of ai s is a linearly independent set. Therefore 0 ̸∈ ∂ h(x). Now we prove the existence of δ2 . If there is no such δ2 , then there exists a sequence {xn } such that for each n, h(xn ) > 0 and for some ζn ∈ ∂ h(xn ), ∥ζn ∥ < 1/n. Hence from Eq. (13),

ζn =

k





αin ai ,

with

i=1

|αin | = 1,

(14)

i =1

for some αin . Each sequence {αin }, i = 1, 2, . . . , k, is bounded. n1

Hence sequence {α1n } has a convergent subsequence, say, {α1 τ }. n2

Also sequence {α2 τ } has a convergent subsequence, say {α2 τ } (note n2

gent. Assume that αi τ → α¯ i , as τ → ∞, then

¯ i | = 1 and i=1 |α i α ¯ a = 0. But this contradicts linear independence of ai s. i=1 i Step 3: In this step we prove the existence of δ . Suppose x ∈ B(s, r ) − Ω , and γ ∈ ∂ k(x), then n

n1

k

k

γ = η + ζ,

η ∈ ∂ g + (x), ζ ∈ ∂ h(x).

(15)

Suppose there is no such δ , then there exist sequences {xn } ⊂ B(s, r ), xn ̸∈ Ω , {ηn }, ηn ∈ ∂ g + (xn ) and {ζn }, ζn ∈ ∂ h(xn ), such that

γn = ηn + ζn ,

(16)

with ∥γn ∥ < 1/n. As xn ̸∈ Ω , g (xn ) > 0 or h(xn ) > 0, then two cases may happen: +

Case I: There exists a convergent subsequence {xnτ } such that for each τ we have g + (xnτ ) > 0. Case II: There exists a convergent subsequence {xnτ } such that for each τ we have h(xnτ ) > 0. If Case I happens then ∂ g (xnτ ) = ∂ g + (xnτ ). Assume xnτ → x¯ , then from the properties of generalized gradient, without losing generality, we can assume that the sequences {ηnτ } and {ζnτ } are convergent as τ → ∞. Suppose ηnτ → η¯ ∈ ∂ g + (¯x) and ζnτ → ζ¯ ∈ ∂ h(¯x). We know xnτ ∈ B(s, r )∩(Ω c ∪∂ Ω1 ). By compactness of B(s, r ) ∩ (Ω c ∪ ∂ Ω1 ), Assumption 1, Remark 1 and results of Step 1, we obtain ∥η∥ ¯ > δ1 . From Corollary 2,



   n i n [0, 1]αi a : |αi | = 1 . ζn ∈ i∈J

(17)

i∈J

One should note that if h(xn ) > 0, then equality (14) holds and if h(xn ) = 0, then (17) holds. Consequently we can assume that (17) holds for each n. Now we can assume that only one of the following equations holds:

ζ¯ = 0, ζ¯ =



(18)

that {α2 τ } is a subsequence of {α2 τ } not {α2n }). Similarly, the other

β¯ i ai ,

(19)

i∈J

in which β¯ i s are not all zero. Also lim γnk = η¯ + ζ¯ = γ¯ = 0.

k→∞

n1

nk

or

i∈J

k

n3

sequences {α3 τ }, . . . , {αk τ } are obtained. For simplicity we take n nkτ = nτ . Obviously any sequence {αi τ }, i = 1, . . . , k, is conver-

(20)

But 0 = γ¯ ∈ ∂ k(¯x). From convexity and nonnegativity of k, k(¯x) = 0. Consequently g + (¯x) = h(¯x) = 0, that is x¯ ∈ Ω . On the other hand by g + (xnτ ) = g (xnτ ) > 0 and taking limit when τ → ∞ we get g (¯x) ≥ 0. Also x¯ ∈ Ω yields g (¯x) ≤ 0; thus we get g (¯x) = 0. Therefore x¯ ∈ ∂ Ω1 ∩ Ω . If (18) holds, then from (20), η¯ = 0. But this contradicts ∥η∥ ¯ > δ1 . If (19) holds, then (20) and the linear independence assumption lead to a contradiction. Thus if Case I happens, then the existence of δ is guaranteed.

82

A. Hosseini et al. / Neural Networks 44 (2013) 78–86

If Case II happens, then we assume xnτ → x¯ ; from the properties of generalized gradient, without loss of generality, we can assume that the sequences {ηnτ } and {ζnk } are convergent as τ → ∞. Suppose ηnτ → η¯ ∈ ∂ g + (¯x) and ζnτ → ζ¯ ∈ ∂ h(¯x). From h(xnτ ) > 0 and results of Step II, ∥ζnτ ∥ > δ2 ; thus ∥ζ¯ ∥ > δ2 /2. Obviously (20) holds in this case too. Consequently x¯ ∈ Ω . If η¯ = 0, then from (20), ζ¯ = 0 which is a contradiction. If η¯ ̸= 0, then for some N > 0; if τ > N, then g (xnτ ) ≥ 0, because otherwise, there exists a subsequence {xnτl } of {xnτ } such that g (xnτl ) < 0, for each l; thus

∂ g + (xnτl ) = 0, that is ηnτl = 0 and this means that η¯ = 0, which is again a contradiction. Thus for some N, if τ > N, then g (xnτ ) ≥ 0. Consequently by taking limit when τ → ∞, we have g (¯x) ≥ 0. But from (20), 0 ∈ ∂ k(¯x), that is k(x) = 0 and g (¯x) ≤ 0; therefore x¯ ∈ Ω ∩ ∂ Ω1 . Now again (20) contradicts the linear independence assumption. Therefore if Case II happens, the existence of δ is guaranteed and the proof of the lemma is completed.



Theorem 2. Assume that the conditions of Lemma 3 hold. If θ > max{l1 (l2 + l3 )/δ 2 , l1 r /υ} + 1, then any state vector of neural network (5), from any initial solution x0 ∈ B(s, r ), converges to feasible region Ω in finite time and stays there thereafter, where υ is the same as mentioned in Lemma 1 and δ is the same as mentioned in Lemma 3. Proof. From Corollary 1, the solution of differential inclusion (5), exists globally. Suppose x(t ) is a solution trajectory of (5), by using Proposition 1, we have dk(x(t )) dt

= x˙ (t ) γ , T

(21)

for each γ ∈ ∂ k(x(t )). Consequently there exist ψ(t ) ∈ ∂ f (x(t )) and γ (t ) ∈ ∂ k(x(t )) such that dk(x(t )) dt

= (−ψ(t ) − θ γ (t ))T γ (t ).

(22)

If x(t ) ̸∈ Ω , that is, k(x(t )) > 0, then from Lemma 3, ∥γ (t )∥ > δ . Also ∥γ (t )∥ < l2 + l3 and ψ(t ) < l1 ; therefore we have dk(x(t )) dt

< l1 (l2 + l3 ) − θ δ < −δ . 2

2

(23)

By integrating (23), from t0 = 0 to s we have k(x(s)) < k(x(t0 )) − δ s. 2

(24)

Therefore if s = k(x(t0 ))/δ , then k(x(s)) = 0. Therefore any solution trajectory of differential inclusion (5) converges to Ω in finite time. Now we prove that the trajectory stays in Ω after reaching it. If this is not the case, then there exists s1 > k(x(t0 ))/δ 2 such that the state trajectory leaves Ω at s1 and there exists s2 > s1 such that for almost all t ∈ (s1 , s2 ), x(t ) ̸∈ Ω . By integrating (23), from s1 to s2 , we obtain s2 s1

dk(x(t )) dt

Lemma 4. The optimal solution set Ω ∗ is a convex set. Proof. Assume that x, y ∈ Ω ∗ and suppose w = α x + (1 − α)y, for some 0 < α < 1. Obviously by convexity of feasible region, w ∈ Ω . We now prove that w ∈ Ω ∗ . If this is not the case, f (w) > f (x) = f (y). By pseudoconvexity of f we have

γ T (x − w) = (1 − α)γ T (x − y) < 0,

γ T (y − w) = αγ T (y − x) < 0.

(26)

As α, 1 − α > 0, from (25) γ (x − y) < 0 and from (26), γ (y − x) < 0, which is an obvious contradiction.  T

T

Lemma 5. If f is pseudoconvex on Ω and continuously differentiable on Ω ∗ , then for each x, x∗ ∈ Ω ∗ , ∇ f (x)T (x − x∗ ) = 0. Proof. For x, x∗ ∈ Ω ∗ , let us define Q (α) = f (α x∗ + (1 − α)x).

(27)

From Lemma 4, Ω is convex and hence for each 0 ≤ α ≤ 1, f (α x∗ + (1 − α)x) = f (x) = f (x∗ ), that is, Q (α) is constant for 0 ≤ α ≤ 1. Consequently dQ (α)/dα = 0 for 0 < α < 1. From Proposition 1 we have ∗

dQ (α) dα

= (x∗ − x)T ∇ f (α x∗ + (1 − α)x).

(28)

Assume 0 < αn < 1 and αn → 0, then from (28) we obtain dQ (αn ) dαn

= (x∗ − x)T ∇ f (αn x∗ + (1 − αn )x) = 0.

By limiting (29), as n → ∞, we get ∇ f (x)T (x − x∗ ) = 0.

(29) 

Theorem 3. Assume that the conditions in Lemma 3 hold. From any initial state x0 ∈ B(s, r ), any state vector of neural network (5) converges to an optimal solution of problem (3) if θ > max{l1 (l2 + l3 )/ δ 2 , l1 r /υ} + 1 in any one of the following cases: (a) f is pseudoconvex on Ω and continuously differentiable on Ω ∗ , (b) f is pseudoconvex on Ω and Ω ∗ is a singleton, (c) f is convex on Ω , where υ is the same as mentioned in Lemma 1 and δ is the same as mentioned in Lemma 3. Proof. We prove the theorem in three steps. Step I: We prove that for each x ∈ Ω , x∗ ∈ Ω ∗ , γ ∈ ∂ f (x), η ∈ ∂ g + (x) and ζ ∈ ∂ h(x),

(25)

(30)

and equality holds only if x ∈ Ω . Assume x ∈ Ω − Ω ∗ , from convexity of g + and h we have ∗

ζ T (x∗ − x) ≤ g + (x∗ ) − g + (x) = 0,

(31)

and

ηT (x∗ − x) ≤ h(x∗ ) − h(x) = 0.

(32)

Also from f (x ) < f (x) and pseudoconvexity of f , ∗

γ T (x∗ − x) < 0.

ds = k(x(s2 )) − k(x(s1 )) < −δ 2 (s2 − s1 ) < 0.

But k(x(s1 )) = 0; hence k(x(s2 )) < 0. This contradicts the nonnegativity of k and the proof of the theorem is completed. 

∀γ ∈ ∂ f (w),

∀γ ∈ ∂ f (w),

− [γ + θ (η + ζ )]T (x − x∗ ) ≤ 0,

2



and

(33)

Combining (31)–(33) shows that inequality (30) is strict, if x ̸∈ Ω ∗ . If x ∈ Ω ∗ we show that (30) holds under conditions (a)–(c): Case (a): From Lemma 5, γ T (x∗ − x) = 0. Combining this with (31) and (32) yields the result. Case (b): In this case x = x∗ , that is γ T (x∗ − x) = 0. Combining this with (31) and (32) yields the result. Case (c): By convexity of f ,

γ T (x∗ − x) ≤ f (x∗ ) − f (x) = 0.

(34)

Therefore for each x ∈ Ω , (30) holds and this inequality is strict for x ̸∈ Ω ∗ and the proof is completed.

A. Hosseini et al. / Neural Networks 44 (2013) 78–86

Step II: From Theorem 2, there exists T > 0 such that for each t ≥ T , x(t ) ∈ Ω . Define V (x(t ), x∗ ) = ∥x(t )− x∗ ∥2 /2, for each x∗ ∈ Ω ∗ . In this step we prove that if x(t ) does not converge to an element of Ω ∗ , then there exists a positive constant c such that for each x∗ ∈ Ω ∗ , lim V (x(t ), x∗ ) > c .

(35)

t →∞

From dV (x(t ), x∗ )/dt = x˙ (t )T (x(t )− x∗ ), for each t ≥ T , there exist γ (t ) ∈ ∂ f (x), η(t ) ∈ ∂ g + (x) and ζ (t ) ∈ ∂ h(x) such that dV (x(t ), x∗ ) dt

= −[γ (t ) + θ (η(t ) + ζ (t ))]T (x(t ) − x∗ ).

(36)

From (30), dV (x(t ), x∗ )/dt ≤ 0, for each t ≥ T . From Theorem 1, x(t ) is bounded; therefore V (x(t ), x∗ ) and consequently ∥x(t )− x∗ ∥ is convergent as t → ∞. If there does not exist c > 0 such that (35) holds for each x∗ ∈ Ω ∗ , then there exists sequence {x∗n } ⊂ Ω ∗ such that lim ∥x(t ) − x∗n ∥ < 1/n.

(37)

t →∞

We know that Ω ∗ is compact; thus, there exists a subsequence {x∗nk } of {x∗n } which is convergent. Suppose that x∗nk → x¯ ∈ Ω ∗ , as k → ∞, then

∥x(t ) − x¯ ∥ < ∥x(t ) − x∗nk ∥ + ∥x∗nk − x¯ ∥.

(38)

By limiting (38), as t → ∞, we obtain lim ∥x(t ) − x¯ ∥ <

t →∞

1 nk

.

(39)

Again by limiting (39) as k → ∞, we get limt →∞ ∥x(t ) − x¯ ∥ = 0. Consequently x(t ) → x¯ ∈ Ω ∗ which is a contradiction hence the existence of c is proved. Step III (main step): Let F (x) = −∂ f (x) − θ ∂ k(x). Suppose for a state vector of neural network (5) there does not exist x∗ ∈ Ω ∗ such that x(t ) → x∗ . Define B=



{z ∈ ℜn : ∥z − x∗ ∥ <



c },

By integrating (43) from T to l > T we obtain V (x(l), x∗ ) = V (x(T ), x∗ ) − σ l.

− µT (z − x∗ ) < −σ .

4. Illustrative examples In this section, we discuss the simulation results of the recurrent neural network (5) for solving the optimization problems in three numerical examples. In each example, at first we find a suitable penalty parameter to guarantee convergence to an optimal solution. For this reason, we do the following.

• Find a feasible solution s. • Find an open ball B(s, r ) such that, Ω ⊂ B(s, r ). • According to Lemma 1, find an υ > 0 such that g (s) − g (x) < −υ . • Find a δ > 0, such that the results of the Lemma 3 hold, that is for each x ∈ B(s, r ) − Ω , γ ∈ ∂ k(x), ∥γ ∥ > δ . • Find Lipschitz constants, l1 , l2 and l3 . • θ = max{l1 (l2 + l3 )/δ 2 , l1 r /υ} + 1. For each example, it is shown that the solution trajectory of state vectors converges to an optimal solution. The simulation results are further illustrated using boxplots. A boxplot is a way of summarizing a set of data measured on an interval scale. It is often used in exploratory data analysis. It is a type of graph which is used to show the shape of the distribution, its central value, and variability and shows the outlier data. The picture produced consists of the most extreme values in the data set (maximum and minimum values), the lower and upper quartiles, and the median. Example 1. Consider a nonsmooth nonconvex optimization problem as follows: x21 + x3

min

3

x2



s.t.

+ |2 + x2 | + |x1 + x2 + 1|,

xi = 0,

i =1 3

 (40)

(45)

|xi | ≤ 1,

i =1

If this is not the case, then there exists zn ∈ Ω − B, µn ∈ F (zn ) such that 1

(44)

If l > V (x(T ), x∗ )/σ , then from (44), V (x(l), x∗ ) < 0 which is a contradiction with V (x(l), x∗ ) ≥ 0 and the proof of the theorem is now completed. 

x∗ ∈Ω ∗

where c is the same constant as found in Step II. Then Ω − B is a compact set with no intersection with Ω ∗ ; thus from Step I, for each x∗ ∈ Ω ∗ , z ∈ Ω −B, and µ ∈ F (z ), −µT (z −x∗ ) < 0. Now we prove that there exists σ > 0 such that for each x∗ ∈ Ω ∗ , z ∈ Ω − B, and µ ∈ F (z ),

83

x2 ≥

1 5

.

From compactness of Ω − B, the sequence {zn } has a convergent subsequence. For simplicity we assume that {zn } itself is such subsequence. Suppose zn → z¯ , for some z¯ ∈ Ω − B. F is an upper semicontinuous set valued map with compact convex values; thus µn → µ ¯ for some µ ¯ ∈ F (¯z ). By taking limit of (41), as n → ∞, we get

It can be verified that the objective function is pseudoconvex on the feasible region. In this example s = (−0.2, 0.2, 0) is a feasible solution. We can take r = 1.5, then Ω1 ⊂ B(s, r ). Also g (s) = −0.6 and for each x ∈ D, as defined in (6), we have g (x) ≥ 0.5. Therefore according to Lemma 1, it is easy to see √ that we can take υ = 1.2. the Also we can check that if we take δ = 2, then the results of√ Lemma 3 hold for this δ . Also we can take l1 = 140, l2 = l3 = 3. Consequently, we have

− µ(¯ ¯ z − x∗ ) = 0.

l1 (l2 + l3 )

− µTn (zn − x∗ ) > − . n

(41)

(42)

But this contradicts −µ(¯ ¯ z − x∗ ) < 0. Therefore for each x∗ ∈ Ω ∗ and z ∈ Ω − B, (40) holds. From Step II and (35), for each t > T , x(t ) ∈ Ω − B. Thus for an arbitrary x∗ ∈ Ω ∗ , and each t > T , there exists µ(t ) ∈ F (x(t )) such that dV (x(t ), x∗ ) dt

= x˙ (t )T (x(t ) − x∗ ) = −µ(t )T (x(t ) − x∗ ) < −σ .

(43)

δ2

= 122.5,

l1 r

υ

= 2.16.

Therefore, by Theorem 3, for convergence, it suffices to take θ ≥ 123. Optimal solution for this problem is (0.3, 0.2, −0.5). Convergence behaviors of the solution trajectories with different initial solutions are shown in Fig. 1. Fig. 2 depicts a boxplot of resulting objective function values from 100 random initial states of the neurodynamic model within a fixed time window (t = 0.1). It is shown that there is no outlier.

84

A. Hosseini et al. / Neural Networks 44 (2013) 78–86

Fig. 1. Transient and convergent behaviors of the state vector of the neural network (5) with θ = 123 and different initial states toward an optimal solution in Example 1.

Fig. 3. Transient and convergent behaviors of the state vector of the neural network (5) with θ = 158 and different initial states toward an optimal solution in Example 2.

Fig. 2. A boxplot of 100 objective function values after t = 0.1 resulted from 100 random initial values in Example 1.

Example 2. Consider a quadratic fractional optimization problem with two equality and three inequality constraints as follows: f ( x) =

min

to see that we can take υ = 5. Also we can check that if we take δ = 1, then the results of the Lemma 3 hold for this δ . Also we can take l1 = 3.28, l2 = 45 and l3 = 3. Consequently, we have

xT Qx + aT x + a0

c T x + c0 x1 + x2 − x3 = 3, x1 − 2x2 + x4 = 0, x21 + 2x42 − x3 ≤ 10, −x1 − x2 − x3 + x4 ≤ 5, x22 − 2x1 x2 + x3 + x24 ≤ 10, x21 + x22 + x23 ≤ 5,

s.t.

(46)

where



−1 0.5 Q = 1 0

0.5 5.5 −1 −0.5

1 −1 1 0



0 −0.5 , 0  0





1 −1 a =  , −1 1

a0 = −2, c = (1, 1, 1, −1)T ,

Fig. 4. A boxplot of 100 objective function values after t = 2 resulted from 100 random initial values in Example 2.

c0 = 6.

It is easy to see that the objective function is pseudoconvex on the feasible region (Liu et al., 2012, Example 3). In this example s = (1, 1, −1, 1) is a feasible solution. We can take r = 6, then Ω1 ⊂ B(s, r ). Also g (s) = −5 and for each x ∈ D, as defined in (6), we have g (x) ≥ 0. Therefore according to Lemma 1, it is easy

l1 (l2 + l3 )

δ

2

= 157.44,

l1 r

υ

= 3.94.

Therefore, by Theorem 3, for convergence, it suffices to take θ ≥ 158. Fig. 3 shows that the state vector x(t ) is convergent to the optimal solution (2.0598, −0.4858, −1.4260, −3.0313). Fig. 4 depicts another boxplot of computed objective function values resulted from 100 random initial states of the neurodynamic model within a fixed time window (t = 0.5 × 10−3 ). It is shown that there is no outlier either. Example 3. Consider the problem of minimizing the condition number of a nonzero matrix min s.t.

κ(S ) S ∈ Λ,

(47)

where S is a symmetric matrix, and Λ is a compact convex set in ℜn×n . If λmax (S ) and λmin (S ) be the maximum and minimum eigenvalues of matrix S respectively, then the condition number

A. Hosseini et al. / Neural Networks 44 (2013) 78–86

85

Fig. 6. A boxplot of 100 objective function values after t = 0.5 × 10−3 resulted from 100 random initial values in Example 2. Fig. 5. Transient and convergent behaviors of the minimal condition number computed by the recurrent neural network (5) with θ = 72 and different initial states in Example 3.

κ(S ) is defined as κ(S ) =

λmax (S ) . λmin (S )

Consider a diagonal matrix

 S=

aT x + a0 0



0 c T x + c0

,

where x = (x1 , x2 , x3 , x4 )T ∈ ℜ4 , a = (−2, −1, 2, 0), c = (1, −1, 2, 1), a0 = 4, c0 = 2 and

Λ = {x ∈ ℜ4 : x21 + x22 ≤ 1 and x23 + x24 ≤ 1 and x ≥ 0}. The objective function κ(S ) is pseudoconvex of x on Λ (Chen, Womersley, & Ye, 2011; Maréchal & Ye, 2010) and it can be written as

 T a x + a0    T c x + c0 κ(S ) =  c T x + c0   aT x + a0

if aT x + a0 ≥ c T x + c0 , if c T x + c0 ≥ aT x + a0 .

We use neural network (5) for solving this problem. In this example s = (0, 0, 0, 0) is a feasible solution. We can take r = 2, then Ω1 ⊂ B(s, r ). Also g (s) = −1 and for each x ∈ D, as defined in (6), we have g (x) ≥ 0. Therefore according to Lemma 1, it is easy to see that we can take υ = 1. Also we can check that if we take δ = 2, then the results of the Lemma 3 hold for this δ . Also we can take l1 = 36, l2 = 2 and l3 = 0. Consequently, we have l1 (l2 + l3 )

δ

2

= 18,

l1 r

υ

= 72.

Therefore, by Theorem 3, for convergence, it suffices to take θ ≥ 72. Convergence of the condition number to optimal solution κ(S ) = 1 is shown in Fig. 5, where several initial solutions have been randomly selected. Fig. 6 depicts another boxplot of the objective function values corresponding to 100 random initial states of the neurodynamic model within a fixed time window of two seconds. Again, there is no outlier.

5. Conclusions This paper presents a novel neurodynamic model for nonsmooth optimization described by using a differential inclusion. The global convergence of its state vector to an optimal solution set is theoretically proved, provided that the objective function of the nonsmooth optimization problem is regular and pseudoconvex in the feasible region. This new model is more general than any existing neural network models for constrained pseudoconvex optimization so far, because more general inequality constraints are considered herein. Besides simulation results of numerical examples, the results for minimizing the condition number of a constrained set of matrices are also delineated. Future work has been aimed to design more effective neurodynamic models for solving a general class of nonconvex optimization problems. Acknowledgments J. Wang’s work was supported by the Research Grants Council of the Hong Kong Special Administrative Region, China, under Grants CUHK416811E and CUHK416812E, and by the National Natural Science Foundation of China under Grant 61273307. The third author would like to thank School of Control Science and Engineering, Dalian University of Technology, for its support during the research visit of the first author there. References Arrow, K. J., & Intriligator, M. D. (1981). Handbook of mathematical economics. Amesterdam: North-Holland. Aubin, J. P., & Cellina, A. (1984). Differential inclusion: set-valued maps and viability theory. Berlin, Germany: Springer. Bian, W., & Xue, X. (2009). Subgradient-based neural networks for nonsmooth optimization problems. IEEE Transactions on Neural Networks, 20, 1024–1038. Chen, X., Womersley, R., & Ye, J. (2011). Minimizing the condition number of a Gram matrix. SIAM Journal on Optimization, 21(1), 127–148. Clarke, F. H. (1983). Optimization and nonsmooth analysis. New York: WileyInterscience. Filippov, A. F. (1988). Mathematics and its applications (soviet series), Differential equations with discontinuous right-hand side. Boston, MA: Kluwer. Forti, M., Nistri, P., & Quincampoix, M. (2004). Generalized neural network for nonsmooth nonlinear programming problems. IEEE Transactions on Circuits and Systems I: Regular Papers, 51, 1741–1754. Guerraggio, A., & Molho, E. (2004). The origin of quasi-concavity: a development between mathematics and economics. Historia Mathematica, 30, 62–75. Guo, Z., Liu, Q., & Wang, J. (2011). A one-layer recurrent neural network for pseudoconvex optimization subject to linear equality constraints. IEEE Transactions on Neural Networks, 22(12), 1892–1900. Holder, O. (1889). Uber einen Mittlwertsatz, Nachr. Ges. Wiss. Goettingen. Hosseini, A., & Hosseini, S. M. (2013). A new steepest descent differential inclusionbased method for solving general nonsmooth convex optimization problems. Journal of Optimization Theory and Applications. http://dx.doi.org/10.1007/s10957-012-0258-4.

86

A. Hosseini et al. / Neural Networks 44 (2013) 78–86

Hosseini, A., Hosseini, S. M., & Soleimani-damaneh, M. (2011). A differential inclusion-based approach for solving nonsmooth convex optimization problems. Optimization. http://dx.doi.org/10.1080/02331934.2011.613993. Islam, S. M. N., & Craven, B. D. (2005). Some extensions of nonconvex economic modeling: invexity, quasimax and new stability conditions. Journal of Optimization Theory and Applications, 125, 315–330. Jensen, J. L. W. (1906). Sur les functions convexes et les inegalites entre les valeurs moyennes. Acta Mathematica, 30, 175–193. Kennedy, M., & Chua, L. (1988). Neural networks for nonlinear programming. IEEE Transactions on Circuits and Systems, 35(5), 554–562. Li, G., Song, S., & Wu, C. (2010). Generalized gradient projection neural networks for nonsmooth optimization problems. Science in China Series F: Information Sciences, 53, 990–1004. Liu, Q., Guo, Z., & Wang, J. (2012). A one-layer recurrent neural network for constrained pseudoconvex optimization and its application for portfolio optimization. Neural Networks, 26(1), 99–109. Liu, S., & Wang, J. (2006). A simplified dual neural network for quadratic programming with its KWTA application. IEEE Transactions on Neural Networks, 17, 1500–1510. Liu, Q., & Wang, J. (2011). A one-layer recurrent neural network for constrained nonsmooth optimization. IEEE Transactions on Systems, Man and Cybernetics— Part B: Cybernetics, 40(5), 1323–1333. Liu, Q., & Wang, J. (2013). A one-layer projection neural network for nonsmooth optimization subject to linear equalities and bound constraints. IEEE Transactions on Neural Networks, 24(5), 812–824. Maréchal, P., & Ye, J. (2010). Optimizing condition numbers. SIAM Journal on Optimization, 20(2), 935–947. Minkowski, H. (1910). Geometrie der Zahlen, Teubner, Leipzig.

Tank, D. W., & Hopfield, J. J. (1986). Simple neural optimization networks: an A/D converter, signal decision circuit, and a linear programming circuit. IEEE Transactions on Circuits and Systems, 533–541. Wang, J. (1994). A deterministic annealing neural network for convex programming. Neural Networks, 7(4), 629–641. Wang, J., Hu, Q., & Jiang, D. (1999). A Lagrangian network for kinematic control of redundant robot manipulators. IEEE Transactions on Neural Networks, 10, 1123–1132. Xia, Y. (1996). A new neural network for solving linear programming problems and its application. IEEE Transactions on Neural Networks, 7, 525–529. Xia, Y., Feng, G., & Wang, J. (2004). A recurrent neural network with exponential convergence for solving convex quadratic program and linear piecewise equations. Neural Networks, 17(7), 1003–1015. Xia, Y., Feng, G., & Wang, J. (2008). A novel recurrent neural network for solving nonlinear optimization problems with inequality constraints. IEEE Transactions on Neural Networks, 19(8), 1340–1353. Xia, Y., Leung, H., & Wang, J. (2002). A projection neural network and its application to constrained optimization problems. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, 49, 442–457. Xia, Y., & Wang, J. (1995). Neural network for solving linear programming problems with bound variables. IEEE Transactions on Neural Networks, 6, 515–519. Xia, Y., & Wang, J. (2004). A general projection neural network for solving monotone variational inequalities and related optimization problems. IEEE Transactions on Neural Networks, 15, 318–328. Xia, Y., & Wang, J. (2005). A recurrent neural network for solving nonlinear convex programs subject to linear constraints. IEEE Transactions on Neural Networks, 16(2), 379–386.