Author’s Accepted Manuscript A Novel Neural Network for Solving Convex Quadratic Programming Problems Subject to Equality and Inequality Constraints Xinjian Huang, Baotong Cui www.elsevier.com/locate/neucom
PII: DOI: Reference:
S0925-2312(16)30370-8 http://dx.doi.org/10.1016/j.neucom.2016.05.032 NEUCOM17048
To appear in: Neurocomputing Received date: 14 October 2015 Revised date: 12 March 2016 Accepted date: 11 May 2016 Cite this article as: Xinjian Huang and Baotong Cui, A Novel Neural Network for Solving Convex Quadratic Programming Problems Subject to Equality and Inequality Constraints, Neurocomputing, http://dx.doi.org/10.1016/j.neucom.2016.05.032 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting galley proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
A Novel Neural Network for Solving Convex Quadratic Programming Problems Subject to Equality and Inequality Constraints Xinjian Huanga,b,∗, Baotong Cuia,b a Key
Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi 214122, China b School of IoT Engineering, Jiangnan University, Wuxi 214122, China
Abstract This paper proposes a neural network model for solving convex quadratic programming (CQP) problems, whose equilibrium points coincide with Karush-Kuhn-Tucker (KKT) points of the CQP problem. Using the equality transformation and Fischer-Burmeister (FB) function, we construct the neural network model and present the KKT condition for the CQP problem. Comparing with the existing neural networks for solving such problems, the proposed neural network model has less variables and neurons, which makes circuit realization easier. Moreover, the proposed neural network is asymptotically stable in the sense of Lyapunov such that it converges to an exact optimal solution of the CQP problem. The simulation results show that the proposed network is feasible and efficient. Keywords: Neural Network, Convex Quadratic Programming, NCP Function, Stability.
1. Introduction Quadratic programming problems have attracted much focus in recent years due to its wide applications in science and engineering including regression analysis[1], robot control[2], signal processing[3], image fusion[4], filter design and pattern recognition[5] et al. In many real-time applications, the optimization problems have a time-varying nature, which have to be solved in real time. However, most of conventional methods, like lagrange methods[5], interior-point methods[5], descent methods[5], penalty function methods[6] et al. might not be efficient enough for digital computers to solve the problem since the computing time required for a solution largely relies on the dimension and structure of the optimization problem, and the complexity of the algorithm used. One promising approach for handling these optimization problems is to employ artificial neural network methodology based on circuit implementation [7]. The neural network methodology arises in a wide variety of applications including pattern recognition[5], system identification and control[8], character recognition[9], image compression[10], model predictive control[11] and stock market prediction[12] et al. In the past two years, the neural network methodology have been extensively investigated for solving electromagnetic theory[13], combustion theory[14], nanotechnology[15], plasma physics[16], thin film flow problems[17], fluid mechanics problem[18] and magneto hydro dyamic[19], thus extended the application fields of neural networks. The main advantage of neural network approach to optimization is that the nature of the dynamic solution procedure is inherently parallel and distributed. Therefore, the neural network approach can solve optimization problems in running time ∗ Corresponding
author
Email address: Preprint submitted to Journal of LATEX Templates
(Xinjian Huang)
at the orders of magnitude much faster than the most popular optimization algorithms executed on general-purpose digital computers. In addition, neural network for solving optimization problem is hardware-implementable, that is to say, the neural network can be implemented by using integrated circuits. The neural network for solving programming problems was first proposed by Tank and Hopfield [20]. Since then, different neural networks for solving different kinds of programming problems have been extensively studied and some achievements have also been obtained. In [21], based on gradient method and penalty function method, Kennedy and Chua propose a neural network for solving nonlinear programming problems. To avoid penalty parameters, Rodriguez-Vazquez et al.[22] propose a switched-capacitor neural network for solving a class of optimization problems. Applying the Lagrange multiplier theory, Wu and Tam [23] propose a Lagrange network for solving quadratic programming problems and Effati and Baymani [24] propose a Lagrange network for solving convex nonlinear programming problems. Huang [25] proposes a novel method to deal with inequality constraints in Lagrangian neural networks by redefining Lagrange multipliers as quadratic function. The method can solve some nonlinear programming and quadratic programming problems. In recent years, more neural networks to deal with linear programming, bilinear programming, nonlinear bilevel programming, quadratic programming and convex programming are presented. For instance, using a NCPfunction, Effati and Nazemi [26] propose a recurrent neural networks for solving the linear and quadratic programming problems, which is proved to be stable in the sense of Lyapunov and globally convergent to an exact optimal solution of the programming problem. Sohrab Effati et al. [27] present a projection neural network for solving bilinear programming problems. In this paper, the bilinear programming problems and the May 18, 2016
mixed-integer bilinear programming problems can be reformulated to linear complementarity problems which can be solved by projection neural networks. In [28], Sha et al. propose a delayed projection neural network for solving quadratic programming problems. Based on steepest descent method, Nazemi and Effati in [29] present a gradient model for dealing with CQP problems. The main idea is to convert the CQP problem into an equivalent unconstrained minimization problem with objective energy function. Liu et al.[30] propose a delayed neural network for solving a class of linear projection equations and some quadratic programming problems. Using Lyapunov-Krasovskii theory and linear matrix inequality (LMI) approach, the proposed neural network is proved to be global asymptotic stability and global exponential stability. In paper[31], the author propose a neural network to dealing with nonlinear bilevel programming problem, which is a NP-hark problem. In this model, the initial point has an sensitive influence on the transient behavior of the neural network, which results in the initial point must be in a constrained condition. Based on projection theory, Liu and Wang[32] propose a projection neural network for constrained quadratic minimax optimization. A neural network model in [33] with equality constraints is presented. Moreover, the influence of the parameter k on the convergence rate is also analysed in this paper. This paper proposes a new neural network for solving CQP problems based on the KKT optimality conditions and the NCP function and the existence of the solution of the neural network is also discussed. The proposed network is reliable and simple in structure. Moreover, this neural network is proved to be globally stable in the sense of Lyapunov and can obtain an exact optimal solution of the original optimization problem. The remainder of the paper is organized as follows. In Section 2, we describe the problem which will be investigated later and some lemmas are given. In Section 3, a novel neural network is formulated for solving the CQP problems. The convergence and stability of the proposed network are then analyzed in Section 4. In Section 5, the simulation results to substantiate the theoretical arguments are given. Some conclusions are drawn in Section 6. Throughout this paper, Rn denotes the space of n-dimensional real column vectors and Rn+ denotes the space of n-dimensional positive real column vectors. R+ means the set of positive real number. In what follows, · 2 denotes l2 normal of Rn and T denotes the transpose. If a differentiable function Γ : Rn → R, then ∇Γ ∈ Rn stands for its gradient.
A vector x is called a feasible solution to a CQP problem if x satisfies constraints (2) and (3). The collection of all feasible points is called the feasibility set, denoted by Ω0 := {x ∈ Rn |Ax − b = 0, Bx − d ≤ 0}. A feasible point x∗ is said to be a local optimal solution to the CQP problem if there exists an open neighborhood S ⊂ Ω0 of x∗ such that for any x ∈ S , f (x) ≥ f (x∗ ).
x∗ is strict if (4) strictly holds when x x∗ , and x∗ is the global optimal solution to the CQP problem if the inequality (4) holds for any x ∈ Ω0 . For any differential mapping = (1 , . . . , m )T : Rn → m R , ∇ = [∇1 (x), . . . ∇m (x)]T ∈ Rm×n denotes the transposed Jacobian of at x. Definition 2.1 ([34]). x∗ is said to be an equilibrium point of x˙ = f (t, x) if f (t, x∗ ) ≡ 0 for all t ≥ 0. Definition 2.2 ([34]). Let D ⊆ Rn be an open neighborhood of x∗ . A continuously differentiable function L : Rn → R is said to be a Lyapunov function at the state x∗ for a system x˙ = f (t, x) if L(x∗ ) = 0, L(x) > 0, ∀x ∈ D\ {x∗ } , dL(x(t)) = [∇L(x(t))]T f (t, x) ≤ 0, ∀x ∈ D. dt Definition 2.3 ([34]). Let x(t) be a solution trajectory of a system x˙ = f (t, x), and let X ∗ denotes the set of equilibrium points of this equation. The solution trajectory is said to be globally convergent to the set X ∗ , if x∗ satisfies lim dist(x(t), X ∗ ) = 0,
t→∞
where dist(x(t), X ∗ ) = infy∈X ∗ x − y . In particular, if the set X ∗ has only one point x∗ , then lim x(t) = x∗ , and the system t→∞ x˙ = f (t, x) is said to be globally asymptotically stable at x∗ if the system is also stable at x∗ in the sense of Lyapunov. Lemma 2.1 ([26]). (i) An isolated equilibrium point x∗ of a system x˙ = f (t, x) is Lyapunov stable if there exists a Lyapunov function over the same neighborhood D of x∗ . (ii) An isolated equilibrium point x∗ of a system x˙ = f (t, x) is asymptotically stable if there exists a Lyapunov function over < 0, ∀x ∈ the same neighborhood D of x∗ such that L(x(t)) dt D\ {x∗ }. Lemma 2.2 ([6]). x∗ ∈ Rn is the optimal solution of (1)-(3) if and only if there exist y∗ ∈ R p and u∗ ∈ Rm satisfies the following KKT condition
2. Problem Formulation Consider a CQP problem of the following form: min s.t.
1 T 2 x Wx
+ cT x
W x∗ + c + AT u∗ + BT y∗ = 0,
(1)
Ax = b Bx ≤ d,
∗
y ≥ 0,
is a real symwhere x = (x1 , x2 , . . . , xn ) ∈ R , W ∈ R metric positive matrix, c ∈ Rn , A ∈ Rm×n , and rank(A) = m, (0 < m < n), B ∈ R p×n , b ∈ Rm , d ∈ R p . n
(5)
Ax∗ = b,
(2) (3) T
(4)
n×n
(6) ∗
d − Bx ≥ 0,
∗T
∗
y (d − Bx ) = 0,
(7)
x∗ is called a KKT point of (1)-(3), and a pair (y∗ , u∗ )T is called the Lagranian multiplier vector corresponding to x∗ . 2
Lemma 2.3 ([6]). If W is a positive definite matrix, then x∗ is the optimal solution of (1)-(3) if and only if x∗ is a KKT point of (1)-(3).
For solving problem (1)-(3), the aim is to construct a continuous-time dynamical system that will deal with the reformulated KKT conditions (9) and (10). Based on Theorem 3.1, we propose a neural network model for solving the problem (1)-(3), with its dynamical equations being given by
Lemma 2.4 ([35]). If A is an n × n nonsingular matrix, then the homogeneous equation Ax = 0 has only the trivial solution x = 0.
dx = −G(W x + c + BT y) − Q(Ax − b), dt dy = −φFB (d − Bx, y), dt
(11)
Lemma 2.5 ([36]). (LaSalle Invariant Set Theorem) Consider the system of the form x˙ = f (t, x), with f continuous and let V(x) be a scalar function with continuous first partial derivatives. Assume that: (i) for some l > 0, the region Ωl defined by V(x) < l is bounded, ˙ (ii) V(x) ≤ 0 for all x in Ωl . ˙ = 0, and Let be the set of all point within Ωl where V(x) M be the largest invariant set in . Then every solution x(t) originating in Ωl tends to M as t → ∞.
with initial point z0 = (x0 , y0 )T . For convenience, we denote G(W x + c + BT y) + Q(Ax − b) η(z) = . (13) φFB (d − Bx, y)
3. Neural Network Formulation
dz(t) = −kη(z), dt z(t0 ) = z0 ,
In this section, we consider a quadratic programming problem of the form (1)-(3) and a corresponding neural network with less state variables and simple structure is presented, and then advantages of the new neural network model are shown comparing with the existing ones. First, we state the following results for (1)-(3).
where k ∈ R+ is a scale parameter, which influences the convergence rate of the proposed neural network (14) and (15). An indication on how the neural network (14) and (15) can be implemented on hardware is shown in Fig.1.
Thus the neural network (11)-(12) can be written as
b ≥ 0,
a ◦ b = 0,
d ¦
(15)
IFB
y
k ³
c
¦
b
A popular NCP-function √ is the FB function which is defined as φFB (a, b) = a + b − a2 + b2 , where a ∈ Rn+ , b ∈ Rn+ , a2 = a ◦ a, b2 = b ◦ b.
G
¦
¦ x
k ³
x
Q
Figure 1: A simplified block diagram for the neural network (14) and (15).
Remark 1. The proposed neural network (14) and (15) is based on the equivalent KKT conditions (9) and (10), in which the KKT conditions is applied for solving nonlinear programming problems. Futhermore, the matrix W in (1) is real symmetric positive matrix. Thus the proposed neural network (14) and (15) can not be used for solving linear programming (LP) problems.
(9)
φFB (d − Bx∗ , y∗ ) = 0,
W
A
Theorem 3.1. x∗ is the optimal solution of (1)-(3) if and only if there exists y∗ ≥ 0 such that (x∗ , y∗ )T satisifies: G(W x∗ + c + BT y∗ ) + Q(Ax∗ − b) = 0,
y
BT
(8)
where ◦ denotes hadamard product.
(10) −1
(14)
B
Definition 3.1 ([37, 38]). A function φ : Rn+ × Rn+ → Rn is called an nonlinear complementary problem (NCP) function if it satisfies φ(a, b) = 0 ⇐⇒ a ≥ 0,
(12)
−1
where G = I − AT (AAT ) A, Q = AT (AAT ) , I is an identity matrix with proper dimension. Proof: If x∗ is the optimal solution of the problem (1)-(3), according to Lemma 2.2, there exist (x∗ , u∗ , y∗ )T ∈ Rn ×Rm ×R p , y∗ ≥ 0 such that equations (5)-(7) are satisfied. First, Based on the proof of Lemma 1 in paper [39], the solution of (5) and (6) are equivalent to the solution of (9). Next, we will prove that the solution of (10) and the solution of (7) are same. That is to say, y∗ ≥ 0, d − Bx∗ ≥ 0, y∗T (d − Bx∗ ) = 0 if and only if φFB (d−Bx∗ , y∗ ) = 0. Based on Definition 3.1, it proves. This completes the proof.
In order to see how well the presented neural network (14) and (15) can be applied to solve problem (1)-(3), we compare it with several existing neural network models. Kennedy and Chua[21] propose a neural network for solving convex nonlinear programming with penalty parameters. This model used for solving (1)-(1) has the following forms dx = −(W x + c + λ((BT (Bx − d))+ + AT (Ax − b))), dt 3
(16)
where λ is a penalty parameter and +
+
+
model model in [33] mode in [41] mode in [42] (17) (18)-(19) (14)-(15)
+
(Bx − d) = ([b1 x − d1 ] , [b2 x − d2 ] , . . . , [b p x − d p ] ), (b j x − d j )+ = max([b j x − d j ]+ , 0),
j = 1, 2, . . . , p.
The model (16) can not find an exact optimal solution due to a finite λ and is hard-implementation when λ is large. Thus, this network can only converge an approximate solution of (1)(3) for any initial points and any given finite penalty parameter. Comparatively, the proposed neural network model (14) and (15) globally converge to an exact optimal solution of (1)(3). In [26], the author proposes the following neural network for solving problem (1)-(3): ⎧ dx(t) ⎪ = −k(W x + c + AT u + BT y), ⎪ ⎪ dt ⎪ ⎪ ⎪ ⎨ du(t) (17) = k(−y + (y + Bx − d)+ ), ⎪ ⎪ dt ⎪ ⎪ ⎪ ⎪ ⎩ dy(t) = k(Ax − b),
4. Stability and Convergence Analysis In this section, we will discuss the stability and convergence properties of the proposed neural network of (14)-(15). We next show that the Jacobian matrix ∇η(z) defined in (13) is nonsingular. To show this, the following lemmas are needed. Lemma 4.1 ([43]). Let f : Rn+ → Rn+ , and is defined by f (x) = √ x, ∀x ∈ Rn+ . Then we get the Jacobian matrix at x = (x1 , x2 )T ∈ Rn+ , x 0 of f (x) is given by ⎧ 1 ⎪ √ I, x2 = 0, ⎪ ⎪ 2 x1 ⎪ ⎪ 1 −1 ⎪ ⎨ ∇ f (x) = Lw = ⎪ ⎪ √1 cx2T /x2 ⎪ 2 2 x1 ⎪ ⎪ ⎪ ⎩ cx /x aI + (b − a)x xT /x , x2 0.
where x ∈ R , u ∈ R , y ∈ R are three state variables. The neural network (17) has three variables and n + m + p neurons, however, the model (14) and (15) proposed in this article only have two variables and n + p neurons. Thus it is much simpler when the model (14) and (15) implemented by using integrated circuits. In [40], a gradient neural network model is used for solving problem (1)-(3). Utilizing the penalty method, the constrained optimization problem (1)-(3) can be approximated by the following unconstrained minimization optimization problem m
p
2
E(z) =
2 2
2
Lemma 4.2 ([44]). For all (x, y)T ∈ Rn+ × Rn+ , x2 + y2 ∈ Rn+ , then φFB (x, y) is continuously differentiable at (x, y)T and Vi ∈ ∇φFB (x, y), i = 1, 2, ..., n has the following property Vi = I − Lw−1 L x , I − Lw−1 Ly , √ where w = z, z = x2 + y2 . I is an identity matrix and LTx = x1 x2T y2 y for x = (x1 , x2 )T ∈ R × Rn−1 , Ly = 1 x2 x1 In−1 y2 y1 In−1 for y = (y1 , y2 )T ∈ R × Rn−1 . Lemma 4.3 ([45]). Let x ∈ Rn+ , y ∈ Rn+ , and w = x2 + y2 , then we have
The gradient neural network model for solving problem (1)-(3) is then given by dz(t) = −k∇E(z(t)), dt z(0) = z0 .
2
√ √ √ ρ − ρ where w = x, a = ρ22 −ρ1 1 , b = 14 ( √1ρ2 + √1ρ1 ), c = 14 ( √1ρ2 − √1ρ1 ) with ρi = x1 + (−1)i x2 , i = 1, 2, and ∇ f (x) is positive definite for all x ∈ Rn+ .
1 ϕ(z)2 , 2 where z = (x, u, y)T ∈ Rn+m+p and ⎞ ⎛ ⎜⎜⎜W x + c + AT u + BT y⎟⎟⎟ ⎜ ⎟⎟⎟⎟ . b − Ax ϕ(z) = ⎜⎜⎜⎜ ⎟⎠ ⎝ φFB (y, d − Bx) minimize
condition on f strictly convex strictly convex convex on Rn strictly convex strictly convex strictly convex
Table 1: Comparison of related neural networks in terms of the number of neurons for (1)-(3).
dt
n
number of neurons n+m+ p 3n + m n+m n+m+ p n+m+ p n+ p
(18) (19)
It is noticeable that the advantage of the gradient neural network model is that the neural network model can be obtained directly using the derivatives of the minimization function. But its disadvantage is that the gradient neural network model (18) and (19) only converges to an approximate solution of problem (1)-(3). Moreover, the circuit-implementation of the neural network model (18) and (19) is complex due to the calculation of the derivative ∇ϕ(z). To see more clearly, a comparison of the proposed neural network among the above three models and additional three models is provided in Table 1.
(Lw − L x )(Lw − Ly ) > 0,
(Lw − L x ) > 0,
(Lw − Ly ) > 0.
Theorem 4.1. The Jacobian matrix ∇η(z) of mapping η defined in (13) is nonsingular if and only if the matrix GWG + GAT QT is positive semidefinite matrix. Proof: By Lemma 4.1, the Jacobian matrix of η(z) has the following structure GW + QA p ∇η(z) = ) (−diag ∇(Bx−d)k φFB ((d − Bx)k , yk ) k=1 ⎞ GBT p ⎟⎟⎟⎟⎠ diag ∇yk φFB (d − Bx, yk )
Remark 2. From Table 1, we can see that the proposed neural network model (14)-(15) has least neurons, that is to say, it makes circuit realization easier.
k=1
4
⎛ ⎜⎜ = ⎜⎜⎝
T GW + QA GB p p −1 −1 (−diag I + Lwmi L(d−Bx)mi )B diag I − Lwmi Lymi i=1
⎞ ⎟⎟⎟ ⎟⎠
Proof: Suppose that (x∗ , u∗ , y∗ )T satisfies the KKT system (5)(7). Consider the following Lyapunov function:
i=1
(20)
GW + QA GBT , (21) = −1 − I + Lw L(d−Bx) B I − Lw−1 Ly where w = (d − Bx)2 + y2i , i = 1, 2, . . . , p, Lw = diag Lwm1 , Lwm2 , . . . , Lwmp , L(d−Bx) = diag L(d−Bx)m1 , L(d−Bx)m2 , . . . , L(d−Bx)mp , Ly = diag Lym1 , Lym2 , . . . , Lymp . Then ∇η(z) is nonsingular if and only if the following matrix T GW + QA GBT ξ(z) = − Lw + L(d−Bx) B Lw − Ly is nonsingular. Next, we will prove it by inducing contradiction. Assume ζ = (ζ1 , ζ2 )T ∈ Rn × R p , and ζ 0. We have T GW + QA GBT ζ1 ξ(z)ζ = = 0. − Lw + L(d−Bx) B Lw − Ly ζ2
(26)
From Lemma 2.2, we get that x∗ is the optimal solution of (1)-(3). From Theorem 3.1, we have η(z∗ ) = 0. Futhermore ∇E(z∗ ) = ∇η(z∗ )T η(z∗ ),
(27)
where ∇η(z) is the Jacobian matrix of η(z) and is nonsingular, which has been proved in Theorem 4.1. By (27), we get ∇E(z∗ ) = 0. Then we get that z∗ is the equilibrium point of the neural network (14) and (15). Conversely, from Lemma 2.4 and (27) we know that η(z∗ ) = 0, in which ∇η(z∗ ) is nonsingular. From Theorem 3.1, we get that x∗ is the optimal solution of the problem (1)-(3). This completes the proof. Now we state the main results of this section.
Proof: Since the problem of (1)-(3) has the unique optimal solution x∗ , and the reformed KKT conditions (9) and (10) also has the unique solution (x∗ , y∗ )T . Moreover, from Theorem 4.2, we get that the reformed KKT conditions (9) and (10) has the same optimal solution with the proposed neural network (14) and (15). Thus the equilibrium point of the network (14) and (15) is unique.
(22) (23)
From (23), we have ζ2T Lw + L(d−Bx) BGζ1 + ζ2T Lw + L(d−Bx) (Lw − Ly )ζ2 = 0. (24) From (24) and Lemma 4.3, we have ζ1T GBT Lw + L(d−Bx) ζ2 < 0.
Theorem 4.4. Let z∗ = (x∗ , y∗ )T be an isolated equilibrium point of the proposed neural network model in (14) and (15), then z∗ is asymptotically stable in the sense of Lyapunov, where x∗ is the optimal solution of (1)-(3).
Premultiplied (22) by ζ1T G, it yields ζ1T G(WG + AT QT )ζ1 − ζ1T GBT Lw + L(d−Bx) ζ2 = 0.
Proof: It is clear that E(z) ≥ 0 and E(z∗ ) = 0. Moreover, since z∗ is an isolated equilibrium point of (14) and (15), there exists a neighborhood Ω∗ ⊆ Rn × R p of z∗ such that ∇E(z∗ ) = 0, and ∇E(z) 0, ∀z ∈ Ω∗ \ {z∗ }. We claim that for any ∀z ∈ Ω∗ \ {z∗ }, it has E(z) > 0. Otherwise if there exists a z¯ ∈ Ω∗ \ {z∗ } satisfying E(¯z) = 0. Then by (14) and (27), we have ∇E(¯z) = 0, i.e., z¯ is also an equilibrium point of (14) and (15), which obviously contradicts the assumption that z is an isolated equilibrium point in Ω∗ . In addition, From (26), we have
So we have ζ1T (GWG + GAT QT )ζ1 < 0. It shows that matrix GWG + GAT QT is negative definite, which contradicts the assumption that the matrix GWG + GAT QT is positive semidefinite, so we have ζ1 = 0. By (23), we get (Lw − Ly )ζ2 = 0, i.e. ζ2 T (Lw − Ly )ζ2 = 0.
1 η(z)2 . 2
Theorem 4.3. The equilibrium point of the proposed neural network (14) and (15) is unique.
It is (WG + AT QT )ζ1 − BT Lw + L(d−Bx) ζ2 = 0, BGζ1 + (Lw − Ly )ζ2 = 0.
E(z) =
(25)
From Lemma 4.3 and (25), we get ζ2 = 0. Thus the Jacobian matrix of η is nonsingular. This completes the proof. We now investigate the relationships between the equilibrium point of (14) and (15) and the solution of the problem (1)-(3).
dz(t) dE(z) = [∇E(z(t))]T = −k ∇E(z(t))2 ≤ 0, dt dt
(28)
and dE(z(t)) < 0, dt
Theorem 4.2. Let (x∗ , y∗ , u∗ )T ∈ Rn ×R p ×Rm satisfies the KKT equations (5)-(7). Then z∗ ∈ Rn × R p is the equilibrium point of the neural network (14) and (15). Conversely, if z∗ = (x∗ , y∗ ) is the equilibrium point of the neural network (14) and (15) and the Jacobian matrix of ∇η(z) in (21) is nonsingular, then x∗ is the KKT point of the problem (1)-(3).
∀z(t) ∈ Ω∗ \ {z∗ } .
(29)
Then by Lemma 2.1 (ii), we get the conclusion that z∗ is asymptotically stable in the sense of Lyapunov. This completes the proof.
5
Theorem 4.5. Suppose that z = z(t, z0 ) is a trajectory of (14) and (15) in which z0 = (0, z0 ) is the initial point and the level set S (z0 ) = {z ∈ Rn × R p : E(z) ≤ E(z0 )} is bounded. Then we have D(z0 ) = {z(t, z0 ) | t ≥ 0} is bounded and there exists z¯ such that lim z(t, z0 ) = z¯.
With a simple calculation, we obtain that the matrix GWG+ GAT QT in Theorem 4.1 is as follows ⎡ ⎢⎢⎢2.0623 ⎢⎢⎢0.7855 GWG + GAT QT = ⎢⎢⎢⎢ ⎢⎢⎣3.3391 2.8478
t→∞
Proof: First, assume that the equilibrium point of the network (14) and (15) is z∗ , that is to say ∇E(z∗ ) = 0. Calculating the derivative of E(z) along the trajectory z(t, z0 ), t ≥ 0, it yields dE(z) dz(t) = [∇E(z(t))]T = −k ∇E(z(t))2 ≤ 0. dt dt
0.7855 3.1834 −1.6125 3.9689
3.3391 −1.6125 8.2907 1.7266
⎤ 2.8478⎥⎥ ⎥ 3.9689⎥⎥⎥⎥ ⎥, 1.7266⎥⎥⎥⎦⎥ 6.8166
and the eigenvalues of the matrix (??) are λ1 = 11.7395, λ2 = 8.6135, λ3 = λ4 = 0. It is easy to verify that the matrix (??) is positive semidefinite, thus the problem can be solved by the proposed network (14) and (15). This problem has the optimal solution x∗ = (0.5, −0.5, 1.5, 0)T . Fig.2 shows that the trajectories of the proposed neural network to solve the above problem with k = 1 and 10 random initial points converge to the optimal solution of this problem. An l2 normal error between z(t) and z∗ with different k and initial point z0 = (1, −1, 1, −1, 1, −1, 1)T is also shown in Fig.3. It is shown that a larger k leads to a better convergence rate.
(30)
Thus along the trajectory z(t, z0 ), t ≥ 0, E(z) is monotone nonincreasing. Therefore D(z0 ) ⊆ S (z0 ), that is to say D(z0 ) = {z(t, z0 ), t ≥ 0} is bounded. Next since, D(z0 ) = {z(t, z0 ), t ≥ 0} is bounded set of points. Take strictly monotone increasing sequence {t¯n }, 0 ≤ t¯1 ≤ t¯2 ≤ · · · ≤ t¯n ≤ , · · · , tn → ∞, then {z(t¯n , z0 )} is a bounded sequence composed of infinitely many points. Thus there exists limiting point z¯, that is, there exists a subsequence {tn } ⊆ {t¯n }, tn → ∞ such that
3
lim z(tn , z0 ) = z¯,
n→∞
where z¯ satisfies
2
dE(z(t)) = 0, dt
1
x(t)
which indicates that z¯ is ω−limit point of D(z0 ). By Lemma 2.5, it yields z(t, z0 ) → z¯ ∈ N as t → ∞, where N is the largest 0 )) = 0}, From (14), (15) and invariant set in K = {z(t, z0 ) | dE(z(t,z dt (30), one has
−1 x1(t)
dy dE(z(t)) dx = 0, =0⇔ = 0, dt dt dt
x (t) 2
−2
x3(t) x4(t)
thus z¯ ∈ Q by N ⊆ K ⊆ Q, where Q is denoted as the optimal point set of (1)-(3). Therefore, from any initial state z0 , the trajectory z(t, z0 ) of (14) and (15) tends to z¯. The proof is complete.
−3
0
20
40
60
80
100
120
140
160
Time(sec)
Figure 2: Transient behaviors xi (i = 1, 2, 3, 4) of the neural network (14) and (15) with 10 random initial points in Example 1.
From Theorem 4.4 and Theorem 4.5, we can get the following corollary immediately. Corollary 4.1. If Q = {(x, y)T }, then the neural network (14) and (15) for solving (1)-(3) is globally asymptotically stable and is convergent to the unique equilibrium point z = (x, y)T .
Example 2 ([28]). Consider the following quadratic program with equality and inequality constraints: min
5. Simulations In this section, we give several examples to illustrate the effectiveness of the proposed neural network (14) and (15).
f (x) = 0.4x12 + 0.3x22 − 0.1x1 x2 − 0.2x1 − 0.4x2 + 0.7x3 ⎧ ⎪ x − x2 + x3 = 5, ⎪ ⎪ ⎨ 1 0.9x1 + 0.2x2 − 0.2x3 ≤ 4, s.t. ⎪ ⎪ ⎪ ⎩ 0.2x + 0.7x − 0.1x ≤ 10. 1
2
3
It is easy to verify that the matrix GWG + GAT QT in Theorem 4.1 is also positive semidefinite. From the proposed network (14) and (15), we can get the equilibrium point x∗ = (1.8051, −0.3192, 3.5957)T of this problem. Fig.4 shows the phase diagram of state variables (x1 (t), x2 (t), x3 (t))T with 12 different initial points. Moreover, Fig.5 show the convergence rate that the l2 normal error between z(t) and z∗ with different convergence rate parameter k and initial points z0 = ( 21 , − 12 , 12 ,
Example 1 ([39]). Consider the following quadratic programming problem: min
0
f (x) = 3x12 +3x22 +4x32 +5x42 +3x1 x2 +5x1 x3 +x2 x4 −11x1 −5x4 ⎧ ⎪ 3x1 − 3x2 − 2x3 + x4 = 0, ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ 4x1 + x2 − x3 − 2x4 = 0, s.t. ⎪ ⎪ ⎪ −x1 + x2 ≤ −1, ⎪ ⎪ ⎪ ⎩ −2 ≤ 3x1 + x3 ≤ 4. 6
450 k=5 k=1 k=0.5 k=0.3
400
4
3
250
2 3
300
x*
x
||z(t)−z*||2
350
200
1 150 0 4
100
2
4
50
x2 0
0
10
20
30
40
50
0
2 −2
0 −4
60
−2
x1
Time(sec)
Figure 3: Convergence behaviors of z(t) − z∗ 2 in Example 1 with z0 = (1, −1, 1, −1, 1, −1, 1)T .
Figure 4: Phase diagram of the neural network (14) and (15) with 12 different initial points in Example 2.
− 12 , 12 , − 12 , 12 )T , where we can get the conclusion that the convergence rate of z(t) and the parameter k have an direct ratio relationship.
and B = −W, d = c. It is easy to verify that the matrix GWG + GAT QT is positive semidefinite, thus the proposed model (14) and (15) can be used for solving this CQP problem and is globally asymptotically stable to the exact solution. The trajectories behavior for solving this problem for W, 50 × 50 and 100 × 100 and c, 50-vector and 100-vector with scale parameter k = 1 are shown in Fig.8 and Fig.9, respectively.
Example 3 ([33]). Consider the following quadratic program with inequality constraints: f (x) = 10x12 + 2x22 + 2x32 − 2x1 x2 − 6x1 x3 + 2x2 x3 ⎧ ⎪ −1 ≤ −x1 + x2 ≤ 0, ⎪ ⎪ ⎨ −1 ≤ −3x1 + x3 ≤ 1, s.t. ⎪ ⎪ ⎪ ⎩ 1 ≤ x + x ≤ 2.
min
Example 5 ([47]). Let us consider a constrained least-squares approximation problem with following properties: Find the pa2 3 rameters of the combination of exponential and polynomial funcThis is a quadratic program with inequality constraints, where tions y(x) = a4 e x + a3 x3 + a2 x2 + a1 x1 + a0 , which fits the data given in Table.2 and subjects to the constraints 8.1 ≤ y(1.3) ≤ A = 0, b = 0, and c = 0. Then we get G = I and Q = 0. It 8.3, 3.4 ≤ y(2.8) ≤ 3.5, and 2.25 ≤ y(4.2) ≤ 2.26. is easy to know that the matrix GWG + GAT QT = W is positive definite, and the neural network (14) and (15) to solve this x 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 problem is effective. All simulation results show that the outy 7.6 7.2 7.9 8 6.2 6.2 3 0.8 1.2 5.8 put trajectory x(t) of the proposed model converges to an optimal solution ( 41 , 14 , 34 )T of this problem with 10 random initial Table 2: Approximation data for Example 5. points, which is shown in Fig.6. Futhermore, Fig.7 show the ∗ convergence rate that the l2 normal error between z(t) and z This problem can be formulated as: with different convergence rate parameter k. We can get that a larger k leads to a better convergence rate. min Ex − e2 Example 4 ([46]). Consider the following CQP problem in (1)(3) with inequality constraints with the following properties:
s.t. U x ∈ Ω,
⎡ ⎤ 1 0 ... 0 0 0⎥⎥ ⎢⎢⎢ 4 ⎥ ⎢⎢⎢−2 4 1 0 . . . 0 0⎥⎥⎥⎥⎥ ⎢⎢⎢ ⎢⎢⎢ 0 −2 4 1 0 . . . 0⎥⎥⎥⎥⎥ ⎢⎢⎢ .. .. . . . ⎥⎥⎥ .. .. W = ⎢⎢⎢⎢⎢ ... . . . .. ⎥⎥⎥⎥ , . . ⎥⎥ ⎢⎢⎢ ⎢⎢⎢ 0 . . . 0 −2 4 1 0⎥⎥⎥⎥ ⎥ ⎢⎢⎢ 0 . . . 0 −2 4 1⎥⎥⎥⎥ ⎢⎢⎣ 0 ⎦ 0 0 0 . . . 0 −2 4 T c = −1 −1 −1 . . . −1 −1 ,
where x = (x1 , x2 , x3 , x4 , x5 )T = (a4 , a3 , a2 , a1 , a0 )T , ⎡ 1 1.649 2.718 4.482 7.389 12.183 20.086 33.116 54.598 90.017⎤T ⎢⎢ 0 0.125 1 3.375 8 15.625 27 42.875 64 91.125⎥⎥ E = ⎢⎢⎢⎣ 0 0.25 1 2.25 4 6.25 9 12.25 16 20.25 ⎥⎥⎥⎦ , 0 0.5 1 1
e = 7.6
7.2
⎡ ⎢⎢⎢ 3.669 ⎢ U = ⎢⎢⎢⎢16.445 ⎣ 66.686 7
1 1
1.5 1
7.9
2 1
8
2.197 21.952 74.088
2.5 1
6.2 1.69 7.84 17.64
3 1
6.2 1.3 2.8 4.2
3.5 1
3
0.8 ⎤ 1⎥⎥⎥ ⎥ 1⎥⎥⎥⎥ , ⎦ 1
4 1
4.5 1
1.2
T 5.8 ,
22
6 k=10 k=1 k=0.5 k=0.1
20 18
k=10 k=1 k=0.5 k=0.2
5
16 4 * 2
||z(t)−z ||
||z(t)−z ||
* 2
14 12 10 8
3
2
6 4
1
2 0
0
2
4
6
8
0
10
0
1
2
Time(sec)
3
4
5
Time(sec)
Figure 5: Convergence behaviors of z(t) − z∗ 2 in Example 2 with z0 = ( 12 , − 12 , 12 , − 12 , 12 , − 12 , 12 )T .
Figure 7: Convergence behaviors of z(t) − z∗ 2 in Example 3 with z0 = (−0.1, 0.1, −0.1, 0.1, −0.1, 0.1, −0.1, 0.1, −0.1)T .
1
3
0.9 2
0.8 0.7
1
x(t)
x(t)
0.6 0
0.5 x50 0.4
x −x 3
−1 x (t) 1
x2
0.2
x (t)
−2
x1
2
x (t)
0.1
3
−3
49
0.3
0
5
10
15
0
20
0
1
2
3
4
5
6
7
8
Time(sec)
Time(sec)
Figure 8: Convergence behaviors of xi (i = 1, 2, . . . , 50) of the neural network (14) and (15) with 100 random initial points in Example 4.
Figure 6: Transient behaviors xi (i = 1, 2, 3) of the neural network (14) and (15) with 10 random initial points in Example 3.
8
9 approximation data approximation curve
8 7 6
0.9
5
y(x)
1
0.8
4
0.7 3
x(t)
0.6 2 0.5 x100
1
0.4
x −x 3
99
0
0.3 0.2
0.5
x1
0
1
2
3
4
5
1
1.5
2
2.5
3
3.5
4
4.5
x
Figure 11: Convergence behaviors of z(t) − z∗ 2 in Example 1 with z0 = (1, −1, 1, −1, 1, −1, 1)T .
0.1 0
0
x2
6
7
8
Time(sec)
and Ω = {x ∈ R3 : 8.1 ≤ x1 ≤ 8.3, 3.4 ≤ x2 ≤ 3.5, 2.25 ≤ x3 ≤ 2.26}. Fig.10 depicts the transient behaviors xi (i = 1, 2, 3, 4, 5) of the neural network (14) and (15) with initial points x0 = (1, −1, 1, −1, 1)T in Example 5. The approximation curve of the problem using the proposed neural network is shown in Fig.11. We can see that the neural network (14) and (15) can solve the constrained least-square approximation problem efficiently.
Figure 9: Convergence behaviors of xi (i = 1, 2, . . . , 100) of the neural network (14) and (15) with 200 random initial points in Example 4.
6. Conclusion In this paper, we have proposed a new neural network for solving CQP problems. Based on convex programming theory, FB function, KKT optimality conditions, Lyapunov stability theory and invariant set theory the constructed neural network can find the optimal solution of the primal CQP problem. Comparing with the existing neural networks, the structure of the proposed network is reliable and efficient. Moreover, the proposed neural network has less variables and neurons. We also analyze the influence of the parameter k on the convergence rate of the trajectory and the convergence behavior of z(t)−z∗ 2 and find that a larger k leads to a better convergence rate. Simulation results illustrate the performance of the proposed neural network.
8 x (t) 1
7
x2(t) x (t)
6
3
x (t) 4
5
x5(t)
x(t)
4 3 2 1 0
Acknowledgement The authors would like to thank the Editor-in-Chief, Associate Editor and the reviewers for their insightful and constructive comments, which help to enrich the content and improve the presentation of the results in this paper. This work was supported by National Natural Science Foundation of China (No. 61473136) and the 111 Project (No. B12018).
−1 −2
0
2
4
6
8
10
Time(sec)
Figure 10: Convergence behaviors of z(t) − z∗ 2 in Example 1 with z0 = (1, −1, 1, −1, 1, −1, 1)T .
9
References
[24] S. Effati, M. Baymani, A new nonlinear neural network for solving convex nonlinear programming problems, Applied Mathematics and Computation 168 (2) (2005) 1370 – 1379. [25] Y. Huang, Lagrange-type neural networks for nonlinear programming problems with inequality constraints, in: IEEE Conference on, 2005 Decision and Control and 2005 European Control Conference. CDC-ECC ’05. 44th, 2005, pp. 4129–4133. [26] S. Effati, A. Nazemi, Neural network models and its application for solving linear and quadratic programming problems, Applied Mathematics and Computation 172 (1) (2006) 305–331. [27] E. Sohrab, M. Amin, E. Mohammad, An efficient projection neural network for solving bilinear programming problems, Neurocomputing 168 (2015) 1188 – 1197. [28] C. Sha, H. Zhao, F. Ren, A new delayed projection neural network for solving quadratic programming problems with equality and inequality constraints, Neurocomputing 168 (2015) 1164 – 1172. [29] A. Nazemi, S. Effati, An application of a merit function for solving convex programming problems, Computers & Industrial Engineering 66 (2) (2013) 212–221. [30] Q. Liu, J. Cao, Y. Xia, A delayed neural network for solving linear projection equations and its analysis, IEEE Transactions on, Neural Networks 16 (4) (2005) 834–843. [31] Y. Lv, T. Hu, G. Wang, Z. Wan, A neural network approach for solving nonlinear bilevel programming problem, Computers & Mathematics with Applications 55 (12) (2008) 2823 – 2829. [32] Q. Liu, J. Wang, A projection neural network for constrained quadratic minimax optimization, Neural Networks and Learning Systems, IEEE Transactions on 26 (11) (2015) 2891–2900. [33] A. Nazemi, A neural network model for solving convex quadratic programming problems with some applications, Engineering Applications of Artificial Intelligence 32 (2014) 54–62. [34] S. Sastry, Nonlinear Systems Analysis, Stability and Control, Springer, 1999. [35] D. Serre, Matrices: Theory and Applications, Springer, 2002. [36] J. J. E. Slotine, W. Li, Applied Nonlinear Control, Prentice Hall, 1991. [37] S. Pan, J. S. Chen, A semismooth newton method for SOCCPs based on a one-parametric class of soc complementarity functions, Computational Optimization and Applications 45 (1) (2010) 59–88. [38] S. Hu, Z. Huang, J. S. Chen, Properties of a family of generalized NCPfunctions and a derivative free algorithm for complementarity problems, Journal of Computational and Applied Mathematics 230 (1) (2009) 69– 82. [39] Y. Yang, J. Cao, X. Xu, M. Hu, Y. Gao, A new neural network for solving quadratic programming problems with equality and inequality constraints, Mathematics and Computers in Simulation 101 (2014) 103 – 112. [40] A. Nazemi, M. Nazemi, A gradient-based neural network method for solving strictly convex quadratic programming problems, Cognitive Computation 6 (3) (2014) 484–495. [41] S. Zhang, A. Constantinides, Lagrange programming neural networks, IEEE Transactions on, Circuits and Systems II: Analog and Digital Signal Processing 39 (7) (1992) 441–452. [42] Q. Liu, J. Wang, J. Cao, A Delayed Lagrangian Network for Solving Quadratic Programming Problems with Equality Constraints, Vol. 3971 of Lecture Notes in Computer Science, Springer Berlin Heidelberg, 2006, Ch. 56, pp. 369–378. [43] J. Sun, L. Zhang, A globally convergent method based on FischerBurmeister operators for solving second-order cone constrained variational inequality problems, Computers & Mathematics with Applications 58 (10) (2009) 1936–1946. [44] J. Sun, J. S. Chen, C. H. Ko, Neural networks for solving second-order cone constrained variational inequality problem, Computational Optimization and Applications 51 (2) (2012) 623–648. [45] M. Fukushima, Z. Luo, P. Tseng, Smoothing functions for second-ordercone complementarity problems, Society for industrial and applied mathematics 12 (2) (2001) 436–460. [46] D. Sun, A class of iterative methods for solving nonlinear projection equations, Journal of Optimization Theory and Applications 91 (1) (1996) 123–140. [47] Q. Liu, J. Cao, Global exponential stability of discrete-time recurrent neural network for solving quadratic programming problems subject to linear constraints, Neurocomputing 74 (2011) 3494–3501.
[1] Y. Xia, H. Leung, E. Bosse, Neural data fusion algorithms based on a linearly constrained least square method, IEEE Transactions on,Neural Networks 13 (2) (2002) 320–329. [2] E. Al-Gallaf, K. A. Mutib, H. Hamdan, Artificial neural network dexterous robotics hand optimal control methodology: grasping and manipulation forces optimization, Artificial Life and Roborics 15 (4) (2010) 408–412. [3] C. Svarer, Neural Networks for Signal Processing, Electronics Institute Technical University of Denmark, Denmark, 1994. [4] A. Malek, M. Yashtini, Image fusion algorithms for color and gray level images based on LCLS method and novel artificial neural network, Neurocomputing 73 (4-6) (2010) 937–943. [5] S. Boyd, L. Vandenberghe, Convex Optimization, Cambridge University Press,NewYork, 2004. [6] M. S. Bazaraa, H. D. Sherali, C. M. Shetty, Nonlinear Programming: theory and algorithms, 3rd edition, John Wiley & Sons, 2006. [7] Y. Xia, J. Wang, A general methodology for designing globally convergent optimization neural networks, IEEE Transactions on,Neural Networks 9 (6) (1998) 1331–1343. [8] D. Zissis, E. K. Xidias, D. Lekkas, A cloud based architecture capable of perceiving and predicting multiple vessel behaviour, Applied Soft Computing 35 (2015) 652 – 661. [9] C. Oprean, L. Likforman-Sulem, A. Popescu, C. Mokbel, Handwritten word recognition using Web resources and recurrent neural networks, International Journal on Document Analysis and Recongnition 18 (4) (2015) 287–301. [10] A. J. Hussain, D. Al-Jumeily, N. Radi, P. Lisboa, Hybrid neural network predictive-wavelet image compression system, Neurocomputing 151, Part 3 (2015) 975 – 984. [11] H.-G. Han, L. Zhang, Y. Hou, J.-F. Qiao, Nonlinear model predictive control based on a self-organizing recurrent neural network, IEEE Transactions on, Neural Networks and Learning Systems 27 (2) (2016) 402–415. [12] L. A. Laboissiere, R. A. Fernandes, G. G. Lage, Maximum and minimum stock price forecasting of brazilian power distribution companies based on artificial neural networks, Applied Soft Computing 35 (2015) 66 – 74. [13] J. A. Khan, M. A. Z. Raja, M. M. Rashidi, M. I. Syam, A. M. Wazwaz, Nature-inspired computing approach for solving non-linear singular Emden-Fowler problem arising in electromagnetic theory, Connection Science 27 (4) (2015) 377–396. [14] M. A. Z. Raja, Solution of the one-dimensional Bratu equation arising in the fuel ignition model using ANN optimised with PSO and SQP, Connection Science 26 (3) (2014) 195–214. [15] M. A. Z. Raja, U. Farooq, N. I. Chaudhary, A. M. Wazwaz, Stochastic numerical solver for nanofluidic problems containing multi-walled carbon nanotubes, Applied Soft Computing 38 (2016) 561 – 586. [16] M. A. Z. Raja, Stochastic numerical treatment for solving Troesch’s problem, Information Sciences 279 (2014) 860 – 873. [17] M. A. Z. Raja, J. A. Khan, T. Haroon, Stochastic numerical treatment for thin film flow of third grade fluid using unsupervised neural networks, Journal of the Taiwan Institute of Chemical Engineers 48 (2015) 26 – 39. [18] M. A. Z. Raja, F. H. Shah, A. A. Khan, N. A. Khan, Design of bio-inspired computational intelligence technique for solving steady thin film flow of Johnson-Segalman fluid on vertical cylinder for drainage problems, Journal of the Taiwan Institute of Chemical Engineers 60 (2016) 59–75. [19] M. A. Z. Raja, R. SAMAR, T. Haroon, S. M. SHAH, Unsupervised neural network model optimized with evolutionary computations for solving variants of nonlinear MHD Jeffery-Hamel problem, Applied Mathematics and Mechanics 36 (12) (2015) 1611–1638. [20] D. Tank, J. Hopfield, Simple ’neural’ optimization networks: An A/D converter, signal decision circuit, and a linear programming circuit, IEEE Transactions on,Circuits and Systems 33 (5) (1986) 533–541. [21] M. P. Kennedy, L. O. Chua, Neural networks for nonlinear programming, IEEE Transactions on,Circuits and Systems 35 (5) (1988) 554–562. [22] A. Rodriguez-Vazquez, R. Dominguez-Castro, A. Rueda, J. L. Huertas, E. Sanchez-Sinencio, Nonlinear switched capacitor ‘neural’ networks for optimization problems, IEEE Transactions on,Circuits and Systems 37 (3) (1990) 384–398. [23] A. Wu, P. K. S. Tam, A neural network methodology and strategy of quadratic optimizationation, Neural Computing & Applications 8 (4) (1999) 283–289.
10