Neurocomputing 30 (2000) 117}128
Solving for a quadratic programming with a quadratic constraint based on a neural network frame Ying Tan*, Chao Deng Department of Electronic Engineering and Information Science, University of Science and Technology of China, P.O. Box 4, Hefei 230027, China Received 2 June 1997; accepted 22 March 1999
Abstract In many applications, a class of optimization problems called quadratic programming with a special quadratic constraint (QPQC) often occurs, such as in the "elds of maximum entropy spectral estimation, FIR "lter design with time}frequency constraint and design of an FIR "lter bank with perfect reconstruction property. In order to deal with this kind of optimization problems and be inspired by the computational virtue of analog or dynamic neural networks, a feedback neural network is proposed for solving for this class of QPQC computation problems in real time in this paper. The stability, convergence and computational performance of the proposed neural network have also been analyzed and proved in detail so as to theoretically guarantee the computational e!ectiveness and capability of the network. From the theoretical analyses it turns out that the solution of a QPQC problem is just the generalized minimum eigenvector of the objective matrix with respect to the constrained matrix. A number of simulation experiments have been given to further support our theoretical analysis and illustrate the computational performance of the proposed network. 2000 Elsevier Science B.V. All rights reserved. Keywords: Arti"cial neural network; Quadratic programming; Generalized eigen-decomposition; QPQC; Analog-circuit neural network; Real-time optimization; Minimum eigenvector and eigenvalue
1. Introduction In the last 40 years, various dynamic solvers have been proposed for solving constrained optimization problems. The dynamical systems approach for solving * Corresponding author. Tel.: #86-551-3601340; fax: #86-551-3631-1760. E-mail address:
[email protected] (Y. Tan) 0925-2312/00/$ } see front matter 2000 Elsevier Science B.V. All rights reserved. PII: S 0 9 2 5 - 2 3 1 2 ( 9 9 ) 0 0 1 2 0 - 4
118
Y. Tan, C. Deng/Neurocomputing 30 (2000) 117}128
constrained optimization problems was "rst proposed by Pyne [4], and later studied by Rybashov, Karpinskaya and others [6]. In the discipline of the constrained optimization computation, linear programming (LP) is most useful because of its fundamental role and wide applicability. Optimization problems with nonlinear objective functions are usually approximated by a second-order (quadratic) system and solved approximately by a standard quadratic programming (QP) technique [3]. Traditional methods for solving LP or QP problems typically involve an iterative process that su!ers from long computational time that limits their extensive usage in many actual "elds [3]. Recently, due to renewed interest in arti"cial neural networks (ANN), several neural network based dynamic solvers have been proposed. Dynamic solvers for optimization problems are especially useful in real time applications with time-dependent cost functions, e.g., on-line optimization for robotics, or satellite guidance. ANN currently has been applied to several classes of constrained optimization problems and has shown promise in solving such problems more e$ciently. Linear and nonlinear programming solvers are a class of analog `neurala optimizers intended for the solution of constrained optimization problems (also called nonlinear programming problems) in real time. Tank and Hop"eld [12] proposed a linear programming model with a neural organization. The authors showed that their circuit evolves in time seeking for a minimum of an energy function, but they did not prove that this minimum corresponds to the solution of the problem under consideration. Kennedy and Chua [1] proposed a modi"ed model and canonical circuit models that are superior to the Tank and Hop"eld model and showed that the system will converge to a stable equilibrium point without oscillation using nonlinear circuit theory. They also yield a relationship between the solution of the network and that of the optimization problem, thus providing the foundations of a synthesis procedure for `neurala nonlinear programming solvers. Lately, many researchers [5,2,13,14,8,11,9] successively proposed a number of models. In [5], Rodriguez-Vazquez et al. proposed a neural network for exactly solving linear and quadratic programming problems and gave their switched capacitor implementation at the same time. In 1992, Maa and Shanblatt summarized and proved the computational capability of neural networks to LP and QP [2]. Wu et al. gave a combination model for exactly solving QP problems based on the duality theory of nonlinear programming [13,14]. This author proposed a feedback neural network model for some speci"c nonlinear programming and developed several applications in signal processing and communications domains for them [8,11,9]. Among the currently existing models in literature, the majority of them are mainly proposed and used for LP and QP problems with di!erent conditions and bene"ts. So far there is no special model for the quadratic programming with a special quadratic constraint (QPQC) problem that frequently occurs in many applications as described in the next section. This motivates our study of the new model based on the neural network framework for this class of QPQC problems. In the next section we explain in some detail the problem and basic representation of QPQC problems. In Section 3 we describe the neural network and its working process. Then in Section 4, we show the performance and some important theorems of the proposed network theoretically. In Section 5 we illustrate the capability of the
Y. Tan, C. Deng/Neurocomputing 30 (2000) 117}128
119
calculation of the QPQC problems in terms of some computer simulated experiments. Finally, in Section 6 we present some conclusions.
2. Problem statement In many actual applications, such as maximum entropy (ME) adaptive spectral estimation, adaptive "ltering, FIR "lter design with time}frequency constraints, design of PR FIR "lter banks, etc., the corresponding problems to be solved can almost be simpli"ed as a quadratic-constraint least square optimization problem [7,10]. For example, the design of perfect reconstruction FIR "lter banks needs to minimize the stopband energy of the prototype "lter under some PR constraints. If we design the "lter bank by optimizing the impulse response of the prototype directly, we can formulate the objective function of the stopband energy as Es"XQX and the PR constraint as XA X"1 (i"1,2, c). Here matrices Q and A are all symmetric G and positive-de"nite matrices [14], which are calculated by the required speci"cations of the prototype "lter. All of these optimization computation problems mentioned above can be simpli"ed and then expressed in a uni"ed mathematical formulation as minimize
XQX,
subject to XAX"1,
(1)
where Q and A are symmetric and positive semi-de"nite matrices, respectively. X denotes an N-dimensional column vector. Throughout the paper, we call the nonlinear optimization problem described by Eq. (1) as a quadratic programming with a quadratic constraint (i.e., QPQC for short) problem.
3. Neural network formulation for QPQC problem In order to solve the QPQC problem of Eq. (1), we must map the objective function and constraints into a closed-loop (feedback) network. Once one or more than one constraint violation occurs, the magnitude and direction of the violation of the constraint(s) are able to be properly fed back to the input of neurons for adjusting the states of the neuron of the network correctly. In this way the overall energy of the network is always decreasing until it achieves a minimum. Once the energy of the network attains its minimum, the states of the neurons of the network are taken to be a minimizer of the original problem, and then a solution of the corresponding QPQC problem. At "rst, by penalty function theory we replaced the above constrained nonlinear optimization of Eq. (1) by the following unconstrained optimization. minimize[J(X, k)"XQX#kP(X)],
(2)
120
Y. Tan, C. Deng/Neurocomputing 30 (2000) 117}128
where k is a positive constant and is also called penalty parameter, P(X) is a penalty function that satis"es these conditions: (1) continuous function, (2) P(X)50, ∀X, (3) P(X)"0 i! XAX"1. In the sequel, we choose the form of penalty function as P(X)"(XAX!1). So, the problem in hand is transformed into an unconstrained nonlinear optimization problem. The solution of the unconstrained nonlinear optimization problem of (2) is just the solution of the constrained nonlinear optimization problem of (1) as k approaches positive in"nity. In many actual applications, a satisfactory approximate solution can be obtained if penalty parameter k is chosen as a su$ciently large positive constant. So, in the following analysis, we assume that k is "xed at some appropriately chosen positive value. Here we would like to construct a neural network for which J could be the energy function. Obviously, this network must have N neurons where N is the dimension of the unknown vector variable X. Let v be the output of the kth neuron I and thence be the kth element of the output vector <"(v ,2, v )2 of the network. So , we have, v (t)"f (u (t)), k"1,2, N, I I
(3)
where u (t) and v (t) are the input and output, respectively, of the kth neuron at time t. I I The activation function of each neuron, f ( ) ), is assumed as a monotonically increasing continuous function such as the sigmoidal activation function. For J to be the energy function, the network dynamics should be such that the time derivative of J is negative. For the energy function of Eq. (2), the time derivative is given by dJ , *J du (t) " f (u (t)) I , I *v (t) dt dt I I
(4)
where f (u) is the derivative of f (u) with respect to u. Now, suppose we de"ne the dynamics of the kth neuron as du (t) *J I "! with Eq. (3). dt *v (t) I
(5)
Since f (u) is a monotonically increasing continuous function, it can be easily deduced from Eqs. (3)}(5) that the neural network with dynamics given by Eq. (5) has stable stationary points at the local minima of the energy function J since the time derivative of J is non-positive, i.e., dJ/dt40. A conceptual implementation of the neural network given by Eqs. (2)}(5) is shown in Fig. 1, which consists of an analog multiplier, an integrator, and an ampli"er with bias. The ampli"er shown with an arrow across it represents one with changeable gain which can be set. In the next section, we will analyze the performance of the proposed network in detail and show the fact that a minimizer of J corresponds to a generalized minimum eigenvector of Q associated with matrix A and is also the approximate solution of the original problem of Eq. (1). Similar to other existing models, this kind of analog neural network can also be used to solve for the nonlinear optimization problem of Eq. (1) or
Y. Tan, C. Deng/Neurocomputing 30 (2000) 117}128
121
Fig. 1. The schematic diagram and it's building blocks of the proposed neural network. (a) block diagram of the network, (b) components of PE-Q and PE-A in (a), and (c) basic building blocks.
(2) in real time when implemented by analog very large scale integrated (VLSI) circuit technology.
4. Theoretical analysis At "rst, it is easy to show that the proposed neural network in Eqs. (3) and (4) is globally Lyapunov stable since the time derivative of the energy function J of the network is non positive, i.e., dJ/dt40, provided the activation function of each neuron of the network is chosen as a monotonically increasing function. What follows is the steady-state performance analysis of the proposed neural network. According to Eq. (2), the gradient vector g(X) and the Hessian matrix H(X) of the cost function (that is the energy function J) can be given, respectively, as g(X)" J"2QX#4k(XAX!1)AX,
(6)
H(X)" J"2Q#8kAXXA#4kA(XAX!1).
(7)
122
Y. Tan, C. Deng/Neurocomputing 30 (2000) 117}128
So, according to Eqs. (2), (6) and (7), we have the following two theorems for our neural network: Theorem 1. X is a stationary point of J if and only if X is a generalized eigenvector of Q relative to matrix A corresponding to the generalized eigenvalue j, with ""X"" "(1!j/2k. Theorem 2. XM is a global minimizer of J if and only if XM is a minimum generalized eigenvector of Q corresponding to the minimum generalized eigenvalue j , with
""XM "" "(1!j /2k.
It can be seen that Theorem 1 immediately follows from Eq. (6) by the de"nition of the generalized eigen problem of the matrix. The proof of Theorem 2 is given in the appendix. An important point to be noted from Theorem 2 is that the norm of the solution is predetermined by the values of k and j . The higher the value of k, the
closer will this norm be to unity. Further, it is required to know the value of j for
choosing k when the network starts to evolve. Since the quantity of j will not be
known a priori, we can only give a coaurse estimation of the lower bound k of the quantity of j according to the traces of matrices Q and A to guide enchoosing the
value of k. It guarantees the existence of the solution of our neural network. As a result we suggest the following practical lower bound k'k "Tr(Q)/2Tr(A), where func tion Tr( ) ) denotes the trace of the matrix. We bring out four signi"cant corollaries from Theorem 2 given below: Corollary 1. The value of k should be such that k'j /2.
This result comes from Theorem 2 by keeping the value of operand to the power non-negative when the network settled. Corollary 2. For a given k, every local minimizer of J is also a global minimizer. This result comes from Theorem 2 by recognizing that H(XM ) is at least positive semi-de"nite when XM is a local minimizer of J. Corollary 3. The minimizer of J is unique (except for the sign). Corollary 4. The eigenvectors of Q associated with the N!1 non-minimal generalized eigenvalues correspond to saddle points of J. This comes from the fact that H(XM ) is inde"nite at the stationary point XM if it is an eigenvector corresponding to a non minimum eigenvalue. From the notions above, we have the following concluding theorem for our proposed neural network described by Eq. (5).
Y. Tan, C. Deng/Neurocomputing 30 (2000) 117}128
123
Theorem 3. From any initial state, the neural network described by Eq. (5) can approach a stable state. Furthermore, the steady-state output of the neural network is just the minimum generalized eigenvector of Q associated with A, provided k'j /2. Moreover,
the output of the neural network is also a satisfactory approximate solution for our QPQC optimization problem when the penalty parameter k is chosen as a suzciently large positive constant (k'Tr(Q)/2Tr(A)).
5. Numerical simulations To illustrate the ability of solving the QPQC problem of our proposed neural network, e.g., the generalized eigen decomposition of the network, in real time, we have performed a number of simulation experiments as follows. At "rst, we perform an experiment for 2-D matrix case. Suppose Q"[18,5; 5,2], A"[5,2; 2,3], then k "1.25, and choose k"10 and 100 (for this example, the theoretic generalized minimum eigenvalue j "0.26794919243112), we simulate
the neural network with two random initial states by MATLAB. The results are shown in Figs. 2 and 3. In Fig. 2, we choose k"10.0, the resultant minimum eigenvector a( "[!0.17598381194478, 0.65678053310018], corresponding to minimum eigenvalue jK "0.26794919243117, the error of the constrained
condition d"1.339745099323e-2. In Fig. 3, when k"100.0, the resultant minimum eigenvector a( "[!0.17705591358840, 0.66078182822853], corresponding to jK "0.26794919243113, the error of constrained condition d"
1.33946488425e!3. It can be seen again that the higher the value of k, the smaller the error of the constraint obtained. So, if we choose an appropriate large value of k, the constraint accuracy is completely able to meet the requirement of most of the concrete application problems in practice. As the second experiment, using symmetrical and non symmetrical 3-D matrices we simulated our neural network again. For the symmetrical case, we suppose
Fig. 2. Dynamic behavior of the network: (a) Trajectories of the output of the network with two random initial states and k"10. (b) Phase-plane traces of the networks state corresponding to (a).
124
Y. Tan, C. Deng/Neurocomputing 30 (2000) 117}128
Fig. 3. Dynamic behavior of the network: (a) Trajectories of the output of the network with two random initial states and k"100. (b) Phase-plane traces of the networks state corresponding to (a).
Fig. 4. Dynamic behavior of the network for a symmetric matrix: (a) Trajectories of the output of the network and k"100. (b) Corresponding phase-plane traces.
Q"[15,1,5; 1,13,2; 5,2,11], A"[11,1,2; 1,1,3; 2,3,13], then k "0.78, and choose k"100.0, (thus j "0.70865255402532). The evolutionary results of the network
are shown in Fig. 4. The resultant minimum generalized eigenvector a( "[!0.13362898462716, 0.00591631836095, 0.26824746465222], corresponding to jK "0.70865255402540, the error of the constrained condition d"
354561066859e!3, at time from 0 to 5 ls. For the non-symmetrical case, we suppose Q"[89.0434, 38.2002, 10.6773; 4.7274, 79.0866, 4.0530; 4.0869, 10.6926, 67.0491], A"[49.5399, 27.6868, 31.3546; 15.5941, 22.9994, 16.8984; 10.7686, 31.0746, 22.9243], then k "1.23177368127747, and also choose k"100.0, (j "1.32293116236482). The evolving results of our NN are shown in Fig. 5 with
a random initial state. The resultant minimum eigenvector a( "[0.07939229341264, 0.04971020051557, 0.06316860088638], corresponding to jK "1.32289543875905,
Y. Tan, C. Deng/Neurocomputing 30 (2000) 117}128
125
Fig. 5. Dynamic behavior of the network for a non-symmetric matrix: (a) trajectories of the output of the network and k"100. (b) Corresponding phase-plane traces.
the error of the constrained condition d"6.72923506655e!03, at time interval from 0 to 1 ls. In order to test the capability of our network to solve large-size problems, we simulate the network for a problem of ten arguments once again. In this experiment we generate two ten-dimensional symmetric and positive-de"nite matrices which are used as objective matrix Q and constrained matrix A, respectively. Here we also choose k"100.0 and start the network with random initialization. The evolutionary trajectories of the networks outputs are plotted in Fig. 6. It is shown from the "gure that the network can evolve to its global minimum energy state in real time from any random initial state and also give out the solution of the corresponding QPQC problem as the steady-state output of the neurons after the network settled. The obtained minimum eigenvector and eigenvalue by
126
Y. Tan, C. Deng/Neurocomputing 30 (2000) 117}128
Fig. 6. Evolutionary trajectories of the network's outputs with problem size of ten free variables and k"100.
our network are respectively, given by a( "[!1.5285e!01, 2.4749e!002, !3.3519e!01, !1.6487e!01, !4.5486e!02, 4.3189e!01, 1.9244e!01, !2.7200e!01, !2.2819e!01, 3.7305e!01]. jK "1.166895344903e!02
(j "1.166242829904e!02). The error of the constraint d"1.662876823064!03,
at time interval from 0 to 50 ls.
6. Conclusions The problem of solving a class of the quadratic programming with a quadratic constraint is addressed in this paper. The solution to this kind of optimization problems is the generalized minimum eigenvector of matrix Q relative to matrix A. The network is globally Lyapnuov stable (i.e., the domain of attraction of the equilibrium point is the whole space) and moreover has a unique minimizer, which will "nd extensive applications in many "elds such as FIR "lter design with time- and frequency-constraint, prototype "lter design in perfect reconstruction cosinemodulated "lter banks and so on. For the researching results in perfect reconstruction "lter banks' design by our proposed neural network, we will report them in other papers [7]. Finally, several numerical simulations are given in the paper to verify the correctness of our theoretical derivations and illustrate the generalized eigendecomposition ability of our network in real time.
Acknowledgements The authors wish to thank the anonymous reviewers for their valuable and meticulous comments. The guidance and help of Professor Z. He at Southeast University are also gratefully acknowledged. This project was supported by Anhui Natural Science Foundation, China.
Y. Tan, C. Deng/Neurocomputing 30 (2000) 117}128
127
Appendix A The proof of Theorem 2. Proof of if part: From the hypothesis, we have QXM "j AXM and j "2k(1!b) is the smallest eigenvalue of the generalized
eigenvalue problem, where b"XM AXM . Hence, from Theorem 1, XM is an equilibrium point of J. To prove that XM is a global minimizer of J, we have to show J(X)!J(XM )50, ∀X3S where S is the feasible region for constrained problems. Suppose X"XM #P, P3S, then J(X)!J(XM )"XQX#kp(X)!XM QXM !kp(XM ) "2PQXM #PQP#k[2PAXM #PAP]!j (2PAXM #PAP)
"2P(QXM !j AXM )#(PQP!j PAP)#k(2PAXM #PAP).
(A.1) Since j is the generalized smallest eigenvalue of matrix Q relative to A, therefore,
Q="j A=, QP*j AP and PQP5j PAP. Substituting these relations and
equations into Eq. (A.1), we can obtain, J(X)!J(XM )50.
(A.2)
Proof of only if part. From the hypothesis and using Theorem 1, we can get QXM "jAXM with j"2k(1!""XM "" ) for some m3+1,2, N,, where ""XM "" "XM AXM is K K referred to as XM 's norm associated with matrix A. The Hessian matrix at point XM is H(XM )"2Q!2jA#8kAXM XM A. Suppose K ∀X3S, we multiply the Hessian matrix both sides by X, we obtain XH(XM )X"2X(Q!jA)X#8kXAXM XM AX K "2(XQX!jXAX)#8k(XAXM ). (A.3) K Since the second term of Eq. (A.3) is always zero except when X"XM if we want the Hessian matrix to be non-negative de"nite, then the "rst term must be a non-negative value, i.e., XQX5jXAX. As X is an arbitrary vector in the feasible region, K according to the generalized Rayleigh principle, if j"j , then the above inequality K
always holds. So we have XH(XM )X50, i.e., the Hessian matrix H(XM ) is a nonnegative de"nite matrix. That is to say that XM is the global minimizer of our QPQC problem when l is "xed at some appropriately chosen positive value, according to classic constant-coe$cient di!erential equation theory. 䊐
References [1] M.P. Kennedy, L.O. Chua, Neural networks for nonlinear programming, IEEE Trans. on Circuit and Systems 35 (5) (1988) 554}562. [2] C.Y. Maa, M. Shanblatt, Linear and quadratic programming neural network analysis, IEEE Trans. Neural Networks 3 (6) (1992) 580}594. [3] P.M. Pardalos, J.B. Rosen, Constrained Global Optimization: Algorithms and Applications, Springer, Berlin, Germany, 1987.
128
Y. Tan, C. Deng/Neurocomputing 30 (2000) 117}128
[4] I.B. Pyne, Linear programming on an electronic analogue computer, Trans. Amer. Inst. Electr. Eng. 75 (1956) 139}143. [5] A. RodrmH guez-VaH zquez, R. DommH nguer-Castro, A. Rueda, J.L. Huertas, E. SaH nchez-Sinencio, Nonlinear switched-capacitor `neurala networks for optimization problems, IEEE Trans. Circuit Systems 37 (3) (1990) 384}397. [6] M.V. Rybashov, Gradient method of solving linear and quadratic programming problems on electronic analog computers, Automat. Remote Control 26 (12) (1965) 2079}2089. [7] Y. Tan, X.Q. Gao, Z.Y. He, Neural networks design approach for cosine-modulated FIR "lter banks and compactly supported wavelets with almost PR property, Signal Processing 69 (1) (1998) 29}48. [8] Y. Tan, Z.Y. He, A neural network structure for determination of minimum eigenvalue and its corresponding eigenvector of a symmetrical matrix, Proceedings of IEEE International Conference on Neural Network Signal Processing (ICNNSP'95), Nanjing, China, December 1997, pp. 512}516. [9] Y. Tan, Z.Y. He, Arbitrary FIR "lter synthesis with neural network, Neural Process. Lett. 8 (1) (1998) 9}13. [10] Y. Tan, Z.Y. He, An e$cient design method for cosine-modulated QMF banks satisfying PR property, Int. J. Circuit Theory Appl. 2 (5) (1998) 539}546. [11] Y. Tan, Z.K. Liu, On matrix eigen-decomposition by neural networks, Neural Network World (Int. J. Neural Mass-Parallel Comput. and Inform. Systems) 8 (3) (1998) 337}352. [12] D.W. Tank, J.J. Hop"eld, Simple &neural' optimization networks: an A/D converter, signal decision circuit, and a linear programming circuit, IEEE Trans. Circuit Systems 33 (5) (1986) 533}541. [13] X.Y. Wu, Y.S. Xia, J. Li, W.K. Chen, A high-performance neural network for solving linear and quadratic programming problems, IEEE Trans. Neural Networks 7 (3) (1996) 643}651. [14] S.H. Zak, V. Upatising, S. Hui, Solving linear programming problems with neural networks: a comparative study, IEEE Trans. Neural Networks 6 (1) (1995) 94}103.
Ying Tan was born in Yingshan county, Sichuan Province, China, in September 1964. He received his B.Sc., M.Sc., and Ph.D. degrees in 1985, 1988, and 1997, respectively. All of his degrees are in electronic engineering at Xidian University, Xi'an, and Southeast University, Nanjing, China. From 1989 to 1994 he was a research scientist and lecturer in the University. Now he is an associate professor and postdoctoral research fellow at the University of Science and Technology of China (USTC), Hefei, P.R. China. His current research interests include neural network theory and its applications, intelligent computational science, signal/ image processing, wavelet transform, pattern recognition, intelligent systems as well as statistical signal analysis and processing. He has published more than 60 journal and conference papers in these areas so far. He has served as a reviewer for several international or internal core journals. He has received several academic prizes and awards from his employed universities and country. Currently he is a member of the IEEE Signal Processing and Communications Societies, and a member of the IEE Signal Processing Society, and also a senior member of the China Institute of Electronics (CIE).
Chao Deng received her B.Sc. degree from Nanjing University of Aeronautics and Astronomics in 1988 and the M.Sc. degree in electronic engineering from University of Science and Technology of China in 1996. Now she is pursuing a Ph.D. degree in computer science at USTC. From 1988 to 1993 she was a research fellow at East-China Electronic Engineering Institute. Her current research interests are in the "eld of neural network learning algorithms, arti"cial intelligence, KDD. She is a student member of IEEE.