ARTICLE IN PRESS
Neurocomputing 64 (2005) 359–374 www.elsevier.com/locate/neucom
An improved neural network for convex quadratic optimization with application to real-time beamforming Youshen Xiaa,, Gang Fengb a
Department of Applied Mathematics, Nanjing University of Posts and Telecommunications, Nanjing 210003, China b Department of Manufacturing Engineering and Engineering Management, The City University of Hong Kong, Hong Kong Received 21 April 2004 Communicated by T. Heskes Available online 8 January 2005
Abstract This paper develops an improved neural network to solve convex quadratic optimization problems with general linear constraints. Compared with the existing primal–dual neural network and dual neural network for solving such problems, the proposed neural network has a lower complexity for implementation. Unlike the Kennedy–Chua neural network, the proposed neural network can converge to an exact optimal solution. Analyzed results and illustrative examples show that the proposed neural network has a fast convergence to the optimal solution. Finally, the proposed neural network is effectively applied to real-time beamforming. r 2004 Elsevier B.V. All rights reserved. Keywords: Quadratic optimization; Recurrent neural network; Convergence analysis; Real-time beamforming
Corresponding author.
E-mail address:
[email protected] (Y. Xia). 0925-2312/$ - see front matter r 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.neucom.2004.11.009
ARTICLE IN PRESS Y. Xia, G. Feng / Neurocomputing 64 (2005) 359–374
360
1. Introduction Consider the following quadratic optimization problem minimize subject to
1 T 2 x Qx
þ cT x
Bx ¼ b; Axpd; lpxph;
ð1Þ
where Q 2 Rnn is a symmetric and positive definite matrix, B 2 Rmn ; A 2 Rrn ; c; h; l 2 Rn ; b 2 Rm ; and d 2 Rr : It is well-known that quadratic optimization problems arise in a wide variety of scientific and engineering applications including regression analysis, image and signal processing, parameter estimation, filter design, robot control, etc. [1]. Many of them have time-varying nature and thus have to be solved in real time [6,16]. Because of the nature of digital computers, conventional numerical optimization techniques may not be effective for such real-time applications. Neural networks are composed of many massively connected neurons. The main advantage of the neural network approach to optimization is that the nature of the dynamic solution procedure is inherently parallel and distributed. Unlike other parallel algorithms, neural networks can be implemented physically in designated hardware such as application-specific integrated circuits, where optimization is carried out in a truly parallel and distributed manner. Because of the inherent nature of parallel and distributed information processing in neural networks, the convergence rate of the solution process is not decreasing as the size of the problem increases. Therefore, the neural network approach can solve optimization problems in running time at the orders of magnitude much faster than the most popular optimization algorithms executed on general-purpose digital computers [3]. Neural networks for optimization have received tremendous interest in recent years [2,5,7,10,12–15,17]. At present, there are several recurrent neural networks for solving quadratic optimization problems (1). Kennedy and Chua [7] presented a primal neural network. Because the network contains a finite penalty parameter, it converges an approximate solution only. To overcome the penalty parameter, we proposed a primal–dual neural network and a dual neural network [12,14]. The primal–dual neural network has a two-layer structure and the dual neural network requires computing an inverse matrix. Thus, both neural networks have a model complexity problem. Moreover, all existing neural networks for solving (1) cannot be guaranteed to have an exponential convergence to the optimal solution of (1). Thus, studying alternative neural networks with a low complexity and a fast convergence rate is of importance and significance. The objective of this paper is to develop an improved neural network for solving (1) with a low complexity and fast convergence. The proposed neural network has one-layer structure without the need of computing an inverse matrix. The beamforming processor is used to generate an optimal set of beams to track the mobiles within the coverage area of the base-station. The implemented algorithms have to rapidly enhance the desired signal and suppress noise and interference at the
ARTICLE IN PRESS Y. Xia, G. Feng / Neurocomputing 64 (2005) 359–374
361
output of an array of sensors; thus, real-time solution algorithms are very desirable. As another objective of this paper, the proposed neural network is effectively applied to real-time beamforming. Theoretical results and illustrative examples show that the proposed neural network has a good performance with a fast convergence rate.
2. Neural network models In this section, we reformulate problem (1) and then develop a neural network model for solving (1). 2.1. Problem reformulation According to the Karush–Kuhn–Tucker (KKT) conditions for (1) [1], we see that x is an optimal solution of (1) if and only if there exist y 2 Rr and w 2 Rm such that ðx ; y ; w Þ satisfies Bx ¼ b; Axpd; yX0; lpxph and g1 ðx ðQx þ c þ AT y BT wÞÞ ¼ x; where g1 ðxÞ ¼ ½g1 ðx1 Þ; . . . ; g1 ðxn Þ T and for i ¼ 1; . . . ; n 8 > < l i ; xi ol i ; g1 ðxi Þ ¼ xi ; l i pxi phi ; > : h ; x 4h : i
i
i
Let g2 ðyÞ ¼ ½g2 ðy1 Þ; . . . ; g2 ðym Þ T and g2 ðyi Þ ¼ maxf0; yi g: According to the wellknown projection theorem [1], the above KKT condition can be equivalently represented by x ¼ g1 ðx aðQx þ c þ AT y BT wÞÞ; y ¼ g2 ðy þ aAx adÞ; Bx ¼ b; where a is a positive constant. That is, 0 1 0 1 g1 ðx aðQx c AT y þ BT wÞÞ x B C B C g2 ðy þ aðAx dÞÞ @yA¼@ A: w aðBx bÞ w
(2)
ARTICLE IN PRESS Y. Xia, G. Feng / Neurocomputing 64 (2005) 359–374
362
For simplicity, we denote z ¼ ½x; y; w T 2 Rnþmþr and gðzÞ ¼ ½g1 ðxÞ; g2 ðyÞ; w T ; and we let 0 1 0 1 I n aQ aAT aBT c B C B C Ir O12 A; p ¼ a@ d A; W ¼ @ aA aB O21 Im b where I n 2 Rnn ; I r 2 Rrr ; and I m 2 Rmm are identity matrices, and O12 2 Rrm and O21 2 Rmr are zero matrices. Then the KKT condition can be written as in a compact vector form gðWz þ pÞ ¼ z:
(3)
2.2. Neural network model In [12], we presented the following primal–dual neural network model dz ¼ ðI þ M T ÞfgðWz þ pÞ zg; dt where z 2 Rnþmþr ; I 2 RðnþmþrÞðnþmþrÞ is an identity matrix, and 0 1 Q AT BT B C M ¼ a@ A O11 O12 A B
O21
(4)
O22
and O11 2 Rrr and O22 2 Rmm are zero matrices. Based on (3) and (4), we propose a modified neural network for solving (1), which is defined as follows: State equation: dz ¼ lfgðWz þ pÞ zg: dt Output equation:
(5)
xðtÞ ¼ DzðtÞ; where l40 is a design parameter, z 2 Rnþmþr is a state vector, x 2 Rn is an output vector, D ¼ ½I 1 ; O ; I 1 2 Rnn is an identity matrix, O 2 RnðmþrÞ is a zero matrix, and W ¼ I nþm M: The proposed neural network can be implemented by a circuit with a single-layer structure shown in Fig. 1. The circuit consists of ðn þ m þ rÞðn þ m þ r þ 1Þ summers, n þ m þ r integrators, ðn þ m þ rÞ2 weighted connections, and n þ m activation functions for g1 ðxi Þ and g2 ðyi Þ: 2.3. Comparison We now compare the proposed neural network with existing neural networks for solving (1). First, unlike the Kennedy–Chua neural network, the proposed neural network is guaranteed to converge an exact optimal solution due to no penalty
ARTICLE IN PRESS Y. Xia, G. Feng / Neurocomputing 64 (2005) 359–374
w11
363
z 1 (0) p1
w1n
Σ
Σ
z1
w1(n+1)
w n1
.. .
.. .
w 1(n+m+r)
.. .
z n (0) pn
w nn
zn
Σ
Σ wn(n+1) w n(n+m+r) w (n+1)1 w(n+1)n
z n+1 (0) pn+1
Σ
Σ
w (n+1)(n+1)
.. .
w(n+1)(n+m+r)
.. .
w(n+m+r)1
z n+ m + r (0) p n+m+r
w(n+m+r)n
Σ
Σ
w(n+m+r)(n+1) w (n+m+r)(n+m+r)
Fig. 1. Architecture of the modified neural network in (5).
parameter. Next, it can be seen that the primal–dual neural network has a two-layer structure and that the total numbers of multiplications and additions/subtractions performed per iteration for the primal–dual neural network in (4) are 2ðn þ m þ rÞ2 and 2ðn þ m þ rÞðn þ m þ r þ 1Þ; respectively, while the total numbers of multiplications and additions/subtractions performed per iteration for the proposed neural network in (5) are ðn þ m þ rÞ2 and ðn þ m þ rÞðn þ m þ r þ 1Þ; respectively. Therefore, the proposed neural network has a lower model complexity than the primal–dual neural network [12]. Third, for a one-layer neural network, we developed a dual neural network model for solving (1) given by dz ^ z þ q zÞ W ^ z qg; ¼ ff ðW dt
(6)
ARTICLE IN PRESS Y. Xia, G. Feng / Neurocomputing 64 (2005) 359–374
364
where z 2 Rnþmþr and 0 1 Q Q1 BT B ^ ¼ @ BQ1 BQ1 BT W AQ1
AQ1 BT
Q1 AT
1
BQ1 AT C A;
AQ1 AT
0
Q1 c
1
B C q ¼ @ BQ1 c A; AQ1 c
and f ðzÞ ¼ ½g1 ðxÞ; w; gd ðyÞ T ; g1 ðxÞ is defined in (2), gd ðyÞ ¼ ½gd ðy1 Þ; :::; gd ðym Þ T ; and for i ¼ 1; . . . ; m ( yi ; yi pd i ; gd ðyi Þ ¼ d i ; yi 4d i : ^ includes an operation of inverse matrix Q1 ; the dual Since the weight matrix W neural network has a higher model complexity than the modified neural network. Moreover, compared to three existing neural networks for solving (1), the present neural network is theoretically proven to have a fast convergence to a unique optimal solution of (1). Finally, since M is asymmetric and positive semi-definite, existing convergence results ([7,11–15,17], and references therein) cannot ascertain the convergence of the proposed neural network. However, the convergence and convergence rate of the proposed neural network can be obtained here.
3. Convergence results In this section, we prove that the modified neural network is globally convergent to optimal solutions with a fast convergence rate. A definition and two lemmas are first introduced. Definition 1. A neural network is said to have an exponential convergence to x if there exists T 0 4t0 such that the output trajectory xðtÞ of this network satisfies kxðtÞ x k ¼ OðeZðtt0 Þ Þ 8tXT 0 ; where Z is a positive constant independent of the initial point. Lemma 1. (i) For any initial point there exists a unique continuous solution zðtÞ for (5). (ii) The set of equilibrium points of (5) is nonempty. (iii) Let zðtÞ ¼ ðxðtÞ; yðtÞ; wðtÞÞ denote the state trajectory of (5) with the initial point z0 ¼ ðx0 ; y0 ; w0 Þ: Then zðtÞ 2 X Rrþ Rm if x0 2 X and y0 X0: Proof. (i) It is easy to see that the right-hand side term of (5) is Lipschitz continuous. Then for any initial point there exists a unique continuous solution zðtÞ for (5). Since (1) has a solution, the set of equilibrium points of (5) is nonempty. Similar to the analysis given in [13], we can see that zðtÞ 2 X Rrþ Rm if x0 2 X and y0 X0; and zðtÞ will approach exponentially to 2 X Rrþ Rm if x0 eX and y0 X0: & We now establish our main results on the proposed neural network.
ARTICLE IN PRESS Y. Xia, G. Feng / Neurocomputing 64 (2005) 359–374
365
Theorem 1. The state trajectory of the proposed neural network in (5) is globally convergent to an equilibrium point of (5). Proof. Let zðtÞ be the state trajectory of (5) with the initial point z0 : Without loss of the generality we assume that z0 2 X Rrþ Rm ; where Rrþ ¼ fy 2 Rr j yX0g and X ¼ fx j lpxphg: Consider the following Lyapunov function: EðzÞ ¼ fz gðWz þ pÞgT ðMz pÞ
1 1 kz gðWz þ pÞk2 þ kz z k2 ; 2a 2a
where z is an equilibrium point of (5), M is defined in (4) and a40 is defined in (2). Similar to the analysis given in [15] we can obtain EðzÞX12 kz z k2 and dEðzÞ p lððI W Þz pÞT ðz z Þ dt lfz gðWz þ pÞgT Mfz gðWz þ pÞg: Since z is the equilibrium point of (5), z ¼ gðWz þ pÞ: Then ððI W Þz pÞT ðz z ÞX0
r 8z 2 X Rm þR :
It follows that ððI W Þz pÞT ðz z ÞXðz z ÞT ðI W Þðz z ÞX0; since I W ¼ M is positive semi-definite. Thus, dEðzÞ p lðz z ÞT ðI W Þðz z Þ lfz gðWz þ pÞgT dt ðI W Þfz gðWz þ pÞg: Because EðzÞ is radially unbounded, for any initial point z0 2 X Rrþ Rm ; there exists a convergent subsequence fzðtk Þg such that limk!1 zðtk Þ ¼ z^; where dEð^zÞ ¼ 0: dt It can be seen that dEð^zÞ=dt ¼ 0 implies ð^z z ÞT ðI W Þð^z z Þ þ lf^z gðW z^ þ pÞgT ðI W Þf^z gðW z^ þ pÞg ¼ 0: That is, ð^z z ÞT ðI W Þð^z z Þ ¼ 0; f^z gðW z^ þ pÞgT ðI W Þf^z gðW z^ þ pÞg ¼ 0:
ARTICLE IN PRESS Y. Xia, G. Feng / Neurocomputing 64 (2005) 359–374
366
^ y; ^ wÞ ^ 2 Rn Rm Rr and Let z ¼ ðx ; y ; w Þ 2 Rn Rm Rr : Then Let z^ ¼ ðx; T ð^z z Þ ðI W Þð^z z Þ ¼ 0 implies that ðx^ x ÞT Qðx^ x Þ ¼ 0: Since Q is positive definite, x^ ¼ x and Bx^ ¼ b: On the other side, f^z gðW z^ þ pÞgT ðI W Þf^z gðW z^ þ pÞg ¼ 0 implies that ^ x^ ¼ 0: g1 ðx^ ðQx^ þ c BT w^ þ AT yÞÞ Since ðy^ y ÞT ðAx^ þ dÞ ¼ 0; y^ T ðAx^ þ dÞ ¼ ðy ÞT ðAx^ þ dÞ ¼ 0: ^ y; ^ wÞ ^ satisfies It follows that y^ ¼ g2 ðy^ þ Ax^ dÞ: Therefore, z^ ¼ ðx; z ¼ gðWz þ pÞ: That is, z^ is an equilibrium point of (5). Now we consider another function ^ ¼ fz gðWz þ pÞgT ððI W Þz pÞ 1 kz gðWz þ pÞk2 þ 1 kz z^k2 ; EðzÞ 2a 2a ^ y; ^ wÞ: ^ Similar to the previous analysis, we have where z^ ¼ ðx; ^ dEðzÞ p0 dt and ^ ^ zÞ ¼ 0: lim Eðzðt k ÞÞ ¼ Eð^
k!1
So, for 840 there exists q40 such that when tk Xtq we have 2 ^ Eðzðt k ÞÞo =2:
^ Since EðzðtÞÞ decreases as t ! þ1 for tXtq ; qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ^ ^ kðzðtÞ z^Þkp 2EðzðtÞÞ p 2Eðzðt k ÞÞo: Then lim zðtÞ ¼ z^:
t!1
Therefore, the state trajectory of the proposed neural network is globally convergent to an equilibrium point of (5). & Theorem 2. The output trajectory of the proposed neural network converges globally to a unique optimal solution of (1) with the following rate: g kxðtÞ x k2 p 8t4t0 ; alðt t0 Þ
ARTICLE IN PRESS Y. Xia, G. Feng / Neurocomputing 64 (2005) 359–374
367
where g is a positive constant. Moreover, the output trajectory of the proposed neural network has an exponential convergence rate. Proof. First, let zðtÞ ¼ ðxðtÞ; yðtÞ; wðtÞÞ be the state trajectory of (4). Then dEðzÞ p lðz z ÞT ðI W Þðz z Þ: dt Note that 0
Q B I W ¼ alpha@ A B
AT O11
1 BT C O12 A:
O21
O22
Then 0
xðtÞ x
10
Q
B B C CB ðzðtÞ z ÞT ðI W ÞðzðtÞ z Þ ¼ aB @ yðtÞ y A@ A wðtÞ w B 0 1 xðtÞ x T B C C B @ yðtÞ y A wðtÞ w
1
AT
BT
O11
C O12 C A O22
O21
¼ aðxðtÞ x ÞT QðxðtÞ x Þ: By Theorem 1 it follows that kxðtÞ x k p2EðzðtÞÞp2Eðzðt0 ÞÞ 2la 2
Z
t
ðxðtÞ x ÞT QðxðtÞ x Þ dt
t0
Then kxðtÞ x k2 decreases as t ! þ1 for tXtq and Z t la ðxðtÞ x ÞT QðxðtÞ x Þ dtpEðzðt0 ÞÞ 8t4t0 : t0
Let m40 be the minimal eigenvalue of Q. Then Z t Eðzðt0 ÞÞ 8t4t0 : kxðtÞ x k2 dtp lam t0 Thus, Rt kxðtÞ x k okxðt^Þ x k ¼ 2
2
Eðzðt0 ÞÞ p lamðt t0 Þ
t0
kxðtÞ x k2 dt t t0
8t4t0 ;
8t4t0 :
ARTICLE IN PRESS Y. Xia, G. Feng / Neurocomputing 64 (2005) 359–374
368
where t0 ot^ot: Let g ¼ Eðzðt0 ÞÞ=m: Then g kxðtÞ x k2 p 8t4t0 : laðt t0 Þ
&
Furthermore, we show that the output trajectory of the proposed neural network has an exponential convergence rate. Since the state trajectory zðtÞ ¼ ðxðtÞ; yðtÞ; wðtÞÞ of the proposed neural network is globally convergent to ðx ; y ; w Þ; there exists T4t0 such that ðxðtÞ; yðtÞ; wðtÞÞ ðx ; y ; w Þ when tXT: Similar to the analysis given in [12], using projection technique we have ðz z ÞT ðI þ M T ÞðgðW z^ þ pÞ zÞp ðz z ÞT Mðz z Þ 8z 2 Rnþmþr : Note that yðtÞ y 0; wðtÞ w 0; g2 ðyðtÞ þ AxðtÞ dÞ yðtÞ 0 and AxðtÞ b 0: Then ðz z ÞT ðI þ M T ÞðgðW z^ þ pÞ zÞ ðx x ÞT ðQ þ IÞðg1 ðx aðQx þ c þ AT y BT wÞÞ xÞÞ: Thus, ðxðtÞ x ÞT ðQ þ IÞðg1 ðxðtÞ aðQxðtÞ þ c þ AT yðtÞ BT wðtÞÞÞ xðtÞÞÞ ðzðtÞ z ÞT MðzðtÞ z Þ ¼ aðxðtÞ x ÞT QðxðtÞ x Þ: It follows that ðxðtÞ x ÞT ðQ þ IÞ
dx lðzðtÞ z ÞT MðzðtÞ z Þ dt ¼ laðxðtÞ x ÞT QðxðtÞ x Þ:
Let V ðxÞ ¼ 12 kQ1 ðx x Þk2 ; where Q21 ¼ I þ Q: Then there exists b0 40 such that dV ðxÞ b0 laV ðxÞ dt and thus V ðxðtÞÞ V ðxðt0 ÞÞebðtt0 Þ ; where b ¼ b0 la: Then kxðtÞ x k2 ¼ Oðebðtt0 Þ Þ: Therefore, the output trajectory of the proposed neural network converges exponentially to a unique optimal solution of (1). Remark 1. Compared with all existing results [11–15,18], the proposed neural network is guaranteed to have a fast convergence rate provided that the design parameters a and l are large enough.
ARTICLE IN PRESS Y. Xia, G. Feng / Neurocomputing 64 (2005) 359–374
369
Finally, we use an example to further show that the proposed neural network has a faster convergence rate than three existing neural networks. Example 1. Consider the following quadratic programming problem: minimize f ðxÞ ¼ 1:05x21 þ x22 þ x23 þ x24 4x1 x2 2x1 x4 ; 8 2x1 þ x2 þ x3 þ 4x4 ¼ 7; > > < 2x1 þ 2x2 þ 2x4 ¼ 6; subject to > > : x p1:5; x p0:5; x p1:5; x p1: 1
The problem (6) d ¼ 0; 2 2:1 6 2 6 Q¼6 4 0 0
2
3
ð7Þ
4
has a unique optimal solution x ¼ ½1:5; 0:5; 0:5; 1 T : Let A ¼ O; 2 2
0 0
0 0
2 0
3 0 07 7 7; 05
2
3 2 6 0 7 6 7 c¼6 7; 4 0 5
2
2
3 1:5 6 0:5 7 6 7 h¼6 7; 4 1:5 5
2
3 1 6 1 7 6 7 l¼6 7: 4 1 5
1
1
1
and B¼
2
1
1
4
2
2
0
2
;
b¼
7 6
:
Then problem (7) can be written as the problem with form (1). We solve (7) by using the proposed neural network. All simulation results show that the proposed neural network is always convergent to an equilibrium point of (5) and its output trajectory has an exponential convergence to x : For example, let l ¼ 50 and a ¼ 1: Fig. 2 4
3.5
3
*
||x(t)−x ||
2.5
2
1.5
1
0.5
0 0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Time (sec)
Fig. 2. Convergence behavior of kxðtÞ x k based on the modified neural network in (5) with 10 random initial points in Example 1.
ARTICLE IN PRESS 370
Y. Xia, G. Feng / Neurocomputing 64 (2005) 359–374
Table 1 Results for four methods in example 1, where e ¼ ½1; . . . ; 1 T KCNN method
PDNN method
DNN method
Modified method
Test 1 Initial point CPU time (s) Iterative number l 2 -norm error
0 2 R4 28.23 32093 0.063
0 2 R6 1.27 1917 0.0129
0 2 R6 22.25 11167 0.0868
0 2 R6 0.22 373 5:54 105
Test 2 Initial point CPU time (s) Iterative number l 2 -norm error
e 2 R4 27.98 32081 0.063
e 2 R6 1.21 1873 0.099
e 2 R6 22.41 11171 0.0706
e 2 R6 0.44 681 2:75 104
Test 3 Initial point CPU time (s) Iterative number l 2 -norm error
e 2 R4 28.34 32098 0.0632
e 2 R6 1.26 1901 0.0137
e 2 R6 21.97 11171 0.15
e 2 R6 0.22 373 6:04 105
displays the convergence of behavior kxðtÞ x k based on (5) with 10 random initial points. For a comparison, we compute this example by using the Kennedy–Chua neural network (KCNN) with the penalty parameter being 500, the primal–dual neural network (PDNN), the dual neural network (DNN), and the modified neural network, respectively. Their computational results are listed in Table 1, where the accuracy is defined by l 2 -norm error kx x k2 : From Table 1 we see that the modified neural network not only gives a better solution but also has a faster convergence rate than other methods.
4. Application to real-time beamforming Adaptive antenna techniques offer the possibility of increasing the performance of wireless communication systems by maximizing directional gain and enhancing the protection towards multipath fading conditions [8]. The beamforming processor is used to generate an optimal set of beams to track the mobiles within the coverage area of the base-station. The implemented algorithms have to rapidly enhance the desired signal and suppress noise and interference at the output of an array of sensors; thus, real-time solution algorithms are very desirable. Consider a linear array of n sensors. Let xl ðkÞ denote the received signal on the lth sensor, which consists of the desired signal, interferences and measurement noise. Then
ARTICLE IN PRESS Y. Xia, G. Feng / Neurocomputing 64 (2005) 359–374
371
the n-dimensional vector of signals received from n sensors is given by xðkÞ ¼
M X
aðyi Þsi ðkÞ þ vðkÞ;
i¼1
where si ðkÞ is the kth signal sample transmitted by the kth user, aðyi Þ is the n-dimensional array response vector of the ith user, and vðkÞ is the n-dimensional vector of white noise samples. Since a linearly constrained minimum power (LCMP) beamformer can be obtained from the minimization of output energy, the LCMP beamformer output is an estimate of the desired signal given by yðtÞ ¼ wH xðtÞ; where the weight w is chosen according to the following optimization problem: minimize
wH Rx w
subject to
cH w ¼ 1;
ð8Þ
where H denotes the Hermitian transpose, Rx ¼ E½xðtÞxðtÞH is the data covariance matrix, and c is the n 1 constraint matrix. For a robust LCMP beamformer [4,9], we consider here w to be an optimal solution of the following optimization problem: minimize
wH Rx w
subject to
cH wX1; ImfcH wg ¼ 0:
ð9Þ
Since the robust LCMP problem (9) is a complex valued one, we first convert it into a real valued optimization problem. We substitute w ¼ wI þ iwII ; Rx ¼ RI þ iRII ; and c ¼ cI þ icII into (9). Then (9) is equivalently written as the following minimize subject to
1 T x Rx 2 DxX1; Bx ¼ 0;
ð10Þ T
2n
where T denotes transpose, x ¼ ½wI ; wII 2 R ; and ! RII RI R¼ ; B ¼ cTII cTI ; D ¼ cTI RII RI
cTII :
Conventional approaches, such as the least-mean square (LMS) algorithm and the recursive least-squares (RLS) algorithm [4] have been used to find an optimal solution of LCMP. We now apply the proposed neural network to obtain the robust optimal solution of the LCMP in real time. The corresponding neural network becomes State equation: 0 1 Rx DT y BT v dz B C ¼ l@ ðy Dx þ 1Þþ y A: (11) dt Bx
ARTICLE IN PRESS Y. Xia, G. Feng / Neurocomputing 64 (2005) 359–374
372
Output equation: uðtÞ ¼ xðtÞ; where z ¼ ½x; y; v T 2 R2nþ2 is a state vector and u is an output vector. We compare the complexity of the proposed neural network with the LMS and RLS algorithms. It is easy to see that the proposed neural network requires 4n2 þ 4n multiplications per iteration. For the LMS algorithm to compute the LCMP beamformer, the updating step requires 8n multiplications per iteration. For the RLS algorithm to compute the LCMP beamformer, the updating step requires 16n2 þ 14n multiplications per iteration. Thus, the proposed neural network has a lower computational complexity than the RLS algorithm. The proposed neural network does not have a lower computational complexity than the LMS algorithm, but it has more robustness against outliers than the LMS algorithm. Moreover, the proposed algorithm has a fast convergence rate than both the LMS algorithm and the RLS algorithm. As a corollary of Theorem 2, we have the following result. Corollary 1. The output trajectory of the proposed neural network in (11) is globally convergent to a robust optimal solution of the LCMP with a fast convergence rate defined in Theorem 2. Example 2. Consider a ten-sensor uniform linear array receiving three uncorrelated, narrow-band signals corrupted by an additive white noise in a stationary environment. A desired signal with signal-to-noise ratio (SNR) 10 dB impinges on the array from f0 ¼ 0:0275p: Two interferences with interference-to-noise ratio (INR) 20 dB are located at f1 ¼ 0:22p and f2 ¼ 0:40p; respectively. Our simulation results are averaged over 100 independent trials. In addition, to compute the
10
0
Beampattern (dB)
−10
−20
−30
−40
−50
−60 −1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
/
Fig. 3. Comparisons of beampatterns of two different beamformers in Example 2, where the solid line corresponds to the modified neural network and the dot line to the Kennedy–Chua network.
ARTICLE IN PRESS Y. Xia, G. Feng / Neurocomputing 64 (2005) 359–374
373
10
0
Beampattern (dB)
−10 −20 −30 −40 −50 −60
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
/
Fig. 4. Comparisons of beampatterns of two different beamformers in Example 2, where the solid line corresponds to the modified neural network and the dot line to the RLS algorithm.
covariance matrix of xðkÞ we use the following average estimate: Rx
N 1 X xðkÞxðkÞH ; N k¼1
where N is a sample number. We compare the proposed neural network with the Kennedy–Chua network and the RLS algorithm. In three algorithms, l ¼ 1; the initial estimate of the weigh is zero, and the penalty parameter contained in the Kennedy– Chua network is 100. Fig. 3 displays the performance comparisons of array beam patterns using the proposed neural network in (11) and the Kennedy–Chua network. Fig. 4 displays the performance comparisons of array beam patterns using the proposed neural network in (11) and the RLS algorithm. It can be seen that the peaks of the mainlobes obtained by the proposed neural network are closer to the desired signal than the Kennedy–Chua network and the RLS algorithm. Moreover, the nulling interferences are distorted by the Kennedy–Chua network and the RLS algorithm, whereas the phenomenon of the distortion does not occur with the proposed network. References [1] D.P. Bertsekas, Parallel and Distributed Computation: Numerical Methods, Prentice-Hall, Englewood Cliffs, NJ, 1989. [2] A. Bouzerdoum, T.R. Pattison, Neural network for quadratic optimization with bound constraints, IEEE Trans. on Neural Networks 4 (2) (1993) 293–304. [3] A. Cichocki, R. Unbehauen, Neural Networks for Optimization and Signal Processing, Wiley, England, 1993. [4] H. Cox, R. Zeskind, M. Owen, Robust adaptive beamforming, IEEE Trans. Acoust. Speech Signal Process. 35 (1987) 1365–1376.
ARTICLE IN PRESS 374
Y. Xia, G. Feng / Neurocomputing 64 (2005) 359–374
[5] Z.G. Hou, A hierarchical optimization neural network for large-scale dynamic systems, Automatica 37 (2001) 1931–1940. [6] N. Kalouptisidis, Signal Processing Systems, Theory and Design, Wiley, New York, 1997. [7] M.P. Kennedy, L.O. Chua, Neural networks for nonlinear programming, IEEE Trans. Circuits Syst. 35 (5) (1988) 554–562. [8] I.S. Reed, J.D. Mallett, L.E. Brennan, Rapid convergence rate in adaptive arrays, IEEE Trans. Aerosp. Electron. Syst. AES-10 (1974) 853–863. [9] S. Vorobyov, A. Gershman, Z.Q. Luo, Robust adaptive beamforming using worst-case performance optimization: a solution to the signal mismatch problem, IEEE Trans. Signal Process. 51 (2003) 313–324. [10] Z.S. Wang, J.Y. Cheung, Y.S. Xia, J.D. Chen, Minimum fuel neural networks and their applications to overcomplete signal representations, IEEE Trans. Circuits Syst. Part I 47 (8) (2000) 1146–1159. [11] X. Gao, L.Z. Liao, W. Xue, A neural network for a class of convex quadratic minimax problems with constraints, IEEE Transactions Neural Networks 15 (2004) 622–628. [12] Y.S. Xia, A new neural network for solving linear and quadratic programming problems, IEEE Trans. Neural Networks 7 (4) (1996) 1544–1547. [13] Y.S. Xia, J. Wang, On the stability of globally projected dynamic systems, J. Optim. Theory Appl. 106 (1) (2000) 129–150. [14] Y.S. Xia, J. Wang, A dual neural network for kinematic control of redundant robot manipulators, IEEE Trans. Syst., Man Cybern.-Part B 31 (1) (2001) 147–154. [15] Y.S. Xia, J. Wang, Global asymptotic and exponential stability of a dynamic neural network with asymmetric connection weights, IEEE Trans. Autom. Control 46 (4) (2001) 635–638. [16] T. Yoshikawa, Foundations of Robotics: Analysis and Control, MIT Press, Cambridge, MA, 1990. [17] S.H. Zaˇk, V. Upatising, S. Hui, Solving linear programming problems with neural networks: a comparative study, IEEE Trans. Neural Networks 6 (1995) 94–104. [18] Y.S. Xia, F. Gang, J. Wang, A primal-dual network for on-line resolving constrained kinematic redundancy, IEEE Trans. Syst. Man Cybernet.— Part B (2005). Youshen Xia received B.S. and M.S. degrees in computational mathematics from Nanjing University, China in 1982 and 1989, respectively. He received his Ph.D. degree from Department of Automation and Computer-Aided Engineering, The Chinese University of Hong Kong, China in 2000. His present research interests include system identification, signal and image processing, and design and analysis of recurrent neural networks for constrained optimization and their engineering applications.
Gang Feng received B.Eng. and M.Eng. degrees in Automatic Control (of Electrical Engineering) from Nanjing Aeronautical Institute, China in 1982 and in 1984, respectively, and Ph.D. degree in Electrical Engineering from the University of Melbourne, Australia, in 1992. He has been with City University of Hong Kong since 2000 and was with School of Electrical Engineering, University of New South Wales, Australia, 1992–1999. He was awarded an Alexander von Humboldt Fellowship in 1997–1998. He was a visiting Fellow at National University of Singapore (1997), and Aachen Technology University, Germany (1997–1998). He has authored and/or coauthored more than 90 referred international journal papers and numerous conference papers. His current research interests include robust adaptive control, signal processing, piecewise linear systems, and intelligent systems and control. Dr. Feng is an associate editor of IEEE Transactions on Fuzzy Systems, and IEEE Transactions on Systems, Man and Cybernetics, Part C, Journal of Control Theory and Applications, and was an associate editor of the Conference Editorial Board of IEEE Control System Society.