Weighted least squares design of 2-D FIR filters using a matrix-based generalized conjugate gradient method

Weighted least squares design of 2-D FIR filters using a matrix-based generalized conjugate gradient method

Available online at www.sciencedirect.com Journal of the Franklin Institute 353 (2016) 1759–1780 www.elsevier.com/locate/jfranklin Weighted least sq...

2MB Sizes 0 Downloads 32 Views

Available online at www.sciencedirect.com

Journal of the Franklin Institute 353 (2016) 1759–1780 www.elsevier.com/locate/jfranklin

Weighted least squares design of 2-D FIR filters using a matrix-based generalized conjugate gradient method$ Ruijie Zhaoa, Xiaoping Laib, Zhiping Linc,n a

School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai 264209, China b Institute of Information and Control, Hangzhou Dianzi University, Hangzhou 310018, China c School of Electrical and Electronic Engineering, Nanyang Technological University, 639798 Singapore, Singapore Received 27 November 2015; accepted 1 March 2016 Available online 10 March 2016

Abstract One of the important issues in the weighted least-squares (WLS) design of two-dimensional (2-D) finite impulse response (FIR) filters is the computational complexity of the design algorithms. This paper presents a matrix formulation of the design problem and derives a matrix-based generalized conjugate gradient (GCG) algorithm. By defining a linear operator acting on the coefficient matrix of the filter, the optimality condition of the design problem is expressed as a linear operator equation. By proving the boundedness, self-adjointness and positive-definiteness of the linear operator in the Hilbert space of the coefficient matrix, a GCG algorithm designated for the Hilbert matrix space is used to solve the linear operator equation. The convergence of the proposed algorithm in finite steps is established and its high computational efficiency is demonstrated by design examples and comparison with some existing methods. & 2016 The Franklin Institute. Published by Elsevier Ltd. All rights reserved.

1. Introduction Design problems of two-dimensional (2-D) filters have received a great deal of attention due to their widespread applications in image processing, sonar and radar signal processing, geophysical signal processing and so on [1,2]. Since finite impulse response (FIR) filters are ☆ This work was supported in part by the National Nature Science Foundation of China under Grants 61304142 and 61573123, in part by the National Basic Research Program of China under Grants 2012CB821200. n Corresponding author. E-mail addresses: [email protected] (R. Zhao), [email protected] (X. Lai), [email protected] (Z. Lin).

http://dx.doi.org/10.1016/j.jfranklin.2016.03.001 0016-0032/& 2016 The Franklin Institute. Published by Elsevier Ltd. All rights reserved.

1760

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

inherently stable and can have linear phase responses, they are widely used in practice. Many methods have been reported for the design of 2-D FIR filters over the past forty years, e.g., the frequency sampling methods [3], the singular-value decomposition (SVD)-based methods [2], the frequency transformation methods [4–8], and the optimal design methods [9–24]. Minimax and weighted least squares (WLS) are two commonly used criterions for optimal design of FIR filters, e.g., [9–18,20–22,24,25]. Among them, minimax designs minimize the maximum approximation errors, and WLS designs minimize the approximation error energies. While the former generally requires the use of sophisticated optimization techniques, the latter can be solved analytically. A more general design criterion is the least lp-norm (pZ 2) criterion, which includes least squares (p ¼ 2) as its special case and approaches minimax when p goes to 1. It is known that the least lp-norm and minimax solution filters can be obtained by solving a sequence of WLS subproblems [19,26], or equivalently, the WLS algorithms for 2-D FIR filters are often the iteration cores of the least lp-norm and minimax designs of 2-D FIR filters. Moreover, obtaining a 2-D FIR filter within a few seconds is often expected by the users in an interactive design. In such cases, the computational complexity of the WLS algorithms for 2-D FIR filters is still a very important issue to consider. However, in design methods presented in [19,21–23,27–29], the coefficient matrix of the 2-D FIR filter is rearranged into a vector, generally leading to a heavy computational burden, e.g., OðN 6 Þ for the WLS design of an 2-D FIR filter of size N  N. After vectorizing the filter coefficients, those methods solve for the vectorized filter coefficients using design algorithms for 1-D filters. To this end, we refer to those methods as 1-D or vectorized methods in this paper. In order to reduce the computational complexity, several matrix-based design methods (also known as 2-D design methods) have been proposed to formulate the design problems in terms of the coefficient matrix and solve for it directly rather than vectorizing it first [9–18,20,24]. In [9–12], unweighted least squares (LS) designs of 2-D FIR filters are investigated and some analytical solutions in matrix form have been obtained. But unweighted LS designs have relatively large magnitude errors in the passband and stopband edges due to the Gibbs phenomenon. To overcome this disadvantage, WLS designs are considered in [13–18]. When the weighting functions are separable, close-form solutions have been obtained in [18]. In other cases, several iterative solution methods have been presented [13–17]. The algorithms in [14–16] only allow 0-1 valued, two-valued, and four-valued weighting functions, respectively. The WLS design with arbitrary weighting functions is considered in [13] and two matrix-based iterative algorithms are presented: one is a fixed point algorithm and the other is a generalized conjugate gradient (GCG) algorithm in Hilbert space. By defining the design problem on a uniform frequency grid, the fast 2-D DCT, IDCT, DST and IDST have been used in the two algorithms, resulting in much lower computational complexity than conventional 1-D methods. Algorithms in [17] also allow arbitrary weighting functions, and are very efficient, but the iteration numbers required for their convergences increase quickly with the increase of the filter sizes and transition band widths. In this paper, the WLS design problem of 2-D FIR filters is firstly formulated in a matrix form as that in [17]. The known basis functions, weighting function, desired frequency response and the unknown impulse response all appear as matrices. Then, another GCG algorithm is derived from the matrix-based problem formulation for the WLS design of quadrantally symmetric 2-D FIR filters with arbitrary weighting functions. The optimality condition of the design problem is obtained and written as a linear operator equation. By proving the boundedness, self-adjointness and positive-definiteness of the associated linear operator, the GCG method in [30] designated

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

1761

for linear operator equations in a Hilbert space is adapted to solve the optimality condition. The resultant matrix-based GCG algorithm has the property of termination in finite steps. Theoretically, the computational complexity of the algorithm is in the order of OðN 5 Þ for an N  N filter, lower than the vectorized WLS algorithms. Taking into account of the good convergence property and an allowable convergence tolerance, its actual complexity is only about OðN 4 Þ in many 2-D filter design examples we have run, including those to be presented in this paper. Compared with the GCG algorithm in [13] (we refer to it as DCT-based GCG algorithm), the proposed matrix-based GCG algorithm also has fewer computations. This paper is organized as follows. Section 2 is dedicated to the WLS design methods. After the conventional vectorized method and the DCT-based GCG algorithm are reviewed in Section 2.1 Section 2.2, respectively, the matrix-based GCG algorithm is developed in Section 2.3. Then, the complexity analysis of the proposed algorithm is given in Section 2.4. Design examples and comparisons with existing methods are provided in Section 3 to illustrate the good performance of the proposed matrix-based GCG algorithm. Conclusions are drawn in Section 4.

2. WLS design methods for 2-D FIR filters 2.1. Conventional vectorized method A 2-D FIR filter of size N 1  N 2 can be characterized by its frequency response Hðejω1 ; ejω2 Þ ¼

NX 1  1 NX 2 1

hðn1 ; n2 Þe  jðn1 ω1 þn2 ω2 Þ ;

ð1Þ

n1 ¼ 0 n2 ¼ 0

where fhðn1 ; n2 Þ; n1 ¼ 0; 1; ⋯; N 1  1; n2 ¼ 0; 1; ⋯; N 2  1g is its impulse response, ω1 and ω2 are the horizontal and vertical frequencies, respectively. The frequency response (1) is often expressed as Hðejω1 ; ejω2 Þ ¼ Mðω1 ; ω2 Þejφðω1 ;ω2 Þ ; where Mðω1 ; ω2 Þ is the magnitude response and φðω1 ; ω2 Þ is the phase response, both of which are real functions of ω1 and ω2. In this paper, we consider the quadrantally (or fourfold) symmetric 2-D filters whose impulse response hðn1 ; n2 Þ satisfies the restriction [1]: hðn1 ; n2 Þ ¼ hðn1 ; N 2  1  n2 Þ ¼ hðN 1  1 n1 ; n2 Þ. In this case, the phase response φðω1 ; ω2 Þ is linear with respect to ω1 and ω2, and the magnitude response Mðω1 ; ω2 Þ is symmetric about the ω1 and ω2 axes where Mðω1 ; ω2 Þ ¼

L1 X L2 X

aij ϕi ðω1 ; N 1 Þϕj ðω2 ; N 2 Þ

i¼1j¼1

with ( Lk ¼

N k =2;

for even N k ;

ðN k þ 1Þ=2;

for odd N k ;

ð2Þ

1762

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

for k ¼ 1,2, ϕn ðω; NÞ0 s are functions 8 pffiffiffi 1= 2 > > > < cos ððn 1ÞωÞ   ϕn ðω; N Þ ¼ > 2n 1 > > ω : cos 2

of ω defined by for n ¼ 1 and odd N; for n a 1 and odd N; for even N;

and aij's are real coefficients related to fhðn1 ; n2 Þg by some rules. For example, if both N1 and N2 are even, aij's are related to fhðn1 ; n2 Þg by [17]: aij ¼ 4 hðL1  i; L2  jÞ; for i ¼ 1; 2; ⋯; L1 ; j ¼ 1; 2; ⋯; L2 . Conventional vectorized methods [19,21–23,27–29] i ¼ 1; 2; …; L1 ; j ¼ 1; 2; …; L2 in a vector,

compact

the

coefficients

aij,

a ¼ ½a11 ; a12 ; …; a1L2 ; a21 ; a22 ; …; aL1 L2 T ; where the superscript “T” denotes the transpose operation. Then, the magnitude response Mðω1 ; ω2 Þ in (2) can be rewritten as Mðω1 ; ω2 Þ ¼ ϕðω1 ; ω2 ÞT a;

ð3Þ

where ϕðω1 ; ω2 Þ is a vector of the form ϕðω1 ; ω2 Þ ¼ ½ϕ11 ϕ21 ; ϕ11 ϕ22 ; …; ϕ11 ϕ2L2 ; ϕ12 ϕ21 ; ϕ12 ϕ22 ; …; ϕ1L1 ϕ2L2 T ; with ϕ1i ϕ2j standing for ϕi ðω1 ; N 1 Þϕj ðω2 ; N 2 Þ. As previously mentioned, the magnitude response Mðω1 ; ω2 Þ described in (2) is symmetric with respect to ω1 and ω2. To this end, when we approximate a desired magnitude response with Mðω1 ; ω2 Þ, we only need to consider the approximation on the first quarter-plane Ω ¼ fðω1 ; ω2 Þj0r ω1 r π; 0r ω2 r πg. Let Dðω1 ; ω2 Þ be the desired magnitude response, which is a real-valued function of ω1 and ω2. We discretize Ω by an M 1  M 2 rectangular frequency grid defined as ^ ¼ fðω1i ; ω2j Þji ¼ 1; 2; …; M 1 ; j ¼ 1; 2; …; M 2 g; Ω

ð4Þ

where 0 r ω1i r π; and 0 r ω2j r π. Then, the weighted square sum of the approximation error is EðaÞ ¼

M1 X M2 X

Wðω1i ; ω2j ÞjMðω1i ; ω2j Þ  Dðω1i ; ω2j Þj2 ;

ð5Þ

i¼1j¼1

where Wðω1 ; ω2 Þ is a nonnegative weighting function. Without loss of generality, we assume that maxi;j fWðω1i ; ω2j Þg ¼ 1. It is not difficult to know from (3) and (5) that E(a) can be rewritten as EðaÞ ¼ ðFa  dÞT WðFa  dÞ;

ð6Þ

where F is an ðM 1 M 2 Þ  ðL1 L2 Þ matrix, W is an ðM 1 M 2 Þ  ðM 1 M 2 Þ diagonal matrix whose diagonal entries are Wðω1i ; ω2j Þ and d is an ðM 1 M 2 Þ  1 vector with components Dðω1i ; ω2j Þ. In conventional 1-D methods, the WLS design of a quadrantally symmetric 2-D filter is to minimize E(a) with respect to the coefficient vector a, min EðaÞ: a

ð7Þ

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

1763

The problem is then generally solved for a analytically by [19] a ¼ ðFT WFÞ  1 FT Wd:

ð8Þ

In the WLS design method described above, the length and width of the frequency grid are recommended as M 1 Z 4N 1 and M 2 Z 4N 2 for a tradeoff between the approximation accuracy and computational complexity. When M1 and M2 are large, a large memory space and long CPU time would be needed to compute the high-dimensional coefficient vector a from (8).

2.2. DCT-based GCG algorithm The impulse response coefficients, hðn1 ; n2 Þ's, of a 2-D FIR filter and their associated coefficients, aij's, are naturally in matrix forms. In the following sections, the WLS design problem of 2-D FIR filters will be reformulated in the way that the filter coefficients aij's are in a matrix form, rather than the vector form as in the previous subsection. To that end, we formulate the magnitude response (2) of a quadrantally symmetric filter as Mðω1 ; ω2 Þ ¼ ψ T ðω1 ; N 1 ÞAψðω2 ; N 2 Þ;

ð9Þ

where ψðω; NÞ is a vector given by 8h  i T > > p1ffiffi ; cos ðωÞ; cos ð2ωÞ; …; cos N  1 ω ; for odd N; > 2 < 2  

  T ψ ðω; N Þ ¼ ω 3ω N 1 > > ω ; for even N; cos ; cos ; …; cos > : 2 2 2 and A is the L1  L2 real coefficient matrix of the filter with its ij-th entry being aij. Aravena and Gu [13] considered the WLS approximation of 2-D linear-phase FIR filters on a uniform frequency grid:     2i 1 2j  1 ω1i ; ω2j ¼ π; π ; i ¼ 1; 2; …; M 1 ; j ¼ 1; 2; …; M 2 : ð10Þ 2M 1 2M 2 For quadrantally symmetric filters, the approximation problem formulated in [13] can be reformulated as follows: 1 min 〈A; pðAÞ〉  〈A; Dw 〉 þ c A 2

ð11Þ

where c is a constant, Dw is an L1  L2 matrix given by Dw ¼

M1 X M2       4 X D ω1i ; ω2j W ω1i ; ω2j ψ ðω1i ; N 1 Þψ T ω2j ; N 2 ; M1M2 i ¼ 1 j ¼ 1

pðÞ is a linear operator acting on the L1  L2 real matrix space RL1 L2 defined by pðXÞ ¼

M1 X M2   4 X ψ T ðω1i ; N 1 ÞXψ ω2j ; N 2 M1M2 i ¼ 1 j ¼ 1

Wðω1i ; ω2j Þψðω1i ; N 1 Þψ T ðω2j ; N 2 Þ;

X A RL1 L2 ;

ð12Þ

1764

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

and 〈; 〉 denotes the inner product in RL1 L2 defined by 〈X; Y〉 ¼ trfXT Yg;

X; Y ARL1 L2 ;

ð13Þ

where trfg represents the trace of a matrix. Aravena and Gu did not compute the operator pðÞ directly from (12). Instead, by assuming the frequency grid in (10), they computed pðÞ using the fast 2-D DCT and IDCT techniques as follows: pðXÞ ¼ DCT2ððIDCT2ðX; M 1 ; M 2 Þ○WÞ; M 1 ; M 2 ; ½L1 ; L2 Þ;

ð14Þ

where “○” denotes the Hadamard matrix product (the entry-wise product [31]), W is the M 1  M 2 matrix with ij-th entry wij, DCT2ð; M 1 ; M 2 ; ½L1 ; L2 Þ represents the first L1 rows and L2 columns of the M 1  M 2 -point fast 2-D DCT, and IDCT2ð; M 1 ; M 2 Þ represents the M 1  M 2 -point fast 2-D IDCT. Using (14), the computation complexity of pðÞ would be reduced from OðL1 L2 M 1 M 2 Þ to OðM 1 M 2 log 2 ðM 1 M 2 ÞÞ. With the DCT-based formula (14) for pðÞ, Aravena and Gu then gave in [13] without derivation the following DCT-based GCG algorithm for problem (11): Algorithm 1. DCT-based GCG Algorithm Step 1: For an initial iterative matrix A0 , let R0 ¼ pðA0 Þ Dw , G0 ¼  R0 and set n¼ 0. Step 2: Let Anþ1 ¼ An þ αn Gn , where n ;Rn 〉 αn ¼  〈G〈Gn ;pðG : n Þ〉

ð15Þ

Step 3: Let Gnþ1 ¼  Rnþ1 þ λn Gn , where Rnþ1 ¼ pðAnþ1 Þ Dw ;

ð16Þ

〈Rnþ1 ;pðGn Þ〉 〈Gn ;pðGn Þ〉 :

ð17Þ

λn ¼

If n ¼ L1 L2 , terminate the algorithm. Otherwise, let n ¼ n þ 1 and go back to Step 2.□

2.3. Proposed matrix-based GCG method In this subsection, we will reformulate the WLS design problem (7) on the arbitrarily defined ^ in (4) in terms of matrix norm. Substituting the discretized rectangular frequency grid Ω frequency points ω1i and ω2j into vectors ψðω1 ; N 1 Þ and ψðω2 ; N 2 Þ in (9) and compacting all vectors ψðω1i ; N 1 Þ in an L1  M 1 matrix P and all vectors ψðω2j ; N 2 Þ in an L2  M 2 matrix Q, yields P ¼ ½ψðω11 ; N 1 Þ; ψðω12 ; N 1 Þ; …; ψðω1;M 1 ; N 1 Þ; Q ¼ ½ψðω21 ; N 2 Þ; ψðω22 ; N 2 Þ; …; ψðω2;M 2 ; N 2 Þ: We define by D the M 1  M 2 matrix with ij-th entries dij ¼ Dðω1i ; ω2j Þ. Then, the weighted square sum of the approximation error in (5) can be reformulated as   1 E ðAÞ ¼ J PT AQ  D ○Wð2Þ J 2F ; ð18Þ

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

1765

1 where J  J F denotes the Frobenius matrix norm and Wð2Þ defines the matrix whose ij-th entry is pffiffiffiffiffiffi wij . The design problem (7) can be rewritten in terms of coefficient matrix A as

ð19Þ

min EðAÞ: A

The cost EðAÞ is strictly convex if and only if the inequality   1 J PT XQ ○Wð2Þ J F 40;

ð20Þ

holds for all L1  L2 nonzero real matrices X [17]. Throughout the paper, we assume that inequality (20) holds, which means the problem (19) has a unique solution. Actually, because P and Q are of full rank if M 1 Z L1 and M 2 Z L2 [17], the condition (20) can be easily satisfied in practical designs. Obviously, EðAÞ is a quadratic function of A, and the optimality condition of the problem (19) is ∇EðAÞ ¼ 0, where ∇EðAÞ is the gradient of the cost function EðAÞ with respect to the matrix variable A given by [15], ∇EðAÞ ¼ 2P½ðPT AQ DÞ○WQT :

ð21Þ

Hence, we obtain a matrix equation P½ðPT AQÞ○WQT ¼ PðD○WÞQT :

ð22Þ

The conjugate gradient (CG) method is a very efficient method for solving systems of linear equations or unconstrained nonlinear optimization problems [32]. It guarantees the optimal solution in a finite number of steps when the cost function is quadratic. In [30], the original CG method in a real vector space was extended to a general form in a Hilbert space H for solving the linear operator equation TðxÞ ¼ d, where TðÞ is a linear operator, xA H and d A H. For convenience, the GCG method in [30] is given in Appendix A of this paper. It can be seen from Appendix A that the GCG method in [30] is an algorithmic framework solving the linear operator equation TðxÞ ¼ d in the Hilbert space H. In order to obtain a concrete GCG algorithm for solving Eq. (22) for the coefficient matrix A A RL1 L2 , the coefficient matrix space RL1 L2 should be equipped with a proper inner product. In this paper, we use the same inner product defined by (13) as that in [13]. With the inner product defined by (13), RL1 L2 becomes a Hilbert space. A proper linear operator TðÞ corresponding to linear equation Eq. (22) should also be defined such that Eq. (22) becomes a linear operator equation with the operator being bounded, selfadjoint and positive-definite. To this end, define the linear operator TðÞ acting from RL1 L2 to RL1 L2 by TðXÞ ¼ P½ðPT XQÞ○WQT ;

X A RL1 L2 :

ð23Þ

Then, Eq. (22) becomes a linear operator equation as follows: TðAÞ ¼ P½ðPT AQÞ○WQT ¼ PðD○WÞQT :

ð24Þ

The boundedness, self-adjointness and positive-definiteness of operator TðÞ are affirmed in the following Theorem, the proof of which is given in Appendix B. Theorem 1. TðÞ defined in (23) is a bounded, self-adjoint and positive-definite linear operator. By replacing the linear operator TðÞ with Eq. (22) and using Eq. (13) as the inner product in the Hilbert space RL1 L2 , we obtain the following matrix-based GCG algorithm that solves the WLS design problem (19) of the 2-D FIR filter for its coefficient matrix A.

1766

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

Algorithm 2. Matrix-based GCG Algorithm Step 1: For the specified magnitude response Dðω1 ; ω2 Þ, filter lengths N1 and N2, choose a ^ , an initial iterative matrix A0 and an error tolerance ε40. Construct the frequency grid Ω desired magnitude matrix D and weight matrix W, and then evaluate the matrices P and Q. Let G0 ¼  ∇EðA0 Þ and set n ¼ 0. Step 2: Let Anþ1 ¼ An þ αn Gn , where 1 n ;∇EðAn Þ〉 αn ¼  〈G : ð25Þ 2 〈Gn ;TðGn Þ〉 Step 3: Let Gnþ1 ¼  ∇EðAnþ1 Þ þ λn Gn , where ∇EðAnþ1 Þ ¼ ∇EðAn Þ þ 2αn TðGn Þ; ð26Þ λn ¼

〈∇EðAnþ1 Þ;TðGn Þ〉 : 〈Gn ;TðGn Þ〉

ð27Þ

If JAnþ1  An J F oε, terminate the algorithm. Otherwise, let n ¼ n þ 1 and go back to Step 2.□

For the initial iterative matrix A0 , A0 ¼ ðPPT Þ  1 PðD○WÞQT ðQQT Þ  1

ð28Þ

is a good choice. The updating Eq. (26) is obtained by substituting Anþ1 ¼ An þ αn Gn into (21), ∇EðAnþ1 Þ ¼ ∇EðAn Þ þ 2αn P½PT Gn QÞ○WQT ¼ ∇EðAn Þ þ 2αn TðGn Þ: As for the direction matrices fGn g generated by the above algorithm, we have the following theorem. The proof of this theorem can be found in Appendix C. Theorem 2. The direction matrices fGn g generated by the matrix-based GCG algorithm satisfy 〈Gi ; TðGj Þ〉 ¼ 0;

for ia j:

From Theorem 2, the direction matrix set fGi ; i ¼ 0; 1; …; ng generated by the matrix-based GCG algorithm is a linearly independent subset of the L1 L2 dimensional space RL1 L2 . Therefore, we immediately have the following theorem. Theorem 3. The matrix-based GCG algorithm terminates in not more than L1 L2 iterations. In practical designs, a small tolerance, say ε ¼ 10  5 , is allowed for the error norm of the coefficient matrices in two successive iterations. This will further reduce the iteration number. Comment 1: Our GCG algorithm has similar iteration steps as the DCT-based GCG algorithm. Actually, when the frequency grid is taken as the same as (10) and if ∇EðAn Þ and TðÞ are respectively replaced by 2Rn and pðÞ, our GCG algorithm almost reduces to the DCT-based GCG algorithm. The only but very important difference between the two GCG algorithms is on the computations of the operator TðÞ (pðÞ in the DCT-based GCG algorithm) and the matrix ∇EðAnþ1 Þ (Rnþ1 in the DCT-based GCG algorithm). Our GCG algorithm calculates TðÞ using the matrix equation Eq. (24). It is different from the fast DCT based Eq. (14) used by the GCG algorithm in [13] to compute pðÞ. Our algorithm computes ∇EðAnþ1 Þ using iterative Eq. (26), which is also different from Eq. (16) used in [13] to calculate Rnþ1 .

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

1767

Comment 2: The WLS design of a 2-D FIR filter is essentially a WLS bivariate function approximation problem, i.e., to fit the desired magnitude response surface Dðω1 ; ω2 Þ in the WLS sense using a linear combination of the basis functions ½1; cos ðω1 Þ; cos ð2ω1 Þ; … ½1; cos ðω2 Þ; cos ð2ω2 Þ; …, where  denotes the Kronecker product. It is not difficult to show that, by properly choosing a set of basis functions and correspondingly defining the matrices P, Q, D, and W, the matrix-based GCG algorithm can also be applied to some other surface fitting problems defined on a rectangular bivariate grid. An example of such problems is the WLS design of variable fractional delay 1-D FIR filters [33], in which the basis functions are ½1; cos ðωÞ; cos ð2ωÞ; …  ½1; p; p2 ; …, where p denotes the variable fractional delay of filters. Obviously, the DCT-based GCG algorithm cannot be applied to this problem. 2.4. Analysis of computational complexity and storage requirement We first give a memory requirement comparison between the proposed matrix-based GCG algorithm and the conventional vectorized method described in Section 2.1. The vectorized method has to store the matrices F and FT WF, requiring M 1 M 2 L1 L2 and L21 L22 memory cells, respectively. While, the proposed algorithm only needs a total of 2M 1 M 2 þ L1 M 1 þ L2 M 2 þ 4L1 L2 memory cells to store matrices W, D, An , Gn , P, Q, ∇EðAn Þ and TðGn Þ. If we let N 1 ¼ N 2 ¼ N and M 1 ¼ M 2 ¼ MðM Z 4NÞ for simplicity, then the memory spaces required by the vectorized algorithm and the matrix-based GCG algorithm are of orders OðN 2 ðM 2 þ N 2 ÞÞ and OðMðM þ NÞÞ, respectively. Obviously, the proposed matrix-based GCG algorithm needs much less memory space than the conventional vectorized one. For the amount of computation, besides the computation of matrix F, the conventional vectorized method (8) needs M 1 M 2 ðL1 L2 þ L21 L22 Þ multiplications to compute FT WF and OðL31 L32 Þ multiplications to compute the inversion of FT WF. In addition, it needs M 1 M 2 ð1 þ L1 L2 Þ þ L21 L22 multiplications to complete the remaining matrix multiply operations in (8). On the other hand, the matrix-based GCG algorithm only needs L1 L2 ðM 1 þ M 2 þ 3Þ þ M 1 M 2 ðL1 þ L2 þ 1Þ multiplications in one iteration, or a maximum number of L1 L2 ½L1 L2 ðM 1 þ M 2 þ 3Þ þ M 1 M 2 ðL1 þ L2 þ 1Þ multiplications for the maximum number of L1 L2 iterations. Again, if we let N 1 ¼ N 2 ¼ N and M 1 ¼ M 2 ¼ M (assumed to be proportional to N), the complexity of the conventional vectorized method is OðN 6 Þ and that of the matrix-based GCG algorithm is r OðN 5 Þ. In many practical designs we have run, the proposed algorithm can achieve a high design precision within 4N iterations. In this case, the complexity of the matrix-based GCG algorithm reduces to OðN 4 Þ. As for the DCT-based GCG algorithm in [13], the computational cost in one iteration mainly depends on the fast 2-D DCT and IDCT used to compute the operator pðÞ. Since there are at least M 1 M 2 ð2 log 2 M 1 M 2 þ 10Þ multiplications in one fast 2-D DCT/IDCT when M1 and M2 are integer powers of 2, 2M 1 M 2 ð2 log 2 M 1 M 2 þ 10Þ þ M 1 M 2 multiplications are needed by the operator pð:Þ according to (14), and thus the complexity of the DCT-based GCG algorithm is OðM 1 M 2 log 2 ðM 1 M 2 ÞÞ. Correspondingly, the main cost in one iteration of our algorithm is the computation of the operator TðÞ, requiring M 1 M 2 ðL1 þ L2 Þ þ L1 L2 ðM 1 þ M 2 Þ multiplications, and thus the complexity of our algorithm is OðmaxðM 1 M 2 L1 ; M 1 M 2 L2 ; L1 L2 M 1 ; L1 L2 M 2 ÞÞ. Theoretically, if M1 and M2 are proportional to L1 and L2, the computational complexity of the DCT-based GCG algorithm is lower than that of our algorithm. However, from a large number of design examples, we find that our algorithm is more efficient than that in [13]. One reason is that the algorithm in [13] needs to compute the operator pðÞ two times in one iteration, one for pðGn Þ and the other for pðAnþ1 Þ, and thus needs four times the computational cost of one 2-D DCT.

1768

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

As a result, the DCT-based GCG algorithm requires a total of 4M 1 M 2 ð2 log 2 M 1 M 2 þ 10Þ þ2M 1 M 2 þ 3L1 L2 multiplications in one iteration. Our algorithm just needs to compute the operator Tð:Þ one time. Another reason is that, we generally have M 1 Z 4N 1 and M 2 Z 4N 2 . This leads to more multiplications are required by the DCT-based GCG algorithm for filters of general orders. As an example, we consider the case of M 1 ¼ M 2 ¼ 4N 1 ¼ 4N 2 ¼ M. Then,  9 it needs at least M 2 ð16 log 2 M þ 42Þ multiplications to compute the two pðÞ's, and M 2 32 Mþ1 multiplications to compute TðÞ. For filter order N ¼ N ¼ 64, we have 1 2   9  16 log 2 M þ 42 ¼ 170 4 32 M þ 1 ¼ 73 , which implies that the ratio of multiplications needed by the DCT-based GCG to our GCG is 2.33. When the filter orders are lower or M44N 1 ¼ 4N 2 , this ratio would be even larger. In addition, there are large number of order sorting operations in 2-D DCT. As for the memory requirement, the DCT-based GCG algorithm requires 2M 1 M 2 þ 4L1 L2 cells to store matrices W, D, An , Gn , Rn and pðGn Þ. Table 1 Multiplications and memory cells required by the conventional vectorized method, the DCT-based GCG algorithm, the matrix iterative algorithms in [17] and the matrix-based GCG algorithm. Method

Number of memory cells

Number of multiplications

Conventional vectorized method DCT-based GCG algorithm Matrix iterative algorithms in [17] Matrix-based GCG algorithm

M 1 M 2 L1 L2 þ L21 L22

M 1 M 2 ð2L1 L2 þ L21 L22 Þ þ OðL31 L32 Þ þ M 1 M 2 þ L21 L22

2M 1 M 2 þ L1 M 1 þ L2 M 2 þ 4L1 L2

L1 L2 ðM 1 þ M 2 þ 3Þ þ M 1 M 2 ðL1 þ L2 þ 1Þn

Numbers of multiplications in one iteration.

1 0.8 0.6

transition band

stopband

0.4 ω2 in unit of π

n

2M 1 M 2 þ 4L1 L2 4M 1 M 2 ð2 log 2 M 1 M 2 þ 10Þ þ 2M 1 M 2 þ 3L1 L2 n 2M 1 M 2 þ 2L1 M 1 þ 2L2 M 2 þ 2L1 L2 L1 L2 ðM 1 þ M 2 Þ þ M 1 M 2 ðL1 þ L2 þ 1Þn

0.2

passband

0 -0.2 -0.4 -0.6 -0.8 -1 -1

-0.5

0

0.5

ω1 in unit of π

Fig. 1. Specifications of the diamond filter in Example 1.

1

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

1769

As another comparison, the matrix iterative algorithms I, II proposed in [17] need L1 L2 ðM 1 þ M 2 Þ þ M 1 M 2 ðL1 þ L2 þ 1Þ multiplications in each iteration, slightly less than those by the proposed matrix-based GCG algorithm. However, simulation results presented in Section 3 show that the matrix-based GCG algorithm consumes much less design time. This is due to the good convergence property of the matrix-based GCG algorithm, i.e., it requires fewer iterations than that required by the algorithms in [17] for convergence. In addition, the algorithms in [17] need a total of 2M 1 M 2 þ 2L1 M 1 þ 2L2 M 2 þ 2L1 L2 memory cells to store matrices W, D, An , P, Q, ðPPT Þ  1 P, QT ðQQT Þ  1 and ðPPT Þ  1 PðD○WÞQT ðQQT Þ  1 . All of the above comparison analyses are summarized in Table 1, where the numbers of multiplications for the DCT-based GCG algorithm, the algorithms in [17] and the matrix-based GCG algorithm are for one iteration. 3. Design examples and comparisons Three design examples are presented in this section to illustrate the good performance of the matrix-based GCG algorithm through comparisons with existing methods. All these designs are implemented in Matlab programs on a Dell vostro-5470 laptop with an Intel (R) Core (TM) i54210U CPU (2.4-GHz) and an 8-Gb memory. In order to compare with existing methods, we use ^ defined by (10) as in [13], and the values of M1 and M2 are taken as the same frequency grid Ω M 1 ¼ 4N 1 and M 2 ¼ 4N 2 . Example 1. Design of diamond filters and comparisons with vectorized methods. The passband Ωp and stopband Ωs of the diamond filters are given by Ωp ¼ fðω1 ; ω2 Þj jω1 j þ jω2 j r 0:66πg; Ωs ¼ fðω1 ; ω2 Þj jω1 j þ jω2 j Z0:7πg; and depicted in Fig. 1. The weights on the passband, stopband and transition band are taken to be 1, 0.2 and 0, respectively. In this example, the conventional vectorized method described in Section 2.1, the eigenfilter method in [29], a semidefinite programming (SDP) [34] method, and the proposed matrix-based GCG algorithm are used to design the diamond filters of various sizes. In the eigenfilter method, the eigenvector corresponding to the minimum eigenvalue of the norm matrix is computed using the inverse power method. In the SDP method, we first transform the problem (7) into a equivalent SDP form, similar as in [21], and then solve it using the CVX software package with Table 2 Comparison among the conventional vectorized method, the eigenfilter method, the SDP method and the matrix-based GCG algorithm in Example 1. Filter size Conventional vectorized method

11  11 31  31 51  51 71  71

Eigenfilter method

SDP method

Matrix-based GCG algorithm Nit

Tcpu(s)

Ew

7.4530 8 5.4965 9 3.1020 13 1.4000 18

0.0010 0.0068 0.029 0.081

7.4530 5.4965 3.1020 1.4000

Tcpu(s)

Ew

Tcpu(s)

Ew

Tcpu(s) Ew

0.0021 0.19 2.1 12.1

7.4530 5.4965 3.1020 1.4000

0.0024 0.21 2.1 12.1

7.4866 5.4988 3.1023 1.4001

0.21 2.7 40.7 233

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

Fig. 2. Magnitude response of the 51  51 diamond filter designed by the matrix-based GCG algorithm.

1 0.8 passband

0.6

β

0.4 ω in unit of π 2

1770

0.2

stopband

transition band 0 ϖ

-0.2 -0.4 -0.6 -0.8 -1 -1

-0.5

0

0.5

ω in unit of π 1

Fig. 3. Specifications of the fan filter in Example 2.

1

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

1771

the SeDuMi solver. The error tolerance in the matrix-based GCG algorithm is taken to be ε ¼ 10  5 . The design time (Tcpu), the weighted error sums (Ew in (5)), and the iteration numbers (Nit) required by the matrix-based GCG algorithm are listed in Table 2. Fig. 2 shows the magnitude response of the diamond filter of size 51  51 designed by our algorithm. From Table 2 we see that, all the four methods give almost the same values of Ew, which means they have almost the same design precision, but the matrix-based GCG algorithm is much more computationally efficient than other three methods, especially with the increase in filter size. For example, for the design of the 71  71 filter, the design times by the conventional vectorized method, eigenfilter method and SDP method are respectively 149, 149 and 2877 times of that by our algorithm. Example 2. Design of fan filters and comparisons with matrix-based algorithms. We compare the proposed algorithm with the matrix iterative algorithm II in [17] (denoted by MI algorithm for simplicity) and the DCT-based GCG algorithm in [13] through the design example provided in [17], i.e., fan filters with the passband Ωp and stopband Ωs in the first quadrant described as Ωp ¼ fðω1 ; ω2 Þj0 r ω1 r ω2 tan β; 0 r ω2 r πg; Ωs ¼ fðω1 ; ω2 Þjðω2 tan β þ ϖÞr ω1 r π; 0r ω2 r πg; Table 3 Comparison among the MI algorithm, the DCT-based GCG algorithm and the matrix-based GCG algorithm for filters of various sizes with ϖ ¼ 0:06π in Example 2. Filter size

11  11 31  31 51  51 71  71

MI algorithm

DCT-based GCG algorithm

Matrix-based GCG algorithm

Nit

Tcpu(s)

Ew

Nit

Tcpu(s)

Ew

Nit

Tcpu(s)

Ew

12(15) 27(7) 89(9) 402(5)

0.0019 0.012 0.086 0.53

10.5473 4.9838 1.3125 0.2911

8 12 21 50

0.0058 0.058 0.36 1.4

10.5473 4.9838 1.3125 0.2910

8 12 21 50

0.00061 0.0043 0.017 0.071

10.5473 4.9838 1.3125 0.2910

Table 4 Comparison among the MI algorithm, the DCT-based GCG algorithm and the matrix-based GCG algorithm for 51  51 filters with various transition band widths in Example 2. ϖ

0:04π 0:06π 0:08π 0:10π 0:12π

MI algorithm

DCT-based GCG algorithm

Matrix-based GCG algorithm

Nit

Tcpu(s)

Ew

Nit

Tcpu(s)

Ew

Nit

Tcpu(s)

Ew

31(6) 89(9) 355(6) 1087(9) 3329(7)

0.036 0.086 0.24 0.82 2.1

6.3128 1.3125 2:7140  10  1 5:4946  10  2 1:1381  10  2

13 21 47 86 127

0.15 0.36 0.63 1.0 1.5

6.3128 1.3125 2:7136  10  1 5:4787  10  2 1:0778  10  2

13 21 47 86 127

0.012 0.017 0.033 0.051 0.089

6.3128 1.3125 2:7136  10  1 5:4787  10  2 1:0778  10  2

1772

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780 90 80

MI Algorithm Matrix-based GCG Algorithm

70

E(An)

60 50 40 30 20 10 0 5

10

15

20

25

30

Iteration number n

Fig. 4. Convergence of the MI and matrix-based GCG algorithms for the design of the 51  51 filter with ϖ ¼ 0:06π in Example 2. 800 MI Algorithm

700

Matrix-based GCG Algorithm

Number of iterations

600

500

400

300

200

100

0

20

30

40

50

60

70

80

filter size N

Fig. 5. Iteration number versus filter size N for the MI and matrix-based GCG algorithms for ϖ ¼ 0:06π in Example 2.

where β ¼ π=6 and ϖ is a parameter used to control the transition band width. The weights on the passband, stopband and transition band are taken to be 1, 0.5 and 0, respectively. Fig. 3 shows the design specifications. For fair comparisons, we use the same termination condition described in Step 3 of our algorithm with ε ¼ 10  5 and the same initial iterative matrix (28) in all three algorithms. Tables 3 and 4 list the iteration numbers (Nit), design time (Tcpu) and weighted square error sums (Ew) obtained by the three algorithms. Note that for the MI algorithm the iteration numbers needed in its first phase are included in the bracket just behind the iteration numbers required in its second phase (see [17] for more details of the MI algorithm). Table 3 shows the design results for filters of various sizes with ϖ ¼ 0:06π. Firstly, from the table we find that, compared with the MI algorithm, the matrix-based GCG algorithm needs

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

1773

much fewer iterations for convergence. Fig. 4 shows the convergence of the two algorithms for the design of a 51  51 filter. Obviously, the proposed algorithm converges faster. Moreover, this advantage becomes more obvious when the filter size increases as shown in Fig. 5, which depicts 3500 MI Algorithm Matrix-based GCG Algorithm

Number of iterations

3000 2500 2000 1500 1000 500 0 0.02

0.04

0.06

0.08

0.1

0.12

ϖ in unit of π

Fig. 6. Iteration number versus ϖ for the MI and matrix-based GCG algorithms with N ¼ 51 in Example 2.

Fig. 7. Magnitude response of the 51  51 fan filter with ϖ ¼ 0:1π designed by the matrix-based GCG algorithm in Example 2.

1774

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

the relations between the iteration number and filter size for the two algorithms. Because fewer iterations are needed, the matrix-based GCG algorithm consumes much less design time, although it takes more multiplications in one iteration than the MI algorithm as mentioned in Section 2.4. Secondly, by comparing the results obtained by the DCT-based and the matrix-based GCG algorithms, it is found that the two algorithms take the same iterations and return the same weighted square error sum. This is because they share the same iteration framework. However, we find from the design time that the proposed algorithm is computationally more efficient than the DCT-based algorithm, confirming the complexity analysis in Section 2.4. Finally, the design results show that the proposed algorithm has the capability of precisely designing 2-D FIR filters in just a small fraction of a second using a personal computer. Table 4 lists the design results for 51  51 filters with various ϖ's, i.e., various transition band widths. Similarly, the matrix-based GCG algorithm is more efficient than the others. Especially, the matrix-based GCG algorithm needs much fewer iterations and less design time to obtain filters with smaller weighted square error sums than the MI algorithm. The efficiency superiority of the matrix-based GCG algorithm over the MI algorithm becomes more obvious when the transition band width becomes larger. In more detail, Fig. 6 depicts the change of the iteration numbers with ϖ for the two algorithms. With the increase of the transition band width, the iteration number of the matrix-based GCG algorithm increases much slower than that of the MI algorithm. It is interesting that for all of the three algorithms, the numbers of the required iterations increase with the transition band width. For the MI algorithm, increase in the transition band will lead to increase in the spectral radius of its iterative operator, and thus leads to decrease of the convergence rates [17]. For the GCG algorithms, the increase in the transition band makes the min ðTÞ ratio λλmax ðTÞ decrease, where λmin ðTÞ and λmax ðTÞ represent the maximum and minimum eigenvalues of the operator T defined in (23), therefore, convergence rate of the GCG algorithms

1 0.8 0.6 stopband

ω2 in unit of π

0.4 0.2 0

θ

passband

-0.2 -0.4 -0.6 -0.8 -1 -1

-0.5

0 ω1 in unit of π

0.5

1

Fig. 8. Specifications of the ideal fan filter used in [8] in Example 3.

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

1775

min ðTÞ decreases [30]. For example, in the design of 51  51 fan filters, the values of λλmax ðTÞ are 4 respectively 0.135, 0.0413, 0.0115, 0.00276 and 7:28  10 for ϖ ¼ 0:04π, 0:06π, 0:08π, 0:1π and 0:12π. Finally, Fig. 7 gives the magnitude response of a 51  51 fan filter with ϖ ¼ 0:1π designed by the proposed algorithm.

Example 3. Design of fan filters and comparisons with the frequency transformation algorithm. Frequency transformation methods are also commonly used techniques to design 2-D filters [4–8]. We give in this example a comparison of the matrix-based GCG algorithm with the efficient frequency transformation method recently proposed in [8] by designing fan FIR filters with specifications described in [8], i.e., the 39  39 filters with the ideal passband and stopband as depicted in Fig. 8. 1 0.8 0.6 stopband

ω2 in unit of π

0.4 0.2 0

θ

passband

-0.2 -0.4

transition band

-0.6 -0.8 -1 -1

-0.5

0 ω1 in unit of π

0.5

1

Fig. 9. Specifications of the fan filter used in our matrix-based GCG algorithm in Example 3. Table 5 Comparison between the frequency transformation algorithm in [8] and the matrix-based GCG algorithm in Example 3. θ

151 201 251 301 351 401

Algorithm in [8]

Matrix-based GCG algorithm

Emp

Ems

Emp

Ems

4:6288  10  2 6:0825  10  2 1:4777  10  1 9:0295  10  2 5:6098  10  2 6:7904  10  2

1:1245  10  2 1:2220  10  1 3:0419  10  1 4:5105  10  1 5:9034  10  1 7:2282  10  1

1:9993  10  3 1:8478  10  3 2:0888  10  3 2:6925  10  3 3:9819  10  3 3:7196  10  3

1:6289  10  3 1:5390  10  3 1:6957  10  3 1:8390  10  3 2:9231  10  3 2:9902  10  3

1776

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

For the matrix-based GCG algorithm, we use the passband Ωp and stopband Ωs shown in Fig. 9 and described by

 Ωp ¼ ðω1 ; ω2 Þj0r ω1 r π; 0 r ω2 r minfω1 tan θ; πg ;

 Ωs ¼ ðω1 ; ω2 Þj0r ω1 rπ; ð0:2π þ ω1 tan θÞr ω2 r π ; in the first quadrant. The weights are taken to be 1 on both the passband and stopband and, and 0 on the transition band. The tolerance parameter is taken to be ε ¼ 10  5 . Design results are summarized in Table 5, where Emp and Ems represent the maximum magnitude errors in the passband and stopband, respectively. Evidently, the matrix-based GCG algorithm has obtained 2-D filters with much better magnitude behaviors for all values of the passband angle θ. Our experience based on a large number of design examples has convinced us that filters designed by our algorithm have better performance than those designed by various frequency transformation methods. 4. Concluding remarks The WLS design of 2-D FIR filters is studied and a matrix-based GCG algorithm is presented. The algorithm retains the natural coefficient matrix form of the filters, and is guaranteed to converge to the optimal solution in a finite number of iterations. Therefore, 2-D FIR filters can be efficiently designed with high precision. It has been shown by the complexity analysis and design results that the proposed algorithm is more efficient than the conventional vectored method, the eignfilter methods, the SDP method, the MI algorithm, the DCT-based GCG algorithm and the frequency transformation methods. Design examples demonstrate the high precision and high computational efficiency of the proposed algorithm. For future works, we could use the proposed matrix-based GCG algorithm as an iteration core in any iterative reweighted least-squares algorithm to efficiently solve the least lp-norm (p Z 2, including the minimax with p ¼ 1) design of quadrantally symmetric 2-D FIR filters. We may also generalize the matrix-based GCG algorithm to the designs of linear-phase centro-symmetric and non-linear phase 2-D FIR filters. Appendix A. GCG method in [30] Consider a linear operator equation TðxÞ ¼ d in a Hilbert space H equipped with an inner product 〈; 〉H , where x A H and d A H. When the linear operator TðÞ is bounded, self-adjoint and positive-definite, the GCG method in [30] for solving TðxÞ ¼ d can be stated as follows. Algorithm 3. GCG Method in [30] Step 1: Given x0 A H, let r 0 ¼ Tðx0 Þ  d, g0 ¼  r 0 and n ¼ 0. Step 2: Let xnþ1 ¼ xn þ αn gn , where n ;gn 〉H : αn ¼  〈g〈r;Tðg n n Þ〉H Step3: Let gnþ1 ¼ r nþ1 þ λn gn , where   r nþ1 ¼ r n þ αn T gn ; λn ¼

〈Tðgn Þ; r nþ1 〉H : 〈gn ; Tðgn Þ〉H

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

1777

If r nþ1 ¼ 0, terminate the algorithm. Otherwise, let n ¼ n þ 1 and go back to Step 2.□

Appendix B. Proof of Theorem 1 Proof. Firstly, because TðÞ is defined on the finite dimensional space RL1 L2 , it is bounded. For two arbitrary matrices X; Y A RL1 L2 , we have   〈T ðXÞ; Y〉 ¼ trf P ðPT XQÞ○WQT ÞT Yg ¼ trfQ½ðPT XQÞ○WT PT Yg ¼ trf½ðPT XQÞ○WT PT YQg     1 1 ¼ trf½ PT XQ ○Wð2Þ T ½ PT YQ ○Wð2Þ g and,

     〈X; T ðYÞ〉 ¼ trfXT P PT YQ ○W QT g    ¼ trfQT XT P PT YQ ○W g     ¼ trfðPT XQÞT PT YQ ○W g h i    1 1 ¼ trf½ PT XQ ○Wð2Þ T PT YQ ○Wð2Þ g:

Thus, 〈TðXÞ; Y〉 ¼ 〈X; TðYÞ〉, which means TðÞ is a self-adjoint linear operator [35]. We have from the above derivation and equation Eq. (20) that h i    1 1 〈X; T ðXÞ〉 ¼ trf½ PT XQ ○Wð2Þ T PT XQ ○Wð2Þ g   1 ¼ J PT XQ ○Wð2Þ J 2F 40 holds for all L1  L2 nonzero real matrices X A RL1 L2 . Therefore, TðÞ is positive-definite [36]. The proof is completed.□

Appendix C. Proof of Theorem 2 Proof. (Induction) Let i,j ¼ 0,1 and ia j. For the case of i ¼ 1; j ¼ 0, we have 〈G1 ; TðG0 Þ〉 ¼ 〈 ∇EðA1 Þ þ λ0 G0 ; TðG0 Þ〉 ¼ 〈 ∇EðA1 Þ; TðG0 Þ〉 þ λ0 〈G0 ; TðG0 Þ〉: Substituting (27) into the above equation yields 〈G1 ; TðG0 Þ〉 ¼ 0. Considering the selfadjointness of the operator TðÞ, we further have 〈TðG1 Þ; G0 〉 ¼ 0, thereby 〈G0 ; TðG1 Þ〉 ¼ 0. Hence, 〈Gi ; TðGj Þ〉 ¼ 0;

i; j ¼ 0; 1; ia j:

Now, assuming 〈Gi ; TðGj Þ〉 ¼ 0;

i; j ¼ 0; 1; …; n; i a j;

ðC:1Þ

1778

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

we want to prove 〈Gi ; TðGj Þ〉 ¼ 0;

i; j ¼ 0; 1; …; n þ 1; ia j:

ðC:2Þ

We first prove (C.2) for i ¼ n þ 1 and j ¼ 0; 1; …; n. When i ¼ n þ 1 and j¼ n, it is easy to see from (27) that 〈Gnþ1 ; TðGn Þ〉 ¼ 〈  ∇EðAnþ1 Þ þ λn Gn ; TðGn Þ〉 ¼ 〈 ∇EðAnþ1 Þ; TðGn Þ〉 þ λn 〈Gn ; TðGn Þ〉 ¼ 0:

ðC:3Þ

In the following, we prove 〈Gnþ1 ; TðGj Þ〉 ¼ 0 for j ¼ 0; 1; …; n 1. Using the updating formula, Anþ1 can be written as Anþ1 ¼ Ajþ1 þ

n X

α k Gk

ðC:4Þ

k ¼ jþ1

for any j ¼ 0; 1; …; n  1. Substituting (C.4) into (21) gives ∇EðAnþ1 Þ ¼ ∇EðAjþ1 Þ þ

n X

2αk P½ðPT Gk QÞ○WQT ¼ ∇EðAjþ1 Þ þ 2

k ¼ jþ1

n X

αk TðGk Þ:

k ¼ jþ1

Then we have for j ¼ 0; 1; …; n  1 〈Gj ; ∇EðAnþ1 Þ〉 ¼ 〈Gj ; ∇EðAjþ1 Þ þ 2

n X k ¼ jþ1

* þ2 Gj ;

n X

αk TðGk Þ〉 ¼ 〈Gj ; ∇EðAjþ1 Þ〉 +

αk TðGk Þ :

k ¼ jþ1

The second term in the right-hand side is zero by the induction assumption (C.1). For the first term, noting that ∇EðAjþ1 Þ ¼ ∇EðAj Þ þ 2αj P½PT Gj QÞ○WQT ¼ ∇EðAj Þ þ 2αj TðGj Þ; and using the formula (25), we have 〈∇EðAjþ1 Þ; Gj 〉 ¼ 0:

ðC:5Þ

Therefore, 〈Gj ; ∇EðAnþ1 Þ〉 ¼ 0;

j ¼ 0; 1; …; n;

ðC:6Þ

by noting (C.5) also holds for j ¼ n. By replacing Gj with Gj ¼  ∇EðAj Þ þ λj  1 Gj  1 , Eq. (C.6) becomes 〈  ∇EðAj Þ þ λj  1 Gj  1 ; ∇EðAnþ1 Þ〉 ¼ 0; where j ¼ 1; 2; …; n. Hence, using (C.5) and noting ∇EðA0 Þ ¼  G0 , we obtain 〈∇EðAj Þ; ∇EðAnþ1 Þ〉 ¼ 0;

j ¼ 0; 1; …; n:

ðC:7Þ

Then, we have for j ¼ 0; 1; …; n  1, 〈Gnþ1 ; TðGj Þ〉 ¼  〈∇EðAnþ1 Þ; TðGj Þ〉 þ λn 〈Gn ; TðGj Þ〉 ¼  〈∇EðAnþ1 Þ; TðGj Þ〉: Substituting Gj ¼ 

Ajþ1  Aj αj

into the right-hand side of the above equation, we further obtain

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

1779

    1 〈Gnþ1 ; T Gj 〉 ¼  ∇EðAnþ1 Þ; T Ajþ1  Aj αj     1  ¼ ∇EðAnþ1 Þ; ∇E Ajþ1  ∇E Aj 2αj By virtue of (C.7), it follows that 〈Gnþ1 ; TðGj Þ〉 ¼ 0;

j ¼ 0; 1; …; n  1:

ðC:8Þ

From (C.3), (C.8) and the self-adjoint property of TðÞ, we also have 〈Gi ; TðGnþ1 Þ〉 ¼ 0;

i ¼ 0; 1; …; n:

Thus, combining with (C.1), we get 〈Gi ; TðGj Þ〉 ¼ 0;

i; j ¼ 0; 1; …; n þ 1; ia j:

Then, it can be concluded by induction that 〈Gi ; TðGj Þ〉 ¼ 0 for any i a j. This completes the proof.□

References [1] J.S. Lim, Two-Dimensional Signal and Image Processing, Prentice Hall, New Jersey, 1989. [2] W.-S. Lu, A. Antoniou, Two-Dimensional Digital Filters, Marcel Dekker, New York, 1992. [3] P. Stubberud, System functions for 2D linear phase frequency sampling filters, J. Frankl. Inst. 341 (4) (2004) 325–342. [4] S.-C. Pei, J.-J. Shyu, Design of two-dimensional FIR digital filters by McClellan transformation and least-squares contour mapping, Signal Process. 44 (1) (1995) 19–26. [5] C.C. Tseng, Design of two-dimensional FIR digital filters by McClellan transform and quadratic programming, IEE Proc.-Vis. Image Signal Process. 148 (5) (2001) 325–331. [6] H.C. Reddy, I.H. Khoo, P.K. Rajan, 2-D symmetry: theory and filter design applications, IEEE Circuits Syst. Mag. 3 (3) (2002) 4–23. [7] J.C. Liu, Y.L. Tai, Design of 2-D wideband circularly symmetric FIR filters by multiplierless high-order transformation, IEEE Trans. Circuits Syst.-I 58 (4) (2011) 746–754. [8] Y.M. Wang, J.G. Yue, Y.Q. Su, H. Liu, Design of two-dimensional zero-phase FIR digital filter by McClellan transformation and interval global optimization, IEEE Trans. Circuits Syst.-II 60 (3) (2013) 167–171. [9] M.O. Ahmad, J.D. Wang, An analytic least mean square solution to the design problem of two-dimensional FIR filters with quadrantally symmetric or antisymmetric frequency response, IEEE Trans. Circuits Syst. 36 (7) (1989) 968–979. [10] W.P. Zhu, M.O. Ahmad, M.N.S. Swamy, An analytical approach for obtaining a closed-form solution to the leastsquare design problem of 2-D zero-phase FIR filters, IEEE Trans. Circuits Syst.-II 41 (12) (1994) 796–807. [11] W.P. Zhu, M.O. Ahmad, M.N.S. Swamy, A closed-form solution to the least-square design problem of 2-D linearphase FIR filters, IEEE Trans. Circuits Syst.-II 44 (12) (1997) 1032–1039. [12] W.P. Zhu, M.O. Ahmad, M.N.S. Swamy, A least-square design approach for 2-D FIR filters with arbitrary frequency response, IEEE Trans. Circuits Syst.-II 46 (8) (1999) 1027–1034. [13] J.L. Aravena, G. Gu, Weighted least mean square design of 2-D FIR digital filters: the general case, IEEE Trans. Signal Process. 44 (10) (1996) 2568–2578. [14] G. Gu, J.L. Aravena, Weighted least mean square design of 2-D FIR digital filters, IEEE Trans. Signal Process. 42 (11) (1994) 3178–3187. [15] R. Zhao, X. Lai, A fast matrix iterative technique for the WLS design of 2-D quadrantally symmetic FIR filters, Multidimens. Syst. Signal Process. 22 (4) (2011) 303–317. [16] R. Zhao, X. Lai, Fast two-dimensional weighted least squares techniques for the design of two-dimensional finite impulse response filters, J. Control Theory Appl. 11 (2) (2013) 141–146.

1780

R. Zhao et al. / Journal of the Franklin Institute 353 (2016) 1759–1780

[17] R. Zhao, X. Lai, Efficient 2-D based algorithms for WLS design of 2-D FIR filters with arbitrary weighting functions, Multidimens. Syst. Signal Process. 24 (3) (2013) 417–434. [18] M.T. Hanna, Weighted least squares design of two-dimensional zero-phase FIR filters in the continuous frequency domain, IEEE Trans. Circuits Syst.-II 43 (7) (1996) 534–537. [19] E. Gislason, M. Johansen, K. Conradsen, B.K. Erboll, S.K. Jacobsen, Three different criteria for the design of 2-D zero phase FIR digital filters, IEEE Trans. Signal Process. 41 (10) (1993) 3070–3074. [20] C.H. Hsieh, C.M. Kuo, Y.D. Jou, Y.L. Han, Design of two-dimensional FIR digital filters by a two-dimensional WLS technique, IEEE Trans. Circuits Syst.-II 44 (5) (1997) 348–358. [21] W.S. Lu, A unified approach for the design of 2-D digital filters via semidefinite programming, IEEE Trans. Circuits Syst.-I 49 (6) (2002) 814–826. [22] X.P. Lai, Y. Cheng, A sequential constrained least-square approach to minimax design of 2-D FIR filters, IEEE Trans. Circuits Syst.-II 54 (11) (2007) 994–998. [23] W.S. Lu, T. Hinamoto, Two-dimensional digital filters with sparse coefficients, Multidimens. Syst. Signal Process. 22 (1) (2011) 173–189. [24] X.Y. Hong, X.P. Lai, R.J. Zhao, Matrix-based algorithms for constrained least-squares and minimax designs of 2-d FIR filters, IEEE Trans. Signal Process. 64 (14) (2013) 3620–3631. [25] A. Kumar, G.K. Singh, R.S. Anand, A simple design method for the cosine-modulated filter banks using weighted constrained least square technique, J. Frankl. Inst. 348 (4) (2011) 606–621. [26] C.S. Burrus, J.A. Barreto, I.W. Selesnick, Iterative reweighted least-squares design of FIR filters, IEEE Trans. Signal Process. 42 (11) (1994) 2926–2936. [27] A. Tkacenko, P.-P. Vaidyanathan, T.-Q. Nguyen, On the eigenfilter design method and its applications: a tutorial, IEEE Trans. Circuits Syst.-II 50 (9) (2003) 497–517. [28] B.-D. Patil, P.-G. Patwardhan, V.-M. Gadre, Eigenfilter approach to the design of one-dimensional and multidimensional two-channel linear-phase FIR perfect reconstruction filter banks, IEEE Trans. Circuits Syst.-I 55 (11) (2008) 3542–3551. [29] S.-C. Pei, C.-C. Tseng, A new eigenfilter based on total least squares error criterion, IEEE Trans. Circuits Syst.-I 48 (6) (2001) 699–709. [30] J.W. Daniel, The conjugate gradient method for linear and nonlinear operator equations, SIAM J. Numer. Anal. 4 (1) (1967) 10–26. [31] R.A. Horn, C.R. Johnson, Topics in Matrix Analysis, Cambridge University Press, Cambridge, U.K., 1994. [32] A. Antoniou, W.S. Lu, Practical Optimization: Algorithms and Engineering Applications, Springer, New York, 2007. [33] T.B. Deng, Symmetric structures for odd-order maximally flat and weighted-least-squares variable fractional-delay filters, IEEE Trans. Circuits Syst.-I 54 (12) (2007) 2718–2732. [34] S. Boyd, L. Vandenberghe, Convex Optimization, Cambridge University Press, Cambridge, 2004. [35] I. Gohberg, S. Hohberg, M.A. Kaashoek, Basic Classes of Linear Operators, Birkhauser, Basel, 2003. [36] N. Dunford, J.T. Schwartz, Linear Operators, Part 2, Spectral Theory, Self Adjoint Operators in Hilbert Space, Wiley-Interscience, New York, 1963.