Journal of Systems Engineering and Electronics Vol. 19, No. 2, 2008, pp.381–384
New recursive algorithm for matrix inversion Cao Jianshu & Wang Xuegang Coll. of Electronic Engineering, Univ. of Electronic Science and Technology of China, Chengdu 610054, P. R. China (Received October 5, 2006)
Abstract: To reduce the computational complexity of matrix inversion, which is the majority of processing in many practical applications, two numerically efficient recursive algorithms (called algorithms I and II, respectively) are presented. Algorithm I is used to calculate the inverse of such a matrix, whose leading principal minors are all nonzero. Algorithm II, whereby, the inverse of an arbitrary nonsingular matrix can be evaluated is derived via improving the algorithm I. The implementation, for algorithm II or I, involves matrix-vector multiplications and vector outer products. These operations are computationally fast and highly parallelizable. MATLAB simulations show that both recursive algorithms are valid.
Keywords: recursive algorithm, matrix inversion, matrix-vector product, leading principal minor (LPM).
1. Introduction In many practical applications, such as radar, communication, and sonar, etc.[1−7] , the chief processing burden usually stems from inversion of large matrix. Generally, there are two methodologies in an attempt to reduce the computational load. The first approach is the reduced-rank method[1−2] , which mitigates the computational complexity by reducing the matrix dimensionality while avoiding significant degradation in performance. Another suggestion is to develop a fast matrix inversion algorithm, which can be laid out in a highly parallel/pipeline structure in hardware. In recent years, several recursive algorithms have been extensively studied [3−7] , resulting in the computationally efficient implementation for matrix inversion. Nevertheless, each of the methods is only applied to a special form matrix. A recursive algorithm for MoorePenrose pseudoinverse [8] can be used to calculate the inverse of an arbitrary nonsingular matrix. Unfortunately, it is a daunting task due to the matrix-matrix multiplication with O(N 3 ) flops needed at every iteration, where N denotes the dimension of matrix. In this article, we first present a recursive algorithm of inversion for such a matrix whose leading principal minors are all nonzero. Interestingly, an arbitrary non-
singular matrix can satisfy the above constrained condition on leading principal minor (LPM) by only permuting its columns. Hence, a new recursive method is developed, which coupled the above algorithm with the permutations in matrix. The new method is valid for the inversion of an arbitrary nonsingular matrix. Its implementation involves matrix-vector multiplications and vector outer products. These operations are computationally fast and highly parallelizable[9,10].
2. Iterative inversion of a matrix whose LPMS are all nonzero Let A ∈ C N ×N be a nonsingular matrix, and Ai denote the LPM of A of order i, where Ai = 0,
i = 1, 2, . . . , N
(1)
Define I = [e1 , e2 , . . . , eN ] be an N × N identity matrix and B = A − I = [b1 , b2 , . . . , bN ]. Using the identity A = I + (A − I)
(2)
we obtain A=I +B =I +
N
bi eT i
(3)
i=1
where (·)T represents transpose. Observing (3), clearly, the recursive formulations with respect to A can be defined as follows
Cao Jianshu & Wang Xuegang
382 A¯i = A¯i−1 + bi eT i
(4)
in which A¯0 = I, A = A¯N , and i = 1, 2, . . . , N . Consequently, we can get A¯i = A¯i−1 + bi eT i = Ai = 0
(5)
where |·| stands for the matrix determinant. It means that A¯i and A¯i−1 are both invertible. Hence, applying the matrix inversion lemma[6] , (4) becomes A¯−1 = A¯−1 i i−1 − Let
T ¯−1 (A¯−1 i−1 bi )(ei Ai−1 ) 1 + eT (A¯−1 bi ) i
(6)
i−1
q = A¯−1 i−1 bi
(7)
p = 1 + q(i)
(8)
therefore, (6) simplifies to ¯−1 A¯−1 = A¯−1 i i−1 − (q/p)Ai−1 (i, :)
(9)
in which q(i) is the ith entry of vector q, A¯−1 i−1 (i, :) represents the ith row vector of A¯−1 . i−1 To summarize, the presented algorithm consists of the following steps. Algorithm I Step 1 Initialization B = A − I = [b1 , b2 , . . . , bN ] A¯−1 0 = I = [e1 , e2 , . . . , eN ] Step 2
Iterative operations: for i = 1, 2, . . . , N q = A¯−1 i−1 bi A¯−1 i
Step 3
p = 1 + q(i) ¯−1 = A¯−1 i−1 − (q /p )Ai−1 (i, :)
Compute A−1 A−1 = A¯−1 N
It is very easy to verify that A¯−1 i−1 would be of (i−1) (i−1) (i−1) the form [a1 , a2 , . . . , ai−1 , ei , . . . , eN ], where (i−1) (i−1) (i−1) , a2 , . . . , ai−1 are the first i − 1 columns of a1 ¯−1 A¯−1 i−1 . Hence, only the first (i − 1) columns of Ai−1 −1 have to be calculated for generating A¯i . The number of complex multiplications required to get A¯−1 i from A¯−1 i−1 is about (2i − 1)N . The dominant multiplications, matrix-vector product, are fit for parallel pipeline implementation with systolic array, which is especially attractive to develop hardware system in practical application[9−10] .
3. Iterative inversion of a nonsingular matrix Algorithm I proposed in Section 2 can be appropriately modified for computing the inverse of an arbitrary nonsingular matrix A ∈ C N ×N . In order to do so, the following preliminary result is required. Lemma 1 Let F = [f1 , f2 , . . . , fN ] ∈ C N ×N be a nonsingular matrix, and Fj−1 denote the LPM of F of order j − 1, 2 j N . If Fj−1 = 0, then there exists a matrix Fˆ whose LPM of order j Fˆj = 0 where Fˆ is a matrix obtained by interchanging fj and f j1 of F , j j1 N . Next, we state our key analysis results in the following theorem. Theorem 1 Let A ∈ C N ×N , rank(A) = N , and j1 , j2 , . . . , ji , . . . , jN ∈ {1, 2, . . . , N } be N distinct indices, not necessarily in natural order, 1 i N . Let aj1 , aj2 , . . . , aji , . . . , aN be the columns of A indexed by j1 , j2 , . . . , ji , . . . , jN . Then (i) There exists aj1 , aj2 , . . . , aji , . . . , aN , such that A¯i = [aj1 , aj2 , . . . , aji , ei+1 , . . . , eN ] (10) is a nonsingular matrix for each i. (ii) Let A¯0 = I, then A¯i = A¯i−1 + (aji − ei )eT i
(11)
holds for every i. (iii) The aji required in Eq.(10) can be found by solving max [abs(A¯−1 (12) i−1 (i, 1 : i)aji (1 : i)] aji ∈M
where aji ∈ M , M is a set formed by the columns of A except aj1 , aj2 , . . . , aj(i−1) ; A¯−1 i−1 (i, 1 : i) and aji (1 : i) represent the vectors formed by the first i entries of the ith row of A¯−1 i−1 and the aji , respectively; and abs(·) stands for the complex magnitude. (iv) A−1 = P A¯−1 (13) N where P is a permutation matrix and P = [ej1 , ej2 , . . . , ejN ]. By theorem 1 and algorithm I, the procedure of computing A−1 is described in the following. Algorithm II Step 1 Initialization
New recursive algorithm for matrix inversion A¯−1 0 = I = [e1 , e2 , . . . , eN ], P = I = [e1 , e2 , . . . , eN ], C =A Step 2 Iterative operations: for i = 1, 2, . . . , N Part 1 (1) d = A¯−1 i−1 (i, 1 : i)C(1 : i, i : N ) (2) Find the maximum entry d(k) in vector d, 1 k N − i + 1. (3) Let index j = k + i − 1, interchange the ith and jth columns of C and P , respectively. (4) Let b = C(:, i) − ei Part 2 q = A¯−1 i−1 b
Step 3
p = 1 + q(i) −1 ¯ ¯ ¯−1 Ai = A−1 i−1 − (q/p)Ai−1 (i, :) Reorder the rows of A¯−1 N according to P , i.e. A−1 = P A¯−1 N
¯−1 in which A¯−1 i−1 (i, :) is the ith row of Ai−1 , vector A¯−1 i−1 (i, 1 : i) consists of the first i entries of the ith row of A¯−1 i−1 ; and C(:, i) is the ith column of C, C(1 : i, i : N ) is a submatrix formed by row 1 through i and column i through N of C. Algorithm II requires approximately (i − 1)(N − i + 1) + (2i − 1)N complex multiplications for the ith iteration and supports parallel implementation with VLSI[9−10] . Compared with the algorithm I, we note that, aside from the Step 3, the main difference is in the Part 1 of Step 2. In the Part 1, the major computational demand comes from calculating d, which has a computational complexity of approximately (i − 1)(N − i + 1) at the ith stage. In Step 3, the permutation operations are achieved via only reordering the rows of A¯−1 N according to P without any computational burden. In practical programming for algorithm II, we define a vector pˆ = [j1 , j2 , . . . , jN ] to represent the matrix P = [ej1 , ej2 , . . . , ejN ]. Since the indexes of interchanging columns for C and P are identical (See: Step 2. Part 1. (3)), we have C(:, i) = A(:, pˆ(i)) and
C(1 : i, i : N ) = A(1 : i, pˆ(i : N )) where pˆ(i) denotes the ith entry of pˆ; pˆ(i : N ) is entry i through N of pˆ; A(:, pˆ(i)) is the pˆ(i)th column of A;
383 and A(1 : i, pˆ(i : N )) is a submatrix formed by row 1 through i and column pˆ(i) pˆ(i + 1) . . . pˆ(N ) of A. It means that the operations of interchanging the ith and jth columns of C (and P ) can be implemented simply by interchanging the ith and jth entry of pˆ, which has the desired feature of significantly reducing the amount of data transferred and saving the chip memory. It is worth pointing out that many types of matrices occurring in applications (e.g., the positive/negative definite matrix, upper/lower triangular matrix, etc.) satisfy the condition (1). For these matrices, we should employ algorithm I rather than algorithm II, which will reduce nearly 15 percent of the computational complexity. Compared to the algorithm presented in Ref. [8] (called trace-method), algorithm II may provide an even greater computational savings (See Table 1). Table 1 Comparison of number of complex multiplications required between algorithm II and the trace-method Algorithm
The ith iteration
Total
Algorithm II
(3i − 2)N − (i − 1)2
(7N 3 − N )/6
Trace-method
N3
N4
4. Simulation MATLAB simulations indicate that each of the above algorithms has same result as the “inv ()”, which is a function of matrix inversion of MATLAB. Due to the page limit, further details on the simulations for large complex matrixes are not discussed herein. We construct a simple example that demonstrates algorithm II via a 3 × 3 matrix inversion.
5. Conclusion In this article, we introduced a new recursive algorithm (i.e. algorithm II) for faster implementation of the matrix inversion. The algorithm, whose dominant operations come from matrix-vector product and vector outer product, is easily mapped onto systolic array structures for a parallel implementation. Moreover, if the matrix ensures the condition in (1), algorithm II will reduce to algorithm I, which has the advantage of decreasing nearly 15 percent of computational cost. MATLAB simulations support both algorithms.
Cao Jianshu & Wang Xuegang
384
References [1] Jin Y, Friedlander B. Reduced-rank adaptive detection of distributed sources using subarrays. IEEE Trans. on Signal Processing, 2005, 53(1): 13–25. [2] Peckham C D, Haimovich A M, Ayoub T F, et al. Reducedrank STAP performance analysis. IEEE Trans. on Aerospace and Electronic Systems, 2000, 36(2): 664–676. [3] Sarestoniemi M, Matsumoto T, Kansanen K. Core matrix inversion techniques for SC/MMSE MIMO turbo equalization. Vehicular Technology Conference, 2004, 2004, 1:
[7] Asif A, Moura J M F. Block matrices with L-block-banded inverse: inversion algorithms. IEEE Trans. on Signal Processing, 2005, 53(2): 630–642. [8] Zhang Xianda. Matrix analysis and applications. Beijing: Press of tsinghua, 2004: 93. [9] Xiong Jun, Liao Guisheng. Recursive algorithm of adaptive weight extraction of space-time signal processing for airborne radars. Proc. of, Radar, 1996: 86–90. [10] Takashi Yahagi. Fast algorithms and parallel signal processing. Beijing: Press of Science, 2003: 203.
394–398. [4] Brennan L, Mallett J, Reed I. Adaptive arrays in airborne MTI radar[J]. IEEE Trans. on Antennas and Propagation, 1976, 24(5): 607–615. [5] Kumar R. A fast algorithm for solving a toeplitz system of equations. IEEE Trans. on Acoust, Speech, Signal Processing, 1985, 33(1): 254–267. [6] Alexander S T, Ghirnikar A L. A method for recursive least squares filtering based upon an inverse QR decomposition. IEEE Trans. on Signal Processing, 1993, 41(1): 20–30. (Continued from page 376) termarking protocol. Communication Technology Proceedings, International Conference on ICCT, 2003, 2(9−11): 1779−1783. [10] Zhuang Y Q, Wu J Q, Duan F. A semi-trusted third party watermark protocol and implement framework. 5th International Symposium on Test and Measurement, ShenZhen, China, 2003: 162−166. [11] Goi B M, Phan C W. Cryptanalysis of two anonymous buyer-seller watermarking protocols and an improvement for true anonymity. ACNS, LNCS 3089. Springer-Verlag Berlin Heidelberg, 2004: 369−382. [12] Choi J G, Sakurai K, Park J H. Does it need trusted third party? design of buyer-seller watermarking protocol without trusted third party. ACNS 2003, LNCS 2846, SpringerVerlag Berlin Heidelberg, 2003: 265−279. [13] Cheung S C, Leung H f, Wang C J. A commutative encrypted protocol for the privacy protection of watermarks in digital contents. Proc. of the 37th Hawaii International Conference on System Sciences, 2004: 1−10. [14] Deng B F, Robert H. Privacy protection for transactions of digital contents. ICICS’01, LNCS, Springer-Verlag, 2001: 202−213. [15] Cheung S C, Curreem H. Rights protection for digital contents redistribution over the internet. Computer Software and Applications Conference, COMPSAC. Proc of. 26th Annual International, 2002: 105−110. [16] Chu H H, Qiao L. A secure multicast protocol with copyright protection. Proc. of the IS &T/ SPIE Conference on Security and Technology. Calif. San Jose, 1999: 460−471. [17] Chaum D. An improved protocol for demonstrating possession of discrete logarithms and some generalizations. Eurocrypt’ 87, LNCS 307, 1987: 127−141.
Cao Jianshu was born in 1970. Now he is pursuing a Ph. D. degree in University of Electronic Science and Technology of China(UESTC). His research interests include signal processing for airborne radar and realtime signal processing. E-mail: js
[email protected] Wang Xuegang was born in 1962. He is now a professor of UESTC. His scientific interests include radar signal processing and real-time signal processing.
[18] He Y Z, Wu C K., Feng D G. Publicly verifiable zeroknowledge watermark detection. Journal of Software, 2005, 16(9): 1606−1616. [19] Peter H, Wong W, Oscar C A. A blind watermarking technique for multiple watermarks. ISCAS’03, 2003. 2: 25−28. [20] Gopalakrishnan K, Memon N P L Vora. Protocols for watermark verification. Multimedia, IEEE, 2001, 8(4): 66−70. [21] Perlman R. An overview of PKI trust models. Network, IEEE, 1999, 13(6): 38−43.
Liu Quan was born in Wuhan in 1963. She is a doctor, professor, tutor of doctor and the dean of the School of Information Engineering, Wuhan University of Technology. Her research interests include nonlinear system analysis and signal processing, etc. She has authored more than 90 papers. E-mail:whutsie@ 126.com Chen Zheng was born in Wuhan in 1979. He received M. S. from Wuhan University of Technology in 2006. His research interests include signal processing and information security. Zhou Zude was born in Hunan Province in 1946. He is a doctor, professor, and tutor of doctor. His research interests include digital control technology, intelligent control system, mechanism and electrical control and automatization, virtual manufacture etc. He has authored more than 250 papers.