Algorithms and Parallel VLSI Architectures III M. Moonen and F. Catthoor (Editors) © 1995 Elsevier Science B.V. All rights reserved.
S U B S P A C E M E T H O D S IN S Y S T E M I D E N T I F I C A T I O N LOCALIZATION
13
AND SOURCE
P.A. REGALIA D~partement Signal et Image Institut National des T~ldcommunications 9, rue Charles Fourier 91011 Evry cedez France
[email protected]
ABSTRACT. Subspace methods have received increasing attention in signal processing and control in recent years, due to thelr successful application to the problems of multlvariable system identification and source localization. This paper gives a tutorial overview of these two applications, in order to draw out features common to both problems. In particular, both problems are based on finding spanning vectors for the null space of a spectral density matrix characterizing the available data. This is expressed numerically in various formulations, usually in terms of extremal singular vectors of a data matrix, or in terms of orthogonal filters which achieve decorrelatlon properties of filtered data sequences. In view of this algebraic similarity, algorithms designed for one problem may be adapted to the other. In both cases, though, successful application of subspace methods depends on some knowledge of the required filterorder of spanning vectors for the desired null space. Data encountered in real applications rarely give rise to finite order filtersif theoretically "exact" subspace fits are desired. Accordingly, some observations on the performance of subspace methods in "reduced order" cases are developed. KEY WORDS. callzation.
1
Subspace estimation, autonomous model, system identification, source lo-
INTRODUCTION
Subspace methods have becomes an attractive numerical approach to practical problems of modern signal processing and control. The framework of subspace methods has evolved simultaneously in source localization [1], [2], [6], [8], and system identification [4], [10]. Although these application areas have different physical origins, the mathematical structure
14
P.A. Regalia
of the problems they aim to solve are laced with parallels. The intent of this paper is to provide a brief overview of the structural similarities between system identification and source localization. To the extent that a common objective may be established for seemingly different application areas, numerical algorithms in one area find an immediate range of applications in neighboring areas. Our presentation is not oriented at the numerical algorithm level, but rather abstracted one level to the common algebraic framework which unerlies subspace methods. Section 2 reviews the underlying signal structure suited for subspace methods, in terms of an autonomous model plus white noise. Section 3 interprets the underlying signal structure in the context of multivariable system identification. Section 4 then shows how this same signal structure intervenes in the broadband source localization problem, and stresses similarities in objectives with the system identification problem. Section 5 then examines the approximation obtained in a particular system identification problem when the order chosen for the identifier is too small, as generically occurs in practice where real data may not admit a finite dimensional model. We shall see that subspace methods decompose the available data into an autonomous part plus white noise, even though this may not be the "true" signal structure. 2
BACKGROUND
Most subspace methods are designed for observed vector-valued signals (denoted by {y(.)}) consisting of a usable signal {m(.)} and an additive disturbance term {b(.)}, as in y(n) = s(n) + b(n) We assume that these (column) vectors consist of p dements each. In most subspace applications, one assumes that the disturbance term is statistically independent of the usable signal, and that it is white: ~ Ip, m = n; O, m~n. (Here and in what follows, the superscript * will denote (conjugate) transposition). E[b(n) b*(m)]
l
The usable signal is often assumed to satisfy an autonomous model of the form Bo s(n) + B1 , ( n - l ) + . . . + B M s ( n - M ) = O,
for all n,
(1)
for some integer M. Here the matrices B k axe "row matrices," i.e., a few row vectors stacked atop one another. Examples of this relation will be brought out in Sections 3 and 4. If we consider the covariance matrix of the usable signal, niz
E
s(n) s(n-1) ." s(n-M)
[']* =
Ro R~ .' R~
where Rk =
= R'_k,
R1 ... R M Ro ". ' "'. ". R1 ... R~ Ro
A = Tg.M,
Subspace Methods in System Identification
15
then the assumption that {s(.)} satisfiesan autonomous model implies that
RI IBm1iol
R~
Ro
".
'
B~
:
".
".
R~
'
=
o
(2)
o
~M This suggests that the matrix coemcients B k of the autonomous model could be found by t h . n111] ,'.n.'e-, nf t11-, m , , t r l v ~..
id".ntlf,,in,,,
16
P.A. Regalia
linear system concatenated into a single time-series vector: s(n)-- [sl(n)] } , -
r inputs
s2(n) }r outputs Suppose the inputs and outputs are related by a linear system with unknown transfer function H(z):
s2(n) = HCz) sl(n).
(3)
(This notation means that s2(n) is the output sample at time n from a linear system with transfer matrix H(z) when driven by the sequence {sl(.)}, with sl(n) the most recent input sample). Suppose that H(z) is a rational function. This means that H(z) can be written in terms of a matrix fraction description [3] H(z) = [D(z)] -1 N(z)
(4)
for two (left coprime) matrix polynomials N(z) = N o + N l z - I + ' ' ' + N M z -M D(z) = Do + D1 z -1 + " " + DM Z - M The relations (3) and (4) combine as
[rx(p-r)] (r • r)
D(z) s2(n) = N(z) sl(n) which is to say Do s2(n) + D1 s2(n-1) + . . . + DM s 2 ( n - M ) = No sl(n) + N1 s l ( n - 1 ) + . . . + N M s l ( n - M ) This in turn may be rearranged as
[No 9 -Do "
' 9.. " N . , ! - D M ]
t3o
B,
BM
for all n.
, ( . -.1 )
=0,
for alln,
s(n-M)
which leads to a simple physical interpretation: The autonomous relation (1) holds if and only if the signal {s(.)) contains the inputs and outputs from a finite-dimensional linear system. We also see that the coefficients of a matrix fraction description may be concatenated into null vectors of the matrix T~M. One subtle point does arise in this formulation: The dimension of the null space of ]~M may exceed r (the number of outputs) [2], in such a way that uniqueness of a determined matrix fraction description is not immediately clear. Some greater insight may be obtained by writing the subspace equations in the frequency domain. To this end, consider the power spectral density matrix OO
s.(ei ) =
e
(p • p)
which is nonnegative definite for all w. At the same time, let B(z) = Bo + BI z -I + ' " + BM z -M,
[(p-r) • p]
Subspace Methods in System Identification
17
where the matrix coefficients Bk are associated with the autonomous signal model. By taking Fourier transforms of (2), one may verify B(e j~)
,Sa(ejw) = O,
for all w.
(5)
The row vectors of B(e j~) then span the null space of ,.qs(ejw) as a function of frequency. In case B(z) consists of a single row vector, it is straightforward to verify that the smallest order M for the vector polynomial B(z) which may span this nun space [as in (5)] is precisely the smallest integer M for which the block Toeplitz matrix T~M becomes singular [as in
(2)]. For the system identification context studied thus far, one may verify that the spectral density matrix ,Sa(ej~) may be decomposed as
3"(eJ~) : [H(eJ )] Sa~ (e j~) [Ip-r H*(eJ~)], where OO
=
•
is the power spectral density matrix of the input sequence {sl(')}. This shows that the rank of S,(e jw) is generically equal to the number of free inputs to the system (= p - r ) , assuming further dependencies do not connect the components of {sl(')} (persistent excitation). As the outputs {s2(.)} are filtered versions of {Sl(.)}, their inclusion does not alter the rank of
Next, we can observe that [N(e jw) -D(eJW)]
,S,(e jw) : 0,
for all w.
With a little further work, one may show that provided the r row vectors of the matrix [N(e jw) -D(eJW)] are linearly independent for (almost) all w (which amounts to saying that the normal rank of [N(z) -D(z)] is full), then the ratio [D(z)] -1N(z) must furnish the system H(z). Note that if N(z) and D(z) are both multiplied from the left by an invertible matrix (which may be a function of z), the ratio [D(z)] -1 N(z) is left unaltered. As a particular case, consider a Gramian matrix [N(e jw) -D(eJW)l [ -D*(eJ~)] N*(ejw) : F(e/~)
F*(eJ'),
with the r x r matrix F(z) minimum phase (i.e., causal and causally invertible). It is then easy to verify that the row vectors of the matrix [F(eJ~)] -1 [N(e j~) -D(eJ~)]
18
P.A. Regalia
are orthonormal for all w, and thus yield orthonormal spanning vectors for the null space of 3a(eJW). The system identification problem is then algebraically equivalent to finding orthonormal spanning vectors for the null space of 8a(eJW).
4
BROADBAND SOURCE LOCALIZATION
Source localization algorithms aim to determine the direction of arrival of a set of waves impinging on a sensor array. We review the basic geometric structure of this problem, in order to obtain the same characterization exposed in the system identification context. The outputs of a p-sensor array are now modelled as
y(n) =
b(n)
where the elements of s(u) are mutually independent source signals, and where {b(.)} is an additive white noise vector. The columns of A(z) contain the successive transfer functions connecting a source at a given spatial location to the successive array outputs. Each column of ~4(z) is thus called a steering vector, which models spatial and frequential filtering effects proper to the transmission medium and array geometry. The problem is to deduce the spatial locations of the emitting sources, given the array snapshot sequence {y(n)}. The spectral density matrix from the sensor outputs now becomes oo
8~(e jw) = =
~
E[y(n) y*(n-k)] e -jk~ so( s
+
provided the noise term {b(.)} is indeed white. Here 8,(e jw) is thepower spectral density matrix of the emitting sources. Provided the number of sources is strictly less than the cc number of sensors, the first term on the right-hand side (i.e., the signal-induced component) is rank deficient for all w. It turns out that its null space completely characterizes the solution of the problem [6], [8]. For if we find orthonormal spanning vectors for the null space of the signal induced term, then we will have constructed the orthogonal complement space to that spanned by the columns of .A(eJw). This, combined with knowhdge of the array response pattern versus emitter localition, is sufficient to recover the information contained in A(z), namely the spatial locations of the sources [8]. More detail on constructing orthonormal spanning vectors for this null space, in the context of adaptive filtering, is developed in [2], [6] and the references therein. We can observe some superficial similarities between the system identification problem and the source localization problem. In both cases, the usable signal component induces a rank-deficient power spectral density matrix, and in both cases, the information so sought (a linear system or spatial location parameters) is entirely characterized by the null space of the singular spectral density matrix in question. Accordingly, algorithms designed for subspace system identification can be used for subspace source localization, and vice-versa. See, e.g., [5], [9].
Subspace Methods in System Identification 5
19
T H E U N D E R M O D E L L E D CASE
The development thus far has, for convenience, assumed that the order M of the autonomous signal model (1) was available. In practice, the required filter order M is highly signaldependent, posing the obvious dilemma of how to properly choose M is cases where a priori information on the data is inadequate.
20
P.A. Regalia
to show the result. Note also that, as expected, a vector in the null space of 7~M yields the coefficients of the ARMA model in question. Suppose that the actual sequence (sl(')} and (s2(')} are related as OO
=
(6)
k=O To avoid an argument that says we can increase the chosen order M until we hit the correct value, we assume that the transfer function OO
k=O is infinite dimensional (i.e., not rational). In this case, the covariance matrix T~M will have full rank irrespective of what value we choose for the integer M. Consider then trying to find an ARMA signal model which is "best compatible" with RM. To this end, let two sequences {-~1(')} and {s2(')} be related as M M k=O k=O where the coefficients { a k ) and {bk) remain to be determined. We note here that, with fi(n) = [h(n)Jh(n)]and ...
~(n-1)
7?.M -- E
.
"1"
[
,
w we shall have bo --a0 "
= o,
(7)
v so that T~Mis always singular. Set now 2(n)
where the disturbance terms {bl(')} and (b2(')} are chosen to render {~r(.)} compatible with the true data. In particular, the covariance matrix built from 3)(n), . . . , ~ ( n - M ) takes the form A
A
RM + Rb, where 7~b is the covariance matrix built from {bl(')} and {b2(')}. This becomes compatible
Subspace Methods in System Identification
21
with the true data provided we set A
~b -- ~ M -- ~M. A
Given only that ~ M is singular, though, a standard result in matrix approximation theory gives [l~b[[ = [ITEM- ~M[[ >_. ~,~i,(7~M). As a particular case, consider the choice ~M = ~M-
A~I.
This retains a block Toeplitz structure as required, but is now positive sen~-definite. We
P.A. Regalia
22
we have similarly
1-
Amdn
so that the matching of cross correlationterms from (8) may be expressed as
hk=
i
~,~'
~=0,1,...,M;
0 (= hk),
k = -I,-2,...,-M.
This shows that the first few terms of the impulse response of H(z) agree to within a factor 1 / ( 1 - Amd,~)with those produced by the true system H(z). Similarly, we can also observe that OO
Z[.~(~).~(~-k)] = ~ ~ ~+k =A ~k, i=O if {81(')} is unit-variance white noise. This gives the kth term of the autocorrelation sequence associated to H(z). For the reconstructed model, we likewise have OO
z{~(~) ~(~-k)l = ~ 1 -- Amin
~ h~ h~+k = (1 - ~ ) ~ k . i=0
The matching properties (10) then show that
I r0- Amin ~k=
'i-Ami~
'
k=O;
rk
'I- A,,i~
'
k = 1,2,...,M;
which reveals how the correlationsequences compare. A slightlydifferentstrategy is investigated in [7],which builds the function H(z) from an extremal eigenvector of a Schur complement of ~M. This can improve the impulse and correlationmatching properties considerably [7]. 6
CONCLUDING REMARKS
We have shown how the system identification and source localization problems may be addressed in a common framework. In both cases, the desired information is characterized in terms of spanning vectors of the null space of a power spectral density matrix. Numerical methods for determining the null space have appeared in different state space formulations [2], [4], [6], [10], which are, for the most part, oriented around orthogonal transformations applied directly to the available data. We have also examined the influence of undermodeUing. Some recent work in this direction [7] shows that subspace methods correspond to total least-squares equation error methods. This can yield weaker subspace fits in undermodelled cases compared to Hankel norm or 7/oo subspace fits. The reduced order system so constructed, however, is intimately connected to low rank matrix approximation, which in turn can be expressed in terms of
Subspace Methods in System Identification
23
interpolation properties relating the impulse and correlation sequences between the true system and its reduced order approximant. More detail on these interpolation properties is available in [7]. References
[1] J. A. Cadzow, "Multiple source location--The signal subspace approach," IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 38, pp. 1110-1125, July 1990. [2] I. Fijalkow, Estimation de Sous.Espaces Rationnels, doctoral thesis, Ecole Nationale Sup~rieure des Tdldcommunications, Paris, 1993. [3] T. Kailath, Linear Systems, Prentice-Hall, Englewood Cliffs, NJ, 1980. [4] M. Moonen, B. DeMoor, L. Vandenberghe, and J. VandewaUe, "On- and off-line identification of linear state-space models," Int. J. Control, vol. 49, pp. 219-232, 1989. [5] P. A. Regalia, "Adaptive IItt filtering using rational subspace methods," Proc. ICASSP, San Francisco, March 1992. [6] P. A. Regalia and Ph. Loubaton, "Rational subspace estimation using adaptive lossiess filters," IEEE Trans. Signal Processing, vol. 40, pp. 2392-2405, October 1992. [7] P. A. Regalia, "An unbiased equation error identifier and reduced order approximations," IEEE Trans. Signal Processing, vol. 42, pp. 1397-1412, June 1994. [8] G. Su and M. Morf, "The signal subspace approach for multiple wide-band emitter location," IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 31, pp. 15021522, December 1983. [9] F. Vanpoucke and M. Moonen, "A state space method for direction finding of wideband emitters," Proc. EUSIPCO-94, Edinbourgh, Sept. 1994, pp. 780-783.
[lo] M. Verhaegen and P. Dewilde, "Subspace model identification (Pts. 1 and 2)," Int. J. Control, vol. 56, pp. 1187-1241~ 1992.