Copyright © IFAC System Identification, Copenhagen, Denmark, 1994
COMPUTING SOLUTIONS OF THE FRISCH SCHEME USING A POTENTIAL REDUCTION METHODl Johan DAVID and Bart DE MOOR2 ES.AT - X.tlo/ielt:e U"iveraiteit Le.ve", X.nli"ul Mereier/u" 9,4, jOOl Lnve" (Hever/ee), Bel,i.m., tel jt/16/tt 09 jl, Ju jt/16/tt 18 SS, em..il: 6.rt.4emoorOeaat.b/n"e"••e.6e
Abstract. Despite the fact that the Friach scheme hu been studied for quite some time now, only solutions where the rank of the matrix is one lower than its order are well understood and characterized. Good numerical methods to compute solutions with lower rank do not even exist. In this paper an algorithm is presented to compute solutions of the Frisch scheme when the resulting rank is lower than the order of the matrix minus one. The algorithm is based on a potential reduction algorithm for convex opti.mization. The problem is, however, not convex. The resulting solution is not always guaranteed to achieve the minimal possible rank. However, a reduced rank solution is always returned.
Key Words. Identification; opti.mizationj Friach scheme; linear matrix inequalities.
1. INTRODUCTION
Typically, however, measurements are noisy so that no linear relations between the data exist. A fundamental assumption on the noise is that it is additive and uncorrelated. The measured matrix M can then be written as:
The Frisch scheme deals with the problem of identifying linear relations between measured data. Suppose n variables are measured over m time instants. The measurements are aggregated in an m x n real matrix M. It is assumed that the number of measurements exceeds the number of variables: m > n 80 that the matrix M has more rows than columns. This assumption of overdetermination is of course necessary since otherwise the problem would be trivial. An existing linear relation will reveal itself via an n-vector :r that belongs to the kernel of the matrix M:
M=M+N, where M denotes the noisefree data and N the noise variables. The noise on the different channels is assumed to be un!:orrelated, hence: N' N = diagonal and ker(N) = O. No linear relations between the noise and the exact data exist either:
M:r = O. The number of independent linear relations is indicated by the algebraic rank r of M. The corank of M is defined as n - rank(M). The corank equals the number of linear independent relations between the variables.
The measured, exact and noise Grammians are denoted as:
1 The following text present. reaearch result. obtained within the framework of the Belgian programme on interunivenity attraction poles (IUAP-17,IUAP-50) initiated by the Belgian State - Prime Minister's Office - Science Policy Programming. The aclentific reaponaibility ia auumed by it. authOR.
The Frisch scheme can now be formulated as follows: Given a positive definite matrix A, find a diage>nal, positive definite matrix D 80 that:
2Bart De Moor ia a aenior reaearch aasociate with the N.F.W.O. (Belgian National FUnd for Scientific Re.earch).
1.
.A = A -
D is positive semi definite.
2. The corank of .A is maximal. 1137
The maximization of the corank is necessary since the goal is to find the maximum number of linear relations. Generically the rank of A, cannot be reduced arbitrarily. There exist a generic lower bound, which is called the WilsonLederman bound. See e.g. De Moor (1988) and references therein. The rank of A can be reduced under this bound only in special cases.
is a barrier function for the feasible domain. The constraint D > 0 has to be included otherwise some of the elements of D could become negative. To find reduced rank solutions t/J(D) has to be minimized. It might look strange that A - D is also included in the convex term that becomes +00 when A - D is singular. This does not matter if q is taken high enough as t/J(D) is equivalent to (q -1)logdet(A - D) -logdetD. It makes the implementation easier however. It is easily shown that q has to be larger than 2.
Although the Frisch scheme has been studied already for a long time, its solutions are only well characterized when the minimal rank of A is only 1 lower than the rank of A and necessary and sufficient conditions for this case are well understood now (Kalman,1982). In this paper a method to find solutions of lower rank is presented for the cases where these conditions are not met. It is based on a potential reduction method for convex optimization. In section 2, it is described briefly how the potential reduction method can be used. Section 3 contains some numerical examples. The conclusions are summarized in section 4.
t/J(D) is not a convex function, hence the theory of Nesterov and Nemirovsky (1993) cannot be applied directly. However, if the concave term is linearized, the theory can be applied. As the first term of t/J(D) is concave, the value of t/J(D) will always be lower than that ofthe linearization. Thus if the linearized function is minimized the corresponding value of t/J(D) will be even lower. The Newton direction of the linearized function is -H(D)-l g(D), where H(D) is the Hessian and g(D) is the gradient. It can then easily be computed as the solution of a least squares problem (Boyd and El Ghaoui, 1993):
2. THE POTENTIAL REDUCTION METHOD
_H- 1 9
In this section we describe how a potential reduc-
n
+
subject to D
where E, is an n x n-matrix with all zeros except a one on position (i, i). The Newton algorithm is:
Dlt+l = Dk - ~diag(H-lg). The damping factor et still has to be determined. To ensure that Dlt+l is again inside the feasible set if Dk is feasible, the Nesterov-Nemirovsky damping factor is taken. This damping factor depends on the so-called Newton decrement o(D) (Nesterov and Nemirovsky, 1993):
> 0, -
> 0,
o(D) = IIH(D)-1/2 g(D)II.
where D is a diagonal matrix. The solutions where A - D is rank deficient, are situated on the boundary of the convex set of positive (semi) definite matrices. The constraints D > 0 are also convex. The idea is to start inside the feasible set and to move to the boundary. Therefore the following objective function is constructed:
t/J(D) = qlogdet(A-D)-logdet
(~
L tI;(A - D)-l/~ Ei(A - D)-1/2I1F + i_I
The Frisch scheme can be reformulated as minimum rank problem involving linear matrix inequalities: D
argmin 111(1 - q) tI
tion method can be used to solve reduced rank linear matrix inequality (LMI) problems. The potential reduction method is an interior point method for the optimization of a linear objective function over a convex set. The Frisch scheme is however not a convex problem. Nevertheless, an extension of the potential method for convex optimization can be used to find solutions quickly. A more detailed description of the algorithm can be found in David (1994).
min Rank A - D
=
The Nesterov-Nemirovsky damping factor is then: et {
et
= ~ ~fo(D) > .25, =1 Ifo(D) ~ .25.
If v is known, o(D) can easily be computed as:
A~D )
o(D) =
It
v,(A - D)-1/2 E,(A -
The first term is -00 where A - D is rank deficient. The second term becomes +00 on the boundary of the domain D > 0 and A-D > O. It
+~tVl 1138
D)-1/211 F
1=1
Starting from an initial feasible point, the algorithm will move towards a minimum of 4J(D). Each of the updates will be inside the feasible set. To stop the algorithm the following criterion is used. It can be shown that when D is near the boundary of A - D > 0,
(
.,
.• . . . . . . . . . .
·2
Q I ~
IIE~=l tJi(A - D)-1/2 Ei(A _ D)-1/2 1I r ) 2
..
•
...
~'i
l-q
.!
(1) is a good estimate of the number of decreasing eigenvalues. When it is rounded towards the nearest integer it indicates the number of decreasing eigenvalues of A-D. Let ncf be that number. The stopping criterion is then: Sort the eigenvalues in ascending order and check if the eigenvalue indicated by ncf is smaller than a certain tolerance. If so, stop, else continue. The resulting rank of A - D is n - ncf.
.'
. •
•
•
•• _~ ... • f " ... &\1•••
"
u
Fig. 1. The logarithm of the eigenvalue8 for a simple 2 x 2 eXlUDple &8 a function of the number of iterations. One eigenvalue decreases to O. The resulting ra.nk of A - D iB thus 1. Note how the eigenvalue reduces linearly on thiB logarithmic plot.
To start the algorithm needs an initial point inside the feasible set. We suggest to start from the analytic center of the constraints A - D > 0 and D > 0 (Boyd and El Ghaoui, 1993). The minimum rank problem is, however, not convex. There is not a single minimum. The intuitive idea behind the algorithm is that if the rank of A - D is lower on a point on the boundary than in other boundary points, 4J(D) will be steeper towards this point. Thus it is more likely that the method will converge to this point. Numerical experiments show that this is often the case. However, the optimization method can get stuck in other minima than the optimal one. Still these minima are reduced rank solutions. This is illustrated in the next section.
boundary of the feasible region consists of the axes z 0 and y O. The curved line represents the boundary where det(A - D) = O. All the points on that boundary represent matrices of rank 1. The algorithm starts in the analytic center of the feasible region: (3.49,3.06). In a few steps the algorithm moves to a singular point on the boundary: (5.86,5.13).
=
=
In figure 2 also the paths followed by the optimization algorithm are shown starting from 40 points around the analytic center. All the paths tend to the boundary of the feasible region where the rank of A - D equals 1. In the previous example, all the rank deficient matrices had rank 1. Consider the following 2 x 2 matrices where the minimal possible rank is 0:
3. NUMERlCAL EXAMPLES In this section some numerical examples are shown. First 2 x 2 examples are solved, to illustrate some facts graphically.
0) A = ( 10 0 5 .
Consider the following example:
Again the algorithm is started from the analytic center of the feasible region. In this case two eigenvalues tend to O. The different steps converge to the intersection of the two boundaries z = 10 and y = 5 where the rank of A - D is O. This is shown in figures 3 and 4.
A=(~ ~). The lowest possible rank for A - D is 1. The rank minimization algorithm starts at the analytic center of A - D > 0 and D > O. The evolution of the eigenvalues is shown in figure 1. If D is represented by:
In figure 4 also the paths followed by the optimization algorithm are shown starting from 40 points around the analytic center of the feasible region. All the paths tend to the corner point of the boundary of the feasible region where the rank of A - D equals O.
The steps of the algorithm can be represented in an zy-plane. This is shown in figure 2. Also the boundaries of the feasible region are plotted. The
The method is not restricted to small examples. Consider a 10 x 10 matrix A 10 . 'The numerical 1139
•
o .:.
., -2
• •• • ••• • ••
• •• • • •• •
• •• • • •• • ••• • • •• • ••• • • •• • •
-7
.
I
e
I
o
I
I
5
•
I
10
15
20
•••b., of h.,.\1•••
4
Fig. 3. The logarithm of the eigenvalues for a special 2 x 2 example as a function of the number of iterations. Two eigenvalue decrease to O. The resulting rank of A - D is thus O. Remark how the eigenvalues reduces linearly on this logarithmic plot.
3
, o -:------------------------- -
o
_ ,_ _ ,_t
e
values are shown in the appendix. This matrix is a positive definite matrix of full rank. The matrix is constructed such that the minimal rank for the Frisch scheme is known. A is constructed as A = BB' + Db with B E jRIOX3, a random matrix. D 1 is a diagonal matrix with strictly positive elements. The minimal obtainable rank of A - D is thus 3. The eigenvalues as a function of the iterations are shown in figure 5. 7 eigenvalues decrease. The rank of the resulting matrix is thus 3. The optimal D is exactly Dl' Note that this is not a typical order 10 example, since the WilsonLederman bound for n = 10 equals 6.
••
o
,------------------------- o 2
As a last example the evolution of the eigenvalues for a random 7 x 7 matrix A 7 are shown in figure 6. The numerical values of the entries are shown in the appendix. Two eigenvalues decrease. The resulting rank of A - D is 5. This is higher than the Wilson-Lederman bound, which is 4 for a 7 x 7 matrix. This is correct as the WilsonLederman bound is only a generic lower bound. The resulting elements of Dare:
Fig. 2. The upper figure shows the steps taken by the algorithm starting from the analytic center of the feasible region. The curved line is the boundary where A - D is rank deficient. The algorithm quickly moves from the analytic center of the feasible region towards the boundary. The lower figure shows the paths followed starting from 40 points around the analytic center. All the paths move to the boundary where the rank of A - D equals 1.
D = diag (0.008258
2.475 0.2606
0.0109 0.8981
0.01552 0.01883)
4. CONCLUSIONS In this paper an algorithm is presented to find reduced rank solutions for the Frisch scheme. The algorithm is based on an algorithm for convex optimization. This problem is however not convex. 1140
....................................... •••...•.........••.•.............•••.•. ••••............•••••••••••.•..........
...... '".. ....... ...... '" "" .. ...... • ............. ........... . . '"
Q I ~
'0
•
••
~
.. • .. •• '" •
oS ..
•
.••••::••!!•••• a. a.
.
_.. a.
...
e
..
' ' .!-----,~-~II,--...,"--20~--: ...,...--""lI:--~..---,~.·
5,f-'~----------------,1lIH
• •
• • • b.r of
".r.\10••
• • Fig. 5. The logarithm of the eigenvalues for an artificially constructed 10 x 10 example as a function of the number of iterations. Seven eigenvalues decrease to O. The resulting rank of A - D is thus 3.
10
. . - _ _ q.a ,0
a
.
.. 5+-;r---------------=.H , ,'
3
I I I
2
:
,
o
I I I I
~--------------------------
o
e
a
10
., Q
Fig. 4. The upper figure shows the steps ta.ken by
I
the algorithm starting from the analytic center of A - D > 0 and D > O. The point with the lowest rank of A - D is (10,5). The algorithm moves quickly towards that point. The lower figure shows the paths followed by the optimization algorithm starting from 40 points around the analytic center. All the paths move to the corner point of the boundary where the rank of A - D is O.
.
••• .I •• •• •• ! ! ··· •
I I I I I I I ... ....
•
~
.. ... ;;
• •
oS
• •
•
·11
-"•
,
I
.0
12
• • •bel' of hn.tlo••
. . .
Fig. 6. The logarithm of the eigenvalues for a random 7 x 7 example as a function of the number of iterations. Two eigenvalues decrease to O. The resulting rank of A - D is thus 5.
1141
The algorithm can evolve to points that are not optimal. However, the solutions are always of reduced rank. The same algorithm can also be used to compute solutions for extensions of the Frisch scheme. E.g. if it is known that certain cross-correlations exist between channels. This can be done by introducing non-zero parameters in off-diagonal elements of D. The only restriction is that D has to remain symmetric (and positive definite).
5. REFERENCES Boyd, S. and L. El Ghaoui (1993). Method of centers for minimizing generalized eigenvalues. Linear Algebra and Application8, 188189,63-113. David, J. (1994). Algorithm8 for AnalY8i8 and De8ign of Robu8t Controller8, Ph.D. thesis, Dept. Elec. Eng., K.U.Leuven, 1994. De Moor, B. (1988). Mathematical concepb and technique8 for modelling of 8tatic and dynamic 8y8tem8, Ph.D. thesis, Dept. Elec.
Eng., K.U.Leuven, Leuven. Kalman, R.E. (1982). System identification from noisy data. In: Proc. 1nl. Symp. on Dynamical SY8tem8 (Gainesville, Florida, 1981) (A. Bednarek, Ed.), Academic Press, New York. Nesterov Y. and A. Nemirovsky (1993). Interior point polynomial method8 in convex programming: Theory and application8, SIAM.
APPENDIX
A 10
=
A7 =
3.67 1.78 1.78 1.57 -1.68 -0.215 1.295 -0.020 -1.644 -0.797 2.474 0.905 0.247 -0.185 4.011 1.468 2.127 0.736 -0.348 0.928
6.157 -5.434 -5.434 13.45 4.433 -4.839 1.944 -0.442 -1.17 4.976 1.227 -1.737 2.56 -3.578
-1.68 1.295 -0.215 -0.020 2.822 -1.429 -1.429 1.954 0.710 -0.662 -0.714 1.269 -0.468 0.449 -2.125 2.146 -1.87 1.25 2.319 -1.479
-1.644 2.474 -0.797 0.905 0.710 -0.714 -0.662 1.269 1.756 -1.418 -1.418 3.654 -0.142 0.353 -2.073 3.922 -0.940 1.111 0.037 0.435
4.011 2.127 0.247 -0.185 1.468 0.736 -0.468 -2.125 -1.87 0.449 2.146 1.25 -0.142 -2.073 -0.940 0.353 3.922 1.111 0.699 0.589 0.347 0.589 6.453 2.492 0.347 2.492 1.879 -0.614 -0.776 -1.558
4.433 -0.442 -1.17 1.227 -3.578 -4.839 1.944 4.976 -1.737 2.56 9.128 -5.036 -2.579 -2.126 0.975 -5.036 6.401 4.124 -0.608 3.822 -2.579 3.822 0.278 6.669 3.881 -2.126 4.124 -1.911 3.88 7.352 0.975 -0.608 6.122 0.278 -1.911
1142
-0.348 0.928 2.319 -1.479 0.037 0.435 -0.614 -0.776 -1.558 3.582