Worst-case identification in ℓ2: linear and nonlinear algorithms

Worst-case identification in ℓ2: linear and nonlinear algorithms

Systems & Control Letters 22 (1994) 93-98 North-Holland 93 Worst-case identification in linear and nonlinear algorithms Jonathan R. P a r t i n g t...

347KB Sizes 0 Downloads 29 Views

Systems & Control Letters 22 (1994) 93-98 North-Holland

93

Worst-case identification in linear and nonlinear algorithms Jonathan

R. P a r t i n g t o n

School of Mathematics, University of Leeds, Leeds LS2 9JT, UK Received 9 January 1993 Revised 25 April 1993 Abstract: We consider worst-case identification in the E2 norm. Given an unknown system h ~ {1 one wishes to choose bounded inputs u e E~ such that given finitelymany corrupted output measurements {y(k): 0 < k < n} ofy = h • u + r/, where q is noise, assumed small, one can construct an approximation g with IIg - hIF2 ~ 0 as [Ir/[I®~ 0 and n ~ ~. It is shown that inputs can be chosen such that to identify a sequence of length n with an E2 error of O(llrlll~)one requires only O(n) measurements. A numerical example is included. Keywords." Worst-case identification; approximation; least squares; linear programming; input design.

1. Introduction O n e of the key p r o b l e m s of worst-case identification is the r e c o n s t r u c t i o n of a system from partial a n d i n a c c u r a t e information. F r o m this i n f o r m a t i o n one constructs a model, a n a p p r o x i m a t i o n to the true system, in such a way that in the limit as the n u m b e r of m e a s u r e m e n t s goes to infinity a n d the noise level goes to zero, the a p p r o x i m a t i o n converges to the true system in some suitable topology. I n this paper we shall be c o n c e r n e d with identifying a n u n k n o w n discrete-time system from a finite collection of i n p u t a n d o u t p u t measurements. Formally, let h = (h(0),h(1) . . . . ) ~ (1 c o r r e s p o n d to an u n k n o w n linear discrete-time system a n d let u = (u(0), u(1) . . . . ) e (oo be a given b o u n d e d i n p u t which we can choose. G i v e n c o r r u p t e d d a t a k

y(k)= ~h(j)u(k-j)+q(k) j=0

fork=0

..... n-l,

where t/, u n k n o w n , is a b o u n d e d noise sequence, the worst-case identification p r o b l e m in (p (p = 1 a n d p = 2 being the most i m p o r t a n t cases) is to construct a n a p p r o x i m a t i o n h,(y(O) . . . . . y(n - 1)) to h such that lim

n~oo E~O

sup Ilrill

IIh.

- hllp = 0.

(1.1)

<

We refer to e as the noise level. If (1.1) is satisfied, we say that u determines a robustly convergent algorithm. The (1 case has been considered by m a n y authors. It was s h o w n in [12, 20] that there are inputs u which do d e t e r m i n e a robustly c o n v e r g e n t algorithm. F o r example, one can use Galois sequences, which are sequences consisting only of l's a n d - l ' s such that every finite sequence of l's a n d - l ' s occurs consecutively as a subsequence. N o t e that we shall always assume that o u r inputs are b o u n d e d , i.e. u ~ (®, a n d there is n o loss of generality in a s s u m i n g that Ilulloo < 1. However, in [6-1 it was s h o w n that t a k i n g u to be a n impulse does n o t give a robustly convergent algorithm, a l t h o u g h these can be constructed given some a priori i n f o r m a t i o n o n h.

Correspondence to." Dr. J.R. Partington, School of Mathematics, University of Leeds, Leeds LS2 9JT, UK. 0167-6911/94/$07.00 © 1994 - Elsevier Science B.V. All rights reserved SSDI 0167-6911(93)E0057-N

J . R . P a r t i n y t o n / W o r s t - c a s e identification i n / 2

94

In [13] a necessary and sufficient condition for u to determine a robustly convergent algorithm was given, namely that there exists a n u m b e r 6 > 0 such that IIh*ulL~ ~ 611h111 f o r a l l h e f l . A m o r e general result was given in [17]. A finite version of the p r o b l e m is of interest. Suppose that h(t) = 0 for t > n. H o w m a n y measurements are needed to determine h to within an error bounded by a fixed constant times the noise level e? Results in [4, 7, 18] indicate that in the fl case a n u m b e r of measurements growing exponentially in n is needed. Although the (2 n o r m is not c o m m o n l y used for robust control design (unlike the E1 and H ~ norms), it does have a hybrid interpretation as a system-induced n o r m ((2 inputs to b o u n d e d outputs). It is also of great importance in other contexts such as L Q G control. We shall show in this work how to design inputs for the (2 case such that in order to obtain n coefficients accurately it is only necessary to m a k e O(n) measurements. Naturally, if the ~ 2 e r r o r is bounded by Ks then the (1 error is bounded by Ke x/~, and so for small values of n these inputs perform quite well in the (1 case too. Note that in both the t~l and (2 cases the n u m b e r E is an absolute lower b o u n d for the identification error, since a change of e in h(0) changes each coefficient of y by at most e and hence is indistinguishable from a noise contribution b o u n d e d by e.

2. Input design for the {2 case Suppose now that h=(h(O),h(1) . . . . ) is an u n k n o w n (real) system and that u = ( u ( 0 ) . . . . . u ( N - 1 ) , 0 , 0 . . . . ) a finite (real) input with Ilullo~ < 1. We write h, for the truncated system h, = (h(0) . . . . . h ( n - 1),0,0 . . . . ), which we expect to be close to h in the (1 norm. (For the neglected c o m p o n e n t s of h to have a small effect on the output we do require h ~ {1, even though we are only seeking (2 convergence.) N o w suppose we have available the corrupted output measurements y = h , u ( k ) + q ( k ) , for k = 0 . . . . . n + N - 2 with IIt/ll~ _< 5, N o t e that ly(k) - (h,,u)(k)l <_ e + IIh - h,[ll, i.e. the identification p r o b l e m for h reduces to the identification problem for h, with an extra noise c o m p o n e n t of IIh - h, ll 1- So, for the m o m e n t , let us suppose that h is itself a finite sequence of length n. We propose three m e t h o d s of obtaining identified models. The first is the classic least-squares m e t h o d and is linear; the second uses linear p r o g r a m m i n g and is a nonlinear algorithm in the spirit of [17]; the third is a linear algorithm which is slightly more complicated to implement. We write Y ( z ) = ~ y ( k ) z k, H(z) = Y~h(k)z k and U(z) = ~ u ( k ) z k, and similarly for E, F and G below.

Theorem 2.1. Let h, u, y and tl be as above. Suppose that I ~ = J u ( k ) z * l ~ m on the unit circle. Define e = (e(0) . . . . , e(n - 1), 0, 0 . . . . ) to be a solution to the least-squares minimization problem n+N-2

min e = (e(0) . . . . . e(n

~ 1),0,0,...}

( ( e , u - y)(k)) 2.

(2.1)

k=0

Let f = (f(0) . . . . . f (n - 1),0, 0 . . . . ) denote a solution to the minimization problem

min f = (f(O) . . . . . f(n

max 1),0,0 . . . . ) O < k < n + N

h(f* u -

y)(k)l.

(2.2)

2

Also define 9 = (9(0) . . . . . 9 ( n - 1),0,0 . . . . ) to correspond to the orthogonal projection of the L2 function G(z) = Y(z)/U(z) = ~ _ ~ 9(k)z k onto the space of polynomials o f degree at most (n - 1). Then the {2 errors

J.R. Partington / Worst-case identification in

f2

95

between e, f and g and the true system h are bounded by

lie - hl12 < 2 e ~ f n + N - 1/m, I I f - hl12 -<

2ex/n

(2.3)

+ S - 1/m and

IIg - hl12 < e ~ / n + N - 1/m. Thus,

if

N = n

and

m > ctx/~

for

some

absolute

constant

ct > O,

then

lie - hll2 < 2x/~e/ct,

life - hll2 < 2x/~e/ct and IIg - hll2 < x/~e/~.

Proof. W e observe first that IIh*u - YII2 < e x / n + N - 1, hence lie . u - YII2 < e x / n + N - 1 a n d thus II(e - h)*ull2 < 2 e ~ / n + N - 1. Also since I I f * u - h*ull® < 2e, a n d ( f - h ) . u has length at m o s t n + N - 1, we have that I I ( f - h)*ull2 < 2 e x / n + N - 1. T h e first t w o results n o w follow on w o r k i n g in the H a r d y space H2 of functions analytic o n the unit disc a n d with s q u a r e - s u m m a b l e T a y l o r coefficients. N o t i n g that e . u c o r r e s p o n d s to E(z)U(z), we see that lie - hl12 = lIE - nlln2 < II(E - H)UII211U-1IIL~ < II(E - n ) u [ 1 2 / m = II(e - h)*ull2/m, with a similar result for f. T o o b t a i n the final result, n o t e that tlg - hl12 < II Y(z)/U(z) - n(z)ll2 -< (1/m)ll Y(z) - n(z)U(z)ll <_ ~x/n

+g

- 1/m.

[]

W e r e m a r k here that the m i n i m i z a t i o n p r o b l e m in (2.1) is a s t a n d a r d least-squares one, a n d that (2.2) can be solved easily by linear p r o g r a m m i n g - see e.g. 1-19, Ch. 2] for details. Calculating g is slightly m o r e c o m p l i c a t e d as o n e m u s t reject the anti-analytic part of Y(z)/U(z), but is still fairly standard.

Corollary

2.2. L e t h be any sequence in f l and u an input o f length n which satisfies I ~ , s ~ u ( k ) z k l >_

~x/~ for I z l - - 1 . With r1 as before, and y = h , u + r / , let e = ( e ( O ) . . . . . e ( n - 1 ) , O , O . . . . ), f = (f(O) . . . . . f (n - 1),0,0 . . . . ) and g = (g(O) . . . . . g(n - 1 ) , O , O , . . . ) be as in Theorem 2.1 with N = n. Then the f2 errors between e, f and g and the true system h are bounded by lie - hl12 < 2 ~ / 2 ( e + IIh - h.lll)/~ + IIh - h.l12, I I f - hll2 < 2 x / ~ ( e + IIh - h.lll)/~ + IIh - h.l12 and

(2.4)

IIg - hll2 --< x//2(e + IIh - h.lll)/ct + Ilh - hnll2, where h. is the truncated system (h(0) . . . . , h(n - 1), 0, 0 . . . . ).

Proof. G i v e n the o b s e r v a t i o n s m a d e at the b e g i n n i n g of this section, T h e o r e m 2.1 implies that Ile - h.l[ 2v/~e

+ Ilh - h.lll)/~, with similar b o u n d s on I l f - hll2 a n d

lift - hll2; hence the result follows.

[]

H o w e v e r , it is an o p e n p r o b l e m w h e t h e r there exists a fixed ~ > 0 a n d real p o l y n o m i a l s u. of degree (n - 1) w h o s e coefficients lie between + 1 a n d - 1 with m i n lu.(z)l > ctx/~.

(2.5)

Izl = 1

H o w e v e r , we can o v e r c o m e this difficulty by a trick which enables us to pass f r o m c o m p l e x inputs of length n to real inputs of length 3n - 1. It is k n o w n that there do exist complex p o l y n o m i a l s Un of degree (n -- 1) with coefficients of m o d u l u s one such that minlz I = ]IUn(Z)I/X//-n--.I as n--, oo: however, these are n o t k n o w n explicitly (see [8, 9]). In I-1, 2]

J.R. Partington / Worst-case identification in g2

96

some more explicit polynomials are given whose coefficients have modulus at most one and which satisfy (2.5) for suitable values of cc Now if u = v + iw is a complex polynomial, decomposed into two real polynomials, we can use the real sequence corresponding to fi(z) = v(z) + z 2n- 1W(Z) since if II(e - h), ~ll ~ -< 2e, then I[(e - h)*ull ~ _< (2x/2)e and we can divide out u as in the proof of Theorem 2.1. That is, a complex input of length n can be replaced by a real one of length 3n - 1, with a slightly worse error bound. The real case is important in its own right, and for small values of n some simple explicit inputs can be obtained from [10, 3]. Using inputs which only take the values + 1, Table 1 lists some sequences [from [10, 3]) which seem to be the most useful in that the lower bound is not much less than x/n. Some computer searching produced some polynomials which have more general coefficients and larger minima. These are listed in Table 2. It would be interesting to know what the optimal polynomials of each degree actually are. A plot of the last of these (evaluated at [z[ = 1) is given in Figure 1. We shall give some examples of the use of these polynomials in the next section. Note that [3] does provide some explicit sequences of arbitrarily long length giving lower bounds on the circle of around n ° ' 4 3 : hence determining, say, 100 coefficients from an experiment of length about 200 is certainly feasible. The following result has limited practical use, but it does not give an infinite form of the finite result of Theorem 2.1. Corrollary 2.3. There are absolute constants K and L, and an infinite bounded input u such that if h is any sequence o f length n, then h can be identified to within an f2 error Ke by taking at most L n measurements o f h , u + tl.

Proof. We use the B y r n e s - K 6 r n e r - K a h a n e polynomials as in [8, 9], for orders n v, decomposing them into real and imaginary parts, vv and wv, as above, and constructing a power series U(Z) = (I)1 + Z2nl+lWl) "~ zd2(D2 "~ z2n2+lw2) + " ' " -~- zdp(up "~ z2np+lwp) "~ "" "'

where dp+l >~ d p + 4 n v + 2 (which guarantees that, if the degree of h is at most np, the part of h , u corresponding to vv and wp does not interfere with earlier convolutions). For example, n v = 4 p and dp = 2 x 4 p would do. Then to identify h, when h has up to n v coefficients, to within an error at most Ke requires at most dp+ 1 measurements, by the arguments used above. [] Moreover, u does determine a robustly convergent algorithm for Z2 identification which converges for any h e f l (cf. Corollary 2.2). Table 1 n

Sequence

Lower b o u n d on circle

5 11 13

(1, - 1, 1, 1, 1) (-1,1, -I,1,1, -1, -1,1,1,1,1) (1, - 1 , 1 , - 1 , 1 , 1 , - 1 , - 1,1,1,1,1,1)

1.658 2.06 3.03

Table 2 n

Sequence

Lower b o u n d on circle

4 5 7

(0.43,0.95, 1, - 1) (0.5, 1,1, - 1,0.5) (0.5,1,1,0, - 1 , 1 , - 0 . 5 )

1.38 1.73 2

J.R. Partington / Worst-case identification in E2

97

-1

-2

-33

i

i

-2

-1

0

L

t

1

2

Fig. 1.

W e r e m a r k here t h a t the choice of i n p u t s for the worst-case identification is related to the 'persistently exciting' c o n d i t i o n used in stochastic identification. See, e.g. [11] for details.

3. N u m e r i c a l e x a m p l e C o n s i d e r the system with transfer function 3(z 2 + 1) H(z) - z2 + 2z + 5 - 0.6 - 0.24z + 0.576z 2 - 0.1824z 3 - 0.0422z 4 + 0.0534z 5 - 0.0129z 6 - 0.0055z 7 + 0.0048z 8 - 0.0008z 9 + • . . , which has been c o n s i d e r e d for the p u r p o s e s of Ho~ identification in I-5, 14-16]. (Recall that for a system with impulse response h = (h(0),h(1) . . . . ) we a d o p t the c o n v e n t i o n of writing its transfer function as n ( z ) = E /o~ = o h(k)z k.) T w o i n p u t s from Section 2 were used. Firstly, the first 7 coefficients were d e t e r m i n e d using the i n p u t (0.5, 1,

1, 0, - 1, 1, - 0 . 5 ) (with ~ = 2 / x / ~ ) with noise level e = 0.05 (so that 13 o u t p u t m e a s u r e m e n t s were made, each c o r r u p t e d by an a m o u n t r a n d o m l y chosen in the interval [ - 0 . 0 5 , 0.05]). T h e n the first 13 coefficients were d e t e r m i n e d using the i n p u t (1, - 1, 1, - 1, 1, 1, - 1, - 1, 1, 1, 1, 1, 1) (with ct = 3 . 0 3 / x / ~ ) with noise level e = 0.025 (so t h a t 25 o u t p u t m e a s u r e m e n t s were made). All the three m e t h o d s of T h e o r e m 2.1 were used, with the following results. N o t e t h a t in the two e x a m p l e s c o n s i d e r e d e a n d 9 were a l m o s t identical: h o w e v e r this need n o t always be the case. First case Solution: e = (0.61, - 0.25, 0.57, - 0.20, - 0.029, 0.051, - 0.0027). Errors: ~z: 0.030; H a n k e l n o r m : 0.039; Ho~: 0.048; f l : 0.079; II h - hn II x: 0.012; IIh - h.l12: 0.0074; f2 e r r o r b o u n d (2.4): 0.2394. Solution: f = (0.63, - 0.26, 0.56, - 0.19, - 0.041, 0.034, 0.0073). Errors: ga2: 0.048; H a n k e l n o r m : 0.059; H~o: 0.085; ~1: 0.12; Ilh - h,I] x: 0.012; IIh - h, ll2: 0.0074; E2 e r r o r b o u n d (2.4): 0.2394.

98

J.R. Partinoton / Worst-case identification in fz

S o l u t i o n : g = (0.61, - 0.25, 0.57, - 0.20, - 0.029, 0.052, - 0.0027). Errors: (2: 0.030; H a n k e l n o r m : 0.040; H ~ : 0.048; {1: 0.078; IIh - h. II1: 0,012; IIh - h, II2: 0.0074; {2 e r r o r b o u n d (2.4): 0.1234.

Second case S o l u t i o n : e = (0.596, - 0.247, 0.581, - 0.187, - 0.049, 0.054, - 0.0096, 0.0009, 0.0088, - 0.0006, - 0.0032, 0.0030, - 0.0051). Errors: E2: 0.016; H a n k e l n o r m : 0.019; H~: 0.025; {1: 0.052; IIh - h, II1:0.00011; IIh - h, II2: 0.000076; {2 e r r o r b o u n d (2.4): 0.0846. Solution: f = (0.597, - 0 . 2 5 0 , 0 . 5 7 8 , - 0 . 1 8 0 , - 0 . 0 4 3 , 0 . 0 6 0 , - 0 . 0 1 3 , 0 . 0 0 2 5 , 0 . 0 0 6 0 , - 0 . 0 0 3 1 , - 0 . 0 0 5 3 , -0.0016, -0.0014). Errors: {2: 0.016; H a n k e l n o r m : 0.021; Ho~: 0.030; {1: 0.044; Ilh - h, lll: 0.00011; IIh - hnll2: 0.000076; {2 e r r o r b o u n d (2.4): 0.0846. Solution: 9 = (0.596, - 0.247, 0.580, - 0.187, - 0.050, 0.054, - 0.010, 0.0011,0.0086, 0.0002, - 0.0038, 0.0035, - 0.0058). Errors: (2: 0.017; H a n k e l n o r m : 0.020; H~: 0.026; {1: 0.054; Ilh - h, II1:0.00011; IIh - h, ll2: 0.000076; (2 e r r o r b o u n d (2.4): 0.0423.

References [1] E. Belier and D.J. Newman, The minimum modulus of polynomials, Proc. Amer. Math. Soc. 45 (1974) 463-465. [2] G. Benke, On the minimum modulus of trigonometric polynomials, Proc. Amer. Math. Soc. 114 (1992) 757-761. [3] F.W. Carroll, D. Eustice and T. Figiel, The minimum modulus of polynomials with coefficients of modulus one, J. London Math. Soc. 16 (2) (1977) 76-82. [4] M.A. Dahleh, T. Theodosopoulos and J.N. Tsitsiklis, The sample complexity of worst-case identification of F.I.R. linear systems, Systems Control Lett. 20 (1993) 157-166. [5] G. Gu and P.P. Khargonekar, Linear and nonlinear algorithms for identification in H ~ with error bounds, IEEE Trans. Automat. Control 37 (1992) 953-963. [6] C.A. Jacobson, C.N. Nett and J.R. Partington, Worst case system identification in 11: optimal algorithms and error bounds, Systems Control Lett. 19 (1992) 419-424. [7] B. Kacewicz and M. Milanese, On the optimal experiment design in the worst-case E1 system identification, in: Proc. 31st 1EEE Conf. on Decision and Control (1992). [8] J.-P. Kahane, Sur les polyn6mes fi coefficients unimodulaires, Bull. London Math. Soc. 12 (1980) 321 342. [9] T.W. K6rner, On a polynomial of Byrnes, Bull. London Math. Soc. 12 (1980) 219-224. [10] J.E. Littlewood, On polynomials ~ ___z", ~e'mlz ", z = e°i, J. London Math. Soc. 41 (1966) 367-376. [11] L. Ljung, System Identification for the User (Prentice-Hall, Englewood Cliffs, NJ, 1987). [12] P.M. Mhkil~i, Robust identification and Galois sequences, Int. J. Control 54 (1991) 1189-1200. [13] P.M. M~ikil/i and J.R. Partington, Worst-case identification from closed-loop time series, Proc. Amer. Control Conf., Chicago 1 (1992) 301 306. [14] J.R. Partington, Robust identification in H~, J. Math. Anal. Appl. 166 (1992) 428-441. [15] J.R. Partington, Robust identification and interpolation in Ha, Int. J. Control 54 (1991) 1281-1290. 1-16] J.R. Partington, Algorithms for identification in H~ with unequally spaced function measurements, Int. J. Control 58 (1993) 21 31. [17] J.R. Partington, Interpolation in normed spaces from the values of linear functionals, Bull. London Math. Soc., to appear. [18] K. Poolla and A. Tikku, On the time complexity of worst-case system identification, IEEE Trans. Automat. Control, to appear. 1-19] K. Trustrum, Linear Proorammin 9 (Routledge and Kegan Paul, London, 1971). [20] D.N.C. Tse, M.A. Dahleh and J.N. Tsitsiklis, Optimal asymptotic identification under bounded disturbances, Proc. 30th IEEE Conf. on Decision and Control, Brighton (1991) 623-628.