Robustness in H∞ identification

Robustness in H∞ identification

Automatica 36 (2000) 1685}1691 Brief Paper Robustness in H identi"cation夽  P. M. MaK kilaK *, J. R. Partington Automation and Control Institute,...

165KB Sizes 2 Downloads 52 Views

Automatica 36 (2000) 1685}1691

Brief Paper

Robustness in H identi"cation夽  P. M. MaK kilaK *, J. R. Partington

Automation and Control Institute, Tampere University of Technology, P.O. Box 692, FIN-33101 Tampere, Finland School of Mathematics, University of Leeds, Leeds LS2 9JT, UK Received 3 November 1998; revised 18 October 1999; received in "nal form 29 February 2000

Abstract We consider identi"cation (in both the time domain, for suitable choices of inputs, and the frequency domain) using a deterministic formulation that is less conservative than the usual worst-case approach, in that it allows for a certain number of outliers. Two new identi"cation methods, the Orlicz and the Top K method, respectively, are proposed and analysed. Theoretical H error bounds for  the identi"ed models are provided, and the proposed methods are illustrated by a numerical example.  2000 Elsevier Science Ltd. All rights reserved. Keywords: System identi"cation; Worst-case identi"cation; Noise models; H-in"nity; Robustness

1. Introduction We shall consider two possible formulations of an H identi"cation experiment: for general background  and further details of these we refer the reader to the survey articles (Milanese & Vicino, 1993; MaK kilaK , Partington & Gustafsson, 1995; Ninness & Goodwin, 1995) and the book (Partington, 1997). The "rst is a time-domain experiment, in which a discrete-time stable linear time-invariant system is speci"ed by R y(t)" g(k)u(t!k)#v(t) (1) I with bounded input u, that we may choose, (without loss of generality ""u"" "1), output y, that we measure, an  unknown impulse response g3l , and a disturbance  v that is small in some sense. (Here ""u"" ,sup "u(t)"  RY denotes the l norm of the sequence u.) In the standard  formulation of this experiment, it is assumed that there is an e'0 such that "v(t)"4e for all t.

(2)

夽 This paper was not presented at any IFAC meeting. This paper was recommended for publication in revised form by Associate Editor T. Sugie under the direction of Editor R. Tempo. * Corresponding author. Tel.: 00-358-3-365-2332; fax: 00-358-3-3652340. E-mail address: [email protected]." (P. M. MaK kilaK ).

Note that v may well be partly deterministic, and may be correlated with u. Now g determines a transfer function  G(z)" g(k)zI, (3) I which is analytic and bounded on the open unit disc (that is it lies in H ) and also continuous on the closed disc.  That is, G lies in the disc algebra A(#). Clearly, G(1/z) is the usual z-transform of (g(k)). The H norm of G is given  by ""G"" ,sup "G(z)""sup "G(z)".  X X The problem data, y(0),2, y(N!1) are used to construct an identi"ed model g , with corresponding trans, fer function GI . The worst-case identi"cation error is , de"ned by e " sup ""g!g "" , (4) , C ,  TRXC in other words, we maximize the identi"cation error over all possible disturbances of size e. The identi"cation algorithm is said to converge robustly, if, for each g3l ,  e P0 as NPR and eP0. (5) , C In the frequency-domain experiment, we are provided with problem data a "G(z )#v , k"1,2, N, (6) I I I where z ,2, z are points on the unit circle (cf. Helmicki,  , Jacobson & Nett, 1991).

0005-1098/00/$ - see front matter  2000 Elsevier Science Ltd. All rights reserved. PII: S 0 0 0 5 - 1 0 9 8 ( 0 0 ) 0 0 0 7 4 - 1

P. M. Ma( kila( , J. R. Partington / Automatica 36 (2000) 1685}1691

1686

Again, we want to identify G, by forming a model GK , , based on the problem data so that the worst-case error e " sup ""G!GK "" , C ,  T XC

(7)

satis"es the worst-case convergence condition (5). The question naturally arises whether such identi"cation experiments can succeed if the size of the disturbances is measured by a smaller norm than the l norm  * this permits a larger collection of disturbances of size e so that, for example, outliers in the measured data can be tolerated. In MaK kilaK and Partington (1999), this question was analysed for the time-domain l experiment, in  which the model error is measured in the l norm   ""(g(k))"" " "g(k)".  I

(8)

It is our purpose here to give a complete answer for the H norm, both in the time- and frequency-domain  experiments. The paper is organized as follows. Section 2 starts with a brief review of the relevant mathematical results that are instrumental for the development of the new robustness results for H identi"cation. For the "rst time,  robust convergence results are given for H identi"ca tion for a setup that allows a milder condition than the l norm condition on uniformly bounded noise. New  identi"cation algorithms, including the Orlicz and the Top K method, respectively, are suggested and analysed. Furthermore, some generalizations are studied, including a new construction of a Schauder basis in l . This is used  to obtain robustly convergent H algorithms for a rich  model set. A numerical example is considered in Section 3. Some conclusions are given in Section 4.

2. Worst-case convergence in H



identi5cation

2.1. An abstract set-up for identixcation In order to study robust convergence issues in a more general context than is usual, the following abstract framework was set up in MaK kilaK and Partington (1999). Let ( f ) be a uniformly bounded sequence of funcI tionals on a separable normed space X, and suppose that we have a sequence of seminorms "" . "" on 1, or ",, each , dominated by the in"nity norm, so that ""(v ,2, v )"" 4 max "v " for all (v ,2, v )  , , I  , XIX,

(9)

and that we want to form a model to identify x3X, using the values (a ), "( f (x)#v )), . We would like the I I I I I algorithm to be robustly convergent in the sense that the identi"ed model x( satis"es ""x( !x""(d provided that , ,

N is large and ""(v ,2, v )"" is small enough. Thus, we  , , de"ne e " sup ""x( !x"" (10) , C , T, XC and analyse the question whether we obtain worst-case convergence in the sense of (5), for each x3X. Theorem 1 (MaK kilaK & Partington, 1999). There is a robustly convergent algorithm for obtaining an approximate model x( to x such that , lim e "0 for each x3X, (11) , C , C if and only if there is a number d'0 such that for each x we have lim inf ""( f (x),2, f (x))"" 5d""x"", (12)  L L L and in this case it can be achieved by an algorithm minimizing ""( f (z)!a )"" over z lying in suitable subspaces of X. H H L Theorem 1 was used to show that we cannot expect robust convergence in the l identi"cation problem if we  weaken the conditions on the disturbance too much. Corollary 2 (MaK kilaK & Partington, 1999). Let 14p(R. Then there is no robustly convergent algorithm for worst-case time-domain identixcation in l in the  presence of disturbances measured by the seminorms





N 1 L ""(v ,2, v )"" " "v "N .  L L I n I

(13)

However, it was shown that certain other norms (Orlicz norms) can indeed be used in place of the l N norms as a measure of the size of the disturbance, if, for example, the input sequence is a Galois sequence (cf. MaK kilaK , 1991). We shall see that similar results obtain in the H situation.  2.2. Frequency-domain identixcation To analyse the H identi"cation experiments, we shall  begin with the simplest case to discuss, namely the frequency-domain experiment (6). Let M51 be an integer. We de"ne norms on 1+ or "+ as follows. Let K be a "xed integer satisfying 14K4M, and let ""(w ,2, w )"" "K\ sup "w ". (14)  + ) I 1$+ 2 +, 1) IZ1 Here "S" denotes the cardinality of a set S. Special cases here include K"1, when we obtain the l norm, and  K"M, when we obtain a normalized l norm,  (1/M) + "w ". I I

P. M. Ma( kila( , J. R. Partington / Automatica 36 (2000) 1685}1691

An alternative norm that we shall use is the particular Orlicz norm used in MaK kilaK and Partington (1999). Let

(t) be a positive, continuous monotonically increasing convex function such that (0)"0. We can de"ne





+ ""(w ,2, w )"" "inf o'0: ("w "/o)41 . (15)  + ( I I For more on such norms see Lindenstrauss and Tzafriri, (1977, p. 115). The usual l norms arise from the funcN tions (t)"tN. We now specify the choice (t)"exp(2!2/t) for 04t41 and extend linearly on (1,R) with gradient 2, so that it remains convex. We now de"ne the norm ""w""("""w"" /""(1,2,1)"" . (16) ( ( This norm is dominated by the l norm. Also  ""(1, 0,2, 0)""("2/(2#log M). The following lemma will aid with calculations in H identi"cation. We recall the notation W x X for the  greatest integer less than or equal to x. Lemma 3. Let Z "+z ,2, z , be a set of distinct points +  + on the unit circle with maximum angular gap * . Let p be + a polynomial of degree n and let d satisfy 0(d(1. Then "p(z )"5d""p"" for at least K values of k, I  X. where K"W\B L +

(17)

Proof. Without loss of generality ""p"" "1, and this is  attained at some point f on the unit circle. By Bernstein's inequality (see e.g. Zygmund, 1988, Vol. II, p. 276 or Partington, 1997, p. 99), ""p"" 4n.  Hence, "p(z )"5"p(f)"!n"z !f"5d, (18) I I provided that "z !f"4(1!d)/n. By the de"nition of I * , there are at least K points of Z satisfying this + + condition, and the result follows. 䊐 This now enables us to show that the conditions of Theorem 1 are satis"ed for frequency-domain H identi "cation, with suitable choices of the norms de"ned above, provided that * P0 as MPR. The next the+ orem has a rather technical appearance but it essentially expresses the fact that, in general, a function f3A(#) will be large at several points of Z if it is well-approximated + by polynomials. Theorem 4. Let Z be as in Lemma 3. Let f3A(#) and let + p be a polynomial of degree n with "" f!p"" "d. Then,  provided that 14K4M and K4W 2(1!d)/n* X, one + has ""( f (z ),2, f (z ))"" 5d("" f "" !d)!d  + ) 

(19)

1687

and 2#log K ""( f (z ),2, f (z ))""(5 d("" f "" !d)!d.  +  2#log M

(20)

It follows that, provided that * P0, we can choose + K"K(M) such that lim inf ""( f (z ),2, f (z ))"" 5d"" f "" (21)  + )  + and, provided that * tends to zero as fast as some negative + power of M, we can choose n"n(M)PR and K"K(M) depending on M such that there is an g'0 with log K(M)/log M5g, in which case we also have lim inf ""( f (z ),2, f (z ))""(5dg"" f "" .  +  +

(22)

Proof. We obtain (19) by estimating the K-norm of ( p(z ),2, p(z )), given that at least K of the values of  I p are of modulus at least d""p"" , and using the elementary  inequalities ""p"" 5"" f "" !d and   ""( f (z ),2, f (z ))"" 5""( p(z ),2, p(z ))"" !d. (23)  + )  + ) To obtain (20) we note that, with o"[(2#log K)/2]d""p"" , we obtain  + ("p(z )"/o)5K exp(2!(2#log K))"1 (24) I I and hence ""(p(z ),2, p(z ))"" 5o, which again yields the  I ( correct inequality for f. The "nal part of the theorem follows immediately, since we can choose a suitable p"p for each n such that "" f!p ""P0. 䊐 L L Typical values of the constants involved are * &2p/M (if the points are approximately equally + spaced), d"1/2, n&M?, K&M\?, where a is chosen according to the hypothesized noise model * if the number K of expected outliers is small, then we should choose a a little less than 1, and obtain a high-degree polynomial model; if it is large, we should choose a closer to zero. Let us see how this yields a constructive algorithm for frequency-domain identi"cation. Given M, one can seek an identi"ed model GI of degree n by minimizing ""( p(z )!a ,2, p(z )!a )""( (25)   + + over polynomials p of degree n. This is a di!erentiable norm, and the problem is one of convex optimization. Alternatively, one can use the norm "" ) "" , which leads to ) a linear programming problem. The error bound is then bounded by 4 ""G!GI "" 4 (dist(G, P )#""(v )""()#dist(G, P ), (26)  dg L I L

P. M. Ma( kila( , J. R. Partington / Automatica 36 (2000) 1685}1691

1688

where P is the subspace consisting of all polynomials of L degree n (see MaK kilaK & Partington, 1999). The term involving the distance of the plant from P is one that L commonly arises in the analysis of modelling errors (cf. Helmicki et al., 1991; Partington, 1997), and gives an explicit uniform bound over relatively compact uncertainty sets in which the unknown system may be supposed to lie. For example, if G is analytic in a disc of radius o'1 and bounded by M there, then dist(G, P )4Mo\L\ (see Pinkus, 1985). L It is possible to use other model sets consisting of rational functions, because a version of Bernstein's inequality holds for these too. See Dudley Ward and Partington (1996) for further details. Proposition 5. Let 14p(R. Then there is no robustly convergent algorithm for worst-case frequency-domain identixcation in A(#) in the presence of disturbances measured by the seminorms given in (13). Proof. It is su$cient to verify that there is no d'0 such that









N 1 L lim inf " f (z )"N 5d"" f "" (27)  I n L I for all f3A(#). Let N be a positive integer and divide the circumference of the unit circle into N equal arcs I ,2, I . Then, for at least one of these, say I , we have  , H 1 1 lim inf "+k : 14k4n : z 3I ,"4 . (28) I H n N L If we now choose a function f in the disc algebra such that "" f "" "1 and " f (z)"41/N, except for f3I , then  H 1 L 1 lim inf " f (z )"N 4lim inf " f (z )"N I I n n L L IXI Z'H I 1 # lim sup " f (z )"N I n L IXI A'H 1 1 2 4 # 4 . N NN N









Since N can be taken arbitrarily large, there is no d satisfying (27). 䊐 2.3. Some generalizations The norm de"ned in (14) is a special case of the following situation. Let K be a "xed integer satisfying 14K4M, and let





""(w ,2, w )"" " sup K\ "w "O  + O ) I + 2 , 1$  + 1) IZ1

O , (29)

where 14q(R. Recall that "S" denotes the cardinality of a set S. The earlier cases in (14) are obtained with q"1. For q"2 and K"M, we obtain a normalized l norm (that is, a least squares setup),  [(1/M) + "w "]. I I The K-norm part of Theorem 4 generalizes to the above q}K-norm setup. Theorem 6. Let Z be as in Lemma 3. Let f3A(#) and let + p be a polynomial of degree n with "" f!p"" "d. Let  14q(R. Then provided that 14K4M and K4W2(1!d)/n* X, one has + ""( f (z ),2, f (z ))"" 5d("" f "" !d)!d. (30)  + O )  It follows that, provided that * P0, we can choose + K"K(M) such that lim inf ""( f (z ),2, f (z ))"" 5d"" fy"" .  + O )  +

(31)

The proof is completely analogous to the K-norm part of Theorem 4 and is hence omitted. We have so far described robust convergence analysis mostly for the particular case that the linear subspace X used in the estimation algorithms is spanned by the L "rst n unit polynomials +zG\,L , where z denotes the G unit 1. In time domain these unit polynomials correspond to the unit vectors +e ,L , where e "+e (k), is G G G G IY de"ned by e (k)"d . Here d "1 if i"j, and d "0 G G\ I GH GH otherwise. The unit vectors +e , have the very nice property that G they form a (Schauder) basis in both l and l (see e.g.   Partington (1997) for de"nitions and results on bases). Furthermore, +zG\, and +e , are an orthonormal GY G GY basis of H (#) and l , respectively. Note that +zG\,   GY is not a basis for the disc algebra A(#) (see e.g. Partington, 1997). Fortunately, in system identi"cation, it su$ces to consider a subspace of A(#) in which +zG\, is a basis. GY Namely, let A (#) denote the subspace of A(#) consist?Q ing of functions f (z)" a zG\, which are analytic in GY G the open unit disk and whose Taylor coe$cients are absolutely summable, that is "a "(R. GY G One reason for the popularity of "nite impulse response (FIR) models in various signal processing applications is the ease with which the performance of FIR model-based estimation methods can be analysed due to the simple nature of the associated unit bases +zG\, GY and +e , . G GY In"nite impulse response (IIR) models have several attractive properties but are more di$cult to analyse and may su!er from stability problems. If the linear subspace X is spanned by n IIR vectors, such as e.g. Laguerre or L Kautz basis vectors, there are some technical di$culties in analysis especially for non-Hilbert space settings, such as for l performance analysis. 

P. M. Ma( kila( , J. R. Partington / Automatica 36 (2000) 1685}1691

These technical di$culties are partly due to the fact that orthonormal IIR bases of H (#) (or of l ) are   typically not bounded bases in A (#) (or in l ), and so ?Q  an H (#) expansion of an arbitrary element in A (#) in  ?Q terms of an IIR orthonormal basis need not converge to the element in the H (nor in the l ) sense, nor need not   even be well de"ned in A (#). Hence it is di$cult to use ?Q the powerful Hilbert space orthonormality-based techniques to analyse the A (#) case when the subspaces ?Q +X , are spanned by IIR elements. Note, however, that L there are other techniques not based on orthonormality ideas that can be used to analyse more general rational model sets in H identi"cation (Partington, 1997). How ever, we shall here analyse a situation which generalizes the unit polynomial setup to more general FIR setups (this case has special interest due to the importance of FIR models in various signal processing and control applications). So we shall now generalize from the case of X being L spanned by the unit polynomials +z ,L to a situation G\ G where X is spanned by more general FIR vectors. L The reason why this is useful is due to the following result. We shall in the sequel let (x, y)" x(k)y(k) IY denote the usual inner product in l .  Theorem 7. Let the sequence +b , be an orthonormal G GY basis in l , such that b "+b (k), satis"es b (k)"0 for  G G IY G any k's(i)!1, where s(i)5i is some strictly monotonically increasing integer-valued function of i. (That is, the b are orthonormal FIR vectors.) Let q(i) denote the number G of non-zero b (k) in b . Suppose that sup q(i)(R. FurG G G thermore, let sup "b (k)"(R. (32) G I GY Then, +b , is a basis of l and the basis expansion of G GY  each element g3l is given by the corresponding l basis   expansion of g, that is, g" c b , where the equality GY G G means convergence in the l norm and  c "(g, b ), i51. G G

(33)

Similarly, the polynomials +bK (z), QG\b (k)zI, form G I G GY a basis of A (#) and for any g( (z)" g(k)zI3A (#) ?Q IY ?Q g( " c bK , (34) G G GY where the equality means convergence in the H norm, and  the c are given by (33). G Proof. As ""b "" "1 and q(i) is uniformly bounded in i, it G  follows that sup ""b "" (R, G  G

(35)

1689

that is, +b , is a uniformly bounded sequence in the G l sense. Let us estimate 





"c "" g(k)b (k) 4""g"" sup "b (k)"(R, G G  G GY GY IY I GY (36) that is +c , 3l , where we have used (32). Denote G GY  h" c b . Then GY G G ""h"" 4 "c """b "" (R, (37)  G G  GY by (35) and (36). Hence h is indeed an element in l . But  h!g is the zero element in l . Hence h"g also in l .   Let "nally g" a b be an arbitrary expansion of GY G G g in l in terms of +b ,. But then f"g! K a b  G G G G satis"es, for any large enough m, " f (k)"(1 for all k, so that a b must tend to g also in the l sense. But as GY G G  +b , is a basis of l it follows that g has a unique G GY  l expansion in terms of +b ,, and hence a "c for all  G G G i51. The A (#) part of the theorem follows trivially ?Q from the l part. This completes the proof. 䊐  Theorems 1, 6, and 7 imply that we can get robust convergence in A (#) with respect to the noise norm (29) ?Q using an identi"cation algorithm +GI , minimizing , ""a !GI "" over the subspace X "+bK (z),L if K and I , O ) L G G n are chosen suitably as a function of the number of available data points N. For this to be possible the measurements +a ,, , given by (6) should be evaluated I I at points z ,2, z on the unit circle such that their  , maximum angular gap * tends to zero when NPR. , Let s(n)&n@ for some b51. Typical values of the other constants involved are then * &2p/N (if the points , are approximately equally spaced), d"1/2, n&N?, K&N\?@, where 0(a(1 is such that ab(1. Remark. Note that taking q"2, K"N, that is the usual least squares setup, does NOT result in robust convergence, but in worst-case divergence (Partington & MaK kilaK , 1995; Partington, 1997). 2.4. Time-domain identixcation Harrison, Ward and Gamble (1996), in the course of analysing the sample complexity of worst-case H iden ti"cation, showed that by a suitable choice of inputs, the time-domain identi"cation experiment could be reduced to the frequency-domain experiment. To do this, one chooses an integer N, and concatenates inputs of the form c "(cos((N!1)kj), cos((N!2)kj),2, cos(kj), 1) I and

(38)

s "(sin((N!1)kj), sin((N!2)kj),2, sin(kj), 1), I

(39)

1690

P. M. Ma( kila( , J. R. Partington / Automatica 36 (2000) 1685}1691

where 04k(N and j"2p/N. If the true system G(z)" g( j )zH is really given by a polynomial of degree at most N!1, then the convolution g*c yields as one of I its output values the quantity ,\g( j )cos( jkj) and g*s H I yields as one of its output values the quantity ,\g( j )sin( jkj). This enables one easily to produce the H value of G(uI), where u"eGH is an Nth root of unity. For general G, we do not obtain exactly the value of G(uI) by this method: however, provided that g3l (our  standing assumption), if we write P G and P g for the , , truncations of G and g to N terms, then, under the assumption that the entire input string u has l norm at  most 1, we are able to obtain the value of (P G)(uI) to , within an error of at most 2""g!P g"" , and hence we ,  obtain the value of G to within an error of at most 2""g!P g"" #""G!P G"" , which is at most ,  ,  3""g!P g"" . ,  Thus, we can reduce the time-domain H identi"ca tion experiment to a frequency-domain experiment, and again we can tolerate disturbances that need not be small in the l norm.  The sample complexity of such an experiment is not too large in the H case (unlike in the l case). That is,   the number of measurements (length of the input/output sequence) required to identify all models of degree N to within an error of at most a "xed constant times ""v"" grows as a power of N, rather than the exponential growth seen in the l situation. We refer to Partington  (1997) for a fuller discussion of this point.

(A)

(B)

Noise distributed with amplitude uniformly distributed in [0,1/2], and with random argument. The largest value was of size 0.4673. Noise constructed with amplitude distributed according to a Cauchy distribution, renormalized to have maximum absolute value 0.2 and mean absolute value 0.0404. The three largest values were of sizes 0.2, 0.1877 and 0.0835.

The opportunity was taken to compare the e!ects of minimizing several quantities: the classical least-squares method (l norm), the Chebyshev method (l norm), the   sum of absolute deviations (l norm), the norm "" ) ""  ) de"ned in (14) for values of K between 4 and 10, and "nally the Orlicz norm given above. In case (A), the "rst 3 methods produced errors of 0.1907, 0.2536, and 0.2225 respectively; the best K was K"8, with an error of 0.1495, and the Orlicz method produced an error of 0.1653. In case (B), the "rst 3 methods produced errors of 0.1275, 0.1599, and 0.1399 respectively; taking K"10 produced an error of 0.0949, and the Orlicz method produced an error of 0.1198. The best identi"ed model found, that with error 0.0949, was 0.0778z#0.0406z!0.0823z#0.0707z# 0.1346z#0.0524. Thus, in this example, the Orlicz and &Top K' methods show themselves to be better at "ltering out outliers than some simpler methods, as we would expect. The drawback is that they are computationally a little more expensive.

3. Numerical example Since the methods presented above have only been shown to be superior to classical techniques in worst-case error analysis, any &typical' numerical example is likely to be somewhat misleading. We are grateful to a referee for encouraging us to reconsider which sort of systems and disturbances are likely to provide the most illuminating results. Accordingly, we take as our unknown transfer function the continuous-time example G(s)"exp(!s)/ (s#2s#5). This system is rather poorly approximable by "nite-dimensional systems (asymptotically, the nth Hankel singular value, which gives a lower bound on the H approximation error by degree-n systems, is 1/pn  (Glover, Lam & Partington, 1990)); thus we shall see the e!ects of undermodelling as well as measurement disturbance. We transform to the disc, by writing z"(1!s)/ (1#s), and consider FIR (polynomial) models of degree 5, based on 42 equally spaced frequency-response measurements. Note that only half the measurements are used, since we require our model to satisfy GI (z )"GI (z). The e!ects of two di!erent noise models are compared:

4. Conclusions We have studied H identi"cation using a determinis tic (non-probabilistic) formulation that is less conservative than the usual worst-case approach, in that it allows for a certain number of outliers. More speci"cally, it allows more general, averaging, noise conditions than the standard supremum norm condition on uniformly bounded noise. The new theory has suggested two new identi"cation methods, the Orlicz and Top K methods, with good robustness properties. It would be of interest to study the rate of convergence of identi"ed models using the proposed methods for various classes of stable systems.

Acknowledgements Financial support to P.M.M. from the Academy of Finland (grant no. 40536) is gratefully acknowledged.

P. M. Ma( kila( , J. R. Partington / Automatica 36 (2000) 1685}1691

References Dudley Ward, N. F., & Partington, J. R. (1996). Robust identi"cation in the disc algebra using rational wavelets and orthonormal basis functions. International Journal of Control, 64, 409}423. Glover, K., Lam, J., & Partington, J. R. (1990). Rational approximation of a class of in"nite-dimensional systems I: Singular values of Hankel operators. Mathematics of Control, Signal and Systems, 3, 325}344. Harrison, K. J., Ward, J. A., & Gamble, D. K. (1996). Sample complexity of worst-case H-identi"cation. Systems Control Letters, 27, 255}260. Helmicki, A. J., Jacobson, C. A., & Nett, C. N. (1991). Control oriented system identi"cation: a worst-case/deterministic approach in H. IEEE Transactions on Automatic Control, 36, 1163}1176. Lindenstrauss, J., & Tzafriri, L. (1977). Classical Banach spaces I: sequence spaces. Berlin: Springer. MaK kilaK , P. M. (1991). Robust identi"cation and Galois sequences. International Journal of Control, 54, 1189}1200. MaK kilaK , P. M., & Partington, J. R. (1999). On robustness in system identi"cation. Automatica, 35, 907}916. MaK kilaK , P. M., Partington, J. R., & Gustafsson, T. K. (1995). Worst-case control-relevant identi"cation. Automatica, 31, 1799}1819. Milanese, M., & Vicino, A. (1993). Information based complexity and nonparametric worst-case system identi"cation. Journal of Complexity, 9, 427}446. Ninness, B. M., & Goodwin, G. (1995). Estimation of model quality. Automatica, 31, 1771}1797. Partington, J. R. (1997). Interpolation, identixcation and sampling. Oxford: Oxford University Press. Partington, J. R., & MaK kilaK , P. M. (1995). Worst-case analysis of the least-squares method and related identi"cation methods. Systems Control Letters, 24, 193}200. Pinkus, A. (1985). n-widths in approximation theory. Berlin: Springer. Zygmund, A. (1988). Trigonometric series. Cambridge: Cambridge University Press.

1691

Pertti M. Makila was born in Turku, Finland, in 1954. He received the M.Sc. and Ph.D. degrees in chemical engineering from Abo Akademic University, Turku, Finland, in 1978 and 1983, respectively. He has been a Senior Fulbright Scholar at the University of California, Berkeley, a Gendron Fellow at the Pulp and Paper Research Institute of Canada in Montreal and Vancouver. He was professor in control engineering at Lulea University of Technology, Lulea, Sweden, from 1994 to 1996. Since 1996, he has been professor in automation engineering at Tampere University of Technology, Tampere, Finland. His current research interests include robust control, system modelling and identi"cation, and signal processing.

Jonathan R. Partington was born in Norwich, UK, in 1955. He received the Ph.D. degree from the Pure Mathematics Department, University of Cambridge, in 1980. He continued to work in abstract functional analysis until 1985, when he moved to the Cambridge University Engineering Department as a Senior Research Associate. In 1989 he was appointed as a Lecturer in the School of Mathematics at the University of Leeds, where he is now Professor of Applied Functional Analysis. He has written two books, `An introduction to Hankel operatorsa (1988) and `Interpolatiopn identi"cation and samplinga (1997). His current research interests include applications of functional analysis to systems theory, particularly approximation and identi"cation.