Automatica 37 (2001) 573}580
Brief Paper
Robust maximum likelihood estimation in the linear model夽 Giuseppe Cala"ore *, Laurent El Ghaoui Dipartimento di Automatica e Informatica, Politecnico di Torino, Cso Duca degli Abruzzi 24, 10129 Torino, Italy Electrical Engineering and Computer Sciences Department, University of California at Berkeley, USA Received 15 April 1999; revised 4 July 2000; received in "nal form 6 October 2000
Abstract This paper addresses the problem of maximum likelihood parameter estimation in linear models a!ected by Gaussian noise, whose mean and covariance matrix are uncertain. The proposed estimate maximizes a lower bound on the worst-case (with respect to the uncertainty) likelihood of the measured sample, and is computed solving a semide"nite optimization problem (SDP). The problem of linear robust estimation is also studied in the paper, and the statistical and optimality properties of the resulting linear estimator are discussed. 2001 Elsevier Science Ltd. All rights reserved. Keywords: Robust estimation; Distributional robustness; Least squares; Convex optimization; Linear matrix inequalities
1. Introduction The problem of estimating parameters from noisy observed data has a long history in engineering and experimental science in general. When the observations and the unknown parameters are related by a linear model, and a stochastic setting is assumed, then the application of the maximum likelihood (ML) principle (see for instance the monograph Berger & Wolpert, 1988) leads to the well-known least-squares (LS) parameter estimate. However, the well-established ML principle assumes that the true parametric model for the data is exactly known, a seldom veri"ed assumption in practice, where models only approximate reality (Knight, 2000, Chapter 5). This paper introduces a family of estimators that are based on a robust version of the ML principle, where uncertainty in the underlying statistical model is explicitly taken into account. In particular, we will study estimators that maximize a lower bound on the worst-case (with respect
夽
This paper was not presented at any IFAC meeting. This paper was recommended for publication in revised form by Associate Editor T. Sugie under the direction of Editor Roberto Tempo. This work was supported in part by Italy CNR funds. Research of second author partially supported via a National Science Foundation CAREER award. * Corresponding author. Tel.: #39-011-564-7066; fax: #39-011564-7099. E-mail address: cala"
[email protected] (G. Cala"ore).
to model uncertainty) value of the likelihood function. Next, we will analyze the case of linear robust estimation, and discuss the bias, variance and optimality properties of the resulting estimator. The undertaken minimax approach to robustness is in the spirit of the distributional robustness approach discussed in Huber (1981) for parametrized families of distributions. In our case, the minimax is performed with respect to unknown-butbounded parameters appearing in the underlying statistical model. The techniques introduced in this paper may also be viewed as the stochastic counterpart of the deterministic robust estimation methods that appeared recently in Chandrasekaran, Golub, Gu, and Sayed (1998) and El Ghaoui and Lebret (1997). In particular, the model uncertainty will here be represented using the linear fractional transformation (LFT) formalism (El Ghaoui and Lebret, 1997), which allows to treat cases where the regression matrix has a particular form, such as Toeplitz or Vandermonde, and where the uncertainty a!ects the data in a structured way. Robust estimation trades accuracy which is best achieved using standard techniques as LS or total least squares (TLS) (Van Hu!el & Vandewalle, 1991), with robustness, i.e. insensitvity with respect to parameters variations. In this latter context, links between robust estimation, sensitivity, and regularization techniques, such as Tikhonov regularization (Tikhonov & Arsenin, 1977), may be found in BjoK rk (1991), Elden (1985) and El Ghaoui and Lebret (1997) and references therein.
0005-1098/01/$ - see front matter 2001 Elsevier Science Ltd. All rights reserved. PII: S 0 0 0 5 - 1 0 9 8 ( 0 0 ) 0 0 1 8 9 - 8
574
G. Calaxore, L.El Ghaoui / Automatica 37 (2001) 573}580
In this paper, we will study a mixed-uncertainty problem, where the regression matrix is a!ected by deterministic, structured and norm-bounded uncertainty, while the measure is a!ected by Gaussian noise whose covariance matrix is also uncertain. In this setting, we will compute a robust (with respect to the deterministic model uncertainty) estimate via semide"nite programming (SDP). An example of application of the introduced theory to the estimation of dynamic parameters of a robot manipulator from real experimental data is presented in Section 4. Notation. For a square matrix X, X Y 0 (resp. X ) 0) means X is symmetric, and positive-de"nite (resp. semide"nite). j (X), where X"X2, denotes the max imum eigenvalue of X. ""X"" denotes the operator (maximum singular value) norm of X. For P31L"L, with P Y 0, and xN 31L, the notation x&N(x , P) means that x is a Gaussian random vector with expected value x and covariance matrix P.
To cast our problem in a ML setting, the loglikelihood function L is de"ned as the logarithm of the a posteriori joint probability density of x,y L(x , *"x , y )"log ( f (x ) f (y )), Q Q V Q W Q where f , f are the probability density functions of x,y, V W respectively. Since x,y are independent Gaussian vectors, maximizing the log-likelihood is equivalent to minimizing the following function: l(x , *)"(x !x )2P\(* )(x !x ) Q N Q # (y !C(* )x )2D\(* )(y !C(* )x ), Q A B Q A where * is the total uncertainty matrix, containing the blocks * , * , * . We notice that, for "xed *, computing N B A the ML estimate reduces to solving the following standard norm minimization problem: x( (*)"arg min ""F(*)x !g(*)"", +* V where
2. Problem statement We consider the problem of estimating a parameter from noisy observations that are related to the unknown parameter by a linear statistical model. To set up the problem, we shall take the Bayesian point of view, and assume an a priori normal distribution on the unknown parameter x31L, i.e. x&N(x , P(* )), where x 31L is the N expected value of x, and the a priori covariance P(* )31LL depends on a matrix * of uncertain paraN N meters, as it will be discussed in detail in Section 2.1. Similarly, the observations vector y31K is assumed to be independent of x, and with normal distribution y&N(y , D(* )), with y 31K, and D(* )31KK. The linear B B statistical model assumes further that the expected values of x and y are related by a linear relation which, in our case, is also uncertain y "C(* )x . A Given some a priori estimate x of x, and given the Q vector of measurements y , we seek an estimate of x Q that maximizes a lower bound on the worst-case (with respect to the uncertainty) a posteriori probability of the observed event. When no deterministic uncertainty is present on the model, this is the celebrated maximum likelihood (ML) approach to parameter estimation, which enjoys special properties such as e$ciency and unbiasedness (see for instance Berger and Wolpert (1988), Goodwin and Payne (1997), and Ljung (1987)). For the important special case of linear estimation, we will discuss in Section 3.2 how these properties extend to the robust estimator, and how the resulting estimate is related to the minimum a posteriori variance estimator.
D\(* )C(* ) D\(* )y B A ; g(*)" B Q . (1) P\(* ) P\(* )x N N Q If now * is allowed to vary in a given norm-bounded set, as precised in Section 2.1, we de"ne the worst-case maximum likelihood (WCML) estimate x( as 5!+* x( "arg min max ""F(*)x !g(*)"". (2) 5!+* V The WCML estimate provides therefore a guaranteed level of the likelihood function, for any possible value of the uncertainty. In Section 2.1, we detail the uncertainty model used throughout the paper, and state a fundamental technical lemma. F(*)"
2.1. LFT uncertainty models We shall consider matrices subject to structured uncertainty in the so-called linear-fractional (LFT) form M(*)"M#¸*(I!H*)\R,
(3)
where M, ¸, H, R are constant matrices, while the uncertainty matrix * belongs to the set D , where D "+*3D: ""*""41,, and D is a linear subspace. The norm used is the spectral (maximum singular value) norm. The subspace D, referred to as the structure subspace in the sequel, de"nes the structure of the perturbation, which is otherwise only bounded in norm. Together, the matrices M, ¸, H, R and the subspace D, constitute a linear-fractional representation of an uncertain model. We will make from now on the standard assumption that all LFT models are well-posed over D , meaning that det(I!H*)O0, for all *3D , see Fan, Tits, and Doyle
G. Calaxore, L.El Ghaoui / Automatica 37 (2001) 573}580
(1991). We also introduce the following linear subspace B(D), referred to as the scaling subspace: B(D)"+(S, ¹, G)"S*"*¹, G*"!*2G2 for every *3D,. LFT models of uncertainty are general and now widely used in robust control (Fan et al., 1991; Zhou, Doyle, & Glover, 1996) (especially in conjunction with SDP techniques, see for instance Asai, Hara, & Iwasaki, 1996), in identi"cation (Wolodkin, Rangan, & Poolla, 1997), and "ltering (El Ghaoui & Cala"ore, 1999; Xie, Soh, & de Souza, 1994). This uncertainty framework includes the case when parameters perturb each coe$cient of the data matrices in a (polynomial or) rational manner, as stated in the representation lemma in Dussy and El Ghaoui (1998). The main results in this paper rely on the following lemma, which provides a su$cient condition for a linear matrix inequality (LMI), see Boyd, El Ghaoui, Feron, and Balakrishnan, 1994 to hold for any allowed value of the uncertainty.
575
D\#¸ * (I!H * )\R , P\(* )"P\#¸ B B B B B N N * (I!H * )\R , be the LFT representation of the N N N N Cholesky factors of D(* ) and P(* ), respectively. Then, N N using the common rules for LFT operation (see for instance Zhou et al., 1996), we obtain an LFT representation of [F(*) g(*)]"[F g]#¸*(I!H*)\[R $
where F(*), g(*) are given in (1), * is a structured matrix containing the (possibly repeated) blocks * , * , * on A B N the diagonal, and F"[C2D\ P\]2; g"[y2D\ x2P\]2. Q Q
g "argmin g 5!+* VE subject to
SY0, ¹Y0 and
3.1. Robust ML estimation
M ¸ 2 M ¸ = I 0 I 0 !
R H 2 ¹ 0
I
G
G2 !S
R
H
0
I
Y0,
(5)
then the LFT model (3) is well posed for all *3D and [M2(*) I]=[M2(*) I]2Y0, ∀*3D . (6) Conditions (4) and (5) are also necessary when D"1NO, i.e. in the case of unstructured perturbation. The proof of this lemma is reported in Appendix A. Remark 2. The above lemma provides in general a su$cient condition only. A discussion on the tightness of this condition and its approximation error is out of the scope of this paper; general results and a further discussion may be found in Ben-Tal, El Ghaoui, and Nemirovskii (2000) and El Ghaoui, Oustry, and Lebret (1998).
(7)
Using the Schur complement rule (see for instance Boyd et al., 1994), the WCML estimation problem (2) may then be cast as a robust semide"nite optimization problem (SDP, see El Ghaoui et al., 1998):
Lemma 1. Consider the LFT model (3) in the matrix variable *31NO, and let = be a given real symmetric matrix. If there exist a triple (S, ¹, G)3B(D) such that (4)
R ], E
I
F(*)x!g(*)
(F(*)x!g(*))2
g
Y0, ∀*3D .
(8)
The WCML estimate x( is de"ned as the value of 5!+* x at the optimum of problem (8). However, the solution of the above problem is in general numerically hard to compute, (Ben-Tal et al., 2000; El Ghaoui et al., 1998). In order to obtain a computable solution, we shall apply the robustness lemma to the robust LMI constraint (8). In this way, we obtain a convex inner approximation of the feasible set, and the minimization of g subject to this new constraint will provide an upper bound on the optimal objective of (8). The so-obtained solution x( will be called a robust maximum likelihood esti0+* mate (RML). This is summarized in the following theorem. Theorem 3. The robust maximum likelihood estimate x( is obtained solving the SDP 0+* g "arg min g 0+* V1%2E
(9)
subject to 3. Robust estimation
(S, G, ¹)3B(D), S Y 0, ¹ Y 0,
In order to compute a robust estimate, we "rst set up the complete uncertainty model for (2) in LFT form. Let C(* ) be given in the LFT form as A C(* )"C#¸ * (I!H * )\R , and let D\(* )" A A A A A A B
Fx!g R x!R $ E Y 0, [(Fx!g)2 (R x!R )2] g $ E #(S, G, ¹)
576
G. Calaxore, L.El Ghaoui / Automatica 37 (2001) 573}580
where
subject to
#(S, G, ¹ )
(S, G, ¹)3B(D), SY0, ¹Y0,
I!¸¹¸2
!¸(¹H2#G)
"
!(H¹#G2)¸2 S!H¹H2!HG!G2H2
.
(10) The optimal upper bound g is exact (i.e. g "g ) 0+* 0+* 5!+* when D"1NO. Proof. The result in the theorem follows immediately from the application of Lemma 1 to the LMI constraint in (8). In particular, the result is obtained setting
M(*)&
F2 0 g2
R2 # $ *(I!H2*)\[¸2 0], 0 R2 E
0
0
x
=& 0
I
0 ,
x 2 0 g
x
(AK!G)2
lI
Y0
(12)
and let K be the value of K at the optimum of (11), then 0*# x( "K z&K x #K y 0*# 0*# V Q W Q is a robust linear estimate (R¸E) guaranteeing that ""F(*)x( !g(*)""4g for all admissible values of the un0*# certainty, where g """z""l . 0*# 0*#
.
!1
Remark 4. When there is no model uncertainty, we can set ¸"0, H"0, R "0, R "0. In this case, the re$ E sults of the previous theorem reduce to the standard LS estimate, and the robust estimate is consistent with the idealized (uncertainty free) model. While the result of the previous theorem is useful to obtain a numerical estimate of the parameters, due to complicated non-linear dependence of the estimate on the data x , y , it is ackward to study further the statistQ Q ical properties of the resulting estimator. To pursue this study, in Section 3.2 we will consider the additional constraint that the estimator should be linear in the data. 3.2. Robust linear estimation The goal of this section is to compute a robust estimate which is linear in the observations. This is done in order to recover some of the nice features related to linear estimators, and to allow for further analysis of the bias and variance characteristics of the estimate. To this end, let K be an unknown gain matrix, and let x"Kz, with z&[y2 x2]2, K&[K K ], and Q Q W V A&[F2 R2 ]2, h&[g2 R2]2&Gz, where G is some given $ E matrix that can be deduced from (1) and (7). Then, the main result on the optimal robust linear estimate (RLE) is provided by the following theorem. Theorem 5. Let l "arg min l 0*# )1%2J
AK!G
Thus, g is a minimized upper bound on g (i.e. 0*# 0+* g 5g ), and the optimal gain K is independent of 0*# 0+* MNR the observations z.
x "
#(S, G, ¹)
(11)
Proof. We start from the result of Theorem 3, assume that zO0 (the case z"0 may be trivially considered aside) and introduce a new variable K such that x"Kz. We now rewrite problem (9) in the equivalent form g "arg min g 0+* )1%2E subject to (S, G, ¹)3B(D), S ) 0, ¹ ) 0, #(S, G, ¹)
(AK!G)z
z2(AK!G)2
g
Y 0.
(13)
The previous is only a restatement of (9), and the resulting estimate is not yet linear in the observations, as the optimal gain K will depend on z. However, we now show that condition (13) is satis"ed whenever condition (12) is satis"ed. This is because, taking Schur complements, (13) is equivalent to #Y0, and g'z2(AK!G)2#\(AK!G)z.
(14)
Since z2(AK!G)2#\(AK!G)z 4""z""j ((AK!G)2#\(AK!G)),
then (14) is implied by g'""z""j ((AK!G)2#\
(AK!G)). Introducing the new variable l"g/""z"", this latter condition, together with #Y0, may be restated in the form of (12), applying again the Schur complement rule. As the initial constraint has been replaced by a more stringent one, it immediately follows that all solutions to
G. Calaxore, L.El Ghaoui / Automatica 37 (2001) 573}580
(11) will be feasible for (9), therefore x( will be a robust estimate, and g an upper bound on g . Minimizing 0*# 0+* over l (i.e. solving problem (11)) amounts to "nding the best possible upper bound on g , based on the premise 0+* that the estimate is linear in the samples.
577
where K is the value of K at the optimum of 0*3# W l "arg min l 0*3# )W 1%2J subject to (S, G, ¹)3B(D), S Y 0, ¹ Y 0,
3.3. Bias and variance of the robust linear estimate In this section, we examine the bias and variance characteristics of the linear robust estimator, and present a result for the computation of a robust linear unbiased estimator. The estimation bias is de"ned as b&E+x( !x ,, where x is the (unknown) expected value of x, and x( is a linear estimate in the form x( "K x #K y . V Q W Q
(15)
We then have that b"b(* )"E(K x #K y !x )" A V Q W Q B(* )x , where A B(* )"B#K ¸ * (I!H * )\R ; A W A A A A A B&K !I#K C, V W
(16)
therefore, the bias is a linear function of the unknown mean x , with uncertain coe$cients. Notice that the robust estimate will be in general a!ected by bias. A condition for having robustly zero bias is of course given by B(* )"0, that is B"0, K ¸ "0. The "rst condition A W A requires in particular that K "I!K C, which means V W that the estimate should be in the classical `innovationsa form x( "x #K (y !Cx ). The second condition reQ W Q Q quires orthogonality between the gain K and the matrix W ¸ describing the uncertainty on the regression matrix A C(* ). In particular, when there is no uncertainty on A C (but we still allow for uncertainty in the covariance matrices D(* ), P(* )), we have ¸ "0, and we can B N A have unbiased estimates, provided that B"0, i.e. K "I!K C. Further, notice that both conditions imV W pose linear constraints on the gain K, which may be easily added (i.e. the resulting problem is still an SDP) to the constraints of problem (11), in order to compute linear robust unbiased estimates. Notice also that the additional constraints on the gain will not destroy feasibility, but simply decrease the level of the achievable robust ML performance. Our result on robust linear unbiased estimation (RLUE) is summarized in the following theorem. Theorem 6. Assume no uncertainty acts on the regression matrix C, i.e. ¸ "0. Let A x( "x #K (y !Cx ), 0*3# Q 0*3# Q Q
#(S, G, ¹)
A[K I!K C]!G W W Y 0. (A[K I!K C]!G)2 lI W W Then, x( is a robust linear unbiased estimate (RLUE) 0*3# guaranteeing that ""F(*)x( !g(*)""4g for all admiss0*3# ible values of the uncertainty, where g """z""l . 0*3# 0*3# g is the best possible upper bound on g , based 0*3# 0*# on the premise that the robust estimate must be linear and unbiased, therefore g 5g 5g . 0*3# 0*# 0+* The proof of the above theorem follows from the previous discussion. We now discuss the a posteriori covariance properties of the estimator in the parameter space. For a generic linear estimate in the form (15), the a posteriori covariance matrix is de"ned as R&E+(x( !x )(x( !x )2,. A standard manipulation then yields R(*)"K P(* )K2 #K D(* )K2#B(* )x x 2B2(* ), (17) V N V W B W A A where B(* ) is de"ned as in (16). A We notice that (17) depends on the unknown mean x , therefore an empirical estimate of the covariance may be obtained substituting the estimated value x( in the place of the unknown x . However, the signi"cance of the covariance matrix in case of biased estimate may be questionable. A more interesting result is obtained in the case of unbiased estimates (obtained by means of Theorem 6, when C is exactly known), where R(*) reduces to R(*)"K P(* )K2 #K D(* )K2. V N V W B W Remark 7. Notice that, using standard rules for operations with LFTs, one can determine an LFT representation for R(*), and use it in a recursive estimation framework, collecting a new observation y , and Q setting x Qx( , P(* )QR(*), etc. Further study is Q 0*3# A however needed to analyze the behavior of the recursive RLUE. It is important at this point to make some observations. It is well known that, when the linear model is perfectly known, the application of the ML principle provides estimates which are unbiased and e$cient, in the sense that the a posteriori covariance in parameter space reaches the CrameH r}Rao lower bound, see for instance Knight (2000, Section 6.4). In our context of robust estimation, we saw that unbiased estimates can be
578
G. Calaxore, L.El Ghaoui / Automatica 37 (2001) 573}580
obtained (Theorem 6) only if no uncertainty is acting on the regression matrix C, while allowing for uncertainty in P(* ) and D(* ). When this is not the case, estimation N B bias will be unavoidable due to imperfect knowledge of the linear relation y "C(* )x between the mean values A of x and y. As for the a posteriori parameter covariance, one may ask how the robust maximization of the likelihood function is related to robust minimization of the a posteriori covariance. The answer to this question is that our robust ML linear estimator is also the one that minimizes an upper bound on the worst-case a posteriori covariance, therefore the estimate provided by Theorem 6 is also a minimum variance unbiased estimate (MVUE), in the robust sense explained above. The reason for this resides in a deep and general duality result between the maximization of the log-likelihood function and the minimization (in matrix sense) of the a posteriori covariance. For linear models, both problems have an equivalent formulation, provided that suitable dual bases are chosen to write the optimization problem; for a thorough discussion of this issue, the reader is referred to Kailath, Sayed, and Hassibi (2000, Chapter 15). Finally, we remark that the robust estimation framework proposed in this paper has similarities with the minimax approach to ML estimates and minimax variance estimates discussed in Huber (1981, Chapter 4), where robustness issued are considered with respect to parametrized families of distributions.
4. Example In this section, we report a result of the application of the presented methodology to the experimental estimation of dynamic parameters of a SCARA two-link IMI manipulator available at the Politecnico di Torino Robotics Lab. Details related to the experimental setup, manipulator model and data treatment are discussed in Cala"ore and Indri (1998). The goal is to estimate eight dynamic and friction parameters of the manipulator from noisy joint torque data. Denoting by q2"[q , q ] the joint positions, and by q2"[q , q ] the measured joint torques, the following manipulator model was developed for identi"cation: q"C (q, q , qK )h#d, where h31 is the B vector of identi"able parameters, d31 is a zero mean Gaussian noise vector, and C(q, q , qK )31 is the nominal regression matrix, which is a non-linear function of q,q ,qK . Since these data are experimentally acquired from the system, or reconstructed by means of "lters, the regression matrix is intrinsically uncertain. We then developed a linearized model for the uncertainty in the LFT form, and collected data at six time instants on the reference trajectory. The relative torque measurements and LFT regression models were then stacked, obtaining the aug-
mented regression model q"(CI #¸I *I RI )h#dI , where q31, CI 31, *I "diag(d I ,2,d I ). The robust P P estimate was then computed solving (9) by means of a Matlab code based on the LMITool SDP solver (El Ghaoui & Commeau, 1999). The robust solution was then compared to the standard LS estimate. It is worth to remind that on the estimation data set the LS estimate yields a smaller residual than the robust estimate, indeed we found e &""D\(q!CI hK )"""3.24 and *1 *1 e &""D\(q!CI hK )"""3.54. However, the quality 0+* 0+* of the two parameter estimates should be compared on data sets di!erent from the one used for the estimation (validation data sets). We therefore computed the residuals e and e using data collected over several di!er*1 0+* ent trajectories. The average residual resulted to be 6.2 for the robust estimate and 6.76 for the LS estimate. More interestingly, if we compare the peak values of the residuals we obtain 10.1 for the robust estimate and 15.1 for the LS estimate, which is about 50% worse. Also, we noticed that the robust estimate is more `regulara than the LS estimate; regularity may be measured by the variance of the residuals, which was 2.65 for the robust estimate and 4.52 for the LS estimate, i.e. again about 70% worst than the robust estimate. The actual data used in this experiment are available from the authors on request. 5. Conclusions In this paper, we have shown that the maximum likelihood estimation problem with uncertainty in the regression matrix and in the observations covariance can be solved in a worst-case setting using convex programming. This implies that, in practice, these problems can be solved e$ciently in polynomial time using available software. The paper also presents specialized results for robust linear estimation and unbiased robust linear estimation. In particular, this latter estimator recovers most of the nice features of standard ML estimators, and seems to be suitable for implementation in a recursive estimation framework. Robust estimation has been applied to an experimental problem of manipulator parameters identi"cation, which has inherent uncertainty in the regression matrix. The reported results show that a consistent improvement can be obtained over standard estimation methods. Appendix. Proof of Lemma 1 We "rst observe that the lower-right block of the matrix in (5) is H2¹H#H2G#G2H!S, therefore condition (5) implies well posedness of the LFT (3), see for instance Fan et al. (1991) for a proof. Now, if the LFT for
G. Calaxore, L.El Ghaoui / Automatica 37 (2001) 573}580
M(*) is well posed, then condition (6) is satis"ed if and only if
u 2 M ¸ 2 M ¸ f (u, p) " : = p I 0 I 0
u
'0
p
for all u, p such that p"*(Ru#Hp) for some *3D . Let then q"Ru#Hp, and (S, ¹, G)3B(D), with ¹Y0. The condition p"*q for some *3D implies that q2Gp"q2G*q"0, by skew-symmetry of G*. In addition, we have q2¹q!p2Sp"q2(¹!*2S*)q "q2¹(I!¹\*2*¹)¹q50. In the above, we have used the fact that S*"*¹, and that the matrix ¹\*2*¹ is actually symmetric, and has eigenvalues less or equal to one. We conclude that
u 2 R
H 2 ¹
p
I
0
G
R H
u
I
p
50
G2 !S
0
for every u, p such that p"*(Ru#Hp) for some *3D , and every triple (S, ¹, G)3B(D) with ¹Y0. Based on this fact, we obtain a su$cient condition for (6) to hold, i.e. that for every non-zero pair (u, p), we have
u 2 M ¸ 2 p
I
0
=
u
I
p
'
M ¸ 0
u 2 R H 2 ¹ p
0
I
G
G2 !S
R
H
u
0
I
p
for some triple (S, ¹, G)3B(D), with ¹Y0. The above condition is exactly the one stated in the theorem. It remains to prove that our condition is also necessary in the unstructured case, D"1NO. In this case, the set B(D) reduces to the set of triples (S, ¹, G), with S"qI , ¹"qI , q31, and G"0. First, we note N O that the well posedness su$cient condition H2¹H#H2G#G2H!S Y 0 for some (S, ¹, G)3B(D) is equivalent to ""H""(1, which is the exact well posedness condition. Second, we note that for every u, p, we have p"*(Ru#Hp) for some *, ""*""41 if and only if
u 2 R H 2 I O f (u, p) " : p 0 I 0
0 !I N
R H
u
I
p
0
50.
We note that the above inequality is strictly feasible, that is there exists a pair (u , p ) such that f (u , p )'0 (since ""H""(1, it su$ces to choose u "0 and p O0). In this case, the S-procedure (Boyd et al., 1994) provides a necessary and su$cient condition for the quadratic constraint f (u, p)'0 to hold for every non-zero pair (u, p)
579
such that f (u, p)50. This condition is that there exists a scalar q50 such that, for every non-zero (u, p), we have f (u, p)'qf (u, p). This is exactly the condition of the theorem in the unstructured case, written with S"qI , ¹"qI and G"0. N O
References Asai, T., Hara, S., & Iwasaki, T. (1996). Simultaneous modelling and synthesis for robust control by LFT scaling. Proceedings of 13th IFAC world congress, vol. G, San Francisco, CA, USA (pp. 309}314). Ben-Tal, A., El Ghaoui, L., & Nemirovskii, A. (2000). Robust semide"nite programming. In: R. Saigal, L. Vandenberghe & H. Wolkowicz (Eds.), Handbook of semidexnite programming. Dordrecht, Canada: Kluwer Academic Publishers, Waterloo. Berger, J. O., & Wolpert, R. (1988). The likelihood principle. Hayward, CA: Institute of Mathematical Statistics. BjoK rk, A. (1991). Component-wise perturbation analysis and error bounds for linear least squares solutions. BIT, 31, 238}244. Boyd, S., El Ghaoui, L., Feron, E., & Balakrishnan, V. (1994). Linear matrix inequalities in system and control theory, Studies in Applied Mathematics. Philadelphia: SIAM. Cala"ore, G., & Indri, M. (1998). Experiment design for robot dynamic calibration. Proceedings of the 1998 IEEE international conference on robotics and automation. Leuven, May 16}21. Chandrasekaran, S., Golub, G., Gu, M., & Sayed, A. H. (1998). Parameter estimation in the presence of bounded data uncertainties. SIAM Journal on Matrix Analysis and Applications, 19(1), 235}252. Dussy, S., & El Ghaoui, L. (1998). Measurement-scheduled control for the RTAC problem. International Journal of Robust and Nonlinear Control, 8(4}5), 377}400. Elden, L. (1985). Perturbation theory for the least-squares problem with linear equality constraints. BIT, 24, 472}476. El Ghaoui, L., & Cala"ore, G. (1999). Deterministic state prediction under structured uncertainty. Proceedings of the American control conference, San Diego, California. El Ghaoui, L., & Commeau, J.-L. (1999). lmitool version 2.0, January. Available via http://www.ensta.fr/gropco. El Ghaoui, L., & Lebret, H. (1997). Robust solutions to least-squares problems with uncertain data. SIAM Journal of Matrix Analysis Applications, 18(4), 1035}1064. El Ghaoui, L., Oustry, F., & Lebret, H. (1998). Robust Solutions to Uncertain Semide"nite Programs. SIAM Journal of Optimization, 9(1), 33}52. Fan, M. K. H., Tits, A. L., & Doyle, J. C. (1991). Robustness in the presence of mixed parametric uncertainty and unmodeled dynamics. IEEE Transactions on Automatic Control, 36(1), 25}38. Goodwin, G. C., & Payne, R. L. (1997). Dynamic system identixcation: Experiment design and data analysis. New York: Academic Press. Huber, P. J. (1981). Robust statistics. New York: Wiley. Kailath, T., Sayed, A. H., & Hassibi, B. (2000). Linear estimation. Englewood Cli!s, NJ: Prentice-Hall. Knight, K. (2000). Mathematical statistics. New York: Chapman & Hall/CRC. Ljung, L. (1987). System identixcation: theory for the user. Englewood Cli!s, NJ: Prentice-Hall. Tikhonov, A., & Arsenin, V. (1977). Solutions of ill-posed problems. New York: Wiley. Van Hu!el, S., & Vandewalle, J. (1991). The total least-squares problem: Computational aspects and analysis. Philadelphia, PA: SIAM.
580
G. Calaxore, L.El Ghaoui / Automatica 37 (2001) 573}580
Wolodkin, G., Rangan, S., & Poolla, K. (1997). An LFT approach to parameter estimation. Proceedings of the American control conference, Albuquerque, New Mexico. Xie, L., Soh, Y. C., & de Souza, C. E. (1994). Robust Kalman "ltering for uncertain discrete-time systems. IEEE Transactions on Automatic Control, 39(6), 1310}1314. Zhou, K., Doyle, J. C., & Glover, K. (1996). Robust and optimal control. Upper Saddle River: Prentice-Hall. Giuseppe Carlo Cala5ore was born in Torino, Italy in December 1969. He received the Laurea degree in Electrical Engineering from Politecnico di Torino in 1993, and the Doctorate degree in Information and System Theory from Politecnico di Torino, in 1997. Since 1998, he holds the position of Assistant Professor at Dipartimento di Automatica e Informatica, Politecnico di Torino. Dr. Cala"ore held visiting positions at
Information Systems Laboratory, Stanford University, in 1995, at Ecole Nationale SupeH rieure de Techniques AvanceeH s (ENSTA), Paris, in 1998, and at the University of California at Berkeley, in 1999. His research interests are in the "elds of convex optimization, randomized algorithms, and identi"cation, prediction and control of uncertain systems.
Laurent El Ghaoui graduated from Ecole Polytechnique, Palaiseau, France, in 1985. He obtained a Ph.D. thesis in Aeronautics and Astronautics from Stanford University in 1990. He held a research position at Laboratoire Systemes et Perception, Arcueil, France, from 1990 to 1992. From 1992 until 1999, he was a Faculty member at the Ecole Nationale Superieure de Techniques Avancees in Paris. He is currently Associate Professor in the Department of Electrical Engineering and Computer Sciences of the University of California at Berkeley.