Robust Optimization

Robust Optimization

Copyright © IFAC Robust Control Design, Budapest, Hungary, 1997 Robust Optimization Laurent El Ghaoui Laboratoire de Mathernatiques Appliquees, Ecole...

912KB Sizes 0 Downloads 56 Views

Copyright © IFAC Robust Control Design, Budapest, Hungary, 1997

Robust Optimization Laurent El Ghaoui Laboratoire de Mathernatiques Appliquees, Ecole Nationale Superieure de Techniques Avancees, 32, Blvd. Victor, 75739 Paris, France. [email protected] ABSTRACT A large class of engineering problems can be formulated as semidefinite programs (SDP) , which are (convex) OJr timization problems involving a linear objective an linear matrix inequality (LMI) constraints. In practice, the data in these problems is often uncertain. We show how a technique used in robust control can be adapted to any SDP. The approach widens the range of applications of robust control theory to a host of engineering problems. We illustrate the approach with (linear) least-squares problems with uncertain data.

1.

Introduction

The basic idea of the LMI method in control is to transform , or approximate, a given control problem into an optimization problem with linear objective and linear matrix inequality (LMI) constraints. An LMI constraint on a vector x E R m is one of the form m

F(x) = Fo +

I: XiFi 2: 0

(1)

i= l

where the symmetric matrices Fi = Ft E R N 0, ... , m are given. The minimization problem minimize cT x subject to F(x)

2: 0

xN

, i

= (2)

where c E R m , is called a semidefinite program (SDP) . The above framework is particularly efficient for the following reasons . Efficient numerical solution . SDP optimization problems can be solved very efficiently using recent interior-point methods (the global optimum is found in modest computing time) . This brings a numerical solution to problems when no analytical or closed-form solution is known. Multicriteria problems. The approach enables to impose many different (possibly conflicting) specifications in the design process, allowing to explore tradeoffs and analyze limits of performance and feasibility. This offers a drastic advantage over design methods that rely on a single criterium , deemed to reflect all design constraints-the choice of a relevant criterium is sometimes a non trivial task . Wide applicability. The techniques used in the aJr proach are relevant far beyond control and estimation. This opens exciting avenues of research where seemingly 115

very different problems are analyzed and solved in a unified framework . For example, the method known in LMI-based control as the S -procedure can be sucessfully applied in combinatorial optimization, leading to efficient relaxations of hard problems. Robustness against uncertainty. The approach is very well suited for problems with uncertainty in the data. Based on a deterministic description of uncertainty (with detailed structure, and hard bounds) , a systematic procedure enables to formulate an SDP OJr timization problem that yields a robust solution. This statement has implications for a wide scope of engineering problems, where measurements, modelling errors, etc, are often present. The last point is our main topic. Our exposition draws from the papers [3, 5] .

2.

LMI Problems with Uncertain Data

2.1

Uncertainty models

In many practical applications, the data of the problem (the vector c and the matrices Fi that appear in (2)) is not exactly known. There are many methods to cope with uncertainty in optimization programs. One is stochastic programming, where a distribution of the uncertain parameters is assumed to be known , and quantities such as mean and covariance matrix of the objective are sought [2] . These methods are usually quite intensive numerically, and do not provide guarantees in terms of realizability of the proposed solution. Following a philosophy consistent with robust control, one may instead assume a deterministic description of uncertainty (such as unknown-but-bounded parametric variations) , and seek solutions that are guaranteed to remain feasible and achieve a desired level of suboptimality) for every allowable perturbation. To take this route, we must first describe the way uncertainty perturbs the "nominal" data of the problem. We assume that the perturbed value of the data is described by F(x , ~)

= F(x) + L~(I -

R(x)T(I _

D~)-l R(x)+

~T DT)-l~T LT ,

where the uncertainty matrix ~ is unknown-butbounded , the matrices D , R are given , and the matrix functions FC) and RC) are affine in the decision variable x (with F(x) symmetric for every x).

Specifically, we assume that ~ is restricted to a given linear subspace V (say, the subspace of diagoanl matrices of appropriate size), and otherwise bounded in norm (largest singular value) by a given "perturbation level" p. We denote by Vp the "perturbation set" {~ E V III~II :S p}. The above model seems very specialized , however it can be used for a wide scope of perturbation structures. It covers in particular the case when elements of the matrix F(x) are perturbed (in a polynomial , or even rational) manner, by a perturbation vector. This is thanks to the "linear-fractional representation" lemma recalled in [5]. For completeness, we should also be equipped with a model of the way the perturbation corrupts the objective vector c. It turns out that we may, without loss of generality, assume that c is independent of perturbation.

2.2

Robust decision problem.s

The robust optimization problem is then the following: minimize cT x subject to F(x,~) 2: Ofor every ~ E V p.

(3)

In general, this problem is of high complexity (untractable) , even though it is often convex [1] . We may compute an upper bound on the objective value thanks to the following lemma (which can be traced back to [4]).

=

Lemm.a 2.1 Let F FT , L, R, D be real matrices of appropriate size. Let V a subspace ofRpx q , and denote by B the linear set of matrices triples (S, G, T) such that S~

for every

~ E

v.

We have det(I -

D~)

for every ~ E v , II~II (S, T , G) E B such that S RT] [L 0 D

3.1

=I 0

=

r(A ,b,p, x)

LI. =

II{A+~A)x-(b+~b)ll .

max

°

p

A(o) ~ A o + L:0;A;, b(o) ~ ba + L:0;bi . ;=1

(4)

A direct application of the above lemma shows that an upper bound on the objective value of problem (3) can be found by solving the "augmented" SDP inf cT x subject to (S, T, G) E B, S> 0, T> 0,

r

> O.

The above reduction is not an exact solution of the problem , except when the set V is the whole space (we refer to this case as the unstructured case) . The main advantage of the above approximation is that is reduces an originally very hard problem to a (numerically) tractable one.

(6)

i=1

For p 2: 0, and x E R m , we define the structured worstcase residual as LI.

a triple

T

(5)

IIL1.A Ll.bllFSp

We say that x is a Robust Least Squares (RLS) solution if x minimizes the worst-case residual r(A, b, p, x). (The RLS solution trades accuracy for robustness, at the expense of introducing bias.) In many applications, the perturbation matrices ~A, ~b have a known structure, and this information is not taken into account in the "additive model" above. For instance, ~A might have a Toeplitz structure inherited from A . In this case, the worst-case residual (5) might be a very conservative estimate. We are led to consider the following Structured RLS (SRLS) problem. Given A o, . . . , Ap E R nxm , bo, .. . , bp E Rn , we define for every E RP,

and

0] [ p2G S -T G] [LD 0] T I I > O.

[; ~] [~~ !T] [~ ~

Robust least squares

Consider the problem of finding a solution x to an overdetermined set of equations Ax :::: b, where the data matrices A E R nxm , b E Rn are given. In practice, the data matrices are often subject to (non necessarily small) deterministic perturbations. First , we may assume that the given model is not a single pair (A, b), but a family of matrices (A + ~A, b+ ~b) , where ~ [~A ~b] is an unknown-but-bounded matrix, precisely, II~II :S p, where p 2: 0 is given . For x fixed, we define the worst-case residual as

p

:S p, if there exist > 0, T > 0, and

F(x) R(x)T]_ [ R(x) 0

Examples

In the following , we provide examples of applications of the above approach, taken from linear algebra.

= ~T and G~ = _~TGT

F + L~(I - D~)-1 R+ RT(I _ D~)-T ~T LT> 0

F [ R

3.

rs(A, b,p ,x) = max IIA(o)x - b(o)lI , IIcSlISp

(7)

We say that x is a Structured Robust Least Squares (SRLS) solution if x minimizes the worst-case residual rs(A ,b,p,x). The above approach can be applied to this problem, once it is realized that the nominal least-squares problem, which is to minimize IIAx - bll , can be written as an SDP : . . . \ b. m1mffilze A su Ject to

[)'I (Ax _ bf

(Ax -

>.

b)] 2:.0

In [3], it is shown that, in the additive model , the RLS solution is unique , as the solution to the strictly convex problem minimize IIAx -

bll + pVll x l1 2 + 1.

The approach is very similar to a Tikhonov regularization-the "regularization parameter" is simply the perturbation level p . The cost of solving the 116

above problem (via interior-point methods) is roughly equal to one SVD of A. In the general case (structured perturbation model), the robust solution is more expensive to compute-still, the problem is reduced to a (tractable) SDP. For more details, we refer to [3] . 3.2

minimize II[~A ~b]II subject to (A

XTLS = (AT A - P~LS)-l ATb = X(PTLs)b,

= =

.] =

=

A(~) = A

+ L~ (/ -

where PTLS = O"min([A b]) . The solution of the standard Least Squares (LS) problem , min., IIAx - bll, involves the classical inverse: XLS = A-lb. The solution to the TLS problem is XTLS = X(PTLS)b, where X(PTLS) is the accurate inverse of A . The TLS method amounts to first compute the smallest perturbation level necessary to make the linear system Ax b cosistent, PTLS. Then, the TLS solution is computed via the accurate inverse (with level PTLS), while the LS solution uses the standard inverse . This is coherent with the observation that "the TLS scr lution is more accurate than the LS one" [6] . Indeed, the TLS solution works with an inverse matrix that best approximates the possible values of (A + ~A)-l , over the smallest perturbation range making the system consistent.

=

=

D~)-l R,

(8)

where A is the (square, invertible) "nominal" value, and L, R, D are given matrices of appropriate size. The norm used to measure the perturbation size is the largest singular value norm , II~II . We say that X is an accurate inverse over V p for the structured matrix A if it minimizes the maximal inversion error at X , defined as 1 -max{IIA(~)-I-XII : ~EV , II~II:Sp} , A

+ ~A)x = b + Ab,

where bE Rn is given , is (except in degenerate cases) :

Accurate inverses

Consider the problem of computing the inverse of a nonsingular matrix A. The classical notion of inverse (whichever way it is numerically computed) neglects the possibility of large and structured perturbations. To make this point clear , let us first see why the classical definition of the inverse of a matrix neglects the possibility of large perturbations. Consider the scalar 1, where a is unknown-but-bounded equation ax (say, a E I [a - P a + p], where p > 0 is given) . The possible values of the solution lie in the interval [(a - p)-l (a + p)-l] . In the absence of more information about the "distribution" of a in the interval X, the "best" value of the inverse is not a-I (the classical inverse). A more accurate value is the cent er of the interval.] , that is, aj(a 2 _ p2) . Perturbation structure is also neglected in the classical definition of a (matrix) inverse. Consider again a scalar equation ax 1, where a c 2 , and the "Cholesky factor" c is unknown-but-bounded (say, c E X = [e - p e + p]). As before, we may define a "accurate inverse" as the cent er of the set of possible values of c- 2 , which is (a + p2)j(a - p2)2. Note this value is in general different from its "unstructured" counterpart. The above remarks call for a precise study of the effect of non necessarily small, structured perturbations, on the inverse of A . For this we introduce a very general model for the perturbation structure. We assume that the perturbation is a p x q matrix ~ that is restricted to a given linear subspace V ~ Rpxq . We then assume that the perturbed value of A can be written in the "linear-fractional representation" (LFR)

p

The accurate inverse above comes up in the Total Least Squares (TLS) problem . Precisely, the solution of the TLS problem

(9)

It turns out that the above problem is amenable to the technique described above. In the case when the perturbation is additive (the perturbed value of A is A + ~, where II~II :S p and otherwise unknown) , we can show that the accurate inverse takes the form

4.

Concluding Remarks

We have sought to show that the techniques now widely used in robust control can be applied any problem with uncertain data, provided (i) the "nominal" problem is an SDP, (ii) the perturbation model is known. This approach is particularly relevant in structured linear algebra, to compute robust or accurate solutions to structured linear equations.

5.

References

[1) A. Ben-Tal and A. Nemirovski . Robust convex programming. To appear in IMA J . Numer. Anal., March 1998. [2) M.A.H. Dempster. Stochastic programming. Academic Press, 1980. [3) L. El Ghaoui and H. Lebret. Robust solutions to leastsquares problems with uncertain data matrices. SIAM J. Matrix Anal. Appl., October 1997. To appear. [4) M. K. H. Fan, A. L. Tits, and J. C. Doyle. Robustness in the presence of mixed parametric uncertainty and unmodeled dynamics. IEEE Trans . Aut. Control, 36(1):25-38, Jan 1991. [5) F. Oustry, L. El Ghaoui, and H. Lebret. Robust solutions to uncertain semidefinite programs. Submitted to SIAM J. Opt., 1996. [6) S. Van Huffel and J. Vandewalle. The total least squares problem: computational aspects and analysis, volume 9 of Frontiers in applied Math. SIAM , Philadelphia, PA, 1991. [7) K. Zhou, J. Doyle, and K. Glover. Robust and Optimal Control. Prentice Hall, New Jersey, 1995.

117